Re: [Qemu-devel] [PATCH v14 05/20] qemu-img: Update documentation for --share-rw

2017-04-23 Thread Fam Zheng
On Fri, 04/21 10:37, Eric Blake wrote:
> On 04/20/2017 10:55 PM, Fam Zheng wrote:
> > Signed-off-by: Fam Zheng 
> > ---
> >  qemu-img-cmds.hx | 48 
> >  1 file changed, 24 insertions(+), 24 deletions(-)
> > 
> > diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
> > index 8ac7822..1b00bb8 100644
> > --- a/qemu-img-cmds.hx
> > +++ b/qemu-img-cmds.hx
> > @@ -10,15 +10,15 @@ STEXI
> >  ETEXI
> >  
> >  DEF("bench", img_bench,
> > -"bench [-c count] [-d depth] [-f fmt] 
> > [--flush-interval=flush_interval] [-n] [--no-drain] [-o offset] 
> > [--pattern=pattern] [-q] [-s buffer_size] [-S step_size] [-t cache] [-w] 
> > filename")
> > +"bench [-c count] [-d depth] [-f fmt] 
> > [--flush-interval=flush_interval] [-n] [--no-drain] [-o offset] 
> > [--pattern=pattern] [-q] [-s buffer_size] [-S step_size] [-t cache] [-w] 
> > [--share-rw] filename")
> 
> General comment - it seems that we favor the short-option spelling where
> one exists; should all of these updates mention -U instead of --share-rw?

OK, I can change it.

> 
> Also, why did you rename it from --unsafe-reads in an earlier revision?
> After all, if I'm understanding this flag correctly, what you are asking
> for is the ability to read the image in spite of other simultaneous
> writers that may make your reads inconsistent.

It was a result of discussion with Kevin on IRC - consistent read as in the new
op blocker API is specifically for the state of the intermediate nodes in commit
job, and is orthogonal to the share-rw semantics as added to qdev. This option
here for qemu-img/qemu-io, is more close to the latter, thus the name is updated
to reflect its use case better.

Fam



[Qemu-devel] [PULL 8/8] COLO-compare: Optimize tcp compare trace event

2017-04-23 Thread Jason Wang
From: Zhang Chen 

Optimize two trace events as one, adjust print format make
it easy to read. rename trace_colo_compare_pkt_info_src/dst
to trace_colo_compare_tcp_info.

Signed-off-by: Zhang Chen 
Signed-off-by: Jason Wang 
---
 net/colo-compare.c | 29 +
 net/trace-events   |  3 +--
 2 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 4ab80b1..03ddebe 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -264,18 +264,23 @@ static int colo_packet_compare_tcp(Packet *spkt, Packet 
*ppkt)
 res = -1;
 }
 
-if (res != 0 && trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
-trace_colo_compare_pkt_info_src(inet_ntoa(ppkt->ip->ip_src),
-ntohl(stcp->th_seq),
-ntohl(stcp->th_ack),
-res, stcp->th_flags,
-spkt->size);
-
-trace_colo_compare_pkt_info_dst(inet_ntoa(ppkt->ip->ip_dst),
-ntohl(ptcp->th_seq),
-ntohl(ptcp->th_ack),
-res, ptcp->th_flags,
-ppkt->size);
+if (res && trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
+char ip_src[20], ip_dst[20];
+
+strcpy(ip_src, inet_ntoa(ppkt->ip->ip_src));
+strcpy(ip_dst, inet_ntoa(ppkt->ip->ip_dst));
+
+trace_colo_compare_tcp_info(ip_src,
+ip_dst,
+ntohl(ptcp->th_seq),
+ntohl(stcp->th_seq),
+ntohl(ptcp->th_ack),
+ntohl(stcp->th_ack),
+res,
+ptcp->th_flags,
+stcp->th_flags,
+ppkt->size,
+spkt->size);
 
 qemu_hexdump((char *)ppkt->data, stderr,
  "colo-compare ppkt", ppkt->size);
diff --git a/net/trace-events b/net/trace-events
index 35198bc..123cb28 100644
--- a/net/trace-events
+++ b/net/trace-events
@@ -13,8 +13,7 @@ colo_compare_icmp_miscompare(const char *sta, int size) ": %s 
= %d"
 colo_compare_ip_info(int psize, const char *sta, const char *stb, int ssize, 
const char *stc, const char *std) "ppkt size = %d, ip_src = %s, ip_dst = %s, 
spkt size = %d, ip_src = %s, ip_dst = %s"
 colo_old_packet_check_found(int64_t old_time) "%" PRId64
 colo_compare_miscompare(void) ""
-colo_compare_pkt_info_src(const char *src, uint32_t sseq, uint32_t sack, int 
res, uint32_t sflag, int ssize) "src/dst: %s s: seq/ack=%u/%u res=%d flags=%x 
spkt_size: %d\n"
-colo_compare_pkt_info_dst(const char *dst, uint32_t dseq, uint32_t dack, int 
res, uint32_t dflag, int dsize) "src/dst: %s d: seq/ack=%u/%u res=%d flags=%x 
dpkt_size: %d\n"
+colo_compare_tcp_info(const char *src, const char *dst, uint32_t pseq, 
uint32_t sseq, uint32_t pack, uint32_t sack, int res, uint32_t pflag, uint32_t 
sflag, int psize, int ssize) "src/dst: %s/%s pseq/sseq:%u/%u pack/sack:%u/%u 
res=%d pflags/sflag:%x/%x psize/ssize:%d/%d \n"
 
 # net/filter-rewriter.c
 colo_filter_rewriter_debug(void) ""
-- 
2.7.4




[Qemu-devel] [PULL 6/8] slirp: add a fake NC-SI backend

2017-04-23 Thread Jason Wang
From: Cédric Le Goater 

NC-SI (Network Controller Sideband Interface) enables a BMC to manage
a set of NICs on a system. This model takes the simplest approach and
reverses the NC-SI packets to pretend a NIC is present and exercise
the Linux driver.

The NCSI header file  comes from mainline Linux and was
untabified.

Signed-off-by: Cédric Le Goater 
Reviewed-by: Philippe Mathieu-Daudé 
Acked-by: Samuel Thibault 
Signed-off-by: Jason Wang 
---
 include/net/eth.h   |   1 +
 slirp/Makefile.objs |   2 +-
 slirp/ncsi-pkt.h| 419 
 slirp/ncsi.c| 130 
 slirp/slirp.c   |   4 +
 slirp/slirp.h   |   3 +
 6 files changed, 558 insertions(+), 1 deletion(-)
 create mode 100644 slirp/ncsi-pkt.h
 create mode 100644 slirp/ncsi.c

diff --git a/include/net/eth.h b/include/net/eth.h
index afeb45b..09054a5 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -209,6 +209,7 @@ struct tcp_hdr {
 #define ETH_P_IPV6(0x86dd)
 #define ETH_P_VLAN(0x8100)
 #define ETH_P_DVLAN   (0x88a8)
+#define ETH_P_NCSI(0x88f8)
 #define ETH_P_UNKNOWN (0x)
 #define VLAN_VID_MASK 0x0fff
 #define IP_HEADER_VERSION_4   (4)
diff --git a/slirp/Makefile.objs b/slirp/Makefile.objs
index 1baa1f1..28049b0 100644
--- a/slirp/Makefile.objs
+++ b/slirp/Makefile.objs
@@ -2,4 +2,4 @@ common-obj-y = cksum.o if.o ip_icmp.o ip6_icmp.o ip6_input.o 
ip6_output.o \
ip_input.o ip_output.o dnssearch.o dhcpv6.o
 common-obj-y += slirp.o mbuf.o misc.o sbuf.o socket.o tcp_input.o tcp_output.o
 common-obj-y += tcp_subr.o tcp_timer.o udp.o udp6.o bootp.o tftp.o arp_table.o 
\
-ndp_table.o
+ndp_table.o ncsi.o
diff --git a/slirp/ncsi-pkt.h b/slirp/ncsi-pkt.h
new file mode 100644
index 000..ea07d1c
--- /dev/null
+++ b/slirp/ncsi-pkt.h
@@ -0,0 +1,419 @@
+/*
+ * Copyright Gavin Shan, IBM Corporation 2016.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef NCSI_PKT_H
+#define NCSI_PKT_H
+
+/* from linux/net/ncsi/ncsi-pkt.h */
+#define __be32 uint32_t
+#define __be16 uint16_t
+
+struct ncsi_pkt_hdr {
+unsigned char mc_id;/* Management controller ID */
+unsigned char revision; /* NCSI version - 0x01  */
+unsigned char reserved; /* Reserved */
+unsigned char id;   /* Packet sequence number   */
+unsigned char type; /* Packet type  */
+unsigned char channel;  /* Network controller ID*/
+__be16length;   /* Payload length   */
+__be32reserved1[2]; /* Reserved */
+};
+
+struct ncsi_cmd_pkt_hdr {
+struct ncsi_pkt_hdr common; /* Common NCSI packet header */
+};
+
+struct ncsi_rsp_pkt_hdr {
+struct ncsi_pkt_hdr common; /* Common NCSI packet header */
+__be16  code;   /* Response code */
+__be16  reason; /* Response reason   */
+};
+
+struct ncsi_aen_pkt_hdr {
+struct ncsi_pkt_hdr common;   /* Common NCSI packet header */
+unsigned char   reserved2[3]; /* Reserved  */
+unsigned char   type; /* AEN packet type   */
+};
+
+/* NCSI common command packet */
+struct ncsi_cmd_pkt {
+struct ncsi_cmd_pkt_hdr cmd;  /* Command header */
+__be32  checksum; /* Checksum   */
+unsigned char   pad[26];
+};
+
+struct ncsi_rsp_pkt {
+struct ncsi_rsp_pkt_hdr rsp;  /* Response header */
+__be32  checksum; /* Checksum*/
+unsigned char   pad[22];
+};
+
+/* Select Package */
+struct ncsi_cmd_sp_pkt {
+struct ncsi_cmd_pkt_hdr cmd;/* Command header */
+unsigned char   reserved[3];/* Reserved   */
+unsigned char   hw_arbitration; /* HW arbitration */
+__be32  checksum;   /* Checksum   */
+unsigned char   pad[22];
+};
+
+/* Disable Channel */
+struct ncsi_cmd_dc_pkt {
+struct ncsi_cmd_pkt_hdr cmd; /* Command header  */
+unsigned char   reserved[3]; /* Reserved*/
+unsigned char   ald; /* Allow link down */
+__be32  checksum;/* Checksum*/
+unsigned char   pad[22];
+};
+
+/* Reset Channel */
+struct ncsi_cmd_rc_pkt {
+struct ncsi_cmd_pkt_hdr cmd;  /* Command header */
+__be32  

[Qemu-devel] [PULL 2/8] hw/net: add MII definitions

2017-04-23 Thread Jason Wang
From: Cédric Le Goater 

This adds comments on the Basic mode control and status registers bit
definitions. It also adds a couple of bits for 1000BASE-T and the
RealTek 8211E PHY for the FTGMAC100 model to use.

Signed-off-by: Cédric Le Goater 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Jason Wang 
---
 include/hw/net/mii.h | 71 +++-
 1 file changed, 53 insertions(+), 18 deletions(-)

diff --git a/include/hw/net/mii.h b/include/hw/net/mii.h
index 9fdd7bb..6ce48a6 100644
--- a/include/hw/net/mii.h
+++ b/include/hw/net/mii.h
@@ -22,13 +22,20 @@
 #define MII_H
 
 /* PHY registers */
-#define MII_BMCR0
-#define MII_BMSR1
-#define MII_PHYID1  2
-#define MII_PHYID2  3
-#define MII_ANAR4
-#define MII_ANLPAR  5
-#define MII_ANER6
+#define MII_BMCR0  /* Basic mode control register */
+#define MII_BMSR1  /* Basic mode status register */
+#define MII_PHYID1  2  /* ID register 1 */
+#define MII_PHYID2  3  /* ID register 2 */
+#define MII_ANAR4  /* Autonegotiation advertisement */
+#define MII_ANLPAR  5  /* Autonegotiation lnk partner abilities */
+#define MII_ANER6  /* Autonegotiation expansion */
+#define MII_ANNP7  /* Autonegotiation next page */
+#define MII_ANLPRNP 8  /* Autonegotiation link partner rx next page */
+#define MII_CTRL10009  /* 1000BASE-T control */
+#define MII_STAT100010 /* 1000BASE-T status */
+#define MII_MDDACR  13 /* MMD access control */
+#define MII_MDDAADR 14 /* MMD access address data */
+#define MII_EXTSTAT 15 /* Extended Status */
 #define MII_NSR 16
 #define MII_LBREMR  17
 #define MII_REC 18
@@ -38,19 +45,33 @@
 /* PHY registers fields */
 #define MII_BMCR_RESET  (1 << 15)
 #define MII_BMCR_LOOPBACK   (1 << 14)
-#define MII_BMCR_SPEED  (1 << 13)
-#define MII_BMCR_AUTOEN (1 << 12)
-#define MII_BMCR_FD (1 << 8)
+#define MII_BMCR_SPEED100   (1 << 13)  /* LSB of Speed (100) */
+#define MII_BMCR_SPEED  MII_BMCR_SPEED100
+#define MII_BMCR_AUTOEN (1 << 12) /* Autonegotiation enable */
+#define MII_BMCR_PDOWN  (1 << 11) /* Enable low power state */
+#define MII_BMCR_ISOLATE(1 << 10) /* Isolate data paths from MII */
+#define MII_BMCR_ANRESTART  (1 << 9)  /* Auto negotiation restart */
+#define MII_BMCR_FD (1 << 8)  /* Set duplex mode */
+#define MII_BMCR_CTST   (1 << 7)  /* Collision test */
+#define MII_BMCR_SPEED1000  (1 << 6)  /* MSB of Speed (1000) */
 
-#define MII_BMSR_100TX_FD   (1 << 14)
-#define MII_BMSR_100TX_HD   (1 << 13)
-#define MII_BMSR_10T_FD (1 << 12)
-#define MII_BMSR_10T_HD (1 << 11)
-#define MII_BMSR_MFPS   (1 << 6)
-#define MII_BMSR_AN_COMP(1 << 5)
-#define MII_BMSR_AUTONEG(1 << 3)
-#define MII_BMSR_LINK_ST(1 << 2)
+#define MII_BMSR_100TX_FD   (1 << 14) /* Can do 100mbps, full-duplex */
+#define MII_BMSR_100TX_HD   (1 << 13) /* Can do 100mbps, half-duplex */
+#define MII_BMSR_10T_FD (1 << 12) /* Can do 10mbps, full-duplex */
+#define MII_BMSR_10T_HD (1 << 11) /* Can do 10mbps, half-duplex */
+#define MII_BMSR_100T2_FD   (1 << 10) /* Can do 100mbps T2, full-duplex */
+#define MII_BMSR_100T2_HD   (1 << 9)  /* Can do 100mbps T2, half-duplex */
+#define MII_BMSR_EXTSTAT(1 << 8)  /* Extended status in register 15 */
+#define MII_BMSR_MFPS   (1 << 6)  /* MII Frame Preamble Suppression */
+#define MII_BMSR_AN_COMP(1 << 5)  /* Auto-negotiation complete */
+#define MII_BMSR_RFAULT (1 << 4)  /* Remote fault */
+#define MII_BMSR_AUTONEG(1 << 3)  /* Able to do auto-negotiation */
+#define MII_BMSR_LINK_ST(1 << 2)  /* Link status */
+#define MII_BMSR_JABBER (1 << 1)  /* Jabber detected */
+#define MII_BMSR_EXTCAP (1 << 0)  /* Ext-reg capability */
 
+#define MII_ANAR_PAUSE_ASYM (1 << 11) /* Try for asymetric pause */
+#define MII_ANAR_PAUSE  (1 << 10) /* Try for pause */
 #define MII_ANAR_TXFD   (1 << 8)
 #define MII_ANAR_TX (1 << 7)
 #define MII_ANAR_10FD   (1 << 6)
@@ -58,17 +79,31 @@
 #define MII_ANAR_CSMACD (1 << 0)
 
 #define MII_ANLPAR_ACK  (1 << 14)
+#define MII_ANLPAR_PAUSEASY (1 << 11) /* can pause asymmetrically */
+#define MII_ANLPAR_PAUSE(1 << 10) /* can pause */
 #define MII_ANLPAR_TXFD (1 << 8)
 #define MII_ANLPAR_TX   (1 << 7)
 #define MII_ANLPAR_10FD (1 << 6)
 #define MII_ANLPAR_10   (1 << 5)
 #define MII_ANLPAR_CSMACD   (1 << 0)
 
+#define MII_ANER_NWAY   (1 << 0) /* Can do N-way auto-nego */
+
+#define MII_CTRL1000_FULL   (1 << 9)  /* 1000BASE-T full duplex */
+#define MII_CTRL1000_HALF   (1 << 8)  /* 1000BASE-T half duplex */
+
+#define MII_STAT1000_FULL   (1 << 11) /* 1000BASE-T full duplex */
+#define MII_STAT1000_HALF   (1 << 10) /* 1000BASE-T half 

[Qemu-devel] [PULL 3/8] net: add FTGMAC100 support

2017-04-23 Thread Jason Wang
From: Cédric Le Goater 

The FTGMAC100 device is an Ethernet controller with DMA function that
can be found on Aspeed SoCs (which include NCSI).

It is fully compliant with IEEE 802.3 specification for 10/100 Mbps
Ethernet and IEEE 802.3z specification for 1000 Mbps Ethernet and
includes Reduced Media Independent Interface (RMII) and Reduced
Gigabit Media Independent Interface (RGMII) interfaces. It adopts an
AHB bus interface and integrates a link list DMA engine with direct
M-Bus accesses for transmitting and receiving packets. It has
independent TX/RX fifos, supports half and full duplex (1000 Mbps mode
only supports full duplex), flow control for full duplex and
backpressure for half duplex.

The FTGMAC100 also implements IP, TCP, UDP checksum offloads and
supports IEEE 802.1Q VLAN tag insertion and removal. It offers
high-priority transmit queue for QoS and CoS applications

This model is backed with a RealTek 8211E PHY which is the chip found
on the AST2500 EVB. It is complete enough to satisfy two different
Linux drivers and a U-Boot driver. Not supported features are :

 - IEEE 802.1Q VLAN
 - High Priority Transmit Queue
 - Wake-On-LAN functions

The code is based on the Coldfire Fast Ethernet Controller model.

Signed-off-by: Cédric Le Goater 
Signed-off-by: Jason Wang 
---
 default-configs/arm-softmmu.mak |1 +
 hw/net/Makefile.objs|1 +
 hw/net/ftgmac100.c  | 1003 +++
 include/hw/net/ftgmac100.h  |   60 +++
 4 files changed, 1065 insertions(+)
 create mode 100644 hw/net/ftgmac100.c
 create mode 100644 include/hw/net/ftgmac100.h

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index 1e3bd2b..78d7af0 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -29,6 +29,7 @@ CONFIG_LAN9118=y
 CONFIG_SMC91C111=y
 CONFIG_ALLWINNER_EMAC=y
 CONFIG_IMX_FEC=y
+CONFIG_FTGMAC100=y
 CONFIG_DS1338=y
 CONFIG_PFLASH_CFI01=y
 CONFIG_PFLASH_CFI02=y
diff --git a/hw/net/Makefile.objs b/hw/net/Makefile.objs
index 6a95d92..5ddaffe 100644
--- a/hw/net/Makefile.objs
+++ b/hw/net/Makefile.objs
@@ -26,6 +26,7 @@ common-obj-$(CONFIG_IMX_FEC) += imx_fec.o
 common-obj-$(CONFIG_CADENCE) += cadence_gem.o
 common-obj-$(CONFIG_STELLARIS_ENET) += stellaris_enet.o
 common-obj-$(CONFIG_LANCE) += lance.o
+common-obj-$(CONFIG_FTGMAC100) += ftgmac100.o
 
 obj-$(CONFIG_ETRAXFS) += etraxfs_eth.o
 obj-$(CONFIG_COLDFIRE) += mcf_fec.o
diff --git a/hw/net/ftgmac100.c b/hw/net/ftgmac100.c
new file mode 100644
index 000..fc309b3
--- /dev/null
+++ b/hw/net/ftgmac100.c
@@ -0,0 +1,1003 @@
+/*
+ * Faraday FTGMAC100 Gigabit Ethernet
+ *
+ * Copyright (C) 2016-2017, IBM Corporation.
+ *
+ * Based on Coldfire Fast Ethernet Controller emulation.
+ *
+ * Copyright (c) 2007 CodeSourcery.
+ *
+ * This code is licensed under the GPL version 2 or later. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/net/ftgmac100.h"
+#include "sysemu/dma.h"
+#include "qemu/log.h"
+#include "net/checksum.h"
+#include "net/eth.h"
+#include "hw/net/mii.h"
+
+/* For crc32 */
+#include 
+
+/*
+ * FTGMAC100 registers
+ */
+#define FTGMAC100_ISR 0x00
+#define FTGMAC100_IER 0x04
+#define FTGMAC100_MAC_MADR0x08
+#define FTGMAC100_MAC_LADR0x0c
+#define FTGMAC100_MATH0   0x10
+#define FTGMAC100_MATH1   0x14
+#define FTGMAC100_NPTXPD  0x18
+#define FTGMAC100_RXPD0x1C
+#define FTGMAC100_NPTXR_BADR  0x20
+#define FTGMAC100_RXR_BADR0x24
+#define FTGMAC100_HPTXPD  0x28
+#define FTGMAC100_HPTXR_BADR  0x2c
+#define FTGMAC100_ITC 0x30
+#define FTGMAC100_APTC0x34
+#define FTGMAC100_DBLAC   0x38
+#define FTGMAC100_REVR0x40
+#define FTGMAC100_FEAR1   0x44
+#define FTGMAC100_RBSR0x4c
+#define FTGMAC100_TPAFCR  0x48
+
+#define FTGMAC100_MACCR   0x50
+#define FTGMAC100_MACSR   0x54
+#define FTGMAC100_PHYCR   0x60
+#define FTGMAC100_PHYDATA 0x64
+#define FTGMAC100_FCR 0x68
+
+/*
+ * Interrupt status register & interrupt enable register
+ */
+#define FTGMAC100_INT_RPKT_BUF(1 << 0)
+#define FTGMAC100_INT_RPKT_FIFO   (1 << 1)
+#define FTGMAC100_INT_NO_RXBUF(1 << 2)
+#define FTGMAC100_INT_RPKT_LOST   (1 << 3)
+#define FTGMAC100_INT_XPKT_ETH(1 << 4)
+#define FTGMAC100_INT_XPKT_FIFO   (1 << 5)
+#define FTGMAC100_INT_NO_NPTXBUF  (1 << 6)
+#define FTGMAC100_INT_XPKT_LOST   (1 << 7)
+#define FTGMAC100_INT_AHB_ERR (1 << 8)
+#define FTGMAC100_INT_PHYSTS_CHG  (1 << 9)
+#define FTGMAC100_INT_NO_HPTXBUF  (1 << 10)
+
+/*
+ * Automatic polling timer control register
+ */
+#define FTGMAC100_APTC_RXPOLL_CNT(x)((x) & 0xf)
+#define FTGMAC100_APTC_RXPOLL_TIME_SEL  (1 << 4)
+#define FTGMAC100_APTC_TXPOLL_CNT(x)(((x) >> 8) & 0xf)
+#define 

[Qemu-devel] [PULL 4/8] net/ftgmac100: add a 'aspeed' property

2017-04-23 Thread Jason Wang
From: Cédric Le Goater 

The Aspeed SoCs have a different definition of the end of the ring
buffer bit. Add a property to specify which set of bits should be used
by the NIC.

Signed-off-by: Cédric Le Goater 
Signed-off-by: Jason Wang 
---
 hw/net/ftgmac100.c | 17 +++--
 include/hw/net/ftgmac100.h |  4 
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/hw/net/ftgmac100.c b/hw/net/ftgmac100.c
index fc309b3..4c218bd 100644
--- a/hw/net/ftgmac100.c
+++ b/hw/net/ftgmac100.c
@@ -126,6 +126,7 @@
 #define FTGMAC100_TXDES0_CRC_ERR (1 << 19)
 #define FTGMAC100_TXDES0_LTS (1 << 28)
 #define FTGMAC100_TXDES0_FTS (1 << 29)
+#define FTGMAC100_TXDES0_EDOTR_ASPEED(1 << 30)
 #define FTGMAC100_TXDES0_TXDMA_OWN   (1 << 31)
 
 #define FTGMAC100_TXDES1_VLANTAG_CI(x)   ((x) & 0x)
@@ -154,6 +155,7 @@
 #define FTGMAC100_RXDES0_PAUSE_FRAME (1 << 25)
 #define FTGMAC100_RXDES0_LRS (1 << 28)
 #define FTGMAC100_RXDES0_FRS (1 << 29)
+#define FTGMAC100_RXDES0_EDORR_ASPEED(1 << 30)
 #define FTGMAC100_RXDES0_RXPKT_RDY   (1 << 31)
 
 #define FTGMAC100_RXDES1_VLANTAG_CI  0x
@@ -462,7 +464,7 @@ static void ftgmac100_do_tx(FTGMAC100State *s, uint32_t 
tx_ring,
 /* Write back the modified descriptor.  */
 ftgmac100_write_bd(, addr);
 /* Advance to the next descriptor.  */
-if (bd.des0 & FTGMAC100_TXDES0_EDOTR) {
+if (bd.des0 & s->txdes0_edotr) {
 addr = tx_ring;
 } else {
 addr += sizeof(FTGMAC100Desc);
@@ -880,7 +882,7 @@ static ssize_t ftgmac100_receive(NetClientState *nc, const 
uint8_t *buf,
 s->isr |= FTGMAC100_INT_RPKT_FIFO;
 }
 ftgmac100_write_bd(, addr);
-if (bd.des0 & FTGMAC100_RXDES0_EDORR) {
+if (bd.des0 & s->rxdes0_edorr) {
 addr = s->rx_ring;
 } else {
 addr += sizeof(FTGMAC100Desc);
@@ -921,6 +923,14 @@ static void ftgmac100_realize(DeviceState *dev, Error 
**errp)
 FTGMAC100State *s = FTGMAC100(dev);
 SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
 
+if (s->aspeed) {
+s->txdes0_edotr = FTGMAC100_TXDES0_EDOTR_ASPEED;
+s->rxdes0_edorr = FTGMAC100_RXDES0_EDORR_ASPEED;
+} else {
+s->txdes0_edotr = FTGMAC100_TXDES0_EDOTR;
+s->rxdes0_edorr = FTGMAC100_RXDES0_EDORR;
+}
+
 memory_region_init_io(>iomem, OBJECT(dev), _ops, s,
   TYPE_FTGMAC100, 0x2000);
 sysbus_init_mmio(sbd, >iomem);
@@ -967,11 +977,14 @@ static const VMStateDescription vmstate_ftgmac100 = {
 VMSTATE_UINT32(phy_advertise, FTGMAC100State),
 VMSTATE_UINT32(phy_int, FTGMAC100State),
 VMSTATE_UINT32(phy_int_mask, FTGMAC100State),
+VMSTATE_UINT32(txdes0_edotr, FTGMAC100State),
+VMSTATE_UINT32(rxdes0_edorr, FTGMAC100State),
 VMSTATE_END_OF_LIST()
 }
 };
 
 static Property ftgmac100_properties[] = {
+DEFINE_PROP_BOOL("aspeed", FTGMAC100State, aspeed, false),
 DEFINE_NIC_PROPERTIES(FTGMAC100State, conf),
 DEFINE_PROP_END_OF_LIST(),
 };
diff --git a/include/hw/net/ftgmac100.h b/include/hw/net/ftgmac100.h
index 962a718..d9bc589 100644
--- a/include/hw/net/ftgmac100.h
+++ b/include/hw/net/ftgmac100.h
@@ -55,6 +55,10 @@ typedef struct FTGMAC100State {
 uint32_t phy_advertise;
 uint32_t phy_int;
 uint32_t phy_int_mask;
+
+bool aspeed;
+uint32_t txdes0_edotr;
+uint32_t rxdes0_edorr;
 } FTGMAC100State;
 
 #endif
-- 
2.7.4




[Qemu-devel] [PULL 7/8] COLO-compare: Optimize tcp compare for option field

2017-04-23 Thread Jason Wang
From: Zhang Chen 

In this patch we support packet that have tcp options field.
Add tcp options field check, If the packet have options
field we just skip it and compare tcp payload,
Avoid unnecessary checkpoint, optimize performance.

Signed-off-by: Zhang Chen 
Signed-off-by: Jason Wang 
---
 net/colo-compare.c | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 9b09cfc..4ab80b1 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -233,7 +233,32 @@ static int colo_packet_compare_tcp(Packet *spkt, Packet 
*ppkt)
 spkt->ip->ip_sum = ppkt->ip->ip_sum;
 }
 
-if (ptcp->th_sum == stcp->th_sum) {
+/*
+ * Check tcp header length for tcp option field.
+ * th_off > 5 means this tcp packet have options field.
+ * The tcp options maybe always different.
+ * for example:
+ * From RFC 7323.
+ * TCP Timestamps option (TSopt):
+ * Kind: 8
+ *
+ * Length: 10 bytes
+ *
+ *+---+---+-+-+
+ *|Kind=8 |  10   |   TS Value (TSval)  |TS Echo Reply (TSecr)|
+ *+---+---+-+-+
+ *   1   1  4 4
+ *
+ * In this case the primary guest's timestamp always different with
+ * the secondary guest's timestamp. COLO just focus on payload,
+ * so we just need skip this field.
+ */
+if (ptcp->th_off > 5) {
+ptrdiff_t tcp_offset;
+tcp_offset = ppkt->transport_header - (uint8_t *)ppkt->data
+ + (ptcp->th_off * 4);
+res = colo_packet_compare_common(ppkt, spkt, tcp_offset);
+} else if (ptcp->th_sum == stcp->th_sum) {
 res = colo_packet_compare_common(ppkt, spkt, ETH_HLEN);
 } else {
 res = -1;
-- 
2.7.4




[Qemu-devel] [PULL 5/8] aspeed: add a FTGMAC100 nic

2017-04-23 Thread Jason Wang
From: Cédric Le Goater 

There is a second NIC but we do not use it for the moment. We use the
'aspeed' property to tune the definition of the end of ring buffer bit
for the Aspeed SoCs.

Signed-off-by: Cédric Le Goater 
Signed-off-by: Jason Wang 
---
 hw/arm/aspeed_soc.c | 21 +
 include/hw/arm/aspeed_soc.h |  2 ++
 2 files changed, 23 insertions(+)

diff --git a/hw/arm/aspeed_soc.c b/hw/arm/aspeed_soc.c
index 571e4f0..4937e2b 100644
--- a/hw/arm/aspeed_soc.c
+++ b/hw/arm/aspeed_soc.c
@@ -19,6 +19,7 @@
 #include "hw/char/serial.h"
 #include "qemu/log.h"
 #include "hw/i2c/aspeed_i2c.h"
+#include "net/net.h"
 
 #define ASPEED_SOC_UART_5_BASE  0x00184000
 #define ASPEED_SOC_IOMEM_SIZE   0x0020
@@ -33,6 +34,8 @@
 #define ASPEED_SOC_TIMER_BASE   0x1E782000
 #define ASPEED_SOC_WDT_BASE 0x1E785000
 #define ASPEED_SOC_I2C_BASE 0x1E78A000
+#define ASPEED_SOC_ETH1_BASE0x1E66
+#define ASPEED_SOC_ETH2_BASE0x1E68
 
 static const int uart_irqs[] = { 9, 32, 33, 34, 10 };
 static const int timer_irqs[] = { 16, 17, 18, 35, 36, 37, 38, 39, };
@@ -175,6 +178,10 @@ static void aspeed_soc_init(Object *obj)
 object_initialize(>wdt, sizeof(s->wdt), TYPE_ASPEED_WDT);
 object_property_add_child(obj, "wdt", OBJECT(>wdt), NULL);
 qdev_set_parent_bus(DEVICE(>wdt), sysbus_get_default());
+
+object_initialize(>ftgmac100, sizeof(s->ftgmac100), TYPE_FTGMAC100);
+object_property_add_child(obj, "ftgmac100", OBJECT(>ftgmac100), NULL);
+qdev_set_parent_bus(DEVICE(>ftgmac100), sysbus_get_default());
 }
 
 static void aspeed_soc_realize(DeviceState *dev, Error **errp)
@@ -299,6 +306,20 @@ static void aspeed_soc_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 sysbus_mmio_map(SYS_BUS_DEVICE(>wdt), 0, ASPEED_SOC_WDT_BASE);
+
+/* Net */
+qdev_set_nic_properties(DEVICE(>ftgmac100), _table[0]);
+object_property_set_bool(OBJECT(>ftgmac100), true, "aspeed", );
+object_property_set_bool(OBJECT(>ftgmac100), true, "realized",
+ _err);
+error_propagate(, local_err);
+if (err) {
+error_propagate(errp, err);
+return;
+}
+sysbus_mmio_map(SYS_BUS_DEVICE(>ftgmac100), 0, ASPEED_SOC_ETH1_BASE);
+sysbus_connect_irq(SYS_BUS_DEVICE(>ftgmac100), 0,
+   qdev_get_gpio_in(DEVICE(>vic), 2));
 }
 
 static void aspeed_soc_class_init(ObjectClass *oc, void *data)
diff --git a/include/hw/arm/aspeed_soc.h b/include/hw/arm/aspeed_soc.h
index dbec0c1..4c5fc66 100644
--- a/include/hw/arm/aspeed_soc.h
+++ b/include/hw/arm/aspeed_soc.h
@@ -20,6 +20,7 @@
 #include "hw/i2c/aspeed_i2c.h"
 #include "hw/ssi/aspeed_smc.h"
 #include "hw/watchdog/wdt_aspeed.h"
+#include "hw/net/ftgmac100.h"
 
 #define ASPEED_SPIS_NUM  2
 
@@ -39,6 +40,7 @@ typedef struct AspeedSoCState {
 AspeedSMCState spi[ASPEED_SPIS_NUM];
 AspeedSDMCState sdmc;
 AspeedWDTState wdt;
+FTGMAC100State ftgmac100;
 } AspeedSoCState;
 
 #define TYPE_ASPEED_SOC "aspeed-soc"
-- 
2.7.4




[Qemu-devel] [PULL 1/8] colo-compare: Fix old packet check bug.

2017-04-23 Thread Jason Wang
From: Zhang Chen 

If colo-compare find one old packet,we can notify colo-frame
do checkpoint, no need continue find more old packet here.

Signed-off-by: Zhang Chen 
Signed-off-by: Jason Wang 
---
 net/colo-compare.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 54e6d40..9b09cfc 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -372,10 +372,9 @@ static int colo_old_packet_check_one(Packet *pkt, int64_t 
*check_time)
 }
 }
 
-static void colo_old_packet_check_one_conn(void *opaque,
-   void *user_data)
+static int colo_old_packet_check_one_conn(Connection *conn,
+  void *user_data)
 {
-Connection *conn = opaque;
 GList *result = NULL;
 int64_t check_time = REGULAR_PACKET_CHECK_MS;
 
@@ -386,7 +385,10 @@ static void colo_old_packet_check_one_conn(void *opaque,
 if (result) {
 /* do checkpoint will flush old packet */
 /* TODO: colo_notify_checkpoint();*/
+return 0;
 }
+
+return 1;
 }
 
 /*
@@ -398,7 +400,12 @@ static void colo_old_packet_check(void *opaque)
 {
 CompareState *s = opaque;
 
-g_queue_foreach(>conn_list, colo_old_packet_check_one_conn, NULL);
+/*
+ * If we find one old packet, stop finding job and notify
+ * COLO frame do checkpoint.
+ */
+g_queue_find_custom(>conn_list, NULL,
+(GCompareFunc)colo_old_packet_check_one_conn);
 }
 
 /*
-- 
2.7.4




[Qemu-devel] [PULL 0/8] Net patches

2017-04-23 Thread Jason Wang
The following changes since commit 32c7e0ab755745e961f1772e95cac381cc68769d:

  Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170421' 
into staging (2017-04-21 15:59:27 +0100)

are available in the git repository at:

  https://github.com/jasowang/qemu.git tags/net-pull-request

for you to fetch changes up to 049f6d8237dd0b14dee02e4c22b20114c43cecff:

  COLO-compare: Optimize tcp compare trace event (2017-04-24 11:30:36 +0800)




Cédric Le Goater (5):
  hw/net: add MII definitions
  net: add FTGMAC100 support
  net/ftgmac100: add a 'aspeed' property
  aspeed: add a FTGMAC100 nic
  slirp: add a fake NC-SI backend

Zhang Chen (3):
  colo-compare: Fix old packet check bug.
  COLO-compare: Optimize tcp compare for option field
  COLO-compare: Optimize tcp compare trace event

 default-configs/arm-softmmu.mak |1 +
 hw/arm/aspeed_soc.c |   21 +
 hw/net/Makefile.objs|1 +
 hw/net/ftgmac100.c  | 1016 +++
 include/hw/arm/aspeed_soc.h |2 +
 include/hw/net/ftgmac100.h  |   64 +++
 include/hw/net/mii.h|   71 ++-
 include/net/eth.h   |1 +
 net/colo-compare.c  |   69 ++-
 net/trace-events|3 +-
 slirp/Makefile.objs |2 +-
 slirp/ncsi-pkt.h|  419 
 slirp/ncsi.c|  130 +
 slirp/slirp.c   |4 +
 slirp/slirp.h   |3 +
 15 files changed, 1770 insertions(+), 37 deletions(-)
 create mode 100644 hw/net/ftgmac100.c
 create mode 100644 include/hw/net/ftgmac100.h
 create mode 100644 slirp/ncsi-pkt.h
 create mode 100644 slirp/ncsi.c



Re: [Qemu-devel] [Qemu-ppc] [RESEND PATCH] MAINTAINERS: Remove myself from e500

2017-04-23 Thread David Gibson
On Thu, Apr 20, 2017 at 09:05:28PM -0500, Scott Wood wrote:
> I recently left Freescale/NXP, and even before that it'd been a few years
> since I was actively involved in KVM/QEMU work.
> 
> Signed-off-by: Scott Wood 
> ---
> Sorry for the resend -- fixed mailing list address.

Applied to my ppc-for-2.10 branch.

> 
>  MAINTAINERS | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c60235eaf6..dcb9be1c6c 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -645,7 +645,6 @@ F: hw/ppc/ppc440_bamboo.c
>  
>  e500
>  M: Alexander Graf 
> -M: Scott Wood 
>  L: qemu-...@nongnu.org
>  S: Supported
>  F: hw/ppc/e500.[hc]
> @@ -656,7 +655,6 @@ F: pc-bios/u-boot.e500
>  
>  mpc8544ds
>  M: Alexander Graf 
> -M: Scott Wood 
>  L: qemu-...@nongnu.org
>  S: Supported
>  F: hw/ppc/mpc8544ds.c
> @@ -933,7 +931,6 @@ F: include/hw/ppc/ppc4xx.h
>  
>  ppce500
>  M: Alexander Graf 
> -M: Scott Wood 
>  L: qemu-...@nongnu.org
>  S: Supported
>  F: hw/ppc/e500*

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [Qemu-ppc] [PULL 00/47] ppc-for-2.10 queue 20170424

2017-04-23 Thread David Gibson
On Sun, Apr 23, 2017 at 08:08:23PM -0700, no-re...@patchew.org wrote:
> Hi,
> 
> This series seems to have some coding style problems. See output below for
> more information:
> 
> Message-id: 20170424015927.8933-1-da...@gibson.dropbear.id.au
> Subject: [Qemu-devel] [PULL 00/47] ppc-for-2.10 queue 20170424
> Type: series
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> 
> BASE=base
> n=1
> total=$(git log --oneline $BASE.. | wc -l)
> failed=0
> 
> # Useful git options
> git config --local diff.renamelimit 0
> git config --local diff.renames True
> 
> commits="$(git log --format=%H --reverse $BASE..)"
> for c in $commits; do
> echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
> if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; 
> then
> failed=1
> echo
> fi
> n=$((n+1))
> done
> 
> exit $failed
> === TEST SCRIPT END ===
> 
> Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
> Switched to a new branch 'test'
> 94bea6a target/ppc: Style fixes
> d23eac0 e500, book3s: mfspr 259: Register mapped/aliased SPRG3 user read
> 3a5bd96 target/ppc: Flush TLB on write to PIDR
> 3240e33 spapr-cpu-core: Release ICPState object during CPU unrealization
> ea6341d ppc/pnv: generate an OEM SEL event on shutdown
> 130cf1f ppc/pnv: add initial IPMI sensors for the BMC simulator
> 7e24b32 ppc/pnv: populate device tree for IPMI BT devices
> 6575717 ppc/pnv: populate device tree for serial devices
> 836d562 ppc/pnv: populate device tree for RTC devices
> dca8231 ppc/pnv: scan ISA bus to populate device tree
> 4d631b7 ppc/pnv: enable only one LPC bus
> af22372 ppc/pnv: Add support for POWER8+ LPC Controller
> 2d79b40 spapr: remove the 'nr_servers' field from the machine
> 8fa3c09 target/ppc: Fix size of struct PPCElfPrstatus
> 212f4d7 ipmi: introduce an ipmi_bmc_gen_event() API
> 240da02 ipmi: introduce an ipmi_bmc_sdr_find() API
> 3607ef7 ipmi: provide support for FRUs
> aa873a2 ipmi: use a file to load SDRs
> ef1ce62 ppc: add IPMI support
> 1a8fffd ppc/pnv: Add OCC model stub with interrupt support
> 828bcfa ppc/pnv: Add cut down PSI bridge model and hookup external interrupt
> ac8392a ppc/pnv: add memory regions for the ICP registers
> d90ca95 ppc/pnv: add a helper to calculate MMIO addresses registers
> a5a614b ppc/pnv: create the ICP object under PnvCore
> 5f43b5e ppc/pnv: extend the machine with a InterruptStatsProvider interface
> 04dfecf ppc/pnv: extend the machine with a XICSFabric interface
> 5b94a0f ppc/pnv: add a PnvICPState object
> 96c645e ppc/xics: add a realize() handler to ICPStateClass
> 4569615 spapr: allocate the ICPState object from under sPAPRCPUCore
> 8257d1e spapr: move the IRQ server number mapping under the machine
> 56d6f91 ppc/xics: introduce an 'intc' backlink under PowerPCCPU
> f024a9a target/ppc: Add ibm, processor-radix-AP-encodings for TCG
> a0d8df3 spapr_pci: Removed unused include
> 7cc7952 spapr_pci: Warn when RAM page size is not enabled in IOMMU page mask
> 2093121 target-ppc/kvm: Enable in-kernel TCE acceleration for multi-tce
> 26538ee spapr: Workaround for broken radix guests
> d2da420 spapr: Enable ISA 3.0 MMU mode selection via CAS
> 50fe08d spapr: move spapr_populate_pa_features()
> d667f21 target/ppc: Implement H_REGISTER_PROCESS_TABLE H_CALL
> 129c199 target/ppc: Add new H-CALL shells for in memory table translation
> d033285 target-ppc: support KVM_CAP_PPC_MMU_RADIX, KVM_CAP_PPC_MMU_HASH_V3
> 38c395b spapr: Add ibm, processor-radix-AP-encodings to the device tree
> 4fe232c target-ppc: kvm: make use of KVM_CREATE_SPAPR_TCE_64
> 90f87ad hw/ppc/pnv: Classify the "PowerNV Chip" devices as CPU devices
> c98837a ppc/spapr: QOM'ify sPAPRRTCState
> 1bc37a7 pseries: Add pseries-2.10 machine type
> 40a726e target/ppc: Improve accuracy of guest HTM availability on P8s
> 
> === OUTPUT BEGIN ===
> Checking PATCH 1/47: target/ppc: Improve accuracy of guest HTM availability 
> on P8s...
> Checking PATCH 2/47: pseries: Add pseries-2.10 machine type...
> Checking PATCH 3/47: ppc/spapr: QOM'ify sPAPRRTCState...
> Checking PATCH 4/47: hw/ppc/pnv: Classify the "PowerNV Chip" devices as CPU 
> devices...
> Checking PATCH 5/47: target-ppc: kvm: make use of KVM_CREATE_SPAPR_TCE_64...
> Checking PATCH 6/47: spapr: Add ibm, processor-radix-AP-encodings to the 
> device tree...
> Checking PATCH 7/47: target-ppc: support KVM_CAP_PPC_MMU_RADIX, 
> KVM_CAP_PPC_MMU_HASH_V3...
> Checking PATCH 8/47: target/ppc: Add new H-CALL shells for in memory table 
> translation...
> Checking PATCH 9/47: target/ppc: Implement H_REGISTER_PROCESS_TABLE H_CALL...
> Checking PATCH 10/47: spapr: move spapr_populate_pa_features()...
> Checking PATCH 11/47: spapr: Enable ISA 3.0 MMU mode selection via CAS...
> Checking PATCH 12/47: spapr: Workaround for broken radix guests...
> Checking PATCH 13/47: target-ppc/kvm: Enable in-kernel TCE acceleration for 
> multi-tce...
> Checking PATCH 14/47: spapr_pci: Warn when RAM page size is not enabled in 
> IOMMU page mask...
> 

Re: [Qemu-devel] [PATCH 2/3] colo-compare: Check main_loop value before call g_main_loop_quit

2017-04-23 Thread Jason Wang



On 2017年04月20日 15:46, zhanghailiang wrote:

If some errors happen before main_loop is initialized in colo
compare thread, qemu will go into finalizing process where
we call g_main_loop_quit(s->main_loop), if main_loop is NULL, there
will be an error report:
"(process:14861): GLib-CRITICAL **: g_main_loop_quit: assertion 'loop != NULL' 
failed".

We need to check if main_loop is NULL or not before call g_main_loop_quit().


Do we need check and fail early in colo_compare_thread() too?

Thanks



Signed-off-by: zhanghailiang 
---
  net/colo-compare.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index a6bf419..d6a5e4c 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -770,7 +770,9 @@ static void colo_compare_finalize(Object *obj)
   s->worker_context, true);
  qemu_chr_fe_deinit(>chr_out);
  
-g_main_loop_quit(s->compare_loop);

+if (s->compare_loop) {
+g_main_loop_quit(s->compare_loop);
+}
  qemu_thread_join(>thread);
  
  /* Release all unhandled packets after compare thead exited */





Re: [Qemu-devel] [PATCH 1/3] colo-compare: serialize compare thread's initialization with main thread

2017-04-23 Thread Jason Wang



On 2017年04月20日 15:46, zhanghailiang wrote:

We call qemu_chr_fe_set_handlers() in colo-compare thread, it is used
to detach watched fd from default main context, so it has chance to
handle the same watched fd with main thread concurrently, which will
trigger an error report:
"qemu-char.c:918: io_watch_poll_finalize: Assertion `iwp->src == ((void *)0)' 
failed."


Anyway to prevent fd from being handled by main thread before creating 
colo thread? Using semaphore seems not elegant.


Thanks



Fix it by serializing compare thread's initialization with main thread.

Signed-off-by: zhanghailiang 
---
  net/colo-compare.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 54e6d40..a6bf419 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -83,6 +83,7 @@ typedef struct CompareState {
  GHashTable *connection_track_table;
  /* compare thread, a thread for each NIC */
  QemuThread thread;
+QemuSemaphore thread_ready;
  
  GMainContext *worker_context;

  GMainLoop *compare_loop;
@@ -557,6 +558,8 @@ static void *colo_compare_thread(void *opaque)
(GSourceFunc)check_old_packet_regular, s, NULL);
  g_source_attach(timeout_source, s->worker_context);
  
+qemu_sem_post(>thread_ready);

+
  g_main_loop_run(s->compare_loop);
  
  g_source_unref(timeout_source);

@@ -707,12 +710,15 @@ static void colo_compare_complete(UserCreatable *uc, 
Error **errp)
connection_key_equal,
g_free,
connection_destroy);
+qemu_sem_init(>thread_ready, 0);
  
  sprintf(thread_name, "colo-compare %d", compare_id);

  qemu_thread_create(>thread, thread_name,
 colo_compare_thread, s,
 QEMU_THREAD_JOINABLE);
  compare_id++;
+qemu_sem_wait(>thread_ready);
+qemu_sem_destroy(>thread_ready);
  
  return;

  }





Re: [Qemu-devel] [PATCH V2 0/6] Add COLO-proxy virtio-net support

2017-04-23 Thread Jason Wang



On 2017年04月20日 14:39, Zhang Chen wrote:

If user use -device virtio-net-pci, virtio-net driver will add a header
to raw net packet that colo-proxy can't handle it. COLO-proxy just
focus on the packet payload, so we skip the virtio-net header to compare
the sent packet that primary guest's to secondary guest's.

Zhang Chen (6):
   net/filter-mirror.c: Add filter-mirror and filter-redirector vnet
 support.
   net/net.c: Add vnet header length to SocketReadState
   net/colo-compare.c: Make colo-compare support vnet_hdr_len
   net/socket.c: Add vnet packet support in net_socket_receive()
   net/colo.c: Add vnet packet parse feature in colo-proxy
   net/colo-compare.c: Add vnet packet's tcp/udp/icmp compare

  include/net/net.h |  4 +++-
  net/colo-compare.c| 48 +++-
  net/colo.c|  9 +
  net/colo.h|  4 +++-
  net/filter-mirror.c   | 25 -
  net/filter-rewriter.c |  2 +-
  net/net.c | 24 ++--
  net/socket.c  |  6 ++
  8 files changed, 99 insertions(+), 23 deletions(-)



A quick glance at the series and find two issues:

- We can't assume virtio-net is the only user for vnet header, you need 
query e.g NetClientState for a correct vnet header len.

- This series breaks qtest:

**
ERROR:tests/e1000e-test.c:296:e1000e_send_verify: assertion failed 
(buffer == "TEST"): ("" == "TEST")

GTester: last random seed: R02S39dd06f7f52013798111df2e4eb602c5
**
ERROR:tests/e1000e-test.c:365:e1000e_receive_verify: assertion failed 
(le32_to_cpu(descr.wb.upper.status_error) & esta_dd == esta_dd): 
(0x == 0x0001)

GTester: last random seed: R02S8c8200b8ec86358cb7addb5c6fe1303c
**
ERROR:tests/e1000e-test.c:296:e1000e_send_verify: assertion failed 
(buffer == "TEST"): ("" == "TEST")

GTester: last random seed: R02S9be86025aa7ded4902bdf644c3964a6e
**
ERROR:tests/libqos/virtio.c:94:qvirtio_wait_queue_isr: assertion failed: 
(g_get_monotonic_time() - start_time <= timeout_us)

GTester: last random seed: R02S30cac33d7a98fa56806ca59b35910ea5
**
ERROR:tests/libqos/virtio.c:94:qvirtio_wait_queue_isr: assertion failed: 
(g_get_monotonic_time() - start_time <= timeout_us)

GTester: last random seed: R02S258359836760a723622abf56cf2e61e7
^C/home/devel/git/qemu/tests/Makefile.include:815: recipe for target 
'check-qtest-x86_64' failed

make: *** [check-qtest-x86_64] Interrupt

Please fix them.

Thanks



Re: [Qemu-devel] [PATCH V2 0/2] COLO-compare: Optimize tcp compare performance and trace format.

2017-04-23 Thread Jason Wang



On 2017年04月18日 10:20, Zhang Chen wrote:

In the first patch, we add tcp options support to optimize compare performance.
and another patch simplified code and adjust trace print format.

Zhang Chen (2):
   COLO-compare: Optimize tcp compare for option field
   COLO-compare: Optimize tcp compare trace event

  net/colo-compare.c | 54 ++
  net/trace-events   |  3 +--
  2 files changed, 43 insertions(+), 14 deletions(-)



Applied, thanks.



Re: [Qemu-devel] [PATCH V2 1/2] COLO-compare: Optimize tcp compare for option field

2017-04-23 Thread Jason Wang



On 2017年04月21日 16:22, Zhang Chen wrote:



On 04/21/2017 12:10 PM, Jason Wang wrote:



On 2017年04月21日 11:48, Zhang Chen wrote:



On 04/20/2017 02:43 PM, Jason Wang wrote:



On 2017年04月18日 10:20, Zhang Chen wrote:

In this patch we support packet that have tcp options field.
Add tcp options field check, If the packet have options
field we just skip it and compare tcp payload,
Avoid unnecessary checkpoint, optimize performance.

Signed-off-by: Zhang Chen 
---
  net/colo-compare.c | 27 ++-
  1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index aada04e..049f6f8 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -248,7 +248,32 @@ static int colo_packet_compare_tcp(Packet 
*spkt, Packet *ppkt)

  spkt->ip->ip_sum = ppkt->ip->ip_sum;
  }
  -if (ptcp->th_sum == stcp->th_sum) {
+/*
+ * Check tcp header length for tcp option field.
+ * th_off > 5 means this tcp packet have options field.
+ * The tcp options maybe always different.
+ * for example:
+ * From RFC 7323.
+ * TCP Timestamps option (TSopt):
+ * Kind: 8
+ *
+ * Length: 10 bytes
+ *
+ * +---+---+-+-+
+ *|Kind=8 |  10   |   TS Value (TSval)  |TS Echo Reply 
(TSecr)|

+ * +---+---+-+-+
+ *   1   1  4 4
+ *
+ * In this case the primary guest's timestamp always 
different with

+ * the secondary guest's timestamp. COLO just focus on payload,
+ * so we just need skip this field.h


Probably a good explanation why we can skip this kind of header. 
But it does not explain why we can skip all the rest?


I found tcp options have many kind number to express different meaning,
Here I just give an example for the different options situation,
and this field not the COLO-proxy focus on, COLO just concern the 
payload.
Maybe we will optimize in the feature. Currently we want to make 
COLO full-function

running in qemu upstream.


I see, but the questions are:

- If we see packets with different options (except for the case you 
explain above), does it mean primary and secondary run out of sync?
- What will happen if we compare and ask for synchronization if we 
find options are not exactly the same?





The tcp options begin from the first SYN packet, if client connect to 
guest, the primary guest and secondary guest
just give a responses to the client's options, so I think the 
different options part mostly from guest's running
statues(like the timestamp), and if kind number are not same, the 
payload more likely not same too. finally, after
next checkpoint, all things will be same. If guest connect to client, 
primary guest send a packet and secondary not,

It will trigger old packet scan to do checkpoint.

Thanks
Zhang Chen


Ok this sounds more or less a question of whether or not we want to be 
more relaxed.


Thanks





Thanks



Thanks
Zhang Chen



Thanks


+ */
+if (ptcp->th_off > 5) {
+ptrdiff_t tcp_offset;
+tcp_offset = ppkt->transport_header - (uint8_t *)ppkt->data
+ + (ptcp->th_off * 4);
+res = colo_packet_compare_common(ppkt, spkt, tcp_offset);
+} else if (ptcp->th_sum == stcp->th_sum) {
  res = colo_packet_compare_common(ppkt, spkt, ETH_HLEN);
  } else {
  res = -1;




.







.








Re: [Qemu-devel] [PULL 00/47] ppc-for-2.10 queue 20170424

2017-04-23 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20170424015927.8933-1-da...@gibson.dropbear.id.au
Subject: [Qemu-devel] [PULL 00/47] ppc-for-2.10 queue 20170424
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
94bea6a target/ppc: Style fixes
d23eac0 e500, book3s: mfspr 259: Register mapped/aliased SPRG3 user read
3a5bd96 target/ppc: Flush TLB on write to PIDR
3240e33 spapr-cpu-core: Release ICPState object during CPU unrealization
ea6341d ppc/pnv: generate an OEM SEL event on shutdown
130cf1f ppc/pnv: add initial IPMI sensors for the BMC simulator
7e24b32 ppc/pnv: populate device tree for IPMI BT devices
6575717 ppc/pnv: populate device tree for serial devices
836d562 ppc/pnv: populate device tree for RTC devices
dca8231 ppc/pnv: scan ISA bus to populate device tree
4d631b7 ppc/pnv: enable only one LPC bus
af22372 ppc/pnv: Add support for POWER8+ LPC Controller
2d79b40 spapr: remove the 'nr_servers' field from the machine
8fa3c09 target/ppc: Fix size of struct PPCElfPrstatus
212f4d7 ipmi: introduce an ipmi_bmc_gen_event() API
240da02 ipmi: introduce an ipmi_bmc_sdr_find() API
3607ef7 ipmi: provide support for FRUs
aa873a2 ipmi: use a file to load SDRs
ef1ce62 ppc: add IPMI support
1a8fffd ppc/pnv: Add OCC model stub with interrupt support
828bcfa ppc/pnv: Add cut down PSI bridge model and hookup external interrupt
ac8392a ppc/pnv: add memory regions for the ICP registers
d90ca95 ppc/pnv: add a helper to calculate MMIO addresses registers
a5a614b ppc/pnv: create the ICP object under PnvCore
5f43b5e ppc/pnv: extend the machine with a InterruptStatsProvider interface
04dfecf ppc/pnv: extend the machine with a XICSFabric interface
5b94a0f ppc/pnv: add a PnvICPState object
96c645e ppc/xics: add a realize() handler to ICPStateClass
4569615 spapr: allocate the ICPState object from under sPAPRCPUCore
8257d1e spapr: move the IRQ server number mapping under the machine
56d6f91 ppc/xics: introduce an 'intc' backlink under PowerPCCPU
f024a9a target/ppc: Add ibm, processor-radix-AP-encodings for TCG
a0d8df3 spapr_pci: Removed unused include
7cc7952 spapr_pci: Warn when RAM page size is not enabled in IOMMU page mask
2093121 target-ppc/kvm: Enable in-kernel TCE acceleration for multi-tce
26538ee spapr: Workaround for broken radix guests
d2da420 spapr: Enable ISA 3.0 MMU mode selection via CAS
50fe08d spapr: move spapr_populate_pa_features()
d667f21 target/ppc: Implement H_REGISTER_PROCESS_TABLE H_CALL
129c199 target/ppc: Add new H-CALL shells for in memory table translation
d033285 target-ppc: support KVM_CAP_PPC_MMU_RADIX, KVM_CAP_PPC_MMU_HASH_V3
38c395b spapr: Add ibm, processor-radix-AP-encodings to the device tree
4fe232c target-ppc: kvm: make use of KVM_CREATE_SPAPR_TCE_64
90f87ad hw/ppc/pnv: Classify the "PowerNV Chip" devices as CPU devices
c98837a ppc/spapr: QOM'ify sPAPRRTCState
1bc37a7 pseries: Add pseries-2.10 machine type
40a726e target/ppc: Improve accuracy of guest HTM availability on P8s

=== OUTPUT BEGIN ===
Checking PATCH 1/47: target/ppc: Improve accuracy of guest HTM availability on 
P8s...
Checking PATCH 2/47: pseries: Add pseries-2.10 machine type...
Checking PATCH 3/47: ppc/spapr: QOM'ify sPAPRRTCState...
Checking PATCH 4/47: hw/ppc/pnv: Classify the "PowerNV Chip" devices as CPU 
devices...
Checking PATCH 5/47: target-ppc: kvm: make use of KVM_CREATE_SPAPR_TCE_64...
Checking PATCH 6/47: spapr: Add ibm, processor-radix-AP-encodings to the device 
tree...
Checking PATCH 7/47: target-ppc: support KVM_CAP_PPC_MMU_RADIX, 
KVM_CAP_PPC_MMU_HASH_V3...
Checking PATCH 8/47: target/ppc: Add new H-CALL shells for in memory table 
translation...
Checking PATCH 9/47: target/ppc: Implement H_REGISTER_PROCESS_TABLE H_CALL...
Checking PATCH 10/47: spapr: move spapr_populate_pa_features()...
Checking PATCH 11/47: spapr: Enable ISA 3.0 MMU mode selection via CAS...
Checking PATCH 12/47: spapr: Workaround for broken radix guests...
Checking PATCH 13/47: target-ppc/kvm: Enable in-kernel TCE acceleration for 
multi-tce...
Checking PATCH 14/47: spapr_pci: Warn when RAM page size is not enabled in 
IOMMU page mask...
Checking PATCH 15/47: spapr_pci: Removed unused include...
Checking PATCH 16/47: target/ppc: Add ibm, processor-radix-AP-encodings for 
TCG...
Checking PATCH 17/47: ppc/xics: introduce an 'intc' backlink under PowerPCCPU...
Checking PATCH 18/47: spapr: move the IRQ server number 

[Qemu-devel] [PULL 47/47] target/ppc: Style fixes

2017-04-23 Thread David Gibson
This makes a small step fixing one of many style problems that exist in
the older ppc code.  This removes spaces between function (or macro) name
and the following '('.

Signed-off-by: David Gibson 
---
 target/ppc/translate_init.c | 372 ++--
 1 file changed, 186 insertions(+), 186 deletions(-)

diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
index 0ecf541..e82e3e6 100644
--- a/target/ppc/translate_init.c
+++ b/target/ppc/translate_init.c
@@ -66,7 +66,7 @@ static void spr_store_dump_spr(int sprn)
 #endif
 }
 
-static void spr_write_generic (DisasContext *ctx, int sprn, int gprn)
+static void spr_write_generic(DisasContext *ctx, int sprn, int gprn)
 {
 gen_store_spr(sprn, cpu_gpr[gprn]);
 spr_store_dump_spr(sprn);
@@ -86,7 +86,7 @@ static void spr_write_generic32(DisasContext *ctx, int sprn, 
int gprn)
 #endif
 }
 
-static void spr_write_clear (DisasContext *ctx, int sprn, int gprn)
+static void spr_write_clear(DisasContext *ctx, int sprn, int gprn)
 {
 TCGv t0 = tcg_temp_new();
 TCGv t1 = tcg_temp_new();
@@ -106,47 +106,47 @@ static void spr_access_nop(DisasContext *ctx, int sprn, 
int gprn)
 
 /* SPR common to all PowerPC */
 /* XER */
-static void spr_read_xer (DisasContext *ctx, int gprn, int sprn)
+static void spr_read_xer(DisasContext *ctx, int gprn, int sprn)
 {
 gen_read_xer(ctx, cpu_gpr[gprn]);
 }
 
-static void spr_write_xer (DisasContext *ctx, int sprn, int gprn)
+static void spr_write_xer(DisasContext *ctx, int sprn, int gprn)
 {
 gen_write_xer(cpu_gpr[gprn]);
 }
 
 /* LR */
-static void spr_read_lr (DisasContext *ctx, int gprn, int sprn)
+static void spr_read_lr(DisasContext *ctx, int gprn, int sprn)
 {
 tcg_gen_mov_tl(cpu_gpr[gprn], cpu_lr);
 }
 
-static void spr_write_lr (DisasContext *ctx, int sprn, int gprn)
+static void spr_write_lr(DisasContext *ctx, int sprn, int gprn)
 {
 tcg_gen_mov_tl(cpu_lr, cpu_gpr[gprn]);
 }
 
 /* CFAR */
 #if defined(TARGET_PPC64) && !defined(CONFIG_USER_ONLY)
-static void spr_read_cfar (DisasContext *ctx, int gprn, int sprn)
+static void spr_read_cfar(DisasContext *ctx, int gprn, int sprn)
 {
 tcg_gen_mov_tl(cpu_gpr[gprn], cpu_cfar);
 }
 
-static void spr_write_cfar (DisasContext *ctx, int sprn, int gprn)
+static void spr_write_cfar(DisasContext *ctx, int sprn, int gprn)
 {
 tcg_gen_mov_tl(cpu_cfar, cpu_gpr[gprn]);
 }
 #endif /* defined(TARGET_PPC64) && !defined(CONFIG_USER_ONLY) */
 
 /* CTR */
-static void spr_read_ctr (DisasContext *ctx, int gprn, int sprn)
+static void spr_read_ctr(DisasContext *ctx, int gprn, int sprn)
 {
 tcg_gen_mov_tl(cpu_gpr[gprn], cpu_ctr);
 }
 
-static void spr_write_ctr (DisasContext *ctx, int sprn, int gprn)
+static void spr_write_ctr(DisasContext *ctx, int sprn, int gprn)
 {
 tcg_gen_mov_tl(cpu_ctr, cpu_gpr[gprn]);
 }
@@ -157,7 +157,7 @@ static void spr_write_ctr (DisasContext *ctx, int sprn, int 
gprn)
 /* UPMCx */
 /* USIA */
 /* UDECR */
-static void spr_read_ureg (DisasContext *ctx, int gprn, int sprn)
+static void spr_read_ureg(DisasContext *ctx, int gprn, int sprn)
 {
 gen_load_spr(cpu_gpr[gprn], sprn + 0x10);
 }
@@ -172,7 +172,7 @@ static void spr_write_ureg(DisasContext *ctx, int sprn, int 
gprn)
 /* SPR common to all non-embedded PowerPC */
 /* DECR */
 #if !defined(CONFIG_USER_ONLY)
-static void spr_read_decr (DisasContext *ctx, int gprn, int sprn)
+static void spr_read_decr(DisasContext *ctx, int gprn, int sprn)
 {
 if (ctx->tb->cflags & CF_USE_ICOUNT) {
 gen_io_start();
@@ -184,7 +184,7 @@ static void spr_read_decr (DisasContext *ctx, int gprn, int 
sprn)
 }
 }
 
-static void spr_write_decr (DisasContext *ctx, int sprn, int gprn)
+static void spr_write_decr(DisasContext *ctx, int sprn, int gprn)
 {
 if (ctx->tb->cflags & CF_USE_ICOUNT) {
 gen_io_start();
@@ -199,7 +199,7 @@ static void spr_write_decr (DisasContext *ctx, int sprn, 
int gprn)
 
 /* SPR common to all non-embedded PowerPC, except 601 */
 /* Time base */
-static void spr_read_tbl (DisasContext *ctx, int gprn, int sprn)
+static void spr_read_tbl(DisasContext *ctx, int gprn, int sprn)
 {
 if (ctx->tb->cflags & CF_USE_ICOUNT) {
 gen_io_start();
@@ -211,7 +211,7 @@ static void spr_read_tbl (DisasContext *ctx, int gprn, int 
sprn)
 }
 }
 
-static void spr_read_tbu (DisasContext *ctx, int gprn, int sprn)
+static void spr_read_tbu(DisasContext *ctx, int gprn, int sprn)
 {
 if (ctx->tb->cflags & CF_USE_ICOUNT) {
 gen_io_start();
@@ -224,19 +224,19 @@ static void spr_read_tbu (DisasContext *ctx, int gprn, 
int sprn)
 }
 
 __attribute__ (( unused ))
-static void spr_read_atbl (DisasContext *ctx, int gprn, int sprn)
+static void spr_read_atbl(DisasContext *ctx, int gprn, int sprn)
 {
 gen_helper_load_atbl(cpu_gpr[gprn], cpu_env);
 }
 
 __attribute__ (( unused ))
-static void spr_read_atbu (DisasContext *ctx, int gprn, int sprn)
+static void spr_read_atbu(DisasContext 

[Qemu-devel] [PULL 37/47] ppc/pnv: enable only one LPC bus

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

The default LPC bus of a multichip system is on chip 0. It's
recognized by the firmware (skiboot) using a "primary" property in the
device tree.

We introduce a pnv_chip_lpc_offset() routine to locate the LPC node of
a chip and set the property directly from the machine level.

Signed-off-by: Cédric Le Goater 
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 22 ++
 hw/ppc/pnv_lpc.c |  9 -
 2 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 27589b9..9468e99 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -255,6 +255,18 @@ static void powernv_populate_icp(PnvChip *chip, void *fdt, 
uint32_t pir,
 g_free(reg);
 }
 
+static int pnv_chip_lpc_offset(PnvChip *chip, void *fdt)
+{
+char *name;
+int offset;
+
+name = g_strdup_printf("/xscom@%" PRIx64 "/isa@%x",
+   (uint64_t) PNV_XSCOM_BASE(chip), 
PNV_XSCOM_LPC_BASE);
+offset = fdt_path_offset(fdt, name);
+g_free(name);
+return offset;
+}
+
 static void powernv_populate_chip(PnvChip *chip, void *fdt)
 {
 PnvChipClass *pcc = PNV_CHIP_GET_CLASS(chip);
@@ -264,6 +276,16 @@ static void powernv_populate_chip(PnvChip *chip, void *fdt)
 
 pnv_xscom_populate(chip, fdt, 0);
 
+/* The default LPC bus of a multichip system is on chip 0. It's
+ * recognized by the firmware (skiboot) using a "primary"
+ * property.
+ */
+if (chip->chip_id == 0x0) {
+int lpc_offset = pnv_chip_lpc_offset(chip, fdt);
+
+_FDT((fdt_setprop(fdt, lpc_offset, "primary", NULL, 0)));
+}
+
 for (i = 0; i < chip->nr_cores; i++) {
 PnvCore *pnv_core = PNV_CORE(chip->cores + i * typesize);
 
diff --git a/hw/ppc/pnv_lpc.c b/hw/ppc/pnv_lpc.c
index 5d20c15..f03a80a 100644
--- a/hw/ppc/pnv_lpc.c
+++ b/hw/ppc/pnv_lpc.c
@@ -92,14 +92,6 @@ enum {
 #define LPC_HC_REGS_OPB_SIZE0x1000
 
 
-/*
- * TODO: the "primary" cell should only be added on chip 0. This is
- * how skiboot chooses the default LPC controller on multichip
- * systems.
- *
- * It would be easly done if we can change the populate() interface to
- * replace the PnvXScomInterface parameter by a PnvChip one
- */
 static int pnv_lpc_populate(PnvXScomInterface *dev, void *fdt, int 
xscom_offset)
 {
 const char compat[] = "ibm,power8-lpc\0ibm,lpc";
@@ -119,7 +111,6 @@ static int pnv_lpc_populate(PnvXScomInterface *dev, void 
*fdt, int xscom_offset)
 _FDT((fdt_setprop(fdt, offset, "reg", reg, sizeof(reg;
 _FDT((fdt_setprop_cell(fdt, offset, "#address-cells", 2)));
 _FDT((fdt_setprop_cell(fdt, offset, "#size-cells", 1)));
-_FDT((fdt_setprop(fdt, offset, "primary", NULL, 0)));
 _FDT((fdt_setprop(fdt, offset, "compatible", compat, sizeof(compat;
 return 0;
 }
-- 
2.9.3




[Qemu-devel] [PULL 35/47] spapr: remove the 'nr_servers' field from the machine

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

xics_system_init() does not need 'nr_servers' anymore as it is only
used to define the 'interrupt-controller' node in the device tree. So
let's just compute the value when calling spapr_dt_xics().

This also gives us an opportunity to simplify the xics_system_init()
routine and introduce a specific spapr_ics_create() helper to create
the sPAPR ICS object.

Signed-off-by: Cédric Le Goater 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 56 ++
 include/hw/ppc/spapr.h |  1 -
 2 files changed, 24 insertions(+), 33 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 703b14a..80d12d0 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -97,45 +97,40 @@
 
 #define HTAB_SIZE(spapr)(1ULL << ((spapr)->htab_shift))
 
-static int try_create_xics(sPAPRMachineState *spapr, const char *type_ics,
-   const char *type_icp, int nr_servers,
-   int nr_irqs, Error **errp)
+static ICSState *spapr_ics_create(sPAPRMachineState *spapr,
+  const char *type_ics,
+  int nr_irqs, Error **errp)
 {
-XICSFabric *xi = XICS_FABRIC(spapr);
 Error *err = NULL, *local_err = NULL;
-ICSState *ics = NULL;
+Object *obj;
 
-ics = ICS_SIMPLE(object_new(type_ics));
-object_property_add_child(OBJECT(spapr), "ics", OBJECT(ics), NULL);
-object_property_set_int(OBJECT(ics), nr_irqs, "nr-irqs", );
-object_property_add_const_link(OBJECT(ics), "xics", OBJECT(xi), NULL);
-object_property_set_bool(OBJECT(ics), true, "realized", _err);
+obj = object_new(type_ics);
+object_property_add_child(OBJECT(spapr), "ics", obj, NULL);
+object_property_add_const_link(obj, "xics", OBJECT(spapr), _abort);
+object_property_set_int(obj, nr_irqs, "nr-irqs", );
+object_property_set_bool(obj, true, "realized", _err);
 error_propagate(, local_err);
 if (err) {
 error_propagate(errp, err);
-return -1;
+return NULL;
 }
 
-spapr->nr_servers = nr_servers;
-spapr->ics = ics;
-spapr->icp_type = type_icp;
-return 0;
+return ICS_SIMPLE(obj);
 }
 
-static int xics_system_init(MachineState *machine,
-int nr_servers, int nr_irqs, Error **errp)
+static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
 {
-int rc = -1;
+sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
 
 if (kvm_enabled()) {
 Error *err = NULL;
 
 if (machine_kernel_irqchip_allowed(machine) &&
-!xics_kvm_init(SPAPR_MACHINE(machine), errp)) {
-rc = try_create_xics(SPAPR_MACHINE(machine), TYPE_ICS_KVM,
- TYPE_KVM_ICP, nr_servers, nr_irqs, );
+!xics_kvm_init(spapr, errp)) {
+spapr->icp_type = TYPE_KVM_ICP;
+spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, );
 }
-if (machine_kernel_irqchip_required(machine) && rc < 0) {
+if (machine_kernel_irqchip_required(machine) && !spapr->ics) {
 error_reportf_err(err,
   "kernel_irqchip requested but unavailable: ");
 } else {
@@ -143,13 +138,11 @@ static int xics_system_init(MachineState *machine,
 }
 }
 
-if (rc < 0) {
-xics_spapr_init(SPAPR_MACHINE(machine), errp);
-rc = try_create_xics(SPAPR_MACHINE(machine), TYPE_ICS_SIMPLE,
-   TYPE_ICP, nr_servers, nr_irqs, errp);
+if (!spapr->ics) {
+xics_spapr_init(spapr, errp);
+spapr->icp_type = TYPE_ICP;
+spapr->ics = spapr_ics_create(spapr, TYPE_ICS_SIMPLE, nr_irqs, errp);
 }
-
-return rc;
 }
 
 static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
@@ -977,6 +970,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
 void *fdt;
 sPAPRPHBState *phb;
 char *buf;
+int smt = kvmppc_smt_threads();
 
 fdt = g_malloc0(FDT_MAX_SIZE);
 _FDT((fdt_create_empty_tree(fdt, FDT_MAX_SIZE)));
@@ -1016,7 +1010,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
 _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
 
 /* /interrupt controller */
-spapr_dt_xics(spapr->nr_servers, fdt, PHANDLE_XICP);
+spapr_dt_xics(DIV_ROUND_UP(max_cpus * smt, smp_threads), fdt, 
PHANDLE_XICP);
 
 ret = spapr_populate_memory(spapr, fdt);
 if (ret < 0) {
@@ -2045,7 +2039,6 @@ static void ppc_spapr_init(MachineState *machine)
 hwaddr node0_size = spapr_node0_size();
 long load_limit, fw_size;
 char *filename;
-int smt = kvmppc_smt_threads();
 
 msi_nonbroken = true;
 
@@ -2096,8 +2089,7 @@ static void ppc_spapr_init(MachineState *machine)
 load_limit = MIN(spapr->rma_size, RTAS_MAX_ADDR) - FW_OVERHEAD;
 
 /* Set up Interrupt Controller before we create 

[Qemu-devel] [PULL 31/47] ipmi: provide support for FRUs

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

This patch provides a simple FRU support for the BMC simulator. FRUs
are loaded from a file which name is specified in the object
properties, each entry having a fixed size, also specified in the
properties. If the file is unknown or not accessible for some reason,
a unique entry of 1024 bytes is created as a default. Just enough to
start some simulation.

These commands complies with the IPMI spec : "34. FRU Inventory Device
Commands".

Signed-off-by: Cédric Le Goater 
Acked-by: Corey Minyard 
Signed-off-by: David Gibson 
---
 hw/ipmi/ipmi_bmc_sim.c | 128 +
 qemu-options.hx|   8 +++-
 2 files changed, 134 insertions(+), 2 deletions(-)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 1142c0c..eae7b2d 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -80,6 +80,9 @@
 #define IPMI_CMD_ENTER_SDR_REP_UPD_MODE   0x2A
 #define IPMI_CMD_EXIT_SDR_REP_UPD_MODE0x2B
 #define IPMI_CMD_RUN_INIT_AGENT   0x2C
+#define IPMI_CMD_GET_FRU_AREA_INFO0x10
+#define IPMI_CMD_READ_FRU_DATA0x11
+#define IPMI_CMD_WRITE_FRU_DATA   0x12
 #define IPMI_CMD_GET_SEL_INFO 0x40
 #define IPMI_CMD_GET_SEL_ALLOC_INFO   0x41
 #define IPMI_CMD_RESERVE_SEL  0x42
@@ -122,6 +125,13 @@ typedef struct IPMISdr {
 uint8_t overflow;
 } IPMISdr;
 
+typedef struct IPMIFru {
+char *filename;
+unsigned int nentries;
+uint16_t areasize;
+uint8_t *data;
+} IPMIFru;
+
 typedef struct IPMISensor {
 uint8_t status;
 uint8_t reading;
@@ -213,6 +223,7 @@ struct IPMIBmcSim {
 
 IPMISel sel;
 IPMISdr sdr;
+IPMIFru fru;
 IPMISensor sensors[MAX_SENSORS];
 char *sdr_filename;
 
@@ -1317,6 +1328,91 @@ static void get_sel_info(IPMIBmcSim *ibs,
 rsp_buffer_push(rsp, (ibs->sel.overflow << 7) | 0x02);
 }
 
+static void get_fru_area_info(IPMIBmcSim *ibs,
+ uint8_t *cmd, unsigned int cmd_len,
+ RspBuffer *rsp)
+{
+uint8_t fruid;
+uint16_t fru_entry_size;
+
+fruid = cmd[2];
+
+if (fruid >= ibs->fru.nentries) {
+rsp_buffer_set_error(rsp, IPMI_CC_INVALID_DATA_FIELD);
+return;
+}
+
+fru_entry_size = ibs->fru.areasize;
+
+rsp_buffer_push(rsp, fru_entry_size & 0xff);
+rsp_buffer_push(rsp, fru_entry_size >> 8 & 0xff);
+rsp_buffer_push(rsp, 0x0);
+}
+
+static void read_fru_data(IPMIBmcSim *ibs,
+ uint8_t *cmd, unsigned int cmd_len,
+ RspBuffer *rsp)
+{
+uint8_t fruid;
+uint16_t offset;
+int i;
+uint8_t *fru_entry;
+unsigned int count;
+
+fruid = cmd[2];
+offset = (cmd[3] | cmd[4] << 8);
+
+if (fruid >= ibs->fru.nentries) {
+rsp_buffer_set_error(rsp, IPMI_CC_INVALID_DATA_FIELD);
+return;
+}
+
+if (offset >= ibs->fru.areasize - 1) {
+rsp_buffer_set_error(rsp, IPMI_CC_INVALID_DATA_FIELD);
+return;
+}
+
+fru_entry = >fru.data[fruid * ibs->fru.areasize];
+
+count = MIN(cmd[5], ibs->fru.areasize - offset);
+
+rsp_buffer_push(rsp, count & 0xff);
+for (i = 0; i < count; i++) {
+rsp_buffer_push(rsp, fru_entry[offset + i]);
+}
+}
+
+static void write_fru_data(IPMIBmcSim *ibs,
+ uint8_t *cmd, unsigned int cmd_len,
+ RspBuffer *rsp)
+{
+uint8_t fruid;
+uint16_t offset;
+uint8_t *fru_entry;
+unsigned int count;
+
+fruid = cmd[2];
+offset = (cmd[3] | cmd[4] << 8);
+
+if (fruid >= ibs->fru.nentries) {
+rsp_buffer_set_error(rsp, IPMI_CC_INVALID_DATA_FIELD);
+return;
+}
+
+if (offset >= ibs->fru.areasize - 1) {
+rsp_buffer_set_error(rsp, IPMI_CC_INVALID_DATA_FIELD);
+return;
+}
+
+fru_entry = >fru.data[fruid * ibs->fru.areasize];
+
+count = MIN(cmd_len - 5, ibs->fru.areasize - offset);
+
+memcpy(fru_entry + offset, cmd + 5, count);
+
+rsp_buffer_push(rsp, count & 0xff);
+}
+
 static void reserve_sel(IPMIBmcSim *ibs,
 uint8_t *cmd, unsigned int cmd_len,
 RspBuffer *rsp)
@@ -1653,6 +1749,9 @@ static const IPMINetfn app_netfn = {
 };
 
 static const IPMICmdHandler storage_cmds[] = {
+[IPMI_CMD_GET_FRU_AREA_INFO] = { get_fru_area_info, 3 },
+[IPMI_CMD_READ_FRU_DATA] = { read_fru_data, 5 },
+[IPMI_CMD_WRITE_FRU_DATA] = { write_fru_data, 5 },
 [IPMI_CMD_GET_SDR_REP_INFO] = { get_sdr_rep_info },
 [IPMI_CMD_RESERVE_SDR_REP] = { reserve_sdr_rep },
 [IPMI_CMD_GET_SDR] = { get_sdr, 8 },
@@ -1755,6 +1854,31 @@ static const VMStateDescription vmstate_ipmi_sim = {
 }
 };
 
+static void ipmi_fru_init(IPMIFru *fru)
+{
+int fsize;
+int size = 0;
+
+fsize = get_image_size(fru->filename);
+if (fsize > 0) {
+size = 

[Qemu-devel] [PULL 36/47] ppc/pnv: Add support for POWER8+ LPC Controller

2017-04-23 Thread David Gibson
From: Benjamin Herrenschmidt 

It adds the Naples chip which supports proper LPC interrupts via the
LPC controller rather than via an external CPLD.

Signed-off-by: Benjamin Herrenschmidt 
[clg: - updated for qemu-2.9
  - ported on latest PowerNV patchset
  - moved the IRQ handler in pnv_lpc.c
  - introduced pnv_lpc_isa_irq_create() to create the ISA IRQs ]
Signed-off-by: Cédric Le Goater 
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 45 +++---
 hw/ppc/pnv_lpc.c | 97 +++-
 include/hw/ppc/pnv_lpc.h |  8 
 3 files changed, 108 insertions(+), 42 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 16f32c9..27589b9 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -346,36 +346,6 @@ static void ppc_powernv_reset(void)
 cpu_physical_memory_write(PNV_FDT_ADDR, fdt, fdt_totalsize(fdt));
 }
 
-/* If we don't use the built-in LPC interrupt deserializer, we need
- * to provide a set of qirqs for the ISA bus or things will go bad.
- *
- * Most machines using pre-Naples chips (without said deserializer)
- * have a CPLD that will collect the SerIRQ and shoot them as a
- * single level interrupt to the P8 chip. So let's setup a hook
- * for doing just that.
- */
-static void pnv_lpc_isa_irq_handler_cpld(void *opaque, int n, int level)
-{
-PnvMachineState *pnv = POWERNV_MACHINE(qdev_get_machine());
-uint32_t old_state = pnv->cpld_irqstate;
-PnvChip *chip = opaque;
-
-if (level) {
-pnv->cpld_irqstate |= 1u << n;
-} else {
-pnv->cpld_irqstate &= ~(1u << n);
-}
-if (pnv->cpld_irqstate != old_state) {
-pnv_psi_irq_set(>psi, PSIHB_IRQ_EXTERNAL,
-pnv->cpld_irqstate != 0);
-}
-}
-
-static void pnv_lpc_isa_irq_handler(void *opaque, int n, int level)
-{
- /* XXX TODO */
-}
-
 static ISABus *pnv_isa_create(PnvChip *chip)
 {
 PnvLpcController *lpc = >lpc;
@@ -390,16 +360,7 @@ static ISABus *pnv_isa_create(PnvChip *chip)
 isa_bus = isa_bus_new(NULL, >isa_mem, >isa_io,
   _fatal);
 
-/* Not all variants have a working serial irq decoder. If not,
- * handling of LPC interrupts becomes a platform issue (some
- * platforms have a CPLD to do it).
- */
-if (pcc->chip_type == PNV_CHIP_POWER8NVL) {
-irqs = qemu_allocate_irqs(pnv_lpc_isa_irq_handler, chip, ISA_NUM_IRQS);
-} else {
-irqs = qemu_allocate_irqs(pnv_lpc_isa_irq_handler_cpld, chip,
-  ISA_NUM_IRQS);
-}
+irqs = pnv_lpc_isa_irq_create(lpc, pcc->chip_type, ISA_NUM_IRQS);
 
 isa_bus_irqs(isa_bus, irqs);
 return isa_bus;
@@ -699,6 +660,10 @@ static void pnv_chip_init(Object *obj)
 object_property_add_child(obj, "occ", OBJECT(>occ), NULL);
 object_property_add_const_link(OBJECT(>occ), "psi",
OBJECT(>psi), _abort);
+
+/* The LPC controller needs PSI to generate interrupts */
+object_property_add_const_link(OBJECT(>lpc), "psi",
+   OBJECT(>psi), _abort);
 }
 
 static void pnv_chip_icp_realize(PnvChip *chip, Error **errp)
diff --git a/hw/ppc/pnv_lpc.c b/hw/ppc/pnv_lpc.c
index 78db524..5d20c15 100644
--- a/hw/ppc/pnv_lpc.c
+++ b/hw/ppc/pnv_lpc.c
@@ -250,6 +250,34 @@ static const MemoryRegionOps pnv_lpc_xscom_ops = {
 .endianness = DEVICE_BIG_ENDIAN,
 };
 
+static void pnv_lpc_eval_irqs(PnvLpcController *lpc)
+{
+bool lpc_to_opb_irq = false;
+
+/* Update LPC controller to OPB line */
+if (lpc->lpc_hc_irqser_ctrl & LPC_HC_IRQSER_EN) {
+uint32_t irqs;
+
+irqs = lpc->lpc_hc_irqstat & lpc->lpc_hc_irqmask;
+lpc_to_opb_irq = (irqs != 0);
+}
+
+/* We don't honor the polarity register, it's pointless and unused
+ * anyway
+ */
+if (lpc_to_opb_irq) {
+lpc->opb_irq_input |= OPB_MASTER_IRQ_LPC;
+} else {
+lpc->opb_irq_input &= ~OPB_MASTER_IRQ_LPC;
+}
+
+/* Update OPB internal latch */
+lpc->opb_irq_stat |= lpc->opb_irq_input & lpc->opb_irq_mask;
+
+/* Reflect the interrupt */
+pnv_psi_irq_set(lpc->psi, PSIHB_IRQ_LPC_I2C, lpc->opb_irq_stat != 0);
+}
+
 static uint64_t lpc_hc_read(void *opaque, hwaddr addr, unsigned size)
 {
 PnvLpcController *lpc = opaque;
@@ -300,12 +328,15 @@ static void lpc_hc_write(void *opaque, hwaddr addr, 
uint64_t val,
 break;
 case LPC_HC_IRQSER_CTRL:
 lpc->lpc_hc_irqser_ctrl = val;
+pnv_lpc_eval_irqs(lpc);
 break;
 case LPC_HC_IRQMASK:
 lpc->lpc_hc_irqmask = val;
+pnv_lpc_eval_irqs(lpc);
 break;
 case LPC_HC_IRQSTAT:
 lpc->lpc_hc_irqstat &= ~val;
+pnv_lpc_eval_irqs(lpc);
 break;
 case LPC_HC_ERROR_ADDRESS:
 break;
@@ -363,14 +394,15 @@ static void opb_master_write(void *opaque, hwaddr 

[Qemu-devel] [PULL 33/47] ipmi: introduce an ipmi_bmc_gen_event() API

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

It will be used to fill the message buffer with custom events expected
by some systems. Typically, an Open PowerNV platform guest is notified
with an OEM SEL message before a shutdown or a reboot.

Signed-off-by: Cédric Le Goater 
Acked-by: Corey Minyard 
Signed-off-by: David Gibson 
---
 hw/ipmi/ipmi_bmc_sim.c | 24 
 include/hw/ipmi/ipmi.h |  2 ++
 2 files changed, 26 insertions(+)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index 8185a84..155561d 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -473,6 +473,30 @@ static int attn_irq_enabled(IPMIBmcSim *ibs)
 IPMI_BMC_MSG_FLAG_EVT_BUF_FULL_SET(ibs));
 }
 
+void ipmi_bmc_gen_event(IPMIBmc *b, uint8_t *evt, bool log)
+{
+IPMIBmcSim *ibs = IPMI_BMC_SIMULATOR(b);
+IPMIInterface *s = ibs->parent.intf;
+IPMIInterfaceClass *k = IPMI_INTERFACE_GET_CLASS(s);
+
+if (!IPMI_BMC_EVENT_MSG_BUF_ENABLED(ibs)) {
+return;
+}
+
+if (log && IPMI_BMC_EVENT_LOG_ENABLED(ibs)) {
+sel_add_event(ibs, evt);
+}
+
+if (ibs->msg_flags & IPMI_BMC_MSG_FLAG_EVT_BUF_FULL) {
+goto out;
+}
+
+memcpy(ibs->evtbuf, evt, 16);
+ibs->msg_flags |= IPMI_BMC_MSG_FLAG_EVT_BUF_FULL;
+k->set_atn(s, 1, attn_irq_enabled(ibs));
+ out:
+return;
+}
 static void gen_event(IPMIBmcSim *ibs, unsigned int sens_num, uint8_t deassert,
   uint8_t evd1, uint8_t evd2, uint8_t evd3)
 {
diff --git a/include/hw/ipmi/ipmi.h b/include/hw/ipmi/ipmi.h
index 0d36cfc..0affe5a 100644
--- a/include/hw/ipmi/ipmi.h
+++ b/include/hw/ipmi/ipmi.h
@@ -261,4 +261,6 @@ typedef uint8_t ipmi_sdr_compact_buffer[sizeof(struct 
ipmi_sdr_compact)];
 
 int ipmi_bmc_sdr_find(IPMIBmc *b, uint16_t recid,
   const struct ipmi_sdr_compact **sdr, uint16_t *nextrec);
+void ipmi_bmc_gen_event(IPMIBmc *b, uint8_t *evt, bool log);
+
 #endif
-- 
2.9.3




[Qemu-devel] [PULL 46/47] e500, book3s: mfspr 259: Register mapped/aliased SPRG3 user read

2017-04-23 Thread David Gibson
From: Bernhard Kaindl 

This patch registers mfspr 259 for Book3S and e500 family cores
following this research:

mfspr 259 provides read-only mapped user access to SPRG3(SPR 275) according to:

- PowerISA 2.02, Book III (documents implementation starting with POWER4+ @ p20)
- IBM PowerPC 970MP RISC Microprocessor User's Manual v2.1, page 48
- Amit Singh: "Mac OS X Internals: A Systems Approach" on 970 and 970FX cores:
  He demonstrates mfspr 259 reading TLS data from Mac OS X on G5 on page 588
- NXP documents it in the Core Reference Manuals of: e500, e500mc and e5500
- getcpu() of the 32 & 64-bit Book3S Linux vDSOs use it to read the core number

mfspr 259 does not appear to be implemented in these cores according to:

- 74xx series: MPC7410/MPC7400 and MPC7450 RISC Microprocessor Reference Manuals
- 4xx series:  PPC440 Processor User's Manual, Revision 1.09 by AMCC
- 750 series:  IBM PowerPC 750CL RISC Microprocessor User's Manual
- e200 series: e200z4 Power Architectureâ Core Reference Manual

Implementation: gen_spr_usprg3() is called from init_proc_book3s_common()
(covers the 970 and POWER cores) and init_proc_e500() (covers the e500 family)
to register spr_read_ureg() in the same way which it already provides
the mapped SPR access for SPR_USPRG4-7 in gen_spr_usprgh() for cores
which have the same read-only mapped SPRG register access for SPRG4-7.

Verified using Linux by pinning a thread to a core and checking sched_getcpu()
using qemu-system-ppc64 -M pseries -cpu POWER8 using MTTCG on a x86_64 host.

Signed-off-by: Bernhard Kaindl 
Reviewed-by: Stefan Resch 
Signed-off-by: David Gibson 
---
 target/ppc/translate_init.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
index 77e5463..0ecf541 100644
--- a/target/ppc/translate_init.c
+++ b/target/ppc/translate_init.c
@@ -1640,6 +1640,14 @@ static void spr_write_booke_pid (DisasContext *ctx, int 
sprn, int gprn)
 }
 #endif
 
+static void gen_spr_usprg3 (CPUPPCState *env)
+{
+spr_register(env, SPR_USPRG3, "USPRG3",
+ _read_ureg, SPR_NOACCESS,
+ _read_ureg, SPR_NOACCESS,
+ 0x);
+}
+
 static void gen_spr_usprgh (CPUPPCState *env)
 {
 spr_register(env, SPR_USPRG4, "USPRG4",
@@ -4914,6 +4922,7 @@ static void init_proc_e500 (CPUPPCState *env, int version)
 break;
 }
 gen_spr_BookE(env, ivor_mask);
+gen_spr_usprg3(env);
 /* Processor identification */
 spr_register(env, SPR_BOOKE_PIR, "PIR",
  SPR_NOACCESS, SPR_NOACCESS,
@@ -8245,6 +8254,7 @@ static void init_proc_book3s_common(CPUPPCState *env)
 {
 gen_spr_ne_601(env);
 gen_tbl(env);
+gen_spr_usprg3(env);
 gen_spr_book3s_altivec(env);
 gen_spr_book3s_pmu_sup(env);
 gen_spr_book3s_pmu_user(env);
-- 
2.9.3




[Qemu-devel] [PULL 34/47] target/ppc: Fix size of struct PPCElfPrstatus

2017-04-23 Thread David Gibson
From: Anton Blanchard 

gdb refuses to parse QEMU memory dumps because struct PPCElfPrstatus
is the wrong size. Fix it.

Signed-off-by: Anton Blanchard 
Fixes: e62fbc54d459 ("target-ppc: dump-guest-memory support")
Signed-off-by: David Gibson 
---
 target/ppc/arch_dump.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c
index 28d9cc7..8e9397a 100644
--- a/target/ppc/arch_dump.c
+++ b/target/ppc/arch_dump.c
@@ -50,7 +50,7 @@ struct PPCUserRegStruct {
 struct PPCElfPrstatus {
 char pad1[112];
 struct PPCUserRegStruct pr_reg;
-reg_t pad2[4];
+char pad2[40];
 } QEMU_PACKED;
 
 
-- 
2.9.3




[Qemu-devel] [PULL 39/47] ppc/pnv: populate device tree for RTC devices

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

The code could be common to any ISA device but we are missing the IO
length.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index c445906..8ab5bb1 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -303,6 +303,26 @@ static void powernv_populate_chip(PnvChip *chip, void *fdt)
 g_free(typename);
 }
 
+static void powernv_populate_rtc(ISADevice *d, void *fdt, int lpc_off)
+{
+uint32_t io_base = d->ioport_id;
+uint32_t io_regs[] = {
+cpu_to_be32(1),
+cpu_to_be32(io_base),
+cpu_to_be32(2)
+};
+char *name;
+int node;
+
+name = g_strdup_printf("%s@i%x", qdev_fw_name(DEVICE(d)), io_base);
+node = fdt_add_subnode(fdt, lpc_off, name);
+_FDT(node);
+g_free(name);
+
+_FDT((fdt_setprop(fdt, node, "reg", io_regs, sizeof(io_regs;
+_FDT((fdt_setprop_string(fdt, node, "compatible", "pnpPNP,b00")));
+}
+
 typedef struct ForeachPopulateArgs {
 void *fdt;
 int offset;
@@ -310,6 +330,16 @@ typedef struct ForeachPopulateArgs {
 
 static int powernv_populate_isa_device(DeviceState *dev, void *opaque)
 {
+ForeachPopulateArgs *args = opaque;
+ISADevice *d = ISA_DEVICE(dev);
+
+if (object_dynamic_cast(OBJECT(dev), TYPE_MC146818_RTC)) {
+powernv_populate_rtc(d, args->fdt, args->offset);
+} else {
+error_report("unknown isa device %s@i%x", qdev_fw_name(dev),
+ d->ioport_id);
+}
+
 return 0;
 }
 
-- 
2.9.3




[Qemu-devel] [PULL 27/47] ppc/pnv: Add cut down PSI bridge model and hookup external interrupt

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

The Processor Service Interface (PSI) Controller is one of the engines
of the "Bridge" unit which connects the different interfaces to the
Power Processor.

This adds just enough of the PSI bridge to handle various on-chip and
the one external interrupt. The rest of PSI has to do with the link to
the IBM FSP service processor which we don't plan to emulate (not used
on OpenPower machines).

The ics_get() and ics_resend() handlers of the XICSFabric interface of
the PowerNV machine are now defined to handle the Interrupt Control
Source of PSI. The InterruptStatsProvider interface is also modified
to dump the new ICS.

Originally from Benjamin Herrenschmidt 

Signed-off-by: Cédric Le Goater 
Signed-off-by: David Gibson 
---
 hw/ppc/Makefile.objs   |   2 +-
 hw/ppc/pnv.c   |  65 +-
 hw/ppc/pnv_psi.c   | 571 +
 include/hw/ppc/pnv.h   |  13 ++
 include/hw/ppc/pnv_psi.h   |  67 ++
 include/hw/ppc/pnv_xscom.h |   3 +
 6 files changed, 714 insertions(+), 7 deletions(-)
 create mode 100644 hw/ppc/pnv_psi.c
 create mode 100644 include/hw/ppc/pnv_psi.h

diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
index 0012934..dc19ee1 100644
--- a/hw/ppc/Makefile.objs
+++ b/hw/ppc/Makefile.objs
@@ -6,7 +6,7 @@ obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o 
spapr_rtas.o
 obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
 obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o
 # IBM PowerNV
-obj-$(CONFIG_POWERNV) += pnv.o pnv_xscom.o pnv_core.o pnv_lpc.o
+obj-$(CONFIG_POWERNV) += pnv.o pnv_xscom.o pnv_core.o pnv_lpc.o pnv_psi.o
 ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
 obj-y += spapr_pci_vfio.o
 endif
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 1fa90d6..a516acb 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -353,15 +353,22 @@ static void ppc_powernv_reset(void)
  * have a CPLD that will collect the SerIRQ and shoot them as a
  * single level interrupt to the P8 chip. So let's setup a hook
  * for doing just that.
- *
- * Note: The actual interrupt input isn't emulated yet, this will
- * come with the PSI bridge model.
  */
 static void pnv_lpc_isa_irq_handler_cpld(void *opaque, int n, int level)
 {
-/* We don't yet emulate the PSI bridge which provides the external
- * interrupt, so just drop interrupts on the floor
- */
+PnvMachineState *pnv = POWERNV_MACHINE(qdev_get_machine());
+uint32_t old_state = pnv->cpld_irqstate;
+PnvChip *chip = opaque;
+
+if (level) {
+pnv->cpld_irqstate |= 1u << n;
+} else {
+pnv->cpld_irqstate &= ~(1u << n);
+}
+if (pnv->cpld_irqstate != old_state) {
+pnv_psi_irq_set(>psi, PSIHB_IRQ_EXTERNAL,
+pnv->cpld_irqstate != 0);
+}
 }
 
 static void pnv_lpc_isa_irq_handler(void *opaque, int n, int level)
@@ -682,6 +689,11 @@ static void pnv_chip_init(Object *obj)
 
 object_initialize(>lpc, sizeof(chip->lpc), TYPE_PNV_LPC);
 object_property_add_child(obj, "lpc", OBJECT(>lpc), NULL);
+
+object_initialize(>psi, sizeof(chip->psi), TYPE_PNV_PSI);
+object_property_add_child(obj, "psi", OBJECT(>psi), NULL);
+object_property_add_const_link(OBJECT(>psi), "xics",
+   OBJECT(qdev_get_machine()), _abort);
 }
 
 static void pnv_chip_icp_realize(PnvChip *chip, Error **errp)
@@ -794,6 +806,16 @@ static void pnv_chip_realize(DeviceState *dev, Error 
**errp)
 error_propagate(errp, error);
 return;
 }
+
+/* Processor Service Interface (PSI) Host Bridge */
+object_property_set_int(OBJECT(>psi), PNV_PSIHB_BASE(chip),
+"bar", _fatal);
+object_property_set_bool(OBJECT(>psi), true, "realized", );
+if (error) {
+error_propagate(errp, error);
+return;
+}
+pnv_xscom_add_subregion(chip, PNV_XSCOM_PSIHB_BASE, >psi.xscom_regs);
 }
 
 static Property pnv_chip_properties[] = {
@@ -824,6 +846,29 @@ static const TypeInfo pnv_chip_info = {
 .abstract  = true,
 };
 
+static ICSState *pnv_ics_get(XICSFabric *xi, int irq)
+{
+PnvMachineState *pnv = POWERNV_MACHINE(xi);
+int i;
+
+for (i = 0; i < pnv->num_chips; i++) {
+if (ics_valid_irq(>chips[i]->psi.ics, irq)) {
+return >chips[i]->psi.ics;
+}
+}
+return NULL;
+}
+
+static void pnv_ics_resend(XICSFabric *xi)
+{
+PnvMachineState *pnv = POWERNV_MACHINE(xi);
+int i;
+
+for (i = 0; i < pnv->num_chips; i++) {
+ics_resend(>chips[i]->psi.ics);
+}
+}
+
 static PowerPCCPU *ppc_get_vcpu_by_pir(int pir)
 {
 CPUState *cs;
@@ -850,6 +895,8 @@ static ICPState *pnv_icp_get(XICSFabric *xi, int pir)
 static void pnv_pic_print_info(InterruptStatsProvider *obj,
Monitor *mon)
 {
+PnvMachineState *pnv = POWERNV_MACHINE(obj);

[Qemu-devel] [PULL 45/47] target/ppc: Flush TLB on write to PIDR

2017-04-23 Thread David Gibson
From: Suraj Jitindar Singh 

The PIDR (process id register) is used to store the id of the currently
running process, which is used to select the process table entry used to
perform address translation. This means that when we write to this register
all the translations in the TLB become outdated as they are for a
previously running process. Thus when this register is written to we need
to invalidate the TLB entries to ensure stale entries aren't used to
to perform translation for the new process, which would result in at best
segfaults or alternatively just random memory being accessed.

Signed-off-by: Suraj Jitindar Singh 
Reviewed-by: David Gibson 
[dwg: Fixed compile error for 32-bit targets]
Signed-off-by: David Gibson 
---
 target/ppc/helper.h |  1 +
 target/ppc/misc_helper.c|  8 
 target/ppc/translate_init.c | 10 --
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 6d77661..bb6a94a 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -709,6 +709,7 @@ DEF_HELPER_FLAGS_1(load_601_rtcu, TCG_CALL_NO_RWG, tl, env)
 DEF_HELPER_FLAGS_1(load_purr, TCG_CALL_NO_RWG, tl, env)
 #endif
 DEF_HELPER_2(store_sdr1, void, env, tl)
+DEF_HELPER_2(store_pidr, void, env, tl)
 DEF_HELPER_FLAGS_2(store_tbl, TCG_CALL_NO_RWG, void, env, tl)
 DEF_HELPER_FLAGS_2(store_tbu, TCG_CALL_NO_RWG, void, env, tl)
 DEF_HELPER_FLAGS_2(store_atbl, TCG_CALL_NO_RWG, void, env, tl)
diff --git a/target/ppc/misc_helper.c b/target/ppc/misc_helper.c
index fa573dd..0e42178 100644
--- a/target/ppc/misc_helper.c
+++ b/target/ppc/misc_helper.c
@@ -88,6 +88,14 @@ void helper_store_sdr1(CPUPPCState *env, target_ulong val)
 }
 }
 
+void helper_store_pidr(CPUPPCState *env, target_ulong val)
+{
+PowerPCCPU *cpu = ppc_env_get_cpu(env);
+
+env->spr[SPR_BOOKS_PID] = val;
+tlb_flush(CPU(cpu));
+}
+
 void helper_store_hid0_601(CPUPPCState *env, target_ulong val)
 {
 target_ulong hid0;
diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
index aa0c44d..77e5463 100644
--- a/target/ppc/translate_init.c
+++ b/target/ppc/translate_init.c
@@ -394,8 +394,14 @@ static void spr_write_sdr1 (DisasContext *ctx, int sprn, 
int gprn)
 gen_helper_store_sdr1(cpu_env, cpu_gpr[gprn]);
 }
 
-/* 64 bits PowerPC specific SPRs */
 #if defined(TARGET_PPC64)
+/* 64 bits PowerPC specific SPRs */
+/* PIDR */
+static void spr_write_pidr(DisasContext *ctx, int sprn, int gprn)
+{
+gen_helper_store_pidr(cpu_env, cpu_gpr[gprn]);
+}
+
 static void spr_read_hior (DisasContext *ctx, int gprn, int sprn)
 {
 tcg_gen_ld_tl(cpu_gpr[gprn], cpu_env, offsetof(CPUPPCState, excp_prefix));
@@ -8200,7 +8206,7 @@ static void gen_spr_power8_book4(CPUPPCState *env)
  KVM_REG_PPC_ACOP, 0);
 spr_register_kvm(env, SPR_BOOKS_PID, "PID",
  SPR_NOACCESS, SPR_NOACCESS,
- _read_generic, _write_generic,
+ _read_generic, _write_pidr,
  KVM_REG_PPC_PID, 0);
 spr_register_kvm(env, SPR_WORT, "WORT",
  SPR_NOACCESS, SPR_NOACCESS,
-- 
2.9.3




[Qemu-devel] [PULL 42/47] ppc/pnv: add initial IPMI sensors for the BMC simulator

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

Skiboot, the firmware for the PowerNV platform, expects the BMC to
provide some specific IPMI sensors. These sensors are exposed in the
device tree and their values are updated by the firmware at boot time.

Sensors of interest are :

"FW Boot Progress"
"Boot Count"

As such a device is defined on the command line, we can only detect
its presence at reset time.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 hw/ppc/Makefile.objs |  2 +-
 hw/ppc/pnv.c | 21 ++
 hw/ppc/pnv_bmc.c | 81 
 include/hw/ppc/pnv.h |  9 ++
 4 files changed, 112 insertions(+), 1 deletion(-)
 create mode 100644 hw/ppc/pnv_bmc.c

diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
index ef67ea8..7efc686 100644
--- a/hw/ppc/Makefile.objs
+++ b/hw/ppc/Makefile.objs
@@ -6,7 +6,7 @@ obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o 
spapr_rtas.o
 obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
 obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o
 # IBM PowerNV
-obj-$(CONFIG_POWERNV) += pnv.o pnv_xscom.o pnv_core.o pnv_lpc.o pnv_psi.o 
pnv_occ.o
+obj-$(CONFIG_POWERNV) += pnv.o pnv_xscom.o pnv_core.o pnv_lpc.o pnv_psi.o 
pnv_occ.o pnv_bmc.o
 ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
 obj-y += spapr_pci_vfio.o
 endif
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 977e126..685eb12 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -35,6 +35,7 @@
 #include "qapi/visitor.h"
 #include "monitor/monitor.h"
 #include "hw/intc/intc.h"
+#include "hw/ipmi/ipmi.h"
 
 #include "hw/ppc/xics.h"
 #include "hw/ppc/pnv_xscom.h"
@@ -476,16 +477,36 @@ static void *powernv_create_fdt(MachineState *machine)
 /* Populate ISA devices on chip 0 */
 lpc_offset = pnv_chip_lpc_offset(pnv->chips[0], fdt);
 powernv_populate_isa(pnv->isa_bus, fdt, lpc_offset);
+
+if (pnv->bmc) {
+pnv_bmc_populate_sensors(pnv->bmc, fdt);
+}
+
 return fdt;
 }
 
 static void ppc_powernv_reset(void)
 {
 MachineState *machine = MACHINE(qdev_get_machine());
+PnvMachineState *pnv = POWERNV_MACHINE(machine);
 void *fdt;
+Object *obj;
 
 qemu_devices_reset();
 
+/* OpenPOWER systems have a BMC, which can be defined on the
+ * command line with:
+ *
+ *   -device ipmi-bmc-sim,id=bmc0
+ *
+ * This is the internal simulator but it could also be an external
+ * BMC.
+ */
+obj = object_resolve_path_type("", TYPE_IPMI_BMC, NULL);
+if (obj) {
+pnv->bmc = IPMI_BMC(obj);
+}
+
 fdt = powernv_create_fdt(machine);
 
 /* Pack resulting tree */
diff --git a/hw/ppc/pnv_bmc.c b/hw/ppc/pnv_bmc.c
new file mode 100644
index 000..a0820dc
--- /dev/null
+++ b/hw/ppc/pnv_bmc.c
@@ -0,0 +1,81 @@
+/*
+ * QEMU PowerNV, BMC related functions
+ *
+ * Copyright (c) 2016-2017, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "hw/hw.h"
+#include "sysemu/sysemu.h"
+#include "target/ppc/cpu.h"
+#include "qapi/error.h"
+#include "qemu/log.h"
+#include "hw/ipmi/ipmi.h"
+#include "hw/ppc/fdt.h"
+
+#include "hw/ppc/pnv.h"
+
+#include 
+
+/* TODO: include definition in ipmi.h */
+#define IPMI_SDR_FULL_TYPE 1
+
+void pnv_bmc_populate_sensors(IPMIBmc *bmc, void *fdt)
+{
+int offset;
+int i;
+const struct ipmi_sdr_compact *sdr;
+uint16_t nextrec;
+
+offset = fdt_add_subnode(fdt, 0, "/bmc");
+_FDT(offset);
+
+_FDT((fdt_setprop_string(fdt, offset, "name", "bmc")));
+_FDT((fdt_setprop_cell(fdt, offset, "#address-cells", 0x1)));
+_FDT((fdt_setprop_cell(fdt, offset, "#size-cells", 0x0)));
+
+offset = fdt_add_subnode(fdt, offset, "sensors");
+_FDT(offset);
+
+_FDT((fdt_setprop_cell(fdt, offset, "#address-cells", 0x1)));
+_FDT((fdt_setprop_cell(fdt, offset, "#size-cells", 0x0)));
+
+for (i = 0; !ipmi_bmc_sdr_find(bmc, i, , ); i++) {
+int off;
+char *name;
+
+if (sdr->header.rec_type != IPMI_SDR_COMPACT_TYPE &&
+sdr->header.rec_type != IPMI_SDR_FULL_TYPE) {
+continue;
+}
+
+name = g_strdup_printf("sensor@%x", sdr->sensor_owner_number);
+off = fdt_add_subnode(fdt, offset, name);
+_FDT(off);
+

[Qemu-devel] [PULL 21/47] ppc/pnv: add a PnvICPState object

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

This provides a new ICPState object for the PowerNV machine (POWER8).
Access to the Interrupt Management area is done though a memory
region. It contains the registers of the Interrupt Control Presenters
of each thread which are used to accept, return, forward interrupts in
the system.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 hw/intc/Makefile.objs |   1 +
 hw/intc/xics_pnv.c| 192 ++
 include/hw/ppc/xics.h |  12 
 3 files changed, 205 insertions(+)
 create mode 100644 hw/intc/xics_pnv.c

diff --git a/hw/intc/Makefile.objs b/hw/intc/Makefile.objs
index adedd0d..78426a7 100644
--- a/hw/intc/Makefile.objs
+++ b/hw/intc/Makefile.objs
@@ -35,6 +35,7 @@ obj-$(CONFIG_SH4) += sh_intc.o
 obj-$(CONFIG_XICS) += xics.o
 obj-$(CONFIG_XICS_SPAPR) += xics_spapr.o
 obj-$(CONFIG_XICS_KVM) += xics_kvm.o
+obj-$(CONFIG_POWERNV) += xics_pnv.o
 obj-$(CONFIG_ALLWINNER_A10_PIC) += allwinner-a10-pic.o
 obj-$(CONFIG_S390_FLIC) += s390_flic.o
 obj-$(CONFIG_S390_FLIC_KVM) += s390_flic_kvm.o
diff --git a/hw/intc/xics_pnv.c b/hw/intc/xics_pnv.c
new file mode 100644
index 000..12ae605
--- /dev/null
+++ b/hw/intc/xics_pnv.c
@@ -0,0 +1,192 @@
+/*
+ * QEMU PowerPC PowerNV Interrupt Control Presenter (ICP) model
+ *
+ * Copyright (c) 2017, IBM Corporation.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "sysemu/sysemu.h"
+#include "qapi/error.h"
+#include "qemu/log.h"
+#include "hw/ppc/xics.h"
+
+#define ICP_XIRR_POLL0 /* 1 byte (CPRR) or 4 bytes */
+#define ICP_XIRR 4 /* 1 byte (CPRR) or 4 bytes */
+#define ICP_MFRR12 /* 1 byte access only */
+
+#define ICP_LINKA   16 /* unused */
+#define ICP_LINKB   20 /* unused */
+#define ICP_LINKC   24 /* unused */
+
+static uint64_t pnv_icp_read(void *opaque, hwaddr addr, unsigned width)
+{
+ICPState *icp = ICP(opaque);
+PnvICPState *picp = PNV_ICP(opaque);
+bool byte0 = (width == 1 && (addr & 0x3) == 0);
+uint64_t val = 0x;
+
+switch (addr & 0xffc) {
+case ICP_XIRR_POLL:
+val = icp_ipoll(icp, NULL);
+if (byte0) {
+val >>= 24;
+} else if (width != 4) {
+goto bad_access;
+}
+break;
+case ICP_XIRR:
+if (byte0) {
+val = icp_ipoll(icp, NULL) >> 24;
+} else if (width == 4) {
+val = icp_accept(icp);
+} else {
+goto bad_access;
+}
+break;
+case ICP_MFRR:
+if (byte0) {
+val = icp->mfrr;
+} else {
+goto bad_access;
+}
+break;
+case ICP_LINKA:
+if (width == 4) {
+val = picp->links[0];
+} else {
+goto bad_access;
+}
+break;
+case ICP_LINKB:
+if (width == 4) {
+val = picp->links[1];
+} else {
+goto bad_access;
+}
+break;
+case ICP_LINKC:
+if (width == 4) {
+val = picp->links[2];
+} else {
+goto bad_access;
+}
+break;
+default:
+bad_access:
+qemu_log_mask(LOG_GUEST_ERROR, "XICS: Bad ICP access 0x%"
+  HWADDR_PRIx"/%d\n", addr, width);
+}
+
+return val;
+}
+
+static void pnv_icp_write(void *opaque, hwaddr addr, uint64_t val,
+  unsigned width)
+{
+ICPState *icp = ICP(opaque);
+PnvICPState *picp = PNV_ICP(opaque);
+bool byte0 = (width == 1 && (addr & 0x3) == 0);
+
+switch (addr & 0xffc) {
+case ICP_XIRR:
+if (byte0) {
+icp_set_cppr(icp, val);
+} else if (width == 4) {
+icp_eoi(icp, val);
+} else {
+goto bad_access;
+}
+break;
+case ICP_MFRR:
+if (byte0) {
+icp_set_mfrr(icp, val);
+} else {
+goto bad_access;
+}
+break;
+case ICP_LINKA:
+if (width == 4) {
+picp->links[0] = val;
+} else {
+goto bad_access;
+}
+break;
+case ICP_LINKB:
+if (width == 4) {
+picp->links[1] = val;
+} else 

[Qemu-devel] [PULL 28/47] ppc/pnv: Add OCC model stub with interrupt support

2017-04-23 Thread David Gibson
From: Benjamin Herrenschmidt 

The OCC is an on-chip microcontroller based on a ppc405 core used
for various power management tasks. It comes with a pile of additional
hardware sitting on the PIB (aka XSCOM bus). At this point we don't
emulate it (nor plan to do so). However there is one facility which
is provided by the surrounding hardware that we do need, which is the
interrupt generation facility. OPAL uses it to send itself interrupts
under some circumstances and there are other uses around the corner.

So this implement just enough to support this.

Signed-off-by: Benjamin Herrenschmidt 
[clg: - updated for qemu-2.9
  - changed the XSCOM interface to fit new model
  - QOMified the model ]
Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 hw/ppc/Makefile.objs   |   2 +-
 hw/ppc/pnv.c   |  13 +
 hw/ppc/pnv_occ.c   | 136 +
 include/hw/ppc/pnv.h   |   2 +
 include/hw/ppc/pnv_occ.h   |  38 +
 include/hw/ppc/pnv_xscom.h |   3 +
 6 files changed, 193 insertions(+), 1 deletion(-)
 create mode 100644 hw/ppc/pnv_occ.c
 create mode 100644 include/hw/ppc/pnv_occ.h

diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
index dc19ee1..ef67ea8 100644
--- a/hw/ppc/Makefile.objs
+++ b/hw/ppc/Makefile.objs
@@ -6,7 +6,7 @@ obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o 
spapr_rtas.o
 obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o
 obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o
 # IBM PowerNV
-obj-$(CONFIG_POWERNV) += pnv.o pnv_xscom.o pnv_core.o pnv_lpc.o pnv_psi.o
+obj-$(CONFIG_POWERNV) += pnv.o pnv_xscom.o pnv_core.o pnv_lpc.o pnv_psi.o 
pnv_occ.o
 ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy)
 obj-y += spapr_pci_vfio.o
 endif
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index a516acb..16f32c9 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -694,6 +694,11 @@ static void pnv_chip_init(Object *obj)
 object_property_add_child(obj, "psi", OBJECT(>psi), NULL);
 object_property_add_const_link(OBJECT(>psi), "xics",
OBJECT(qdev_get_machine()), _abort);
+
+object_initialize(>occ, sizeof(chip->occ), TYPE_PNV_OCC);
+object_property_add_child(obj, "occ", OBJECT(>occ), NULL);
+object_property_add_const_link(OBJECT(>occ), "psi",
+   OBJECT(>psi), _abort);
 }
 
 static void pnv_chip_icp_realize(PnvChip *chip, Error **errp)
@@ -816,6 +821,14 @@ static void pnv_chip_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 pnv_xscom_add_subregion(chip, PNV_XSCOM_PSIHB_BASE, >psi.xscom_regs);
+
+/* Create the simplified OCC model */
+object_property_set_bool(OBJECT(>occ), true, "realized", );
+if (error) {
+error_propagate(errp, error);
+return;
+}
+pnv_xscom_add_subregion(chip, PNV_XSCOM_OCC_BASE, >occ.xscom_regs);
 }
 
 static Property pnv_chip_properties[] = {
diff --git a/hw/ppc/pnv_occ.c b/hw/ppc/pnv_occ.c
new file mode 100644
index 000..04880f2
--- /dev/null
+++ b/hw/ppc/pnv_occ.c
@@ -0,0 +1,136 @@
+/*
+ * QEMU PowerPC PowerNV Emulation of a few OCC related registers
+ *
+ * Copyright (c) 2015-2017, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "hw/hw.h"
+#include "sysemu/sysemu.h"
+#include "target/ppc/cpu.h"
+#include "qapi/error.h"
+#include "qemu/log.h"
+
+#include "hw/ppc/pnv.h"
+#include "hw/ppc/pnv_xscom.h"
+#include "hw/ppc/pnv_occ.h"
+
+#define OCB_OCI_OCCMISC 0x4020
+#define OCB_OCI_OCCMISC_AND 0x4021
+#define OCB_OCI_OCCMISC_OR  0x4022
+
+static void pnv_occ_set_misc(PnvOCC *occ, uint64_t val)
+{
+bool irq_state;
+
+val &= 0xull;
+
+occ->occmisc = val;
+irq_state = !!(val >> 63);
+pnv_psi_irq_set(occ->psi, PSIHB_IRQ_OCC, irq_state);
+}
+
+static uint64_t pnv_occ_xscom_read(void *opaque, hwaddr addr, unsigned size)
+{
+PnvOCC *occ = PNV_OCC(opaque);
+uint32_t offset = addr >> 3;
+uint64_t val = 0;
+
+switch (offset) {
+case OCB_OCI_OCCMISC:
+val = occ->occmisc;
+break;
+default:
+qemu_log_mask(LOG_UNIMP, "OCC Unimplemented register: Ox%"
+  HWADDR_PRIx 

[Qemu-devel] [PULL 38/47] ppc/pnv: scan ISA bus to populate device tree

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

This is an empty shell that we will use to include nodes in the device
tree for ISA devices. We expect RTC, UART and IPMI BT devices.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 9468e99..c445906 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -303,6 +303,29 @@ static void powernv_populate_chip(PnvChip *chip, void *fdt)
 g_free(typename);
 }
 
+typedef struct ForeachPopulateArgs {
+void *fdt;
+int offset;
+} ForeachPopulateArgs;
+
+static int powernv_populate_isa_device(DeviceState *dev, void *opaque)
+{
+return 0;
+}
+
+static void powernv_populate_isa(ISABus *bus, void *fdt, int lpc_offset)
+{
+ForeachPopulateArgs args = {
+.fdt = fdt,
+.offset = lpc_offset,
+};
+
+/* ISA devices are not necessarily parented to the ISA bus so we
+ * can not use object_child_foreach() */
+qbus_walk_children(BUS(bus), powernv_populate_isa_device,
+   NULL, NULL, NULL, );
+}
+
 static void *powernv_create_fdt(MachineState *machine)
 {
 const char plat_compat[] = "qemu,powernv\0ibm,powernv";
@@ -311,6 +334,7 @@ static void *powernv_create_fdt(MachineState *machine)
 char *buf;
 int off;
 int i;
+int lpc_offset;
 
 fdt = g_malloc0(FDT_MAX_SIZE);
 _FDT((fdt_create_empty_tree(fdt, FDT_MAX_SIZE)));
@@ -350,6 +374,10 @@ static void *powernv_create_fdt(MachineState *machine)
 for (i = 0; i < pnv->num_chips; i++) {
 powernv_populate_chip(pnv->chips[i], fdt);
 }
+
+/* Populate ISA devices on chip 0 */
+lpc_offset = pnv_chip_lpc_offset(pnv->chips[0], fdt);
+powernv_populate_isa(pnv->isa_bus, fdt, lpc_offset);
 return fdt;
 }
 
-- 
2.9.3




[Qemu-devel] [PULL 43/47] ppc/pnv: generate an OEM SEL event on shutdown

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

OpenPOWER systems expect to be notified with such an event before a
shutdown or a reboot. An OEM SEL message is sent with specific
identifiers and a user data containing the request : OFF or REBOOT.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 14 ++
 hw/ppc/pnv_bmc.c | 41 +
 include/hw/ppc/pnv.h |  2 ++
 3 files changed, 57 insertions(+)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 685eb12..d4bcdb0 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -485,6 +485,15 @@ static void *powernv_create_fdt(MachineState *machine)
 return fdt;
 }
 
+static void pnv_powerdown_notify(Notifier *n, void *opaque)
+{
+PnvMachineState *pnv = POWERNV_MACHINE(qdev_get_machine());
+
+if (pnv->bmc) {
+pnv_bmc_powerdown(pnv->bmc);
+}
+}
+
 static void ppc_powernv_reset(void)
 {
 MachineState *machine = MACHINE(qdev_get_machine());
@@ -638,6 +647,11 @@ static void ppc_powernv_init(MachineState *machine)
 
 /* Create an RTC ISA device too */
 rtc_init(pnv->isa_bus, 2000, NULL);
+
+/* OpenPOWER systems use a IPMI SEL Event message to notify the
+ * host to powerdown */
+pnv->powerdown_notifier.notify = pnv_powerdown_notify;
+qemu_register_powerdown_notifier(>powerdown_notifier);
 }
 
 /*
diff --git a/hw/ppc/pnv_bmc.c b/hw/ppc/pnv_bmc.c
index a0820dc..7b60b4c 100644
--- a/hw/ppc/pnv_bmc.c
+++ b/hw/ppc/pnv_bmc.c
@@ -32,6 +32,47 @@
 /* TODO: include definition in ipmi.h */
 #define IPMI_SDR_FULL_TYPE 1
 
+/*
+ * OEM SEL Event data packet sent by BMC in response of a Read Event
+ * Message Buffer command
+ */
+typedef struct OemSel {
+/* SEL header */
+uint8_t id[2];
+uint8_t type;
+uint8_t timestamp[4];
+uint8_t manuf_id[3];
+
+/* OEM SEL data (6 bytes) follows */
+uint8_t netfun;
+uint8_t cmd;
+uint8_t data[4];
+} OemSel;
+
+#define SOFT_OFF0x00
+#define SOFT_REBOOT 0x01
+
+static void pnv_gen_oem_sel(IPMIBmc *bmc, uint8_t reboot)
+{
+/* IPMI SEL Event are 16 bytes long */
+OemSel sel = {
+.id= { 0x55 , 0x55 },
+.type  = 0xC0, /* OEM */
+.manuf_id  = { 0x0, 0x0, 0x0 },
+.timestamp = { 0x0, 0x0, 0x0, 0x0 },
+.netfun= 0x3A, /* IBM */
+.cmd   = 0x04, /* AMI OEM SEL Power Notification */
+.data  = { reboot, 0xFF, 0xFF, 0xFF },
+};
+
+ipmi_bmc_gen_event(bmc, (uint8_t *) , 0 /* do not log the event */);
+}
+
+void pnv_bmc_powerdown(IPMIBmc *bmc)
+{
+pnv_gen_oem_sel(bmc, SOFT_OFF);
+}
+
 void pnv_bmc_populate_sensors(IPMIBmc *bmc, void *fdt)
 {
 int offset;
diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index 02f6cf5..c1288f9 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -134,6 +134,7 @@ typedef struct PnvMachineState {
 uint32_t cpld_irqstate;
 
 IPMIBmc  *bmc;
+Notifier powerdown_notifier;
 } PnvMachineState;
 
 #define PNV_FDT_ADDR  0x0100
@@ -143,6 +144,7 @@ typedef struct PnvMachineState {
  * BMC helpers
  */
 void pnv_bmc_populate_sensors(IPMIBmc *bmc, void *fdt);
+void pnv_bmc_powerdown(IPMIBmc *bmc);
 
 /*
  * POWER8 MMIO base addresses
-- 
2.9.3




[Qemu-devel] [PULL 40/47] ppc/pnv: populate device tree for serial devices

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 8ab5bb1..dfa21e4 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -323,6 +323,37 @@ static void powernv_populate_rtc(ISADevice *d, void *fdt, 
int lpc_off)
 _FDT((fdt_setprop_string(fdt, node, "compatible", "pnpPNP,b00")));
 }
 
+static void powernv_populate_serial(ISADevice *d, void *fdt, int lpc_off)
+{
+const char compatible[] = "ns16550\0pnpPNP,501";
+uint32_t io_base = d->ioport_id;
+uint32_t io_regs[] = {
+cpu_to_be32(1),
+cpu_to_be32(io_base),
+cpu_to_be32(8)
+};
+char *name;
+int node;
+
+name = g_strdup_printf("%s@i%x", qdev_fw_name(DEVICE(d)), io_base);
+node = fdt_add_subnode(fdt, lpc_off, name);
+_FDT(node);
+g_free(name);
+
+_FDT((fdt_setprop(fdt, node, "reg", io_regs, sizeof(io_regs;
+_FDT((fdt_setprop(fdt, node, "compatible", compatible,
+  sizeof(compatible;
+
+_FDT((fdt_setprop_cell(fdt, node, "clock-frequency", 1843200)));
+_FDT((fdt_setprop_cell(fdt, node, "current-speed", 115200)));
+_FDT((fdt_setprop_cell(fdt, node, "interrupts", d->isairq[0])));
+_FDT((fdt_setprop_cell(fdt, node, "interrupt-parent",
+   fdt_get_phandle(fdt, lpc_off;
+
+/* This is needed by Linux */
+_FDT((fdt_setprop_string(fdt, node, "device_type", "serial")));
+}
+
 typedef struct ForeachPopulateArgs {
 void *fdt;
 int offset;
@@ -335,6 +366,8 @@ static int powernv_populate_isa_device(DeviceState *dev, 
void *opaque)
 
 if (object_dynamic_cast(OBJECT(dev), TYPE_MC146818_RTC)) {
 powernv_populate_rtc(d, args->fdt, args->offset);
+} else if (object_dynamic_cast(OBJECT(dev), TYPE_ISA_SERIAL)) {
+powernv_populate_serial(d, args->fdt, args->offset);
 } else {
 error_report("unknown isa device %s@i%x", qdev_fw_name(dev),
  d->ioport_id);
-- 
2.9.3




[Qemu-devel] [PULL 24/47] ppc/pnv: create the ICP object under PnvCore

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

Each thread of a core is linked to an ICP. This allocates a PnvICPState
object before the PowerPCCPU object is realized and lets the XICSFabric
do the store under the 'intc' backlink when xics_cpu_setup() is
called.

This modeling removes the need of maintaining an array of ICP objects
under the PowerNV machine and also simplifies the XICSFabric icp_get()
handler.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c  |  2 ++
 hw/ppc/pnv_core.c | 27 +--
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index f3623ee..2add2ad 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -694,6 +694,8 @@ static void pnv_chip_realize(DeviceState *dev, Error **errp)
 object_property_set_int(OBJECT(pnv_core),
 pcc->core_pir(chip, core_hwid),
 "pir", _fatal);
+object_property_add_const_link(OBJECT(pnv_core), "xics",
+   qdev_get_machine(), _fatal);
 object_property_set_bool(OBJECT(pnv_core), true, "realized",
  _fatal);
 object_unref(OBJECT(pnv_core));
diff --git a/hw/ppc/pnv_core.c b/hw/ppc/pnv_core.c
index d79d530..1b7ec70 100644
--- a/hw/ppc/pnv_core.c
+++ b/hw/ppc/pnv_core.c
@@ -25,6 +25,7 @@
 #include "hw/ppc/pnv.h"
 #include "hw/ppc/pnv_core.h"
 #include "hw/ppc/pnv_xscom.h"
+#include "hw/ppc/xics.h"
 
 static void powernv_cpu_reset(void *opaque)
 {
@@ -110,23 +111,37 @@ static const MemoryRegionOps pnv_core_xscom_ops = {
 .endianness = DEVICE_BIG_ENDIAN,
 };
 
-static void pnv_core_realize_child(Object *child, Error **errp)
+static void pnv_core_realize_child(Object *child, XICSFabric *xi, Error **errp)
 {
 Error *local_err = NULL;
 CPUState *cs = CPU(child);
 PowerPCCPU *cpu = POWERPC_CPU(cs);
+Object *obj;
+
+obj = object_new(TYPE_PNV_ICP);
+object_property_add_child(OBJECT(cpu), "icp", obj, NULL);
+object_property_add_const_link(obj, "xics", OBJECT(xi), _abort);
+object_property_set_bool(obj, true, "realized", _err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
 
 object_property_set_bool(child, true, "realized", _err);
 if (local_err) {
+object_unparent(obj);
 error_propagate(errp, local_err);
 return;
 }
 
 powernv_cpu_init(cpu, _err);
 if (local_err) {
+object_unparent(obj);
 error_propagate(errp, local_err);
 return;
 }
+
+xics_cpu_setup(xi, cpu, ICP(obj));
 }
 
 static void pnv_core_realize(DeviceState *dev, Error **errp)
@@ -140,6 +155,14 @@ static void pnv_core_realize(DeviceState *dev, Error 
**errp)
 void *obj;
 int i, j;
 char name[32];
+Object *xi;
+
+xi = object_property_get_link(OBJECT(dev), "xics", _err);
+if (!xi) {
+error_setg(errp, "%s: required link 'xics' not found: %s",
+   __func__, error_get_pretty(local_err));
+return;
+}
 
 pc->threads = g_malloc0(size * cc->nr_threads);
 for (i = 0; i < cc->nr_threads; i++) {
@@ -160,7 +183,7 @@ static void pnv_core_realize(DeviceState *dev, Error **errp)
 for (j = 0; j < cc->nr_threads; j++) {
 obj = pc->threads + j * size;
 
-pnv_core_realize_child(obj, _err);
+pnv_core_realize_child(obj, XICS_FABRIC(xi), _err);
 if (local_err) {
 goto err;
 }
-- 
2.9.3




[Qemu-devel] [PULL 30/47] ipmi: use a file to load SDRs

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

The IPMI BMC simulator populates the sdr/sensor tables with a minimal
set of entries (Watchdog). But some qemu platforms might want to use
extra entries for their custom needs.

This patch modifies slighty the initializing routine to take into
account a larger set read from a file. The name of the file to use is
defined through a new 'sdr' property of the simulator device.

Signed-off-by: Cédric Le Goater 
Acked-by: Corey Minyard 
Reviewed-by: Marcel Apfelbaum 
Signed-off-by: David Gibson 
---
 hw/ipmi/ipmi_bmc_sim.c | 23 +--
 qemu-options.hx| 11 ++-
 2 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index c7883d6..1142c0c 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -27,6 +27,7 @@
 #include "qemu/timer.h"
 #include "hw/ipmi/ipmi.h"
 #include "qemu/error-report.h"
+#include "hw/loader.h"
 
 #define IPMI_NETFN_CHASSIS0x00
 
@@ -213,6 +214,7 @@ struct IPMIBmcSim {
 IPMISel sel;
 IPMISdr sdr;
 IPMISensor sensors[MAX_SENSORS];
+char *sdr_filename;
 
 /* Odd netfns are for responses, so we only need the even ones. */
 const IPMINetfn *netfns[MAX_NETFNS / 2];
@@ -1696,22 +1698,33 @@ static void ipmi_sdr_init(IPMIBmcSim *ibs)
 
 sdrs_size = sizeof(init_sdrs);
 sdrs = init_sdrs;
+if (ibs->sdr_filename &&
+!g_file_get_contents(ibs->sdr_filename, (gchar **) , _size,
+ NULL)) {
+error_report("failed to load sdr file '%s'", ibs->sdr_filename);
+sdrs_size = sizeof(init_sdrs);
+sdrs = init_sdrs;
+}
 
 for (i = 0; i < sdrs_size; i += len) {
 struct ipmi_sdr_header *sdrh;
 
 if (i + IPMI_SDR_HEADER_SIZE > sdrs_size) {
 error_report("Problem with recid 0x%4.4x", i);
-return;
+break;
 }
 sdrh = (struct ipmi_sdr_header *) [i];
 len = ipmi_sdr_length(sdrh);
 if (i + len > sdrs_size) {
 error_report("Problem with recid 0x%4.4x", i);
-return;
+break;
 }
 sdr_add_entry(ibs, sdrh, len, NULL);
 }
+
+if (sdrs != init_sdrs) {
+g_free(sdrs);
+}
 }
 
 static const VMStateDescription vmstate_ipmi_sim = {
@@ -1780,6 +1793,11 @@ static void ipmi_sim_realize(DeviceState *dev, Error 
**errp)
 vmstate_register(NULL, 0, _ipmi_sim, ibs);
 }
 
+static Property ipmi_sim_properties[] = {
+DEFINE_PROP_STRING("sdrfile", IPMIBmcSim, sdr_filename),
+DEFINE_PROP_END_OF_LIST(),
+};
+
 static void ipmi_sim_class_init(ObjectClass *oc, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(oc);
@@ -1787,6 +1805,7 @@ static void ipmi_sim_class_init(ObjectClass *oc, void 
*data)
 
 dc->hotpluggable = false;
 dc->realize = ipmi_sim_realize;
+dc->props = ipmi_sim_properties;
 bk->handle_command = ipmi_sim_handle_command;
 }
 
diff --git a/qemu-options.hx b/qemu-options.hx
index 9171bd5..366b088 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -425,7 +425,7 @@ possible drivers and properties, use @code{-device help} and
 @code{-device @var{driver},help}.
 
 Some drivers are:
-@item -device ipmi-bmc-sim,id=@var{id}[,slave_addr=@var{val}]
+@item -device 
ipmi-bmc-sim,id=@var{id}[,slave_addr=@var{val}][,sdrfile=@var{file}]
 
 Add an IPMI BMC.  This is a simulation of a hardware management
 interface processor that normally sits on a system.  It provides
@@ -437,6 +437,15 @@ This address is the BMC's address on the I2C network of 
management
 controllers.  If you don't know what this means, it is safe to ignore
 it.
 
+@table @option
+@item bmc=@var{id}
+The BMC to connect to, one of ipmi-bmc-sim or ipmi-bmc-extern above.
+@item slave_addr=@var{val}
+Define slave address to use for the BMC.  The default is 0x20.
+@item sdrfile=@var{file}
+file containing raw Sensor Data Records (SDR) data.  The default is none.
+@end table
+
 @item -device 
ipmi-bmc-extern,id=@var{id},chardev=@var{id}[,slave_addr=@var{val}]
 
 Add a connection to an external IPMI BMC simulator.  Instead of
-- 
2.9.3




[Qemu-devel] [PULL 25/47] ppc/pnv: add a helper to calculate MMIO addresses registers

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

Some controllers (ICP, PSI) have a base register address which is
calculated using the chip id.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 include/hw/ppc/pnv.h | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/hw/ppc/pnv.h b/include/hw/ppc/pnv.h
index df98a72..5693ba1 100644
--- a/include/hw/ppc/pnv.h
+++ b/include/hw/ppc/pnv.h
@@ -91,14 +91,24 @@ typedef struct PnvChipClass {
 OBJECT_CHECK(PnvChip, (obj), TYPE_PNV_CHIP_POWER9)
 
 /*
- * This generates a HW chip id depending on an index:
+ * This generates a HW chip id depending on an index, as found on a
+ * two socket system with dual chip modules :
  *
  *0x0, 0x1, 0x10, 0x11
  *
  * 4 chips should be the maximum
+ *
+ * TODO: use a machine property to define the chip ids
  */
 #define PNV_CHIP_HWID(i) i) & 0x3e) << 3) | ((i) & 0x1))
 
+/*
+ * Converts back a HW chip id to an index. This is useful to calculate
+ * the MMIO addresses of some controllers which depend on the chip id.
+ */
+#define PNV_CHIP_INDEX(chip)\
+(((chip)->chip_id >> 2) * 2 + ((chip)->chip_id & 0x3))
+
 #define TYPE_POWERNV_MACHINE   MACHINE_TYPE_NAME("powernv")
 #define POWERNV_MACHINE(obj) \
 OBJECT_CHECK(PnvMachineState, (obj), TYPE_POWERNV_MACHINE)
-- 
2.9.3




[Qemu-devel] [PULL 11/47] spapr: Enable ISA 3.0 MMU mode selection via CAS

2017-04-23 Thread David Gibson
From: Sam Bobroff 

Add the new node, /chosen/ibm,arch-vec-5-platform-support to the
device tree. This allows the guest to determine which modes are
supported by the hypervisor.

Update the option vector processing in h_client_architecture_support()
to handle the new MMU bits. This allows guests to request hash or
radix mode and QEMU to create the guest's HPT at this time if it is
necessary but hasn't yet been done.  QEMU will terminate the guest if
it requests an unavailable mode, as required by the architecture.

Extend the ibm,pa-features node with the new ISA 3.0 values
and set the radix bit if KVM supports radix mode. This probably won't
be used directly by guests to determine the availability of radix mode
(that is indicated by the new node added above) but the architecture
requires that it be set when the hardware supports it.

If QEMU is using KVM, and KVM is capable of running in radix mode,
guests can be run in real-mode without allocating a HPT (because KVM
will use a minimal RPT). So in this case, we avoid creating the HPT
at reset time and later (during CAS) create it if it is necessary.

ISA 3.0 guests will now begin to call h_register_process_table(),
which has been added previously.

Signed-off-by: Sam Bobroff 
[dwg: Strip some unneeded prefix from error messages]
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c  | 71 -
 hw/ppc/spapr_hcall.c| 30 +++
 include/hw/ppc/spapr_ovec.h |  5 
 3 files changed, 93 insertions(+), 13 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 21da9a1..d967ec3 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -237,20 +237,31 @@ static void spapr_populate_pa_features(CPUPPCState *env, 
void *fdt, int offset)
 0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
 0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
 0x80, 0x00, 0x80, 0x00, 0x00, 0x00 };
-/* Currently we don't advertise any of the "new" ISAv3.00 functionality */
-uint8_t pa_features_300[] = { 64, 0,
-0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0, /*  0 -  5 */
-0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /*  6 - 11 */
+uint8_t pa_features_300[] = { 66, 0,
+/* 0: MMU|FPU|SLB|RUN|DABR|NX, 1: fri[nzpm]|DABRX|SPRG3|SLB0|PP110 */
+/* 2: VPM|DS205|PPR|DS202|DS206, 3: LSD|URG, SSO, 5: LE|CFAR|EB|LSQ */
+0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0, /* 0 - 5 */
+/* 6: DS207 */
+0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /* 6 - 11 */
+/* 16: Vector */
 0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
-0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 18 - 23 */
-0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 24 - 29 */
-0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 30 - 35 */
-0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 36 - 41 */
-0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 42 - 47 */
-0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 48 - 53 */
-0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 54 - 59 */
-0x00, 0x00, 0x00, 0x00   }; /* 60 - 63 */
-
+/* 18: Vec. Scalar, 20: Vec. XOR, 22: HTM */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 18 - 23 */
+/* 24: Ext. Dec, 26: 64 bit ftrs, 28: PM ftrs */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 24 - 29 */
+/* 30: MMR, 32: LE atomic, 34: EBB + ext EBB */
+0x80, 0x00, 0x80, 0x00, 0xC0, 0x00, /* 30 - 35 */
+/* 36: SPR SO, 38: Copy/Paste, 40: Radix MMU */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 36 - 41 */
+/* 42: PM, 44: PC RA, 46: SC vec'd */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 42 - 47 */
+/* 48: SIMD, 50: QP BFP, 52: String */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 48 - 53 */
+/* 54: DecFP, 56: DecI, 58: SHA */
+0x80, 0x00, 0x80, 0x00, 0x80, 0x00, /* 54 - 59 */
+/* 60: NM atomic, 62: RNG */
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 60 - 65 */
+};
 uint8_t *pa_features;
 size_t pa_size;
 
@@ -856,6 +867,33 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, void 
*fdt)
 spapr_dt_rtas_tokens(fdt, rtas);
 }
 
+/* Prepare ibm,arch-vec-5-platform-support, which indicates the MMU features
+ * that the guest may request and thus the valid values for bytes 24..26 of
+ * option vector 5: */
+static void spapr_dt_ov5_platform_support(void *fdt, int chosen)
+{
+char val[2 * 3] = {
+24, 0x00, /* Hash/Radix, filled in below. */
+25, 0x00, /* Hash options: Segment Tables == no, GTSE == no. */
+26, 0x40, /* Radix options: GTSE == yes. */
+};
+
+if (kvm_enabled()) {
+if (kvmppc_has_cap_mmu_radix() && kvmppc_has_cap_mmu_hash_v3()) {
+val[1] = 0x80; /* OV5_MMU_BOTH */
+} else if (kvmppc_has_cap_mmu_radix()) {
+val[1] = 0x40; /* OV5_MMU_RADIX_300 */
+} else {
+val[1] = 0x00; /* Hash */
+}
+} else {
+   

[Qemu-devel] [PULL 17/47] ppc/xics: introduce an 'intc' backlink under PowerPCCPU

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

Today, the ICPState array of the sPAPR machine is indexed with
'cpu_index' of the CPUState. This numbering of CPUs is internal to
QEMU and the guest only knows about what is exposed in the device
tree, that is the 'cpu_dt_id'. This is why sPAPR uses the helper
xics_get_cpu_index_by_dt_id() to do the mapping in a couple of places.

To provide a more generic XICS layer, we need to abstract the IRQ
'server' number and remove any assumption made on its nature. It
should not be used as a 'cpu_index' for lookups like xics_cpu_setup()
and xics_cpu_destroy() do.

To reach that goal, we choose to introduce a generic 'intc' backlink
under PowerPCCPU, and let the machine core init routine do the
ICPState lookup. The resulting object is passed on to xics_cpu_setup()
which does the store under PowerPCCPU. The IRQ 'server' number in XICS
is now generic. sPAPR uses 'cpu_dt_id' and PowerNV will use 'PIR'
number.

This also has the benefit of simplifying the sPAPR hcall routines
which do not need to do any ICPState lookups anymore.

Signed-off-by: Cédric Le Goater 
Signed-off-by: David Gibson 
---
 hw/intc/xics.c  |  6 +++---
 hw/intc/xics_spapr.c| 20 +---
 hw/ppc/spapr_cpu_core.c |  4 +++-
 include/hw/ppc/xics.h   |  2 +-
 target/ppc/cpu.h|  1 +
 5 files changed, 13 insertions(+), 20 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index e740989..56fe70c 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -52,7 +52,7 @@ int xics_get_cpu_index_by_dt_id(int cpu_dt_id)
 void xics_cpu_destroy(XICSFabric *xi, PowerPCCPU *cpu)
 {
 CPUState *cs = CPU(cpu);
-ICPState *icp = xics_icp_get(xi, cs->cpu_index);
+ICPState *icp = ICP(cpu->intc);
 
 assert(icp);
 assert(cs == icp->cs);
@@ -61,15 +61,15 @@ void xics_cpu_destroy(XICSFabric *xi, PowerPCCPU *cpu)
 icp->cs = NULL;
 }
 
-void xics_cpu_setup(XICSFabric *xi, PowerPCCPU *cpu)
+void xics_cpu_setup(XICSFabric *xi, PowerPCCPU *cpu, ICPState *icp)
 {
 CPUState *cs = CPU(cpu);
 CPUPPCState *env = >env;
-ICPState *icp = xics_icp_get(xi, cs->cpu_index);
 ICPStateClass *icpc;
 
 assert(icp);
 
+cpu->intc = OBJECT(icp);
 icp->cs = cs;
 
 icpc = ICP_GET_CLASS(icp);
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index 84d24b2..58f100d 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -43,11 +43,9 @@
 static target_ulong h_cppr(PowerPCCPU *cpu, sPAPRMachineState *spapr,
target_ulong opcode, target_ulong *args)
 {
-CPUState *cs = CPU(cpu);
-ICPState *icp = xics_icp_get(XICS_FABRIC(spapr), cs->cpu_index);
 target_ulong cppr = args[0];
 
-icp_set_cppr(icp, cppr);
+icp_set_cppr(ICP(cpu->intc), cppr);
 return H_SUCCESS;
 }
 
@@ -69,9 +67,7 @@ static target_ulong h_ipi(PowerPCCPU *cpu, sPAPRMachineState 
*spapr,
 static target_ulong h_xirr(PowerPCCPU *cpu, sPAPRMachineState *spapr,
target_ulong opcode, target_ulong *args)
 {
-CPUState *cs = CPU(cpu);
-ICPState *icp = xics_icp_get(XICS_FABRIC(spapr), cs->cpu_index);
-uint32_t xirr = icp_accept(icp);
+uint32_t xirr = icp_accept(ICP(cpu->intc));
 
 args[0] = xirr;
 return H_SUCCESS;
@@ -80,9 +76,7 @@ static target_ulong h_xirr(PowerPCCPU *cpu, sPAPRMachineState 
*spapr,
 static target_ulong h_xirr_x(PowerPCCPU *cpu, sPAPRMachineState *spapr,
  target_ulong opcode, target_ulong *args)
 {
-CPUState *cs = CPU(cpu);
-ICPState *icp = xics_icp_get(XICS_FABRIC(spapr), cs->cpu_index);
-uint32_t xirr = icp_accept(icp);
+uint32_t xirr = icp_accept(ICP(cpu->intc));
 
 args[0] = xirr;
 args[1] = cpu_get_host_ticks();
@@ -92,21 +86,17 @@ static target_ulong h_xirr_x(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 static target_ulong h_eoi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
   target_ulong opcode, target_ulong *args)
 {
-CPUState *cs = CPU(cpu);
-ICPState *icp = xics_icp_get(XICS_FABRIC(spapr), cs->cpu_index);
 target_ulong xirr = args[0];
 
-icp_eoi(icp, xirr);
+icp_eoi(ICP(cpu->intc), xirr);
 return H_SUCCESS;
 }
 
 static target_ulong h_ipoll(PowerPCCPU *cpu, sPAPRMachineState *spapr,
 target_ulong opcode, target_ulong *args)
 {
-CPUState *cs = CPU(cpu);
-ICPState *icp = xics_icp_get(XICS_FABRIC(spapr), cs->cpu_index);
 uint32_t mfrr;
-uint32_t xirr = icp_ipoll(icp, );
+uint32_t xirr = icp_ipoll(ICP(cpu->intc), );
 
 args[0] = xirr;
 args[1] = mfrr;
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 6883f09..7db61bd 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -63,6 +63,8 @@ static void spapr_cpu_init(sPAPRMachineState *spapr, 
PowerPCCPU *cpu,
Error **errp)
 {
 CPUPPCState *env = >env;
+XICSFabric *xi = 

[Qemu-devel] [PULL 32/47] ipmi: introduce an ipmi_bmc_sdr_find() API

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

This patch exposes a new IPMI routine to query a sdr entry from the
sdr table maintained by the IPMI BMC simulator. The API is very
similar to the internal sdr_find_entry() routine and should be used
the same way to query one or all sdrs.

A typical use would be to loop on the sdrs to build nodes of a device
tree.

Signed-off-by: Cédric Le Goater 
Acked-by: Corey Minyard 
Signed-off-by: David Gibson 
---
 hw/ipmi/ipmi_bmc_sim.c | 16 
 include/hw/ipmi/ipmi.h |  2 ++
 2 files changed, 18 insertions(+)

diff --git a/hw/ipmi/ipmi_bmc_sim.c b/hw/ipmi/ipmi_bmc_sim.c
index eae7b2d..8185a84 100644
--- a/hw/ipmi/ipmi_bmc_sim.c
+++ b/hw/ipmi/ipmi_bmc_sim.c
@@ -416,6 +416,22 @@ static int sdr_find_entry(IPMISdr *sdr, uint16_t recid,
 return 1;
 }
 
+int ipmi_bmc_sdr_find(IPMIBmc *b, uint16_t recid,
+  const struct ipmi_sdr_compact **sdr, uint16_t *nextrec)
+
+{
+IPMIBmcSim *ibs = IPMI_BMC_SIMULATOR(b);
+unsigned int pos;
+
+pos = 0;
+if (sdr_find_entry(>sdr, recid, , nextrec)) {
+return -1;
+}
+
+*sdr = (const struct ipmi_sdr_compact *) >sdr.sdr[pos];
+return 0;
+}
+
 static void sel_inc_reservation(IPMISel *sel)
 {
 sel->reservation++;
diff --git a/include/hw/ipmi/ipmi.h b/include/hw/ipmi/ipmi.h
index 91b83b5..0d36cfc 100644
--- a/include/hw/ipmi/ipmi.h
+++ b/include/hw/ipmi/ipmi.h
@@ -259,4 +259,6 @@ struct ipmi_sdr_compact {
 
 typedef uint8_t ipmi_sdr_compact_buffer[sizeof(struct ipmi_sdr_compact)];
 
+int ipmi_bmc_sdr_find(IPMIBmc *b, uint16_t recid,
+  const struct ipmi_sdr_compact **sdr, uint16_t *nextrec);
 #endif
-- 
2.9.3




[Qemu-devel] [PULL 22/47] ppc/pnv: extend the machine with a XICSFabric interface

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

A XICSFabric QOM interface is used by the XICS layer to manipulate the
ICP and ICS objects. Let's define the associated handlers for the
PowerNV machine. All handlers should be defined even if there is no
ICS under the PowerNV machine yet.

Signed-off-by: Cédric Le Goater 
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index aad7917..0a0cfe3 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -34,6 +34,7 @@
 #include "qemu/cutils.h"
 #include "qapi/visitor.h"
 
+#include "hw/ppc/xics.h"
 #include "hw/ppc/pnv_xscom.h"
 
 #include "hw/isa/isa.h"
@@ -738,6 +739,29 @@ static const TypeInfo pnv_chip_info = {
 .abstract  = true,
 };
 
+static PowerPCCPU *ppc_get_vcpu_by_pir(int pir)
+{
+CPUState *cs;
+
+CPU_FOREACH(cs) {
+PowerPCCPU *cpu = POWERPC_CPU(cs);
+CPUPPCState *env = >env;
+
+if (env->spr_cb[SPR_PIR].default_value == pir) {
+return cpu;
+}
+}
+
+return NULL;
+}
+
+static ICPState *pnv_icp_get(XICSFabric *xi, int pir)
+{
+PowerPCCPU *cpu = ppc_get_vcpu_by_pir(pir);
+
+return cpu ? ICP(cpu->intc) : NULL;
+}
+
 static void pnv_get_num_chips(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
 {
@@ -788,6 +812,7 @@ static void powernv_machine_class_props_init(ObjectClass 
*oc)
 static void powernv_machine_class_init(ObjectClass *oc, void *data)
 {
 MachineClass *mc = MACHINE_CLASS(oc);
+XICSFabricClass *xic = XICS_FABRIC_CLASS(oc);
 
 mc->desc = "IBM PowerNV (Non-Virtualized)";
 mc->init = ppc_powernv_init;
@@ -798,6 +823,7 @@ static void powernv_machine_class_init(ObjectClass *oc, 
void *data)
 mc->no_parallel = 1;
 mc->default_boot_order = NULL;
 mc->default_ram_size = 1 * G_BYTE;
+xic->icp_get = pnv_icp_get;
 
 powernv_machine_class_props_init(oc);
 }
@@ -808,6 +834,10 @@ static const TypeInfo powernv_machine_info = {
 .instance_size = sizeof(PnvMachineState),
 .instance_init = powernv_machine_initfn,
 .class_init= powernv_machine_class_init,
+.interfaces = (InterfaceInfo[]) {
+{ TYPE_XICS_FABRIC },
+{ },
+},
 };
 
 static void powernv_machine_register_types(void)
-- 
2.9.3




[Qemu-devel] [PULL 19/47] spapr: allocate the ICPState object from under sPAPRCPUCore

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

Today, all the ICPs are created before the CPUs, stored in an array
under the sPAPR machine and linked to the CPU when the core threads
are realized. This modeling brings some complexity when a lookup in
the array is required and it can be simplified by allocating the ICPs
when the CPUs are.

This is the purpose of this proposal which introduces a new 'icp_type'
field under the machine and creates the ICP objects of the right type
(KVM or not) before the PowerPCCPU object are.

This change allows more cleanups : the removal of the icps array under
the sPAPR machine and the removal of the xics_get_cpu_index_by_dt_id()
helper.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 hw/intc/xics.c  | 11 ---
 hw/ppc/spapr.c  | 47 ++-
 hw/ppc/spapr_cpu_core.c | 18 ++
 include/hw/ppc/spapr.h  |  2 +-
 include/hw/ppc/xics.h   |  2 --
 5 files changed, 29 insertions(+), 51 deletions(-)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index 56fe70c..d4428b4 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -38,17 +38,6 @@
 #include "monitor/monitor.h"
 #include "hw/intc/intc.h"
 
-int xics_get_cpu_index_by_dt_id(int cpu_dt_id)
-{
-PowerPCCPU *cpu = ppc_get_vcpu_by_dt_id(cpu_dt_id);
-
-if (cpu) {
-return cpu->parent_obj.cpu_index;
-}
-
-return -1;
-}
-
 void xics_cpu_destroy(XICSFabric *xi, PowerPCCPU *cpu)
 {
 CPUState *cs = CPU(cpu);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 08f8615..703b14a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -104,7 +104,6 @@ static int try_create_xics(sPAPRMachineState *spapr, const 
char *type_ics,
 XICSFabric *xi = XICS_FABRIC(spapr);
 Error *err = NULL, *local_err = NULL;
 ICSState *ics = NULL;
-int i;
 
 ics = ICS_SIMPLE(object_new(type_ics));
 object_property_add_child(OBJECT(spapr), "ics", OBJECT(ics), NULL);
@@ -113,34 +112,14 @@ static int try_create_xics(sPAPRMachineState *spapr, 
const char *type_ics,
 object_property_set_bool(OBJECT(ics), true, "realized", _err);
 error_propagate(, local_err);
 if (err) {
-goto error;
+error_propagate(errp, err);
+return -1;
 }
 
-spapr->icps = g_malloc0(nr_servers * sizeof(ICPState));
 spapr->nr_servers = nr_servers;
-
-for (i = 0; i < nr_servers; i++) {
-ICPState *icp = >icps[i];
-
-object_initialize(icp, sizeof(*icp), type_icp);
-object_property_add_child(OBJECT(spapr), "icp[*]", OBJECT(icp), NULL);
-object_property_add_const_link(OBJECT(icp), "xics", OBJECT(xi), NULL);
-object_property_set_bool(OBJECT(icp), true, "realized", );
-if (err) {
-goto error;
-}
-object_unref(OBJECT(icp));
-}
-
 spapr->ics = ics;
+spapr->icp_type = type_icp;
 return 0;
-
-error:
-error_propagate(errp, err);
-if (ics) {
-object_unparent(OBJECT(ics));
-}
-return -1;
 }
 
 static int xics_system_init(MachineState *machine,
@@ -1441,9 +1420,10 @@ static int spapr_post_load(void *opaque, int version_id)
 int err = 0;
 
 if (!object_dynamic_cast(OBJECT(spapr->ics), TYPE_ICS_KVM)) {
-int i;
-for (i = 0; i < spapr->nr_servers; i++) {
-icp_resend(>icps[i]);
+CPUState *cs;
+CPU_FOREACH(cs) {
+PowerPCCPU *cpu = POWERPC_CPU(cs);
+icp_resend(ICP(cpu->intc));
 }
 }
 
@@ -3114,20 +3094,21 @@ static void spapr_ics_resend(XICSFabric *dev)
 
 static ICPState *spapr_icp_get(XICSFabric *xi, int cpu_dt_id)
 {
-sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
-int server = xics_get_cpu_index_by_dt_id(cpu_dt_id);
+PowerPCCPU *cpu = ppc_get_vcpu_by_dt_id(cpu_dt_id);
 
-return (server < spapr->nr_servers) ? >icps[server] : NULL;
+return cpu ? ICP(cpu->intc) : NULL;
 }
 
 static void spapr_pic_print_info(InterruptStatsProvider *obj,
  Monitor *mon)
 {
 sPAPRMachineState *spapr = SPAPR_MACHINE(obj);
-int i;
+CPUState *cs;
+
+CPU_FOREACH(cs) {
+PowerPCCPU *cpu = POWERPC_CPU(cs);
 
-for (i = 0; i < spapr->nr_servers; i++) {
-icp_pic_print_info(>icps[i], mon);
+icp_pic_print_info(ICP(cpu->intc), mon);
 }
 
 ics_pic_print_info(spapr->ics, mon);
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 4e1a995..2e689b5 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -63,8 +63,6 @@ static void spapr_cpu_init(sPAPRMachineState *spapr, 
PowerPCCPU *cpu,
Error **errp)
 {
 CPUPPCState *env = >env;
-XICSFabric *xi = XICS_FABRIC(spapr);
-ICPState *icp = xics_icp_get(xi, cpu->cpu_dt_id);
 
 /* Set time-base frequency to 512 MHz */
 cpu_ppc_tb_init(env, SPAPR_TIMEBASE_FREQ);

[Qemu-devel] [PULL 29/47] ppc: add IPMI support

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

OpenPOWER systems use a BT device to communicate with the BMC.
Provide support for it.

Signed-off-by: Cédric Le Goater 
Signed-off-by: David Gibson 
---
 default-configs/ppc64-softmmu.mak | 4 
 1 file changed, 4 insertions(+)

diff --git a/default-configs/ppc64-softmmu.mak 
b/default-configs/ppc64-softmmu.mak
index 05c8335..7f5b56c 100644
--- a/default-configs/ppc64-softmmu.mak
+++ b/default-configs/ppc64-softmmu.mak
@@ -6,6 +6,10 @@ include usb.mak
 CONFIG_VIRTIO_VGA=y
 CONFIG_ESCC=y
 CONFIG_M48T59=y
+CONFIG_IPMI=y
+CONFIG_IPMI_LOCAL=y
+CONFIG_IPMI_EXTERN=y
+CONFIG_ISA_IPMI_BT=y
 CONFIG_SERIAL=y
 CONFIG_PARALLEL=y
 CONFIG_I8254=y
-- 
2.9.3




[Qemu-devel] [PULL 41/47] ppc/pnv: populate device tree for IPMI BT devices

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

When an ipmi-bt device [1] is defined on the ISA bus, we need to
populate the device tree with the object properties. Such devices are
created with the command line options :

   -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-bt,bmc=bmc0,irq=10

[1] https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg03168.html

Signed-off-by: Cédric Le Goater 
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index dfa21e4..977e126 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -354,6 +354,39 @@ static void powernv_populate_serial(ISADevice *d, void 
*fdt, int lpc_off)
 _FDT((fdt_setprop_string(fdt, node, "device_type", "serial")));
 }
 
+static void powernv_populate_ipmi_bt(ISADevice *d, void *fdt, int lpc_off)
+{
+const char compatible[] = "bt\0ipmi-bt";
+uint32_t io_base;
+uint32_t io_regs[] = {
+cpu_to_be32(1),
+0, /* 'io_base' retrieved from the 'ioport' property of 'isa-ipmi-bt' 
*/
+cpu_to_be32(3)
+};
+uint32_t irq;
+char *name;
+int node;
+
+io_base = object_property_get_int(OBJECT(d), "ioport", _fatal);
+io_regs[1] = cpu_to_be32(io_base);
+
+irq = object_property_get_int(OBJECT(d), "irq", _fatal);
+
+name = g_strdup_printf("%s@i%x", qdev_fw_name(DEVICE(d)), io_base);
+node = fdt_add_subnode(fdt, lpc_off, name);
+_FDT(node);
+g_free(name);
+
+fdt_setprop(fdt, node, "reg", io_regs, sizeof(io_regs));
+fdt_setprop(fdt, node, "compatible", compatible, sizeof(compatible));
+
+/* Mark it as reserved to avoid Linux trying to claim it */
+_FDT((fdt_setprop_string(fdt, node, "status", "reserved")));
+_FDT((fdt_setprop_cell(fdt, node, "interrupts", irq)));
+_FDT((fdt_setprop_cell(fdt, node, "interrupt-parent",
+   fdt_get_phandle(fdt, lpc_off;
+}
+
 typedef struct ForeachPopulateArgs {
 void *fdt;
 int offset;
@@ -368,6 +401,8 @@ static int powernv_populate_isa_device(DeviceState *dev, 
void *opaque)
 powernv_populate_rtc(d, args->fdt, args->offset);
 } else if (object_dynamic_cast(OBJECT(dev), TYPE_ISA_SERIAL)) {
 powernv_populate_serial(d, args->fdt, args->offset);
+} else if (object_dynamic_cast(OBJECT(dev), "isa-ipmi-bt")) {
+powernv_populate_ipmi_bt(d, args->fdt, args->offset);
 } else {
 error_report("unknown isa device %s@i%x", qdev_fw_name(dev),
  d->ioport_id);
-- 
2.9.3




[Qemu-devel] [PULL 16/47] target/ppc: Add ibm, processor-radix-AP-encodings for TCG

2017-04-23 Thread David Gibson
From: Suraj Jitindar Singh 

The ibm,processor-radix-AP-encodings device tree property of the cpu node
is used to specify the radix mode supported page sizes of the processor
to the guest os. Contained in the top 3 bits of the msb is the actual
page size (AP) encoding associated with the corresponding radix mode
supported page size. Add this property for a TCG guest, note the TCG code
is capable of translating any format so just add the 4 default page sizes.

The ibm,processor-radix-AP-encodings device tree property is defined as:
One to n cells in ascending order of radix mode supported page sizes
encoded as BE ints (32bit on ppc) in the form:
0bxxxy
- 0bxxx -> AP encoding
- 0by -> supported page size encoded as a shift

Signed-off-by: Suraj Jitindar Singh 
Signed-off-by: David Gibson 
---
 target/ppc/translate_init.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
index c1a9014..aa0c44d 100644
--- a/target/ppc/translate_init.c
+++ b/target/ppc/translate_init.c
@@ -8808,6 +8808,25 @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data)
 pcc->interrupts_big_endian = ppc_cpu_interrupts_big_endian_lpcr;
 }
 
+#ifdef CONFIG_SOFTMMU
+/*
+ * Radix pg sizes and AP encodings for dt node ibm,processor-radix-AP-encodings
+ * Encoded as array of int_32s in the form:
+ *  0bxxxy
+ *  x -> AP encoding
+ *  y -> radix mode supported page size (encoded as a shift)
+ */
+static struct ppc_radix_page_info POWER9_radix_page_info = {
+.count = 4,
+.entries = {
+0x000c, /*  4K - enc: 0x0 */
+0xa010, /* 64K - enc: 0x5 */
+0x2015, /*  2M - enc: 0x1 */
+0x401e  /*  1G - enc: 0x2 */
+}
+};
+#endif /* CONFIG_SOFTMMU */
+
 static void init_proc_POWER9(CPUPPCState *env)
 {
 /* Common Registers */
@@ -8959,6 +8978,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
 pcc->handle_mmu_fault = ppc64_v3_handle_mmu_fault;
 /* segment page size remain the same */
 pcc->sps = _POWER8_sps;
+pcc->radix_page_info = _radix_page_info;
 #endif
 pcc->excp_model = POWERPC_EXCP_POWER8;
 pcc->bus_model = PPC_FLAGS_INPUT_POWER7;
-- 
2.9.3




[Qemu-devel] [PULL 44/47] spapr-cpu-core: Release ICPState object during CPU unrealization

2017-04-23 Thread David Gibson
From: Bharata B Rao 

Recent commits that re-organized ICPState object missed to destroy
the object when CPU is unrealized. Fix this so that CPU unplug
doesn't abort QEMU.

Signed-off-by: Bharata B Rao 
Reviewed-by: Cédric Le Goater 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_cpu_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 2e689b5..4389ef4 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -127,6 +127,7 @@ static void spapr_cpu_core_unrealizefn(DeviceState *dev, 
Error **errp)
 PowerPCCPU *cpu = POWERPC_CPU(cs);
 
 spapr_cpu_destroy(cpu);
+object_unparent(cpu->intc);
 cpu_remove_sync(cs);
 object_unparent(obj);
 }
-- 
2.9.3




[Qemu-devel] [PULL 18/47] spapr: move the IRQ server number mapping under the machine

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

This is the second step to abstract the IRQ 'server' number of the
XICS layer. Now that the prereq cleanups have been done in the
previous patch, we can move down the 'cpu_dt_id' to 'cpu_index'
mapping in the sPAPR machine handler.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 hw/intc/xics_spapr.c| 5 ++---
 hw/ppc/spapr.c  | 3 ++-
 hw/ppc/spapr_cpu_core.c | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index 58f100d..f05308b8 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -52,9 +52,8 @@ static target_ulong h_cppr(PowerPCCPU *cpu, sPAPRMachineState 
*spapr,
 static target_ulong h_ipi(PowerPCCPU *cpu, sPAPRMachineState *spapr,
   target_ulong opcode, target_ulong *args)
 {
-target_ulong server = xics_get_cpu_index_by_dt_id(args[0]);
 target_ulong mfrr = args[1];
-ICPState *icp = xics_icp_get(XICS_FABRIC(spapr), server);
+ICPState *icp = xics_icp_get(XICS_FABRIC(spapr), args[0]);
 
 if (!icp) {
 return H_PARAMETER;
@@ -122,7 +121,7 @@ static void rtas_set_xive(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 }
 
 nr = rtas_ld(args, 0);
-server = xics_get_cpu_index_by_dt_id(rtas_ld(args, 1));
+server = rtas_ld(args, 1);
 priority = rtas_ld(args, 2);
 
 if (!ics_valid_irq(ics, nr) || !xics_icp_get(XICS_FABRIC(spapr), server)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 8749f1b..08f8615 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3112,9 +3112,10 @@ static void spapr_ics_resend(XICSFabric *dev)
 ics_resend(spapr->ics);
 }
 
-static ICPState *spapr_icp_get(XICSFabric *xi, int server)
+static ICPState *spapr_icp_get(XICSFabric *xi, int cpu_dt_id)
 {
 sPAPRMachineState *spapr = SPAPR_MACHINE(xi);
+int server = xics_get_cpu_index_by_dt_id(cpu_dt_id);
 
 return (server < spapr->nr_servers) ? >icps[server] : NULL;
 }
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 7db61bd..4e1a995 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -64,7 +64,7 @@ static void spapr_cpu_init(sPAPRMachineState *spapr, 
PowerPCCPU *cpu,
 {
 CPUPPCState *env = >env;
 XICSFabric *xi = XICS_FABRIC(spapr);
-ICPState *icp = xics_icp_get(xi, CPU(cpu)->cpu_index);
+ICPState *icp = xics_icp_get(xi, cpu->cpu_dt_id);
 
 /* Set time-base frequency to 512 MHz */
 cpu_ppc_tb_init(env, SPAPR_TIMEBASE_FREQ);
-- 
2.9.3




[Qemu-devel] [PULL 23/47] ppc/pnv: extend the machine with a InterruptStatsProvider interface

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

Signed-off-by: Cédric Le Goater 
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 0a0cfe3..f3623ee 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -33,6 +33,8 @@
 #include "exec/address-spaces.h"
 #include "qemu/cutils.h"
 #include "qapi/visitor.h"
+#include "monitor/monitor.h"
+#include "hw/intc/intc.h"
 
 #include "hw/ppc/xics.h"
 #include "hw/ppc/pnv_xscom.h"
@@ -762,6 +764,18 @@ static ICPState *pnv_icp_get(XICSFabric *xi, int pir)
 return cpu ? ICP(cpu->intc) : NULL;
 }
 
+static void pnv_pic_print_info(InterruptStatsProvider *obj,
+   Monitor *mon)
+{
+CPUState *cs;
+
+CPU_FOREACH(cs) {
+PowerPCCPU *cpu = POWERPC_CPU(cs);
+
+icp_pic_print_info(ICP(cpu->intc), mon);
+}
+}
+
 static void pnv_get_num_chips(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
 {
@@ -813,6 +827,7 @@ static void powernv_machine_class_init(ObjectClass *oc, 
void *data)
 {
 MachineClass *mc = MACHINE_CLASS(oc);
 XICSFabricClass *xic = XICS_FABRIC_CLASS(oc);
+InterruptStatsProviderClass *ispc = INTERRUPT_STATS_PROVIDER_CLASS(oc);
 
 mc->desc = "IBM PowerNV (Non-Virtualized)";
 mc->init = ppc_powernv_init;
@@ -824,6 +839,7 @@ static void powernv_machine_class_init(ObjectClass *oc, 
void *data)
 mc->default_boot_order = NULL;
 mc->default_ram_size = 1 * G_BYTE;
 xic->icp_get = pnv_icp_get;
+ispc->print_info = pnv_pic_print_info;
 
 powernv_machine_class_props_init(oc);
 }
@@ -836,6 +852,7 @@ static const TypeInfo powernv_machine_info = {
 .class_init= powernv_machine_class_init,
 .interfaces = (InterfaceInfo[]) {
 { TYPE_XICS_FABRIC },
+{ TYPE_INTERRUPT_STATS_PROVIDER },
 { },
 },
 };
-- 
2.9.3




[Qemu-devel] [PULL 13/47] target-ppc/kvm: Enable in-kernel TCE acceleration for multi-tce

2017-04-23 Thread David Gibson
From: Alexey Kardashevskiy 

This enables in-kernel handling of H_PUT_TCE_INDIRECT and
H_STUFF_TCE hypercalls. The host kernel support is there since v4.6,
in particular d3695aa4f452
("KVM: PPC: Add support for multiple-TCE hcalls").

H_PUT_TCE is already accelerated and does not need any special enablement.

Signed-off-by: Alexey Kardashevskiy 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c   |  4 +++-
 target/ppc/kvm.c | 14 ++
 target/ppc/kvm_ppc.h |  6 ++
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a355512..8749f1b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2361,10 +2361,12 @@ static void ppc_spapr_init(MachineState *machine)
 
 qemu_register_boot_set(spapr_boot_set, spapr);
 
-/* to stop and start vmclock */
 if (kvm_enabled()) {
+/* to stop and start vmclock */
 qemu_add_vm_change_state_handler(cpu_ppc_clock_vm_state_change,
  >tb);
+
+kvmppc_spapr_enable_inkernel_multitce();
 }
 }
 
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index c959b90..8574c36 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -2198,6 +2198,20 @@ bool kvmppc_spapr_use_multitce(void)
 return cap_spapr_multitce;
 }
 
+int kvmppc_spapr_enable_inkernel_multitce(void)
+{
+int ret;
+
+ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PPC_ENABLE_HCALL, 0,
+H_PUT_TCE_INDIRECT, 1);
+if (!ret) {
+ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PPC_ENABLE_HCALL, 0,
+H_STUFF_TCE, 1);
+}
+
+return ret;
+}
+
 void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t page_shift,
   uint64_t bus_offset, uint32_t nb_table,
   int *pfd, bool need_vfio)
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index 4b2fd9a..f48243d 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@ -39,6 +39,7 @@ target_ulong kvmppc_configure_v3_mmu(PowerPCCPU *cpu,
 #ifndef CONFIG_USER_ONLY
 off_t kvmppc_alloc_rma(void **rma);
 bool kvmppc_spapr_use_multitce(void);
+int kvmppc_spapr_enable_inkernel_multitce(void);
 void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t page_shift,
   uint64_t bus_offset, uint32_t nb_table,
   int *pfd, bool need_vfio);
@@ -180,6 +181,11 @@ static inline bool kvmppc_spapr_use_multitce(void)
 return false;
 }
 
+static inline int kvmppc_spapr_enable_inkernel_multitce(void)
+{
+return -1;
+}
+
 static inline void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t 
page_shift,
 uint64_t bus_offset,
 uint32_t nb_table,
-- 
2.9.3




[Qemu-devel] [PULL 26/47] ppc/pnv: add memory regions for the ICP registers

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

This provides to a PowerNV chip (POWER8) access to the Interrupt
Management area, which contains the registers of the Interrupt Control
Presenters of each thread. These are used to accept, return, forward
interrupts in the system.

This area is modeled with a per-chip container memory region holding
all the ICP registers. Each thread of a chip is then associated with
its ICP registers using a memory subregion indexed by its PIR number
in the overall region.

The device tree is populated accordingly.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 81 
 include/hw/ppc/pnv.h | 19 
 2 files changed, 100 insertions(+)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 2add2ad..1fa90d6 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -218,6 +218,43 @@ static void powernv_create_core_node(PnvChip *chip, 
PnvCore *pc, void *fdt)
servers_prop, sizeof(servers_prop;
 }
 
+static void powernv_populate_icp(PnvChip *chip, void *fdt, uint32_t pir,
+ uint32_t nr_threads)
+{
+uint64_t addr = PNV_ICP_BASE(chip) | (pir << 12);
+char *name;
+const char compat[] = "IBM,power8-icp\0IBM,ppc-xicp";
+uint32_t irange[2], i, rsize;
+uint64_t *reg;
+int offset;
+
+irange[0] = cpu_to_be32(pir);
+irange[1] = cpu_to_be32(nr_threads);
+
+rsize = sizeof(uint64_t) * 2 * nr_threads;
+reg = g_malloc(rsize);
+for (i = 0; i < nr_threads; i++) {
+reg[i * 2] = cpu_to_be64(addr | ((pir + i) * 0x1000));
+reg[i * 2 + 1] = cpu_to_be64(0x1000);
+}
+
+name = g_strdup_printf("interrupt-controller@%"PRIX64, addr);
+offset = fdt_add_subnode(fdt, 0, name);
+_FDT(offset);
+g_free(name);
+
+_FDT((fdt_setprop(fdt, offset, "compatible", compat, sizeof(compat;
+_FDT((fdt_setprop(fdt, offset, "reg", reg, rsize)));
+_FDT((fdt_setprop_string(fdt, offset, "device_type",
+  "PowerPC-External-Interrupt-Presentation")));
+_FDT((fdt_setprop(fdt, offset, "interrupt-controller", NULL, 0)));
+_FDT((fdt_setprop(fdt, offset, "ibm,interrupt-server-ranges",
+   irange, sizeof(irange;
+_FDT((fdt_setprop_cell(fdt, offset, "#interrupt-cells", 1)));
+_FDT((fdt_setprop_cell(fdt, offset, "#address-cells", 0)));
+g_free(reg);
+}
+
 static void powernv_populate_chip(PnvChip *chip, void *fdt)
 {
 PnvChipClass *pcc = PNV_CHIP_GET_CLASS(chip);
@@ -231,6 +268,10 @@ static void powernv_populate_chip(PnvChip *chip, void *fdt)
 PnvCore *pnv_core = PNV_CORE(chip->cores + i * typesize);
 
 powernv_create_core_node(chip, pnv_core, fdt);
+
+/* Interrupt Control Presenters (ICP). One per core. */
+powernv_populate_icp(chip, fdt, pnv_core->pir,
+ CPU_CORE(pnv_core)->nr_threads);
 }
 
 if (chip->ram_size) {
@@ -643,6 +684,38 @@ static void pnv_chip_init(Object *obj)
 object_property_add_child(obj, "lpc", OBJECT(>lpc), NULL);
 }
 
+static void pnv_chip_icp_realize(PnvChip *chip, Error **errp)
+{
+PnvChipClass *pcc = PNV_CHIP_GET_CLASS(chip);
+char *typename = pnv_core_typename(pcc->cpu_model);
+size_t typesize = object_type_get_instance_size(typename);
+int i, j;
+char *name;
+XICSFabric *xi = XICS_FABRIC(qdev_get_machine());
+
+name = g_strdup_printf("icp-%x", chip->chip_id);
+memory_region_init(>icp_mmio, OBJECT(chip), name, PNV_ICP_SIZE);
+sysbus_init_mmio(SYS_BUS_DEVICE(chip), >icp_mmio);
+g_free(name);
+
+sysbus_mmio_map(SYS_BUS_DEVICE(chip), 1, PNV_ICP_BASE(chip));
+
+/* Map the ICP registers for each thread */
+for (i = 0; i < chip->nr_cores; i++) {
+PnvCore *pnv_core = PNV_CORE(chip->cores + i * typesize);
+int core_hwid = CPU_CORE(pnv_core)->core_id;
+
+for (j = 0; j < CPU_CORE(pnv_core)->nr_threads; j++) {
+uint32_t pir = pcc->core_pir(chip, core_hwid) + j;
+PnvICPState *icp = PNV_ICP(xics_icp_get(xi, pir));
+
+memory_region_add_subregion(>icp_mmio, pir << 12, 
>mmio);
+}
+}
+
+g_free(typename);
+}
+
 static void pnv_chip_realize(DeviceState *dev, Error **errp)
 {
 PnvChip *chip = PNV_CHIP(dev);
@@ -713,6 +786,14 @@ static void pnv_chip_realize(DeviceState *dev, Error 
**errp)
 object_property_set_bool(OBJECT(>lpc), true, "realized",
  _fatal);
 pnv_xscom_add_subregion(chip, PNV_XSCOM_LPC_BASE, >lpc.xscom_regs);
+
+/* Interrupt Management Area. This is the memory region holding
+ * all the Interrupt Control Presenter (ICP) registers */
+pnv_chip_icp_realize(chip, );
+if (error) {
+error_propagate(errp, error);
+return;
+}
 }
 
 static Property 

[Qemu-devel] [PULL 15/47] spapr_pci: Removed unused include

2017-04-23 Thread David Gibson
From: Alexey Kardashevskiy 

Signed-off-by: Alexey Kardashevskiy 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_pci.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 097ebdd..e7567e2 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -50,8 +50,6 @@
 #include "sysemu/hostmem.h"
 #include "sysemu/numa.h"
 
-#include "hw/vfio/vfio.h"
-
 /* Copied from the kernel arch/powerpc/platforms/pseries/msi.c */
 #define RTAS_QUERY_FN   0
 #define RTAS_CHANGE_FN  1
-- 
2.9.3




[Qemu-devel] [PULL 08/47] target/ppc: Add new H-CALL shells for in memory table translation

2017-04-23 Thread David Gibson
From: Suraj Jitindar Singh 

The use of the new in memory tables introduced in ISAv3.00 for translation,
also referred to as process tables, requires the introduction of 3 new
H-CALLs; H_REGISTER_PROCESS_TABLE, H_CLEAN_SLB, and H_INVALIDATE_PID.

Add shells for each of these and register them as the hypercall handlers.
Currently they all log an unimplemented hypercall and return H_FUNCTION.

Signed-off-by: Suraj Jitindar Singh 
[dwg: Fix style nits]
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_hcall.c   | 31 +++
 include/hw/ppc/spapr.h |  3 +++
 2 files changed, 34 insertions(+)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index f05a90e..7952129 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -878,6 +878,32 @@ static target_ulong h_set_mode(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 return ret;
 }
 
+static target_ulong h_clean_slb(PowerPCCPU *cpu, sPAPRMachineState *spapr,
+target_ulong opcode, target_ulong *args)
+{
+qemu_log_mask(LOG_UNIMP, "Unimplemented SPAPR hcall 0x"TARGET_FMT_lx"%s\n",
+  opcode, " (H_CLEAN_SLB)");
+return H_FUNCTION;
+}
+
+static target_ulong h_invalidate_pid(PowerPCCPU *cpu, sPAPRMachineState *spapr,
+ target_ulong opcode, target_ulong *args)
+{
+qemu_log_mask(LOG_UNIMP, "Unimplemented SPAPR hcall 0x"TARGET_FMT_lx"%s\n",
+  opcode, " (H_INVALIDATE_PID)");
+return H_FUNCTION;
+}
+
+static target_ulong h_register_process_table(PowerPCCPU *cpu,
+ sPAPRMachineState *spapr,
+ target_ulong opcode,
+ target_ulong *args)
+{
+qemu_log_mask(LOG_UNIMP, "Unimplemented SPAPR hcall 0x"TARGET_FMT_lx"%s\n",
+  opcode, " (H_REGISTER_PROC_TBL)");
+return H_FUNCTION;
+}
+
 #define H_SIGNAL_SYS_RESET_ALL -1
 #define H_SIGNAL_SYS_RESET_ALLBUTSELF  -2
 
@@ -1084,6 +1110,11 @@ static void hypercall_register_types(void)
 spapr_register_hypercall(H_PAGE_INIT, h_page_init);
 spapr_register_hypercall(H_SET_MODE, h_set_mode);
 
+/* In Memory Table MMU h-calls */
+spapr_register_hypercall(H_CLEAN_SLB, h_clean_slb);
+spapr_register_hypercall(H_INVALIDATE_PID, h_invalidate_pid);
+spapr_register_hypercall(H_REGISTER_PROC_TBL, h_register_process_table);
+
 /* "debugger" hcalls (also used by SLOF). Note: We do -not- differenciate
  * here between the "CI" and the "CACHE" variants, they will use whatever
  * mapping attributes qemu is using. When using KVM, the kernel will
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index ba9e689..342f7a6 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -361,6 +361,9 @@ struct sPAPRMachineState {
 #define H_XIRR_X0x2FC
 #define H_RANDOM0x300
 #define H_SET_MODE  0x31C
+#define H_CLEAN_SLB 0x374
+#define H_INVALIDATE_PID0x378
+#define H_REGISTER_PROC_TBL 0x37C
 #define H_SIGNAL_SYS_RESET  0x380
 #define MAX_HCALL_OPCODEH_SIGNAL_SYS_RESET
 
-- 
2.9.3




[Qemu-devel] [PULL 10/47] spapr: move spapr_populate_pa_features()

2017-04-23 Thread David Gibson
From: Sam Bobroff 

In the next patch, spapr_fixup_cpu_dt() will need to call
spapr_populate_pa_features() so move it's definition up without making
any other changes.

Signed-off-by: Sam Bobroff 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 122 -
 1 file changed, 61 insertions(+), 61 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 54391a1..21da9a1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -227,6 +227,67 @@ static int spapr_fixup_cpu_numa_dt(void *fdt, int offset, 
CPUState *cs)
 return ret;
 }
 
+/* Populate the "ibm,pa-features" property */
+static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
+{
+uint8_t pa_features_206[] = { 6, 0,
+0xf6, 0x1f, 0xc7, 0x00, 0x80, 0xc0 };
+uint8_t pa_features_207[] = { 24, 0,
+0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0,
+0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00 };
+/* Currently we don't advertise any of the "new" ISAv3.00 functionality */
+uint8_t pa_features_300[] = { 64, 0,
+0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0, /*  0 -  5 */
+0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /*  6 - 11 */
+0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
+0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 18 - 23 */
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 24 - 29 */
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 30 - 35 */
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 36 - 41 */
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 42 - 47 */
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 48 - 53 */
+0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 54 - 59 */
+0x00, 0x00, 0x00, 0x00   }; /* 60 - 63 */
+
+uint8_t *pa_features;
+size_t pa_size;
+
+switch (POWERPC_MMU_VER(env->mmu_model)) {
+case POWERPC_MMU_VER_2_06:
+pa_features = pa_features_206;
+pa_size = sizeof(pa_features_206);
+break;
+case POWERPC_MMU_VER_2_07:
+pa_features = pa_features_207;
+pa_size = sizeof(pa_features_207);
+break;
+case POWERPC_MMU_VER_3_00:
+pa_features = pa_features_300;
+pa_size = sizeof(pa_features_300);
+break;
+default:
+return;
+}
+
+if (env->ci_large_pages) {
+/*
+ * Note: we keep CI large pages off by default because a 64K capable
+ * guest provisioned with large pages might otherwise try to map a qemu
+ * framebuffer (or other kind of memory mapped PCI BAR) using 64K pages
+ * even if that qemu runs on a 4k host.
+ * We dd this bit back here if we are confident this is not an issue
+ */
+pa_features[3] |= 0x20;
+}
+if (kvmppc_has_cap_htm() && pa_size > 24) {
+pa_features[24] |= 0x80;/* Transactional memory support */
+}
+
+_FDT((fdt_setprop(fdt, offset, "ibm,pa-features", pa_features, pa_size)));
+}
+
 static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
 {
 int ret = 0, offset, cpus_offset;
@@ -379,67 +440,6 @@ static int spapr_populate_memory(sPAPRMachineState *spapr, 
void *fdt)
 return 0;
 }
 
-/* Populate the "ibm,pa-features" property */
-static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
-{
-uint8_t pa_features_206[] = { 6, 0,
-0xf6, 0x1f, 0xc7, 0x00, 0x80, 0xc0 };
-uint8_t pa_features_207[] = { 24, 0,
-0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0,
-0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
-0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
-0x80, 0x00, 0x80, 0x00, 0x00, 0x00 };
-/* Currently we don't advertise any of the "new" ISAv3.00 functionality */
-uint8_t pa_features_300[] = { 64, 0,
-0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0, /*  0 -  5 */
-0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /*  6 - 11 */
-0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
-0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 18 - 23 */
-0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 24 - 29 */
-0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 30 - 35 */
-0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 36 - 41 */
-0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 42 - 47 */
-0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 48 - 53 */
-0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 54 - 59 */
-0x00, 0x00, 0x00, 0x00   }; /* 60 - 63 */
-
-uint8_t *pa_features;
-size_t pa_size;
-
-switch (POWERPC_MMU_VER(env->mmu_model)) {
-case POWERPC_MMU_VER_2_06:
-pa_features = pa_features_206;
-pa_size = sizeof(pa_features_206);
-break;
-case POWERPC_MMU_VER_2_07:
-pa_features = pa_features_207;
-pa_size = sizeof(pa_features_207);
-break;
-case POWERPC_MMU_VER_3_00:
-pa_features = pa_features_300;
-pa_size = 

[Qemu-devel] [PULL 20/47] ppc/xics: add a realize() handler to ICPStateClass

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

It will be used by derived classes in PowerNV for customization.

Signed-off-by: Cédric Le Goater 
Reviewed-by: David Gibson 
Signed-off-by: David Gibson 
---
 hw/intc/xics.c| 5 +
 include/hw/ppc/xics.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index d4428b4..292fffe 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -337,6 +337,7 @@ static void icp_reset(void *dev)
 static void icp_realize(DeviceState *dev, Error **errp)
 {
 ICPState *icp = ICP(dev);
+ICPStateClass *icpc = ICP_GET_CLASS(dev);
 Object *obj;
 Error *err = NULL;
 
@@ -349,6 +350,10 @@ static void icp_realize(DeviceState *dev, Error **errp)
 
 icp->xics = XICS_FABRIC(obj);
 
+if (icpc->realize) {
+icpc->realize(dev, errp);
+}
+
 qemu_register_reset(icp_reset, dev);
 }
 
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index b07f56f..731e177 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -60,6 +60,7 @@ typedef struct XICSFabric XICSFabric;
 struct ICPStateClass {
 DeviceClass parent_class;
 
+void (*realize)(DeviceState *dev, Error **errp);
 void (*pre_save)(ICPState *s);
 int (*post_load)(ICPState *s, int version_id);
 void (*cpu_setup)(ICPState *icp, PowerPCCPU *cpu);
-- 
2.9.3




[Qemu-devel] [PULL 07/47] target-ppc: support KVM_CAP_PPC_MMU_RADIX, KVM_CAP_PPC_MMU_HASH_V3

2017-04-23 Thread David Gibson
From: Sam Bobroff 

Query and cache the value of two new KVM capabilities that indicate
KVM's support for new radix and hash modes of the MMU.

Signed-off-by: Sam Bobroff 
Signed-off-by: David Gibson 
---
 target/ppc/kvm.c | 14 ++
 target/ppc/kvm_ppc.h | 12 
 2 files changed, 26 insertions(+)

diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 9dc2f7f..38db27b 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -86,6 +86,8 @@ static int cap_papr;
 static int cap_htab_fd;
 static int cap_fixup_hcalls;
 static int cap_htm; /* Hardware transactional memory support */
+static int cap_mmu_radix;
+static int cap_mmu_hash_v3;
 
 static uint32_t debug_inst_opcode;
 
@@ -140,6 +142,8 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 cap_htab_fd = kvm_check_extension(s, KVM_CAP_PPC_HTAB_FD);
 cap_fixup_hcalls = kvm_check_extension(s, KVM_CAP_PPC_FIXUP_HCALL);
 cap_htm = kvm_vm_check_extension(s, KVM_CAP_PPC_HTM);
+cap_mmu_radix = kvm_vm_check_extension(s, KVM_CAP_PPC_MMU_RADIX);
+cap_mmu_hash_v3 = kvm_vm_check_extension(s, KVM_CAP_PPC_MMU_HASH_V3);
 
 if (!cap_interrupt_level) {
 fprintf(stderr, "KVM: Couldn't find level irq capability. Expect the "
@@ -2354,6 +2358,16 @@ bool kvmppc_has_cap_htm(void)
 return cap_htm;
 }
 
+bool kvmppc_has_cap_mmu_radix(void)
+{
+return cap_mmu_radix;
+}
+
+bool kvmppc_has_cap_mmu_hash_v3(void)
+{
+return cap_mmu_hash_v3;
+}
+
 static PowerPCCPUClass *ppc_cpu_get_family_class(PowerPCCPUClass *pcc)
 {
 ObjectClass *oc = OBJECT_CLASS(pcc);
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index 08ecf75..64189a4 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@ -54,6 +54,8 @@ void kvmppc_read_hptes(ppc_hash_pte64_t *hptes, hwaddr ptex, 
int n);
 void kvmppc_write_hpte(hwaddr ptex, uint64_t pte0, uint64_t pte1);
 bool kvmppc_has_cap_fixup_hcalls(void);
 bool kvmppc_has_cap_htm(void);
+bool kvmppc_has_cap_mmu_radix(void);
+bool kvmppc_has_cap_mmu_hash_v3(void);
 int kvmppc_enable_hwrng(void);
 int kvmppc_put_books_sregs(PowerPCCPU *cpu);
 PowerPCCPUClass *kvm_ppc_get_host_cpu_class(void);
@@ -254,6 +256,16 @@ static inline bool kvmppc_has_cap_htm(void)
 return false;
 }
 
+static inline bool kvmppc_has_cap_mmu_radix(void)
+{
+return false;
+}
+
+static inline bool kvmppc_has_cap_mmu_hash_v3(void)
+{
+return false;
+}
+
 static inline int kvmppc_enable_hwrng(void)
 {
 return -1;
-- 
2.9.3




[Qemu-devel] [PULL 09/47] target/ppc: Implement H_REGISTER_PROCESS_TABLE H_CALL

2017-04-23 Thread David Gibson
From: Suraj Jitindar Singh 

The H_REGISTER_PROCESS_TABLE H_CALL is used by a guest to indicate to the
hypervisor where in memory its process table is and how translation should
be performed using this process table.

Provide the implementation of this H_CALL for a guest.

We first check for invalid flags, then parse the flags to determine the
operation, and then check the other parameters for valid values based on
the operation (register new table/deregister table/maintain registration).
The process table is then stored in the appropriate location and registered
with the hypervisor (if running under KVM), and the LPCR_[UPRT/GTSE] bits
are updated as required.

Signed-off-by: Suraj Jitindar Singh 
Signed-off-by: Sam Bobroff 
[dwg: Correct missing prototype and uninitialized variable]
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c |  35 +--
 hw/ppc/spapr_hcall.c   | 113 +++--
 include/hw/ppc/spapr.h |   2 +
 target/ppc/kvm.c   |  31 ++
 target/ppc/kvm_ppc.h   |  10 +
 5 files changed, 176 insertions(+), 15 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index ea247e6..54391a1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -40,6 +40,7 @@
 #include "kvm_ppc.h"
 #include "migration/migration.h"
 #include "mmu-hash64.h"
+#include "mmu-book3s-v3.h"
 #include "qom/cpu.h"
 
 #include "hw/boards.h"
@@ -1113,7 +1114,7 @@ static int get_htab_fd(sPAPRMachineState *spapr)
 return spapr->htab_fd;
 }
 
-static void close_htab_fd(sPAPRMachineState *spapr)
+void close_htab_fd(sPAPRMachineState *spapr)
 {
 if (spapr->htab_fd >= 0) {
 close(spapr->htab_fd);
@@ -1240,6 +1241,19 @@ static void spapr_reallocate_hpt(sPAPRMachineState 
*spapr, int shift,
 }
 }
 
+void spapr_setup_hpt_and_vrma(sPAPRMachineState *spapr)
+{
+spapr_reallocate_hpt(spapr,
+ spapr_hpt_shift_for_ramsize(MACHINE(spapr)->maxram_size),
+ _fatal);
+if (spapr->vrma_adjust) {
+spapr->rma_size = kvmppc_rma_size(spapr_node0_size(),
+  spapr->htab_shift);
+}
+/* We're setting up a hash table, so that means we're not radix */
+spapr->patb_entry = 0;
+}
+
 static void find_unknown_sysbus_device(SysBusDevice *sbdev, void *opaque)
 {
 bool matched = false;
@@ -1268,17 +1282,14 @@ static void ppc_spapr_reset(void)
 /* Check for unknown sysbus devices */
 foreach_dynamic_sysbus_device(find_unknown_sysbus_device, NULL);
 
-spapr->patb_entry = 0;
-
-/* Allocate and/or reset the hash page table */
-spapr_reallocate_hpt(spapr,
- spapr_hpt_shift_for_ramsize(machine->maxram_size),
- _fatal);
-
-/* Update the RMA size if necessary */
-if (spapr->vrma_adjust) {
-spapr->rma_size = kvmppc_rma_size(spapr_node0_size(),
-  spapr->htab_shift);
+if (kvm_enabled() && kvmppc_has_cap_mmu_radix()) {
+/* If using KVM with radix mode available, VCPUs can be started
+ * without a HPT because KVM will start them in radix mode.
+ * Set the GR bit in PATB so that we know there is no HPT. */
+spapr->patb_entry = PATBE1_GR;
+} else {
+spapr->patb_entry = 0;
+spapr_setup_hpt_and_vrma(spapr);
 }
 
 qemu_devices_reset();
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 7952129..a958fee 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -12,6 +12,8 @@
 #include "trace.h"
 #include "kvm_ppc.h"
 #include "hw/ppc/spapr_ovec.h"
+#include "qemu/error-report.h"
+#include "mmu-book3s-v3.h"
 
 struct SPRSyncState {
 int spr;
@@ -894,14 +896,119 @@ static target_ulong h_invalidate_pid(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 return H_FUNCTION;
 }
 
+static void spapr_check_setup_free_hpt(sPAPRMachineState *spapr,
+   uint64_t patbe_old, uint64_t patbe_new)
+{
+/*
+ * We have 4 Options:
+ * HASH->HASH || RADIX->RADIX || NOTHING->RADIX : Do Nothing
+ * HASH->RADIX  : Free HPT
+ * RADIX->HASH  : Allocate HPT
+ * NOTHING->HASH: Allocate HPT
+ * Note: NOTHING implies the case where we said the guest could choose
+ *   later and so assumed radix and now it's called H_REG_PROC_TBL
+ */
+
+if ((patbe_old & PATBE1_GR) == (patbe_new & PATBE1_GR)) {
+/* We assume RADIX, so this catches all the "Do Nothing" cases */
+} else if (!(patbe_old & PATBE1_GR)) {
+/* HASH->RADIX : Free HPT */
+g_free(spapr->htab);
+spapr->htab = NULL;
+spapr->htab_shift = 0;
+close_htab_fd(spapr);
+} else if (!(patbe_new & PATBE1_GR)) {
+/* RADIX->HASH 

[Qemu-devel] [PULL 06/47] spapr: Add ibm, processor-radix-AP-encodings to the device tree

2017-04-23 Thread David Gibson
From: Sam Bobroff 

Use the new ioctl, KVM_PPC_GET_RMMU_INFO, to fetch radix MMU
information from KVM and present the page encodings in the device tree
under ibm,processor-radix-AP-encodings. This provides page size
information to the guest which is necessary for it to use radix mode.

Signed-off-by: Sam Bobroff 
[dwg: Compile fix for 32-bit targets, style nit fix]
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c   | 13 +
 include/sysemu/kvm.h |  1 +
 target/ppc/cpu-qom.h |  1 +
 target/ppc/cpu.h |  4 
 target/ppc/kvm.c | 29 +
 5 files changed, 48 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 3edc3dd..ea247e6 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -459,6 +459,8 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, 
int offset,
 sPAPRDRConnector *drc;
 sPAPRDRConnectorClass *drck;
 int drc_index;
+uint32_t radix_AP_encodings[PPC_PAGE_SIZES_MAX_SZ];
+int i;
 
 drc = spapr_dr_connector_by_id(SPAPR_DR_CONNECTOR_TYPE_CPU, index);
 if (drc) {
@@ -544,6 +546,17 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, 
int offset,
 _FDT(spapr_fixup_cpu_numa_dt(fdt, offset, cs));
 
 _FDT(spapr_fixup_cpu_smt_dt(fdt, offset, cpu, compat_smt));
+
+if (pcc->radix_page_info) {
+for (i = 0; i < pcc->radix_page_info->count; i++) {
+radix_AP_encodings[i] =
+cpu_to_be32(pcc->radix_page_info->entries[i]);
+}
+_FDT((fdt_setprop(fdt, offset, "ibm,processor-radix-AP-encodings",
+  radix_AP_encodings,
+  pcc->radix_page_info->count *
+  sizeof(radix_AP_encodings[0];
+}
 }
 
 static void spapr_populate_cpus_dt_node(void *fdt, sPAPRMachineState *spapr)
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 24281fc..5cc83f2 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -527,5 +527,6 @@ int kvm_set_one_reg(CPUState *cs, uint64_t id, void 
*source);
  * Returns: 0 on success, or a negative errno on failure.
  */
 int kvm_get_one_reg(CPUState *cs, uint64_t id, void *target);
+struct ppc_radix_page_info *kvm_get_radix_page_info(void);
 int kvm_get_max_memslots(void);
 #endif
diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h
index 81500e5..d0cf6ca 100644
--- a/target/ppc/cpu-qom.h
+++ b/target/ppc/cpu-qom.h
@@ -197,6 +197,7 @@ typedef struct PowerPCCPUClass {
 int bfd_mach;
 uint32_t l1_dcache_size, l1_icache_size;
 const struct ppc_segment_page_sizes *sps;
+struct ppc_radix_page_info *radix_page_info;
 void (*init_proc)(CPUPPCState *env);
 int  (*check_pow)(CPUPPCState *env);
 int (*handle_mmu_fault)(PowerPCCPU *cpu, vaddr eaddr, int rwx, int 
mmu_idx);
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 5ee33b3..cacdd0a 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -943,6 +943,10 @@ struct ppc_segment_page_sizes {
 struct ppc_one_seg_page_size sps[PPC_PAGE_SIZES_MAX_SZ];
 };
 
+struct ppc_radix_page_info {
+uint32_t count;
+uint32_t entries[PPC_PAGE_SIZES_MAX_SZ];
+};
 
 /*/
 /* The whole PowerPC CPU context */
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 79b90a6..9dc2f7f 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -50,6 +50,7 @@
 #include "hw/ppc/spapr_cpu_core.h"
 #endif
 #include "elf.h"
+#include "sysemu/kvm_int.h"
 
 //#define DEBUG_KVM
 
@@ -333,6 +334,30 @@ static void kvm_get_smmu_info(PowerPCCPU *cpu, struct 
kvm_ppc_smmu_info *info)
 kvm_get_fallback_smmu_info(cpu, info);
 }
 
+struct ppc_radix_page_info *kvm_get_radix_page_info(void)
+{
+KVMState *s = KVM_STATE(current_machine->accelerator);
+struct ppc_radix_page_info *radix_page_info;
+struct kvm_ppc_rmmu_info rmmu_info;
+int i;
+
+if (!kvm_check_extension(s, KVM_CAP_PPC_MMU_RADIX)) {
+return NULL;
+}
+if (kvm_vm_ioctl(s, KVM_PPC_GET_RMMU_INFO, _info)) {
+return NULL;
+}
+radix_page_info = g_malloc0(sizeof(*radix_page_info));
+radix_page_info->count = 0;
+for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
+if (rmmu_info.ap_encodings[i]) {
+radix_page_info->entries[i] = rmmu_info.ap_encodings[i];
+radix_page_info->count++;
+}
+}
+return radix_page_info;
+}
+
 static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
 {
 if (!(flags & KVM_PPC_PAGE_SIZES_REAL)) {
@@ -2303,6 +2328,10 @@ static void kvmppc_host_cpu_class_init(ObjectClass *oc, 
void *data)
 if (icache_size != -1) {
 pcc->l1_icache_size = icache_size;
 }
+
+#if defined(TARGET_PPC64)
+pcc->radix_page_info = kvm_get_radix_page_info();
+#endif /* defined(TARGET_PPC64) */
 }
 
 bool kvmppc_has_cap_epr(void)
-- 
2.9.3




[Qemu-devel] [PULL 12/47] spapr: Workaround for broken radix guests

2017-04-23 Thread David Gibson
From: Sam Bobroff 

For a little while around 4.9, Linux kernels that saw the radix bit in
ibm,pa-features would attempt to set up the MMU as if they were a
hypervisor, even if they were a guest, which would cause them to
crash.

Work around this by detecting pre-ISA 3.0 guests by their lack of that
bit in option vector 1, and then removing the radix bit from
ibm,pa-features. Note: This now requires regeneration of that node
after CAS negotiation.

Signed-off-by: Sam Bobroff 
[dwg: Fix style nits]
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c  | 15 +--
 hw/ppc/spapr_hcall.c|  6 --
 include/hw/ppc/spapr.h  |  1 +
 include/hw/ppc/spapr_ovec.h |  3 +++
 4 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index d967ec3..a355512 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -228,7 +228,8 @@ static int spapr_fixup_cpu_numa_dt(void *fdt, int offset, 
CPUState *cs)
 }
 
 /* Populate the "ibm,pa-features" property */
-static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset)
+static void spapr_populate_pa_features(CPUPPCState *env, void *fdt, int offset,
+  bool legacy_guest)
 {
 uint8_t pa_features_206[] = { 6, 0,
 0xf6, 0x1f, 0xc7, 0x00, 0x80, 0xc0 };
@@ -295,6 +296,12 @@ static void spapr_populate_pa_features(CPUPPCState *env, 
void *fdt, int offset)
 if (kvmppc_has_cap_htm() && pa_size > 24) {
 pa_features[24] |= 0x80;/* Transactional memory support */
 }
+if (legacy_guest && pa_size > 40) {
+/* Workaround for broken kernels that attempt (guest) radix
+ * mode when they can't handle it, if they see the radix bit set
+ * in pa-features. So hide it from them. */
+pa_features[40 + 2] &= ~0x80; /* Radix MMU */
+}
 
 _FDT((fdt_setprop(fdt, offset, "ibm,pa-features", pa_features, pa_size)));
 }
@@ -309,6 +316,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState 
*spapr)
 
 CPU_FOREACH(cs) {
 PowerPCCPU *cpu = POWERPC_CPU(cs);
+CPUPPCState *env = >env;
 DeviceClass *dc = DEVICE_GET_CLASS(cs);
 int index = ppc_get_vcpu_dt_id(cpu);
 int compat_smt = MIN(smp_threads, ppc_compat_max_threads(cpu));
@@ -350,6 +358,9 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState 
*spapr)
 if (ret < 0) {
 return ret;
 }
+
+spapr_populate_pa_features(env, fdt, offset,
+ spapr->cas_legacy_guest_workaround);
 }
 return ret;
 }
@@ -547,7 +558,7 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, 
int offset,
   page_sizes_prop, page_sizes_prop_size)));
 }
 
-spapr_populate_pa_features(env, fdt, offset);
+spapr_populate_pa_features(env, fdt, offset, false);
 
 _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
cs->cpu_index / vcpus_per_socket)));
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index cbd1f29..9f18f75 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1062,7 +1062,7 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu,
 uint32_t max_compat = cpu->max_compat;
 uint32_t best_compat = 0;
 int i;
-sPAPROptionVector *ov5_guest, *ov5_cas_old, *ov5_updates;
+sPAPROptionVector *ov1_guest, *ov5_guest, *ov5_cas_old, *ov5_updates;
 bool guest_radix;
 
 /*
@@ -1114,6 +1114,7 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu,
 /* For the future use: here @ov_table points to the first option vector */
 ov_table = list;
 
+ov1_guest = spapr_ovec_parse_vector(ov_table, 1);
 ov5_guest = spapr_ovec_parse_vector(ov_table, 5);
 if (spapr_ovec_test(ov5_guest, OV5_MMU_BOTH)) {
 error_report("guest requested hash and radix MMU, which is invalid.");
@@ -1155,7 +1156,8 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu,
 exit(EXIT_FAILURE);
 }
 }
-
+spapr->cas_legacy_guest_workaround = !spapr_ovec_test(ov1_guest,
+  OV1_PPC_3_00);
 if (!spapr->cas_reboot) {
 spapr->cas_reboot =
 (spapr_h_cas_compose_response(spapr, args[1], args[2],
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index d234efc..e27de64 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -89,6 +89,7 @@ struct sPAPRMachineState {
 sPAPROptionVector *ov5; /* QEMU-supported option vectors */
 sPAPROptionVector *ov5_cas; /* negotiated (via CAS) option vectors */
 bool cas_reboot;
+bool cas_legacy_guest_workaround;
 
 Notifier epow_notifier;
 QTAILQ_HEAD(, sPAPREventLogEntry) pending_events;
diff --git a/include/hw/ppc/spapr_ovec.h b/include/hw/ppc/spapr_ovec.h

[Qemu-devel] [PULL 14/47] spapr_pci: Warn when RAM page size is not enabled in IOMMU page mask

2017-04-23 Thread David Gibson
From: Alexey Kardashevskiy 

If a page size used by QEMU is not enabled in the PHB IOMMU page mask,
in-kernel acceleration of TCE handling won't be enabled and performance
might be slower than expected.

This prints a warning if system page size is not enabled. This should
print a warning if huge pages are enabled but sphb.pgsz still uses
the default value of 4K|64K.

Signed-off-by: Alexey Kardashevskiy 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_pci.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 98c52e4..097ebdd 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -1771,6 +1771,12 @@ static void spapr_phb_realize(DeviceState *dev, Error 
**errp)
 }
 
 /* DMA setup */
+if ((sphb->page_size_mask & qemu_getrampagesize()) == 0) {
+error_report("System page size 0x%lx is not enabled in page_size_mask "
+ "(0x%"PRIx64"). Performance may be slow",
+ qemu_getrampagesize(), sphb->page_size_mask);
+}
+
 for (i = 0; i < windows_supported; ++i) {
 tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn[i]);
 if (!tcet) {
-- 
2.9.3




[Qemu-devel] [PULL 03/47] ppc/spapr: QOM'ify sPAPRRTCState

2017-04-23 Thread David Gibson
From: Cédric Le Goater 

Also use an 'sPAPRRTCState' attribute under the sPAPR machine to hold
the RTC object. Overall, these changes remove an unnecessary and
implicit dependency on SysBus.

Signed-off-by: Cédric Le Goater 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 16 
 hw/ppc/spapr_events.c  |  2 +-
 hw/ppc/spapr_rtc.c | 41 +++--
 include/hw/ppc/spapr.h | 21 -
 include/hw/ppc/xics.h  |  2 +-
 5 files changed, 33 insertions(+), 49 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index de5db75..3edc3dd 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1333,13 +1333,13 @@ static void spapr_create_nvram(sPAPRMachineState *spapr)
 
 static void spapr_rtc_create(sPAPRMachineState *spapr)
 {
-DeviceState *dev = qdev_create(NULL, TYPE_SPAPR_RTC);
-
-qdev_init_nofail(dev);
-spapr->rtc = dev;
-
-object_property_add_alias(qdev_get_machine(), "rtc-time",
-  OBJECT(spapr->rtc), "date", NULL);
+object_initialize(>rtc, sizeof(spapr->rtc), TYPE_SPAPR_RTC);
+object_property_add_child(OBJECT(spapr), "rtc", OBJECT(>rtc),
+  _fatal);
+object_property_set_bool(OBJECT(>rtc), true, "realized",
+  _fatal);
+object_property_add_alias(OBJECT(spapr), "rtc-time", OBJECT(>rtc),
+  "date", _fatal);
 }
 
 /* Returns whether we want to use VGA or not */
@@ -1377,7 +1377,7 @@ static int spapr_post_load(void *opaque, int version_id)
  * So when migrating from those versions, poke the incoming offset
  * value into the RTC device */
 if (version_id < 3) {
-err = spapr_rtc_import_offset(spapr->rtc, spapr->rtc_offset);
+err = spapr_rtc_import_offset(>rtc, spapr->rtc_offset);
 }
 
 return err;
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 24a5758..f0b28d8 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -422,7 +422,7 @@ static void spapr_init_maina(struct rtas_event_log_v6_maina 
*maina,
 maina->hdr.section_id = cpu_to_be16(RTAS_LOG_V6_SECTION_ID_MAINA);
 maina->hdr.section_length = cpu_to_be16(sizeof(*maina));
 /* FIXME: section version, subtype and creator id? */
-spapr_rtc_read(spapr->rtc, , NULL);
+spapr_rtc_read(>rtc, , NULL);
 year = tm.tm_year + 1900;
 maina->creation_date = cpu_to_be32((to_bcd(year / 100) << 24)
| (to_bcd(year % 100) << 16)
diff --git a/hw/ppc/spapr_rtc.c b/hw/ppc/spapr_rtc.c
index 3a17ac4..00a4e4c 100644
--- a/hw/ppc/spapr_rtc.c
+++ b/hw/ppc/spapr_rtc.c
@@ -33,19 +33,8 @@
 #include "qapi-event.h"
 #include "qemu/cutils.h"
 
-#define SPAPR_RTC(obj) \
-OBJECT_CHECK(sPAPRRTCState, (obj), TYPE_SPAPR_RTC)
-
-typedef struct sPAPRRTCState sPAPRRTCState;
-struct sPAPRRTCState {
-/*< private >*/
-SysBusDevice parent_obj;
-int64_t ns_offset;
-};
-
-void spapr_rtc_read(DeviceState *dev, struct tm *tm, uint32_t *ns)
+void spapr_rtc_read(sPAPRRTCState *rtc, struct tm *tm, uint32_t *ns)
 {
-sPAPRRTCState *rtc = SPAPR_RTC(dev);
 int64_t host_ns = qemu_clock_get_ns(rtc_clock);
 int64_t guest_ns;
 time_t guest_s;
@@ -63,16 +52,12 @@ void spapr_rtc_read(DeviceState *dev, struct tm *tm, 
uint32_t *ns)
 }
 }
 
-int spapr_rtc_import_offset(DeviceState *dev, int64_t legacy_offset)
+int spapr_rtc_import_offset(sPAPRRTCState *rtc, int64_t legacy_offset)
 {
-sPAPRRTCState *rtc;
-
-if (!dev) {
+if (!rtc) {
 return -ENODEV;
 }
 
-rtc = SPAPR_RTC(dev);
-
 rtc->ns_offset = legacy_offset * NANOSECONDS_PER_SECOND;
 
 return 0;
@@ -91,12 +76,7 @@ static void rtas_get_time_of_day(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 return;
 }
 
-if (!spapr->rtc) {
-rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
-return;
-}
-
-spapr_rtc_read(spapr->rtc, , );
+spapr_rtc_read(>rtc, , );
 
 rtas_st(rets, 0, RTAS_OUT_SUCCESS);
 rtas_st(rets, 1, tm.tm_year + 1900);
@@ -113,7 +93,7 @@ static void rtas_set_time_of_day(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
  target_ulong args,
  uint32_t nret, target_ulong rets)
 {
-sPAPRRTCState *rtc;
+sPAPRRTCState *rtc = >rtc;
 struct tm tm;
 time_t new_s;
 int64_t host_ns;
@@ -123,11 +103,6 @@ static void rtas_set_time_of_day(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 return;
 }
 
-if (!spapr->rtc) {
-rtas_st(rets, 0, RTAS_OUT_HW_ERROR);
-return;
-}
-
 tm.tm_year = rtas_ld(args, 0) - 1900;
 tm.tm_mon = rtas_ld(args, 1) - 1;
 tm.tm_mday = rtas_ld(args, 2);
@@ -144,8 +119,6 @@ static void rtas_set_time_of_day(PowerPCCPU *cpu, 
sPAPRMachineState *spapr,
 /* Generate a monitor event for the change */
 

[Qemu-devel] [PULL 00/47] ppc-for-2.10 queue 20170424

2017-04-23 Thread David Gibson
The following changes since commit 32c7e0ab755745e961f1772e95cac381cc68769d:

  Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170421' 
into staging (2017-04-21 15:59:27 +0100)

are available in the git repository at:

  git://github.com/dgibson/qemu.git tags/ppc-for-2.10-20170424

for you to fetch changes up to 4cab48942a1c5353f0a314fab1aa85a5f0a61461:

  target/ppc: Style fixes (2017-04-24 08:56:19 +1000)


ppc patch queue 2017-04-24

Here's my first pull request for qemu-2.10, consisting of assorted
patches which have accumulated while qemu-2.9 stabilized.  Highlights
are:
* Rework / cleanup of the XICS interrupt controller
* Substantial improvement to the 'powernv' machine type
- Includes an MMIO XICS version
* POWER9 support improvements
- POWER9 guests with KVM
- Partial support for POWER9 guests with TCG
* IOMMU and VFIO improvements
* Assorted minor changes

There are several IPMI patches here that aren't usually in my area of
maintenance, but there isn't a regular maintainer and these patches
are for the benefit of the powernv machine type.


Alexey Kardashevskiy (4):
  target-ppc: kvm: make use of KVM_CREATE_SPAPR_TCE_64
  target-ppc/kvm: Enable in-kernel TCE acceleration for multi-tce
  spapr_pci: Warn when RAM page size is not enabled in IOMMU page mask
  spapr_pci: Removed unused include

Anton Blanchard (1):
  target/ppc: Fix size of struct PPCElfPrstatus

Benjamin Herrenschmidt (2):
  ppc/pnv: Add OCC model stub with interrupt support
  ppc/pnv: Add support for POWER8+ LPC Controller

Bernhard Kaindl (1):
  e500,book3s: mfspr 259: Register mapped/aliased SPRG3 user read

Bharata B Rao (1):
  spapr-cpu-core: Release ICPState object during CPU unrealization

Cédric Le Goater (25):
  ppc/spapr: QOM'ify sPAPRRTCState
  ppc/xics: introduce an 'intc' backlink under PowerPCCPU
  spapr: move the IRQ server number mapping under the machine
  spapr: allocate the ICPState object from under sPAPRCPUCore
  ppc/xics: add a realize() handler to ICPStateClass
  ppc/pnv: add a PnvICPState object
  ppc/pnv: extend the machine with a XICSFabric interface
  ppc/pnv: extend the machine with a InterruptStatsProvider interface
  ppc/pnv: create the ICP object under PnvCore
  ppc/pnv: add a helper to calculate MMIO addresses registers
  ppc/pnv: add memory regions for the ICP registers
  ppc/pnv: Add cut down PSI bridge model and hookup external interrupt
  ppc: add IPMI support
  ipmi: use a file to load SDRs
  ipmi: provide support for FRUs
  ipmi: introduce an ipmi_bmc_sdr_find() API
  ipmi: introduce an ipmi_bmc_gen_event() API
  spapr: remove the 'nr_servers' field from the machine
  ppc/pnv: enable only one LPC bus
  ppc/pnv: scan ISA bus to populate device tree
  ppc/pnv: populate device tree for RTC devices
  ppc/pnv: populate device tree for serial devices
  ppc/pnv: populate device tree for IPMI BT devices
  ppc/pnv: add initial IPMI sensors for the BMC simulator
  ppc/pnv: generate an OEM SEL event on shutdown

David Gibson (2):
  pseries: Add pseries-2.10 machine type
  target/ppc: Style fixes

Sam Bobroff (6):
  target/ppc: Improve accuracy of guest HTM availability on P8s
  spapr: Add ibm,processor-radix-AP-encodings to the device tree
  target-ppc: support KVM_CAP_PPC_MMU_RADIX, KVM_CAP_PPC_MMU_HASH_V3
  spapr: move spapr_populate_pa_features()
  spapr: Enable ISA 3.0 MMU mode selection via CAS
  spapr: Workaround for broken radix guests

Suraj Jitindar Singh (4):
  target/ppc: Add new H-CALL shells for in memory table translation
  target/ppc: Implement H_REGISTER_PROCESS_TABLE H_CALL
  target/ppc: Add ibm,processor-radix-AP-encodings for TCG
  target/ppc: Flush TLB on write to PIDR

Thomas Huth (1):
  hw/ppc/pnv: Classify the "PowerNV Chip" devices as CPU devices

 default-configs/ppc64-softmmu.mak |   4 +
 hw/intc/Makefile.objs |   1 +
 hw/intc/xics.c|  22 +-
 hw/intc/xics_pnv.c| 192 +
 hw/intc/xics_spapr.c  |  25 +-
 hw/ipmi/ipmi_bmc_sim.c| 191 -
 hw/ppc/Makefile.objs  |   2 +-
 hw/ppc/pnv.c  | 411 ---
 hw/ppc/pnv_bmc.c  | 122 
 hw/ppc/pnv_core.c |  27 +-
 hw/ppc/pnv_lpc.c  | 106 ++-
 hw/ppc/pnv_occ.c  | 136 +
 hw/ppc/pnv_psi.c  | 571 ++
 hw/ppc/spapr.c| 371 +++--
 hw/ppc/spapr_cpu_core.c   |  17 +-
 hw/ppc/spapr_events.c |   2 +-
 hw/ppc/spapr_hcall.c  | 174 

[Qemu-devel] [PULL 05/47] target-ppc: kvm: make use of KVM_CREATE_SPAPR_TCE_64

2017-04-23 Thread David Gibson
From: Alexey Kardashevskiy 

KVM_CAP_SPAPR_TCE capability allows creating TCE tables in KVM which
allows having in-kernel acceleration for H_PUT_TCE_xxx hypercalls.
However it only supports 32bit DMA windows at zero bus offset.

There is a new KVM_CAP_SPAPR_TCE_64 capability which supports 64bit
window size, variable page size and bus offset.

This makes use of the new capability. The kernel headers are already
updated as the kernel support went in to v4.6.

Signed-off-by: Alexey Kardashevskiy 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr_iommu.c |  8 +---
 target/ppc/kvm.c | 48 +---
 target/ppc/kvm_ppc.h | 12 +++-
 3 files changed, 49 insertions(+), 19 deletions(-)

diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index ae30bbe..29c80bb 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -79,15 +79,16 @@ static IOMMUAccessFlags 
spapr_tce_iommu_access_flags(uint64_t tce)
 
 static uint64_t *spapr_tce_alloc_table(uint32_t liobn,
uint32_t page_shift,
+   uint64_t bus_offset,
uint32_t nb_table,
int *fd,
bool need_vfio)
 {
 uint64_t *table = NULL;
-uint64_t window_size = (uint64_t)nb_table << page_shift;
 
-if (kvm_enabled() && !(window_size >> 32)) {
-table = kvmppc_create_spapr_tce(liobn, window_size, fd, need_vfio);
+if (kvm_enabled()) {
+table = kvmppc_create_spapr_tce(liobn, page_shift, bus_offset, 
nb_table,
+fd, need_vfio);
 }
 
 if (!table) {
@@ -342,6 +343,7 @@ void spapr_tce_table_enable(sPAPRTCETable *tcet,
 tcet->nb_table = nb_table;
 tcet->table = spapr_tce_alloc_table(tcet->liobn,
 tcet->page_shift,
+tcet->bus_offset,
 tcet->nb_table,
 >fd,
 tcet->need_vfio);
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index c3d4262..79b90a6 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -74,6 +74,7 @@ static int cap_booke_sregs;
 static int cap_ppc_smt;
 static int cap_ppc_rma;
 static int cap_spapr_tce;
+static int cap_spapr_tce_64;
 static int cap_spapr_multitce;
 static int cap_spapr_vfio;
 static int cap_hior;
@@ -126,6 +127,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
 cap_ppc_smt = kvm_check_extension(s, KVM_CAP_PPC_SMT);
 cap_ppc_rma = kvm_check_extension(s, KVM_CAP_PPC_RMA);
 cap_spapr_tce = kvm_check_extension(s, KVM_CAP_SPAPR_TCE);
+cap_spapr_tce_64 = kvm_check_extension(s, KVM_CAP_SPAPR_TCE_64);
 cap_spapr_multitce = kvm_check_extension(s, KVM_CAP_SPAPR_MULTITCE);
 cap_spapr_vfio = false;
 cap_one_reg = kvm_check_extension(s, KVM_CAP_ONE_REG);
@@ -2136,13 +2138,10 @@ bool kvmppc_spapr_use_multitce(void)
 return cap_spapr_multitce;
 }
 
-void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t window_size, int *pfd,
-  bool need_vfio)
+void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t page_shift,
+  uint64_t bus_offset, uint32_t nb_table,
+  int *pfd, bool need_vfio)
 {
-struct kvm_create_spapr_tce args = {
-.liobn = liobn,
-.window_size = window_size,
-};
 long len;
 int fd;
 void *table;
@@ -2155,14 +2154,41 @@ void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t 
window_size, int *pfd,
 return NULL;
 }
 
-fd = kvm_vm_ioctl(kvm_state, KVM_CREATE_SPAPR_TCE, );
-if (fd < 0) {
-fprintf(stderr, "KVM: Failed to create TCE table for liobn 0x%x\n",
-liobn);
+if (cap_spapr_tce_64) {
+struct kvm_create_spapr_tce_64 args = {
+.liobn = liobn,
+.page_shift = page_shift,
+.offset = bus_offset >> page_shift,
+.size = nb_table,
+.flags = 0
+};
+fd = kvm_vm_ioctl(kvm_state, KVM_CREATE_SPAPR_TCE_64, );
+if (fd < 0) {
+fprintf(stderr,
+"KVM: Failed to create TCE64 table for liobn 0x%x\n",
+liobn);
+return NULL;
+}
+} else if (cap_spapr_tce) {
+uint64_t window_size = (uint64_t) nb_table << page_shift;
+struct kvm_create_spapr_tce args = {
+.liobn = liobn,
+.window_size = window_size,
+};
+if ((window_size != args.window_size) || bus_offset) {
+return NULL;
+}
+fd = kvm_vm_ioctl(kvm_state, KVM_CREATE_SPAPR_TCE, );
+if (fd < 0) {
+fprintf(stderr, "KVM: Failed to create TCE table for liobn 0x%x\n",
+

[Qemu-devel] [PULL 04/47] hw/ppc/pnv: Classify the "PowerNV Chip" devices as CPU devices

2017-04-23 Thread David Gibson
From: Thomas Huth 

The devices that are derived from TYPE_PNV_CHIP currently show up
as "uncategorized" devices in the help text of "-device ?". Since
they obviously are related to the CPU, let's put them into the
CPU category instead.

Signed-off-by: Thomas Huth 
Signed-off-by: David Gibson 
---
 hw/ppc/pnv.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 3fa722a..aad7917 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -723,6 +723,7 @@ static void pnv_chip_class_init(ObjectClass *klass, void 
*data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 
+set_bit(DEVICE_CATEGORY_CPU, dc->categories);
 dc->realize = pnv_chip_realize;
 dc->props = pnv_chip_properties;
 dc->desc = "PowerNV Chip";
-- 
2.9.3




[Qemu-devel] [PULL 01/47] target/ppc: Improve accuracy of guest HTM availability on P8s

2017-04-23 Thread David Gibson
From: Sam Bobroff 

On Power8 hosts it is currently theoretically possible for QEMU/KVM-HV guests
to receive a ibm,pa-features property indicating that HTM support is available
when it is not.  The situation would occur if the platform firmware of
a Power8 host cleared the HTM bit of the ibm,pa-features property.
QEMU would query KVM for the availability of HTM, which will return no
support, but workaround code in kvm_arch_init_vcpu() would then
re-enable it because KVM_HV is in use and the processor is P8.

This patch adjusts the workaround in kvm_arch_init_vcpu() so that it does not
enable HTM (in the above case) unless the host kernel indicates to the QEMU
process, via the auxiliary vector, that userspace can use HTM (via the HWCAP2
bit KVM_FEATURE2_HTM).

The reason to use the value from the auxiliary vector is that it is
set based only on what the host kernel found in the ibm,pa-features
HTM bit at boot time.

Signed-off-by: Sam Bobroff 
Signed-off-by: David Gibson 
---
 target/ppc/kvm.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 64017ac..c3d4262 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -49,6 +49,7 @@
 #if defined(TARGET_PPC64)
 #include "hw/ppc/spapr_cpu_core.h"
 #endif
+#include "elf.h"
 
 //#define DEBUG_KVM
 
@@ -509,8 +510,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
 case POWERPC_MMU_2_07:
 if (!cap_htm && !kvmppc_is_pr(cs->kvm_state)) {
 /* KVM-HV has transactional memory on POWER8 also without the
- * KVM_CAP_PPC_HTM extension, so enable it here instead. */
-cap_htm = true;
+ * KVM_CAP_PPC_HTM extension, so enable it here instead as
+ * long as it's availble to userspace on the host. */
+if (qemu_getauxval(AT_HWCAP2) & PPC_FEATURE2_HAS_HTM) {
+cap_htm = true;
+}
 }
 break;
 default:
-- 
2.9.3




[Qemu-devel] [PULL 02/47] pseries: Add pseries-2.10 machine type

2017-04-23 Thread David Gibson
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 35db949..de5db75 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3158,18 +3158,37 @@ static const TypeInfo spapr_machine_info = {
 type_init(spapr_machine_register_##suffix)
 
 /*
+ * pseries-2.10
+ */
+static void spapr_machine_2_10_instance_options(MachineState *machine)
+{
+}
+
+static void spapr_machine_2_10_class_options(MachineClass *mc)
+{
+/* Defaults for the latest behaviour inherited from the base class */
+}
+
+DEFINE_SPAPR_MACHINE(2_10, "2.10", true);
+
+/*
  * pseries-2.9
  */
+#define SPAPR_COMPAT_2_9   \
+HW_COMPAT_2_9
+
 static void spapr_machine_2_9_instance_options(MachineState *machine)
 {
+spapr_machine_2_10_instance_options(machine);
 }
 
 static void spapr_machine_2_9_class_options(MachineClass *mc)
 {
-/* Defaults for the latest behaviour inherited from the base class */
+spapr_machine_2_10_class_options(mc);
+SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_9);
 }
 
-DEFINE_SPAPR_MACHINE(2_9, "2.9", true);
+DEFINE_SPAPR_MACHINE(2_9, "2.9", false);
 
 /*
  * pseries-2.8
-- 
2.9.3




Re: [Qemu-devel] Subject: [PATCH] target-ppc: Add global timer group A to open-pic.

2017-04-23 Thread alarson
David Gibson  wrote on 04/23/2017 06:17:22 
PM:

> From: David Gibson 
> To: Aaron Larson 
> Cc: ag...@suse.de, qemu-devel@nongnu.org, qemu-...@nongnu.org
> Date: 04/23/2017 06:54 PM
> Subject: Re: Subject: [PATCH] target-ppc: Add global timer group A to 
open-pic.
> 
> On Fri, Apr 21, 2017 at 02:58:23PM -0700, Aaron Larson wrote:
> > Add global timer group A to open-pic.  This patch is still somewhat
> > dubious because I'm not sure how to determine what QEMU wants for the
> > timer frequency.  Suggestions solicited.
> 
> This commit message really needs some more context.  What's a "global
> time group A" a and why do we want it?

The open-pic spec includes two sets/groups of timers, groups A and B,
4 timers in each group.  Previously in QEMU the timer group A
registers were implemented but they did not "count" or generate any
interrupts.  The patch makes the timers to do both (count and generate
interrupts).

About a year ago I mentioned that we had implemented this and offered
to submit a patch, which seemed to be acceptable.  Sadly, when I
reviewed the implementation it had several egregious errors that I
didn't know how to fix until recently.

Quite frankly I didn't expect the patch to be accepted in its current
form for the reason mentioned in the above commit log and was hoping
for some guidance.  If this is no longer a desirable patch, I can
certainly continue to maintain it locally for our use.\

Aaron.



Re: [Qemu-devel] [PATCH v2 0/9] Openrisc misc features / fixes

2017-04-23 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: cover.1492986468.git.sho...@gmail.com
Subject: [Qemu-devel] [PATCH v2 0/9] Openrisc misc features / fixes
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
e25195d target/openrisc: Remove duplicate features property
7ed50bd target/openrisc: Implement full vmstate serialization
49d0127 migration: Add VMSTATE_STRUCT_2DARRAY()
27c47a9 target/openrisc: implement shadow registers
7cf6151 migration: Add VMSTATE_UINTTL_2DARRAY()
649863f target/openrisc: add numcores and coreid support
dab96ab target/openrisc: Fixes for memory debugging
9fe0bd9 target/openrisc: Implement EPH bit
42314a9 target/openrisc: Implement EVBAR register

=== OUTPUT BEGIN ===
Checking PATCH 1/9: target/openrisc: Implement EVBAR register...
Checking PATCH 2/9: target/openrisc: Implement EPH bit...
Checking PATCH 3/9: target/openrisc: Fixes for memory debugging...
WARNING: line over 80 characters
#45: FILE: target/openrisc/mmu.c:232:
+miss = cpu_openrisc_get_phys_addr(cpu, _addr, , addr, 
MMU_INST_FETCH);

total: 0 errors, 1 warnings, 38 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 4/9: target/openrisc: add numcores and coreid support...
Checking PATCH 5/9: migration: Add VMSTATE_UINTTL_2DARRAY()...
Checking PATCH 6/9: target/openrisc: implement shadow registers...
WARNING: line over 80 characters
#135: FILE: linux-user/signal.c:4492:
+__put_user(sas_ss_flags(cpu_get_gpr(env, 1)), 
>uc.tuc_stack.ss_flags);

total: 0 errors, 1 warnings, 230 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 7/9: migration: Add VMSTATE_STRUCT_2DARRAY()...
ERROR: line over 90 characters
#20: FILE: include/migration/vmstate.h:503:
+#define VMSTATE_STRUCT_2DARRAY_TEST(_field, _state, _n1, _n2, _test, _version, 
_vmsd, _type) { \

ERROR: spaces required around that '|' (ctx:VxV)
#27: FILE: include/migration/vmstate.h:510:
+.flags= VMS_STRUCT|VMS_ARRAY,   \
   ^

WARNING: line over 80 characters
#38: FILE: include/migration/vmstate.h:761:
+#define VMSTATE_STRUCT_2DARRAY(_field, _state, _n1, _n2, _version, _vmsd, 
_type) \

total: 2 errors, 1 warnings, 27 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 8/9: target/openrisc: Implement full vmstate serialization...
Checking PATCH 9/9: target/openrisc: Remove duplicate features property...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] Subject: [PATCH] target-ppc: Add global timer group A to open-pic.

2017-04-23 Thread David Gibson
On Fri, Apr 21, 2017 at 02:58:23PM -0700, Aaron Larson wrote:
> Add global timer group A to open-pic.  This patch is still somewhat
> dubious because I'm not sure how to determine what QEMU wants for the
> timer frequency.  Suggestions solicited.

This commit message really needs some more context.  What's a "global
time group A" a and why do we want it?

> 
> Signed-off-by: Aaron Larson 
> ---
>  hw/intc/openpic.c | 134 
> ++
>  1 file changed, 116 insertions(+), 18 deletions(-)
> 
> diff --git a/hw/intc/openpic.c b/hw/intc/openpic.c
> index 4349e45..769a31a 100644
> --- a/hw/intc/openpic.c
> +++ b/hw/intc/openpic.c
> @@ -45,6 +45,7 @@
>  #include "qemu/bitops.h"
>  #include "qapi/qmp/qerror.h"
>  #include "qemu/log.h"
> +#include "qemu/timer.h"
>  
>  //#define DEBUG_OPENPIC
>  
> @@ -54,8 +55,10 @@ static const int debug_openpic = 1;
>  static const int debug_openpic = 0;
>  #endif
>  
> +static int get_current_cpu(void);
>  #define DPRINTF(fmt, ...) do { \
>  if (debug_openpic) { \
> +printf("Core%d: ", get_current_cpu()); \
>  printf(fmt , ## __VA_ARGS__); \
>  } \
>  } while (0)
> @@ -246,9 +249,24 @@ typedef struct IRQSource {
>  #define IDR_EP  0x8000  /* external pin */
>  #define IDR_CI  0x4000  /* critical interrupt */
>  
> +/* Conversion between ticks and nanosecs: TODO: need better way for this.
> + Assume a 100mhz clock, divided by 8, or 25mhz
> + 25,000,000 ticks/sec, 25,000/ms, 25/us, 1 tick/40ns
> +*/
> +#define CONV_FACTOR 40LL
> +static inline uint64_t ns_to_ticks(uint64_t ns)   { return ns   / 
> CONV_FACTOR; }
> +static inline uint64_t ticks_to_ns(uint64_t tick) { return tick * 
> CONV_FACTOR; }
> +
>  typedef struct OpenPICTimer {
>  uint32_t tccr;  /* Global timer current count register */
>  uint32_t tbcr;  /* Global timer base count register */
> +int   n_IRQ;
> +bool  qemu_timer_active; /* Is the qemu_timer is 
> running? */
> +struct QEMUTimer *qemu_timer;   /* May be 0 if not created. */
> +struct OpenPICState  *opp;  /* device timer is part of. */
> +/* the time corresponding to the last current_count written or read,
> +   only defined if qemu_timer_active. */
> +uint64_t  originTime;
>  } OpenPICTimer;
>  
>  typedef struct OpenPICMSI {
> @@ -795,37 +813,102 @@ static uint64_t openpic_gbl_read(void *opaque, hwaddr 
> addr, unsigned len)
>  return retval;
>  }
>  
> +static void openpic_tmr_set_tmr(OpenPICTimer *tmr, uint32_t val, bool 
> enabled);
> +
> +static void qemu_timer_cb(void *opaque)
> +{
> +OpenPICTimer *tmr = opaque;
> +OpenPICState *opp = tmr->opp;
> +uint32_t  n_IRQ = tmr->n_IRQ;
> +uint32_t val =   tmr->tbcr & ~TBCR_CI;
> +uint32_t tog = ((tmr->tccr & TCCR_TOG) ^ TCCR_TOG);  /* invert toggle. */
> +
> +DPRINTF("%s n_IRQ=%d\n", __func__, n_IRQ);
> +/* reload current count from base count and setup timer. */
> +tmr->tccr = val | tog;
> +openpic_tmr_set_tmr(tmr, val, /*enabled=*/true);
> +/* raise the interrupt. */
> +opp->src[n_IRQ].destmask = read_IRQreg_idr(opp, n_IRQ);
> +openpic_set_irq(opp, n_IRQ, 1);
> +openpic_set_irq(opp, n_IRQ, 0);
> +}
> +
> +/* If enabled is true, arranges for an interrupt to be raised val clocks into
> +   the future, if enabled is false cancels the timer. */
> +static void openpic_tmr_set_tmr(OpenPICTimer *tmr, uint32_t val, bool 
> enabled)
> +{
> +/* if timer doesn't exist, create it. */
> +if (tmr->qemu_timer == 0) {
> +tmr->qemu_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, _timer_cb, 
> tmr);
> +DPRINTF("Created timer for n_IRQ %d\n", tmr->n_IRQ);
> +}
> +uint64_t ns = ticks_to_ns(val & ~TCCR_TOG);
> +/* A count of zero causes a timer to be set to expire immediately.  This
> +   effectively stops the simulation so we don't honor that configuration.
> +   On real hardware, this would generate an interrupt on every clock 
> cycle
> +   if the interrupt was unmasked. */
> +if ((ns == 0) || !enabled) {
> +tmr->qemu_timer_active = false;
> +tmr->tccr = tmr->tccr & TCCR_TOG;
> +timer_del(tmr->qemu_timer); /* set timer to never expire. */
> +} else {
> +tmr->qemu_timer_active = true;
> +uint64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
> +tmr->originTime = now;
> +timer_mod(tmr->qemu_timer, now + ns); /* set timer expiration. */
> +}
> +}
> +
> +/* Returns the currrent tccr value, i.e., timer value (in clocks) with
> +   appropriate TOG. */
> +static uint64_t openpic_tmr_get_timer(OpenPICTimer *tmr)
> +{
> +uint64_t retval;
> +if (!tmr->qemu_timer_active) {
> +retval = tmr->tccr;
> +} else {
> +uint64_t now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
> +uint64_t used = now - tmr->originTime;  /* nsecs */
> + 

Re: [Qemu-devel] [PATCH v2] s390x/misc_helper.c: wrap s390_virtio_hypercall in BQL

2017-04-23 Thread Philippe Mathieu-Daudé

On 04/23/2017 07:47 PM, Aurelien Jarno wrote:

On 2017-04-23 19:38, Philippe Mathieu-Daudé wrote:

Hi Aurelien!

Why don't lock inside s390_virtio_hypercall() directly round the diag500
dispatch call?


s390_virtio_hypercall is shared between TCG and KVM. For KVM the lock is
already done before calling s390_virtio_hypercall in kvm_arch_handle_exit.


Fair enough!

Reviewed-by: Philippe Mathieu-Daudé 



Re: [Qemu-devel] [PATCH v2] s390x/misc_helper.c: wrap s390_virtio_hypercall in BQL

2017-04-23 Thread Aurelien Jarno
On 2017-04-23 19:38, Philippe Mathieu-Daudé wrote:
> Hi Aurelien!
> 
> Why don't lock inside s390_virtio_hypercall() directly round the diag500
> dispatch call?

s390_virtio_hypercall is shared between TCG and KVM. For KVM the lock is
already done before calling s390_virtio_hypercall in kvm_arch_handle_exit.

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



[Qemu-devel] [PATCH v2 8/9] target/openrisc: Implement full vmstate serialization

2017-04-23 Thread Stafford Horne
Previously serialization did not persist the tlb, timer, pic and other
key state items.  This meant snapshotting and restoring a running os
would crash. After adding these I am able to take snapshots of a
running linux os and restore at a later time.

I am currently not trying to maintain capatibility with older versions
as I do not believe this really worked before or anyone used it.

Signed-off-by: Stafford Horne 
---
Changes since v1:

 o Added evbar
 o Bumped version on vmstate_env

 target/openrisc/machine.c | 73 +--
 1 file changed, 71 insertions(+), 2 deletions(-)

diff --git a/target/openrisc/machine.c b/target/openrisc/machine.c
index 2bf71c3..a82be62 100644
--- a/target/openrisc/machine.c
+++ b/target/openrisc/machine.c
@@ -24,6 +24,63 @@
 #include "hw/boards.h"
 #include "migration/cpu.h"
 
+static int env_post_load(void *opaque, int version_id)
+{
+CPUOpenRISCState *env = opaque;
+
+/* Restore MMU handlers */
+if (env->sr & SR_DME) {
+env->tlb->cpu_openrisc_map_address_data =
+_openrisc_get_phys_data;
+} else {
+env->tlb->cpu_openrisc_map_address_data =
+_openrisc_get_phys_nommu;
+}
+
+if (env->sr & SR_IME) {
+env->tlb->cpu_openrisc_map_address_code =
+_openrisc_get_phys_code;
+} else {
+env->tlb->cpu_openrisc_map_address_code =
+_openrisc_get_phys_nommu;
+}
+
+
+return 0;
+}
+
+static const VMStateDescription vmstate_tlb_entry = {
+.name = "tlb_entry",
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINTTL(mr, OpenRISCTLBEntry),
+VMSTATE_UINTTL(tr, OpenRISCTLBEntry),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static const VMStateDescription vmstate_cpu_tlb = {
+.name = "cpu_tlb",
+.version_id = 1,
+.minimum_version_id = 1,
+.minimum_version_id_old = 1,
+.fields = (VMStateField[]) {
+VMSTATE_STRUCT_2DARRAY(itlb, CPUOpenRISCTLBContext,
+ ITLB_WAYS, ITLB_SIZE, 0,
+ vmstate_tlb_entry, OpenRISCTLBEntry),
+VMSTATE_STRUCT_2DARRAY(dtlb, CPUOpenRISCTLBContext,
+ DTLB_WAYS, DTLB_SIZE, 0,
+ vmstate_tlb_entry, OpenRISCTLBEntry),
+VMSTATE_END_OF_LIST()
+}
+};
+
+#define VMSTATE_CPU_TLB(_f, _s) \
+VMSTATE_STRUCT_POINTER(_f, _s, vmstate_cpu_tlb, CPUOpenRISCTLBContext)
+
+
 static int get_sr(QEMUFile *f, void *opaque, size_t size, VMStateField *field)
 {
 CPUOpenRISCState *env = opaque;
@@ -47,8 +104,9 @@ static const VMStateInfo vmstate_sr = {
 
 static const VMStateDescription vmstate_env = {
 .name = "env",
-.version_id = 5,
-.minimum_version_id = 5,
+.version_id = 6,
+.minimum_version_id = 6,
+.post_load = env_post_load,
 .fields = (VMStateField[]) {
 VMSTATE_UINTTL_2DARRAY(shadow_gpr, CPUOpenRISCState, 16, 32),
 VMSTATE_UINTTL(pc, CPUOpenRISCState),
@@ -79,9 +137,20 @@ static const VMStateDescription vmstate_env = {
 VMSTATE_UINT32(cpucfgr, CPUOpenRISCState),
 VMSTATE_UINT32(dmmucfgr, CPUOpenRISCState),
 VMSTATE_UINT32(immucfgr, CPUOpenRISCState),
+VMSTATE_UINT32(evbar, CPUOpenRISCState),
 VMSTATE_UINT32(esr, CPUOpenRISCState),
 VMSTATE_UINT32(fpcsr, CPUOpenRISCState),
 VMSTATE_UINT64(mac, CPUOpenRISCState),
+
+VMSTATE_CPU_TLB(tlb, CPUOpenRISCState),
+
+VMSTATE_TIMER_PTR(timer, CPUOpenRISCState),
+VMSTATE_UINT32(ttmr, CPUOpenRISCState),
+VMSTATE_UINT32(ttcr, CPUOpenRISCState),
+
+VMSTATE_UINT32(picmr, CPUOpenRISCState),
+VMSTATE_UINT32(picsr, CPUOpenRISCState),
+
 VMSTATE_END_OF_LIST()
 }
 };
-- 
2.9.3




[Qemu-devel] [PATCH v2 5/9] migration: Add VMSTATE_UINTTL_2DARRAY()

2017-04-23 Thread Stafford Horne
In openRISC we are implementing the shadow registers as a 2d array.
Using this target long method rather than direct 32-bit alternatives is
consistent with the rest of our vm state serialization logic.

Signed-off-by: Stafford Horne 
---
 include/migration/cpu.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/include/migration/cpu.h b/include/migration/cpu.h
index f3d5dfc..a40bd35 100644
--- a/include/migration/cpu.h
+++ b/include/migration/cpu.h
@@ -18,6 +18,8 @@
 VMSTATE_UINT64_EQUAL_V(_f, _s, _v)
 #define VMSTATE_UINTTL_ARRAY_V(_f, _s, _n, _v)\
 VMSTATE_UINT64_ARRAY_V(_f, _s, _n, _v)
+#define VMSTATE_UINTTL_2DARRAY_V(_f, _s, _n1, _n2, _v)\
+VMSTATE_UINT64_2DARRAY_V(_f, _s, _n1, _n2, _v)
 #define VMSTATE_UINTTL_TEST(_f, _s, _t)   \
 VMSTATE_UINT64_TEST(_f, _s, _t)
 #define vmstate_info_uinttl vmstate_info_uint64
@@ -37,6 +39,8 @@
 VMSTATE_UINT32_EQUAL_V(_f, _s, _v)
 #define VMSTATE_UINTTL_ARRAY_V(_f, _s, _n, _v)\
 VMSTATE_UINT32_ARRAY_V(_f, _s, _n, _v)
+#define VMSTATE_UINTTL_2DARRAY_V(_f, _s, _n1, _n2, _v)\
+VMSTATE_UINT32_2DARRAY_V(_f, _s, _n1, _n2, _v)
 #define VMSTATE_UINTTL_TEST(_f, _s, _t)   \
 VMSTATE_UINT32_TEST(_f, _s, _t)
 #define vmstate_info_uinttl vmstate_info_uint32
@@ -48,5 +52,8 @@
 VMSTATE_UINTTL_EQUAL_V(_f, _s, 0)
 #define VMSTATE_UINTTL_ARRAY(_f, _s, _n)  \
 VMSTATE_UINTTL_ARRAY_V(_f, _s, _n, 0)
+#define VMSTATE_UINTTL_2DARRAY(_f, _s, _n1, _n2)  \
+VMSTATE_UINTTL_2DARRAY_V(_f, _s, _n1, _n2, 0)
+
 
 #endif
-- 
2.9.3




Re: [Qemu-devel] NetBSD maintenance

2017-04-23 Thread Kamil Rytarowski
On 23.04.2017 12:19, Peter Maydell wrote:
> On 23 April 2017 at 00:27, Kamil Rytarowski  wrote:
>> I noted a call for NetBSD maintainers in the 2.9.0 release notes.
>>
>> I'm willing to attach a NetBSD machine to CI cluster and volunteer basic
>> maintenance. I'm mostly interested in NetBSD as host & as guest as this
>> is my daily and work driver on my desktop and development machines.
> 
> Thanks for the offer of assistance. At the moment I have a
> NetBSD VM set up so I can run the usual make/make check
> tests that I run on other hosts. So the most immediate
> requirement is for somebody to investigate and send patches
> for the bugs which mean it doesn't build at all.
> My initial investigation suggests that at least one bug is that
> ivshmem-server uses shm_open but doesn't link -lrt, so
> some fixes to the build machinery are needed.
> 

I will have a look and try to build qemu from git (master branch).

I also find it surprising that there was a call for CI machines and
there is no infrastructure for it.

>> As of today NetBSD patches for qemu are maintained in pkgsrc. There are
>> also at least DragonFly and SunOS (SmartOS) diffs available.
>> https://github.com/NetBSD/pkgsrc/tree/trunk/emulators/qemu/patches
> 
> Accumulating patches downstream like this is I think
> a big part of the problem -- if QEMU has bugs on NetBSD
> then we need NetBSD users to report the problems and
> provide us with fixes. Otherwise you get what's happened:
> we try to build NetBSD with upstream QEMU, find it doesn't
> even compile, and conclude that obviously nobody's using
> QEMU on NetBSD because nobody's complained that it doesn't
> work, so we might as well drop it.
> 

We should maintain buildable and functional version for major BSDs (Net,
Free, Open, DragonFly), Linux, Darwin and SunOS in pkgsrc. Not
necessarily every release is verified on each Operating System, but our
users at some point managed to get it functional on all of them.

> (I recall hitting that "ssp/unistd.h defines macros for read, etc"
> bug a few years back when I last tried NetBSD as a host;
> it looks like it's fixed in the headers on newer NetBSD, though,
> so maybe those patches could just be dropped if we're lucky.)
> 

I will have a look.

>> Qemu is one of the core tools in NetBSD development and it's used in our
>> release engineering infrastructure:
>>
>> http://releng.netbsd.org/test-results.html
> 
> Do you use/test the bsd-user code, or just system emulation?
> I know the FreeBSD folks have extensive patches to the usermode
> code, but does it work OK on NetBSD hosts, or is it just unused?
> 

I'm afraid that there is no support in qemu usermode on NetBSD right now.

I just used unicorn-engine (qemu fork) for this purpose.

>> I will start with upstreaming local diffs and move on to running tests.
> 
> Thanks, I think that is the right place to start.
> 

Thanks for reply!

> -- PMM
> 




signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v2 7/9] migration: Add VMSTATE_STRUCT_2DARRAY()

2017-04-23 Thread Stafford Horne
For openrisc we implement tlb state as a 2d array of tlb entry structs.
This is added to allow easy storing of state of 2d arrays.

Signed-off-by: Stafford Horne 
---
 include/migration/vmstate.h | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index f2dbf84..9b7dcdc 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -499,6 +499,17 @@ extern const VMStateInfo vmstate_info_qtailq;
 .offset   = vmstate_offset_array(_state, _field, _type, _num),\
 }
 
+#define VMSTATE_STRUCT_2DARRAY_TEST(_field, _state, _n1, _n2, _test, _version, 
_vmsd, _type) { \
+.name = (stringify(_field)),\
+.num  = (_n1) * (_n2),  \
+.field_exists = (_test),\
+.version_id   = (_version), \
+.vmsd = &(_vmsd),   \
+.size = sizeof(_type),  \
+.flags= VMS_STRUCT|VMS_ARRAY,   \
+.offset   = vmstate_offset_2darray(_state, _field, _type, _n1, _n2),\
+}
+
 #define VMSTATE_STRUCT_VARRAY_UINT8(_field, _state, _field_num, _version, 
_vmsd, _type) { \
 .name   = (stringify(_field)),   \
 .num_offset = vmstate_offset_value(_state, _field_num, uint8_t), \
@@ -746,6 +757,10 @@ extern const VMStateInfo vmstate_info_qtailq;
 VMSTATE_STRUCT_ARRAY_TEST(_field, _state, _num, NULL, _version,   \
 _vmsd, _type)
 
+#define VMSTATE_STRUCT_2DARRAY(_field, _state, _n1, _n2, _version, _vmsd, 
_type) \
+VMSTATE_STRUCT_2DARRAY_TEST(_field, _state, _n1, _n2, NULL, _version,   \
+_vmsd, _type)
+
 #define VMSTATE_BUFFER_UNSAFE_INFO(_field, _state, _version, _info, _size) \
 VMSTATE_BUFFER_UNSAFE_INFO_TEST(_field, _state, NULL, _version, _info, \
 _size)
-- 
2.9.3




[Qemu-devel] [PATCH v2 2/9] target/openrisc: Implement EPH bit

2017-04-23 Thread Stafford Horne
From: Tim 'mithro' Ansell 

Exception Prefix High (EPH) control bit of the Supervision Register
(SR).

The significant bits (31-12) of the vector offset address for each
exception depend on the setting of the Supervision Register (SR)'s EPH
bit and the Exception Vector Base Address Register (EVBAR).

If SR[EPH] is set, the vector offset is logically ORed with the offset
0xF000.

This means if EPH is;
 * 0 - Exceptions vectors start at EVBAR
 * 1 - Exception vectors start at EVBAR | 0xF000

Signed-off-by: Tim 'mithro' Ansell 
Signed-off-by: Stafford Horne 
---
 target/openrisc/interrupt.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/target/openrisc/interrupt.c b/target/openrisc/interrupt.c
index 78f0ba9..2c91fab 100644
--- a/target/openrisc/interrupt.c
+++ b/target/openrisc/interrupt.c
@@ -69,6 +69,9 @@ void openrisc_cpu_do_interrupt(CPUState *cs)
 if (env->cpucfgr & CPUCFGR_EVBARP) {
 vect_pc |= env->evbar;
 }
+if (env->sr & SR_EPH) {
+vect_pc |= 0xf000;
+}
 env->pc = vect_pc;
 } else {
 cpu_abort(cs, "Unhandled exception 0x%x\n", cs->exception_index);
-- 
2.9.3




[Qemu-devel] [PATCH v2 4/9] target/openrisc: add numcores and coreid support

2017-04-23 Thread Stafford Horne
These are used to identify the processor in SMP system.  Their
definition has been defined in verilog cores but it not yet part of the
spec but it will be soon.

The proposal for this is available:
  https://openrisc.io/proposals/core-identifier-and-number-of-cores

Reviewed-by: Richard Henderson 
Signed-off-by: Stafford Horne 
---
 target/openrisc/sys_helper.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/target/openrisc/sys_helper.c b/target/openrisc/sys_helper.c
index 6ba8162..e13666b 100644
--- a/target/openrisc/sys_helper.c
+++ b/target/openrisc/sys_helper.c
@@ -233,6 +233,12 @@ target_ulong HELPER(mfspr)(CPUOpenRISCState *env,
 case TO_SPR(0, 64): /* ESR */
 return env->esr;
 
+case TO_SPR(0, 128): /* COREID */
+return 0;
+
+case TO_SPR(0, 129): /* NUMCORES */
+return 1;
+
 case TO_SPR(1, 512) ... TO_SPR(1, 512+DTLB_SIZE-1): /* DTLBW0MR 0-127 */
 idx = spr - TO_SPR(1, 512);
 return env->tlb->dtlb[0][idx].mr;
-- 
2.9.3




[Qemu-devel] [PATCH v2 6/9] target/openrisc: implement shadow registers

2017-04-23 Thread Stafford Horne
Shadow registers are part of the openrisc spec along with sr[cid], as
part of the fast context switching feature.  When exceptions occur,
instead of having to save registers to the stack if enabled the CID will
increment and a new set of registers will be available.

This patch only implements shadow registers which can be used as extra
scratch registers via the mfspr and mtspr if required.  This is
implemented in a way where it would be easy to add on the fast context
switching, currently cid is hardcoded to 0.

This is need for openrisc linux smp kernels to boot correctly.

Signed-off-by: Stafford Horne 
---

Changes since v1:

 o Use accessor functions cpu_get_gpr()/cpu_set_gpr()

 linux-user/elfload.c|  2 +-
 linux-user/main.c   | 18 +-
 linux-user/openrisc/target_cpu.h|  6 +++---
 linux-user/openrisc/target_signal.h |  2 +-
 linux-user/signal.c | 16 
 target/openrisc/cpu.c   |  4 +++-
 target/openrisc/cpu.h   | 15 +--
 target/openrisc/gdbstub.c   |  4 ++--
 target/openrisc/machine.c   |  6 +++---
 target/openrisc/sys_helper.c|  9 +
 target/openrisc/translate.c |  5 +++--
 11 files changed, 55 insertions(+), 32 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index f520d77..ce77317 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -1052,7 +1052,7 @@ static void elf_core_copy_regs(target_elf_gregset_t *regs,
 int i;
 
 for (i = 0; i < 32; i++) {
-(*regs)[i] = tswapreg(env->gpr[i]);
+(*regs)[i] = tswapreg(cpu_get_gpr(env, i));
 }
 (*regs)[32] = tswapreg(env->pc);
 (*regs)[33] = tswapreg(cpu_get_sr(env));
diff --git a/linux-user/main.c b/linux-user/main.c
index 10a3bb3..79d621b 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -2590,17 +2590,17 @@ void cpu_loop(CPUOpenRISCState *env)
 case EXCP_SYSCALL:
 env->pc += 4;   /* 0xc00; */
 ret = do_syscall(env,
- env->gpr[11], /* return value   */
- env->gpr[3],  /* r3 - r7 are params */
- env->gpr[4],
- env->gpr[5],
- env->gpr[6],
- env->gpr[7],
- env->gpr[8], 0, 0);
+ cpu_get_gpr(env, 11), /* return value   */
+ cpu_get_gpr(env, 3),  /* r3 - r7 are params */
+ cpu_get_gpr(env, 4),
+ cpu_get_gpr(env, 5),
+ cpu_get_gpr(env, 6),
+ cpu_get_gpr(env, 7),
+ cpu_get_gpr(env, 8), 0, 0);
 if (ret == -TARGET_ERESTARTSYS) {
 env->pc -= 4;
 } else if (ret != -TARGET_QEMU_ESIGRETURN) {
-env->gpr[11] = ret;
+cpu_set_gpr(env, 11, ret);
 }
 break;
 case EXCP_DPF:
@@ -4765,7 +4765,7 @@ int main(int argc, char **argv, char **envp)
 int i;
 
 for (i = 0; i < 32; i++) {
-env->gpr[i] = regs->gpr[i];
+cpu_set_gpr(env, i, regs->gpr[i]);
 }
 env->pc = regs->pc;
 cpu_set_sr(env, regs->sr);
diff --git a/linux-user/openrisc/target_cpu.h b/linux-user/openrisc/target_cpu.h
index f283d96..606ad6f 100644
--- a/linux-user/openrisc/target_cpu.h
+++ b/linux-user/openrisc/target_cpu.h
@@ -23,14 +23,14 @@
 static inline void cpu_clone_regs(CPUOpenRISCState *env, target_ulong newsp)
 {
 if (newsp) {
-env->gpr[1] = newsp;
+cpu_set_gpr(env, 1, newsp);
 }
-env->gpr[11] = 0;
+cpu_set_gpr(env, 11, 0);
 }
 
 static inline void cpu_set_tls(CPUOpenRISCState *env, target_ulong newtls)
 {
-env->gpr[10] = newtls;
+cpu_set_gpr(env, 10, newtls);
 }
 
 #endif
diff --git a/linux-user/openrisc/target_signal.h 
b/linux-user/openrisc/target_signal.h
index 9f2c493..95a733e 100644
--- a/linux-user/openrisc/target_signal.h
+++ b/linux-user/openrisc/target_signal.h
@@ -20,7 +20,7 @@ typedef struct target_sigaltstack {
 
 static inline abi_ulong get_sp_from_cpustate(CPUOpenRISCState *state)
 {
-return state->gpr[1];
+return cpu_get_gpr(state, 1);
 }
 
 
diff --git a/linux-user/signal.c b/linux-user/signal.c
index a67db04..eb6cb9f 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -4411,7 +4411,7 @@ static void setup_sigcontext(struct target_sigcontext *sc,
  CPUOpenRISCState *regs,
  unsigned long mask)
 {
-unsigned long usp = regs->gpr[1];
+unsigned long usp = cpu_get_gpr(regs, 1);
 
 /* copy the regs. they are first in sc so we can use sc directly */
 
@@ -4436,7 +4436,7 @@ static inline abi_ulong get_sigframe(struct 
target_sigaction *ka,
   

[Qemu-devel] [PATCH v2 1/9] target/openrisc: Implement EVBAR register

2017-04-23 Thread Stafford Horne
From: Tim 'mithro' Ansell 

Exception Vector Base Address Register (EVBAR) - This optional register
can be used to apply an offset to the exception vector addresses.

The significant bits (31-12) of the vector offset address for each
exception depend on the setting of the Supervision Register (SR)'s EPH
bit and the Exception Vector Base Address Register (EVBAR).

Its presence is indicated by the EVBARP bit in the CPU Configuration
Register (CPUCFGR).

Signed-off-by: Tim 'mithro' Ansell 
Signed-off-by: Stafford Horne 
---
 target/openrisc/cpu.c| 2 ++
 target/openrisc/cpu.h| 7 +++
 target/openrisc/interrupt.c  | 6 +-
 target/openrisc/sys_helper.c | 7 +++
 4 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c
index 7fd2b9a..1524ed9 100644
--- a/target/openrisc/cpu.c
+++ b/target/openrisc/cpu.c
@@ -134,6 +134,7 @@ static void or1200_initfn(Object *obj)
 
 set_feature(cpu, OPENRISC_FEATURE_OB32S);
 set_feature(cpu, OPENRISC_FEATURE_OF32S);
+set_feature(cpu, OPENRISC_FEATURE_EVBAR);
 }
 
 static void openrisc_any_initfn(Object *obj)
@@ -141,6 +142,7 @@ static void openrisc_any_initfn(Object *obj)
 OpenRISCCPU *cpu = OPENRISC_CPU(obj);
 
 set_feature(cpu, OPENRISC_FEATURE_OB32S);
+set_feature(cpu, OPENRISC_FEATURE_EVBAR);
 }
 
 typedef struct OpenRISCCPUInfo {
diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
index 418a0e6..1958b72 100644
--- a/target/openrisc/cpu.h
+++ b/target/openrisc/cpu.h
@@ -111,6 +111,11 @@ enum {
 CPUCFGR_OF32S = (1 << 7),
 CPUCFGR_OF64S = (1 << 8),
 CPUCFGR_OV64S = (1 << 9),
+/* CPUCFGR_ND = (1 << 10), */
+/* CPUCFGR_AVRP = (1 << 11), */
+CPUCFGR_EVBARP = (1 << 12),
+/* CPUCFGR_ISRP = (1 << 13), */
+/* CPUCFGR_AECSRP = (1 << 14), */
 };
 
 /* DMMU configure register */
@@ -200,6 +205,7 @@ enum {
 OPENRISC_FEATURE_OF32S = (1 << 7),
 OPENRISC_FEATURE_OF64S = (1 << 8),
 OPENRISC_FEATURE_OV64S = (1 << 9),
+OPENRISC_FEATURE_EVBAR = (1 << 12),
 };
 
 /* Tick Timer Mode Register */
@@ -289,6 +295,7 @@ typedef struct CPUOpenRISCState {
 uint32_t dmmucfgr;/* DMMU configure register */
 uint32_t immucfgr;/* IMMU configure register */
 uint32_t esr; /* Exception supervisor register */
+uint32_t evbar;   /* Exception vector base address register */
 uint32_t fpcsr;   /* Float register */
 float_status fp_status;
 
diff --git a/target/openrisc/interrupt.c b/target/openrisc/interrupt.c
index a2eec6f..78f0ba9 100644
--- a/target/openrisc/interrupt.c
+++ b/target/openrisc/interrupt.c
@@ -65,7 +65,11 @@ void openrisc_cpu_do_interrupt(CPUState *cs)
 env->lock_addr = -1;
 
 if (cs->exception_index > 0 && cs->exception_index < EXCP_NR) {
-env->pc = (cs->exception_index << 8);
+hwaddr vect_pc = cs->exception_index << 8;
+if (env->cpucfgr & CPUCFGR_EVBARP) {
+vect_pc |= env->evbar;
+}
+env->pc = vect_pc;
 } else {
 cpu_abort(cs, "Unhandled exception 0x%x\n", cs->exception_index);
 }
diff --git a/target/openrisc/sys_helper.c b/target/openrisc/sys_helper.c
index 60c3193..6ba8162 100644
--- a/target/openrisc/sys_helper.c
+++ b/target/openrisc/sys_helper.c
@@ -39,6 +39,10 @@ void HELPER(mtspr)(CPUOpenRISCState *env,
 env->vr = rb;
 break;
 
+case TO_SPR(0, 11): /* EVBAR */
+env->evbar = rb;
+break;
+
 case TO_SPR(0, 16): /* NPC */
 cpu_restore_state(cs, GETPC());
 /* ??? Mirror or1ksim in not trashing delayed branch state
@@ -206,6 +210,9 @@ target_ulong HELPER(mfspr)(CPUOpenRISCState *env,
 case TO_SPR(0, 4): /* IMMUCFGR */
 return env->immucfgr;
 
+case TO_SPR(0, 11): /* EVBAR */
+return env->evbar;
+
 case TO_SPR(0, 16): /* NPC (equals PC) */
 cpu_restore_state(cs, GETPC());
 return env->pc;
-- 
2.9.3




[Qemu-devel] [PATCH v2 3/9] target/openrisc: Fixes for memory debugging

2017-04-23 Thread Stafford Horne
When debugging in gdb you might want to inspect instructions in mapped
pages or in exception vectors like 0x800 etc.  This was previously not
possible in qemu since the *get_phys_page_debug() routine only looked
into the data tlb.

Change to fall back to look into instruction tlb and plain physical
pages.

Reviewed-by: Richard Henderson 
Signed-off-by: Stafford Horne 
---
 target/openrisc/mmu.c | 23 +++
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/target/openrisc/mmu.c b/target/openrisc/mmu.c
index 56b11d3..a6d7bcd 100644
--- a/target/openrisc/mmu.c
+++ b/target/openrisc/mmu.c
@@ -124,7 +124,7 @@ static int cpu_openrisc_get_phys_addr(OpenRISCCPU *cpu,
 {
 int ret = TLBRET_MATCH;
 
-if (rw == 2) {/* ITLB */
+if (rw == MMU_INST_FETCH) {/* ITLB */
*physical = 0;
 ret = cpu->env.tlb->cpu_openrisc_map_address_code(cpu, physical,
   prot, address, rw);
@@ -221,12 +221,27 @@ hwaddr openrisc_cpu_get_phys_page_debug(CPUState *cs, 
vaddr addr)
 OpenRISCCPU *cpu = OPENRISC_CPU(cs);
 hwaddr phys_addr;
 int prot;
+int miss;
 
-if (cpu_openrisc_get_phys_addr(cpu, _addr, , addr, 0)) {
-return -1;
+/* Check memory for any kind of address, since during debug the
+   gdb can ask for anything, check data tlb for address */
+miss = cpu_openrisc_get_phys_addr(cpu, _addr, , addr, 0);
+
+/* Check instruction tlb */
+if (miss) {
+miss = cpu_openrisc_get_phys_addr(cpu, _addr, , addr, 
MMU_INST_FETCH);
+}
+
+/* Last, fall back to a plain address */
+if (miss) {
+miss = cpu_openrisc_get_phys_nommu(cpu, _addr, , addr, 0);
 }
 
-return phys_addr;
+if (miss) {
+return -1;
+} else {
+return phys_addr;
+}
 }
 
 void cpu_openrisc_mmu_init(OpenRISCCPU *cpu)
-- 
2.9.3




[Qemu-devel] [PATCH v2 9/9] target/openrisc: Remove duplicate features property

2017-04-23 Thread Stafford Horne
The features property has stored the exact same thing as the cpucfgr
spr. Remove the feature enum and property as it is not needed.

In order to preserve the behavior or keeping features accross reset this
patch moves cpucfgr into the non reset region of the state struct.  Since
the cpucfgr is read only this means we only need to sset cpucfgr once
during class init.

Signed-off-by: Stafford Horne 
---
 target/openrisc/cpu.c | 17 +++--
 target/openrisc/cpu.h | 16 ++--
 2 files changed, 5 insertions(+), 28 deletions(-)

diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c
index 6c1ed07..c9b3f22 100644
--- a/target/openrisc/cpu.c
+++ b/target/openrisc/cpu.c
@@ -52,7 +52,6 @@ static void openrisc_cpu_reset(CPUState *s)
 s->exception_index = -1;
 
 cpu->env.upr = UPR_UP | UPR_DMP | UPR_IMP | UPR_PICP | UPR_TTP;
-cpu->env.cpucfgr = CPUCFGR_OB32S | CPUCFGR_OF32S | CPUCFGR_NSGF;
 cpu->env.dmmucfgr = (DMMUCFGR_NTW & (0 << 2)) | (DMMUCFGR_NTS & (6 << 2));
 cpu->env.immucfgr = (IMMUCFGR_NTW & (0 << 2)) | (IMMUCFGR_NTS & (6 << 2));
 
@@ -65,12 +64,6 @@ static void openrisc_cpu_reset(CPUState *s)
 #endif
 }
 
-static inline void set_feature(OpenRISCCPU *cpu, int feature)
-{
-cpu->feature |= feature;
-cpu->env.cpucfgr = cpu->feature;
-}
-
 static void openrisc_cpu_realizefn(DeviceState *dev, Error **errp)
 {
 CPUState *cs = CPU(dev);
@@ -132,19 +125,15 @@ static void or1200_initfn(Object *obj)
 {
 OpenRISCCPU *cpu = OPENRISC_CPU(obj);
 
-set_feature(cpu, OPENRISC_FEATURE_NSGF);
-set_feature(cpu, OPENRISC_FEATURE_OB32S);
-set_feature(cpu, OPENRISC_FEATURE_OF32S);
-set_feature(cpu, OPENRISC_FEATURE_EVBAR);
+cpu->env.cpucfgr = CPUCFGR_NSGF | CPUCFGR_OB32S | CPUCFGR_OF32S |
+   CPUCFGR_EVBARP;
 }
 
 static void openrisc_any_initfn(Object *obj)
 {
 OpenRISCCPU *cpu = OPENRISC_CPU(obj);
 
-set_feature(cpu, OPENRISC_FEATURE_NSGF);
-set_feature(cpu, OPENRISC_FEATURE_OB32S);
-set_feature(cpu, OPENRISC_FEATURE_EVBAR);
+cpu->env.cpucfgr = CPUCFGR_NSGF | CPUCFGR_OB32S | CPUCFGR_EVBARP;
 }
 
 typedef struct OpenRISCCPUInfo {
diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
index e159b22..938ccc3 100644
--- a/target/openrisc/cpu.h
+++ b/target/openrisc/cpu.h
@@ -196,18 +196,6 @@ enum {
 SR_SCE = (1 << 17),
 };
 
-/* OpenRISC Hardware Capabilities */
-enum {
-OPENRISC_FEATURE_NSGF = (15 << 0),
-OPENRISC_FEATURE_CGF = (1 << 4),
-OPENRISC_FEATURE_OB32S = (1 << 5),
-OPENRISC_FEATURE_OB64S = (1 << 6),
-OPENRISC_FEATURE_OF32S = (1 << 7),
-OPENRISC_FEATURE_OF64S = (1 << 8),
-OPENRISC_FEATURE_OV64S = (1 << 9),
-OPENRISC_FEATURE_EVBAR = (1 << 12),
-};
-
 /* Tick Timer Mode Register */
 enum {
 TTMR_TP = (0xfff),
@@ -292,7 +280,6 @@ typedef struct CPUOpenRISCState {
 uint32_t sr;  /* Supervisor register, without SR_{F,CY,OV} */
 uint32_t vr;  /* Version register */
 uint32_t upr; /* Unit presence register */
-uint32_t cpucfgr; /* CPU configure register */
 uint32_t dmmucfgr;/* DMMU configure register */
 uint32_t immucfgr;/* IMMU configure register */
 uint32_t esr; /* Exception supervisor register */
@@ -311,6 +298,8 @@ typedef struct CPUOpenRISCState {
 CPU_COMMON
 
 /* Fields from here on are preserved across CPU reset. */
+uint32_t cpucfgr; /* CPU configure register */
+
 #ifndef CONFIG_USER_ONLY
 CPUOpenRISCTLBContext * tlb;
 
@@ -337,7 +326,6 @@ typedef struct OpenRISCCPU {
 
 CPUOpenRISCState env;
 
-uint32_t feature;   /* CPU Capabilities */
 } OpenRISCCPU;
 
 static inline OpenRISCCPU *openrisc_env_get_cpu(CPUOpenRISCState *env)
-- 
2.9.3




[Qemu-devel] [PATCH v2 0/9] Openrisc misc features / fixes

2017-04-23 Thread Stafford Horne
Hello,

I have got a few response on the last series and have fixed them up.  Also
I have dropped the shutdown patch.

These patches I added while working on upcoming openrisc smp support.  This
does not allow for SMP openrisc on qemu "yet" but it does help to allow
booting of an SMP kernel on the uni-processor qemu system which I was using
as a sanity check before testing on fpga hardware.

Changes Since v1:

 o Added Tim's patches to this series (since the vmstate patch depends on
   in)
 o Changed vmstate patch to bump version numbers + support evbar saving
 o Changed Shadow Register patch to use accessor functions
 o Added reviewed-by's
 o Added a patch to remote the `features` field and enums
 o Dropped shutdown on `l.nop 1` patch

-Stafford

Stafford Horne (7):
  target/openrisc: Fixes for memory debugging
  target/openrisc: add numcores and coreid support
  migration: Add VMSTATE_UINTTL_2DARRAY()
  target/openrisc: implement shadow registers
  migration: Add VMSTATE_STRUCT_2DARRAY()
  target/openrisc: Implement full vmstate serialization
  target/openrisc: Remove duplicate features property

Tim 'mithro' Ansell (2):
  target/openrisc: Implement EVBAR register
  target/openrisc: Implement EPH bit

 include/migration/cpu.h |  7 
 include/migration/vmstate.h | 15 
 linux-user/elfload.c|  2 +-
 linux-user/main.c   | 18 -
 linux-user/openrisc/target_cpu.h|  6 +--
 linux-user/openrisc/target_signal.h |  2 +-
 linux-user/signal.c | 16 
 target/openrisc/cpu.c   | 13 ++-
 target/openrisc/cpu.h   | 36 ++
 target/openrisc/gdbstub.c   |  4 +-
 target/openrisc/interrupt.c |  9 -
 target/openrisc/machine.c   | 75 +++--
 target/openrisc/mmu.c   | 23 ++--
 target/openrisc/sys_helper.c| 22 +++
 target/openrisc/translate.c |  5 ++-
 15 files changed, 194 insertions(+), 59 deletions(-)

-- 
2.9.3




Re: [Qemu-devel] [PATCH v2] s390x/misc_helper.c: wrap s390_virtio_hypercall in BQL

2017-04-23 Thread Philippe Mathieu-Daudé

Hi Aurelien!

Why don't lock inside s390_virtio_hypercall() directly round the diag500 
dispatch call?


regards,

Phil.

On 04/23/2017 07:32 PM, Aurelien Jarno wrote:

s390_virtio_hypercall can trigger IO events and interrupts, most notably
when using virtio-ccw devices.

Reviewed-by: Alexander Graf 
Signed-off-by: Aurelien Jarno 
---
 target/s390x/misc_helper.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
index 4946b56ab3..aec737d707 100644
--- a/target/s390x/misc_helper.c
+++ b/target/s390x/misc_helper.c
@@ -307,7 +307,9 @@ void HELPER(diag)(CPUS390XState *env, uint32_t r1, uint32_t 
r3, uint32_t num)
 switch (num) {
 case 0x500:
 /* KVM hypercall */
+qemu_mutex_lock_iothread();
 r = s390_virtio_hypercall(env);
+qemu_mutex_unlock_iothread();
 break;
 case 0x44:
 /* yield */





[Qemu-devel] [PATCH] target-s390x: Mask the SIGP order_code to 8bit.

2017-04-23 Thread Aurelien Jarno
From: Philipp Kern 

According to "CPU Signaling and Response", "Signal-Processor Orders",
the order field is bit position 56-63. Without this, the Linux
guest kernel is sometimes unable to stop emulation and enters
an infinite loop of "XXX unknown sigp: 0x0005".

Signed-off-by: Philipp Kern 
Signed-off-by: Aurelien Jarno 
---
 target/s390x/misc_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

This patch has been sent by Philipp Kern a lot of time ago, and it seems
has been lost. I am resending it, as it is still useful.

diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
index 3bf09ea222..4946b56ab3 100644
--- a/target/s390x/misc_helper.c
+++ b/target/s390x/misc_helper.c
@@ -534,7 +534,7 @@ uint32_t HELPER(sigp)(CPUS390XState *env, uint64_t 
order_code, uint32_t r1,
 /* Remember: Use "R1 or R1 + 1, whichever is the odd-numbered register"
as parameter (input). Status (output) is always R1. */
 
-switch (order_code) {
+switch (order_code & 0xff) {
 case SIGP_SET_ARCH:
 /* switch arch */
 break;
-- 
2.11.0




[Qemu-devel] [PATCH v2] s390x/misc_helper.c: wrap s390_virtio_hypercall in BQL

2017-04-23 Thread Aurelien Jarno
s390_virtio_hypercall can trigger IO events and interrupts, most notably
when using virtio-ccw devices.

Reviewed-by: Alexander Graf 
Signed-off-by: Aurelien Jarno 
---
 target/s390x/misc_helper.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
index 4946b56ab3..aec737d707 100644
--- a/target/s390x/misc_helper.c
+++ b/target/s390x/misc_helper.c
@@ -307,7 +307,9 @@ void HELPER(diag)(CPUS390XState *env, uint32_t r1, uint32_t 
r3, uint32_t num)
 switch (num) {
 case 0x500:
 /* KVM hypercall */
+qemu_mutex_lock_iothread();
 r = s390_virtio_hypercall(env);
+qemu_mutex_unlock_iothread();
 break;
 case 0x44:
 /* yield */
-- 
2.11.0




Re: [Qemu-devel] [PATCH] s390x/misc_helper.c: wrap s390_virtio_hypercall in BQL

2017-04-23 Thread Aurelien Jarno
On 2017-04-23 19:19, Alexander Graf wrote:
> 
> 
> > Am 23.04.2017 um 18:08 schrieb Aurelien Jarno :
> > 
> > s390_virtio_hypercall can trigger IO events and interrupts, most notably
> > when using virtio-ccw devices.
> > 
> > Signed-off-by: Aurelien Jarno 
> > ---
> > roms/qemu-palcode  | 2 +-
> > roms/seabios   | 2 +-
> > target/s390x/misc_helper.c | 2 ++
> > 3 files changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/roms/qemu-palcode b/roms/qemu-palcode
> > index f3c7e44c70..c87a92639b 16
> > --- a/roms/qemu-palcode
> > +++ b/roms/qemu-palcode
> > @@ -1 +1 @@
> > -Subproject commit f3c7e44c70254975df2a00af39701eafbac4d471
> > +Subproject commit c87a92639b28ac42bc8f6c67443543b405dc479b
> > diff --git a/roms/seabios b/roms/seabios
> > index 5f4c7b13cd..e2fc41e24e 16
> > --- a/roms/seabios
> > +++ b/roms/seabios
> > @@ -1 +1 @@
> > -Subproject commit 5f4c7b13cdf9c450eb55645f4362ea58fa61b79b
> > +Subproject commit e2fc41e24ee0ada60fc511d60b15a41b294538be
> 
> I guess you didn't mean to send those hunks?

Indeed that's a mistake sorry, i'll resend a new version.

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



[Qemu-devel] [PATCH RFC] target/openrisc: Support non-busy idle state using PMR SPR

2017-04-23 Thread Stafford Horne
The OpenRISC architecture has the Power Management Register (PMR)
special purpose register to manage cpu power states.  The interesting
modes are:

 * Doze Mode (DME) - Stop cpu except timer & pic - wake on interrupt
 * Sleep Mode (SME) - Stop cpu and all units - wake on interrupt
 * Suspend Model (SUME) - Stop cpu and all units - wake on reset

The linux kernel will set DME when idle.

This patch implements the PMR SPR and halts the qemu cpu when there is a
change to DME or SME.  This means that openrisc qemu in no longer peggs
a host cpu at 100%.

Signed-off-by: Stafford Horne 
---
(Sorry, resending this again, there was something wrong with my mail setup
on the last)

Hello,

This patch seems work fine but I am not sure if it is the right way to do
trigger the halt signal.  Should I do it via raising an interrupt or
exception and exitting the cpu?

Also, I don't know if its due to this patch of an issue with the timer
interrupts.  After applying this patch the timer interrupts do not trigger
until a keypress is make.  i.e. something like this...

  $ sleep 5
  

It may or may not be related to this patch as I noticed sometime things
like this happened before this patch.


 target/openrisc/cpu.c|  3 ++-
 target/openrisc/cpu.h| 10 ++
 target/openrisc/interrupt.c  |  2 ++
 target/openrisc/machine.c|  1 +
 target/openrisc/sys_helper.c | 12 
 5 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c
index c9b3f22..1d6330c 100644
--- a/target/openrisc/cpu.c
+++ b/target/openrisc/cpu.c
@@ -51,7 +51,8 @@ static void openrisc_cpu_reset(CPUState *s)
 cpu->env.lock_addr = -1;
 s->exception_index = -1;
 
-cpu->env.upr = UPR_UP | UPR_DMP | UPR_IMP | UPR_PICP | UPR_TTP;
+cpu->env.upr = UPR_UP | UPR_DMP | UPR_IMP | UPR_PICP | UPR_TTP |
+   UPR_PMP;
 cpu->env.dmmucfgr = (DMMUCFGR_NTW & (0 << 2)) | (DMMUCFGR_NTS & (6 << 2));
 cpu->env.immucfgr = (IMMUCFGR_NTW & (0 << 2)) | (IMMUCFGR_NTS & (6 << 2));
 
diff --git a/target/openrisc/cpu.h b/target/openrisc/cpu.h
index 938ccc3..2721432 100644
--- a/target/openrisc/cpu.h
+++ b/target/openrisc/cpu.h
@@ -140,6 +140,15 @@ enum {
 IMMUCFGR_HTR = (1 << 11),
 };
 
+/* Power management register */
+enum {
+PMR_SDF = (15 << 0),
+PMR_DME = (1 << 4),
+PMR_SME = (1 << 5),
+PMR_DCGE = (1 << 6),
+PMR_SUME = (1 << 7),
+};
+
 /* Float point control status register */
 enum {
 FPCSR_FPEE = 1,
@@ -284,6 +293,7 @@ typedef struct CPUOpenRISCState {
 uint32_t immucfgr;/* IMMU configure register */
 uint32_t esr; /* Exception supervisor register */
 uint32_t evbar;   /* Exception vector base address register */
+uint32_t pmr; /* Power Management Register */
 uint32_t fpcsr;   /* Float register */
 float_status fp_status;
 
diff --git a/target/openrisc/interrupt.c b/target/openrisc/interrupt.c
index 2c91fab..3959671 100644
--- a/target/openrisc/interrupt.c
+++ b/target/openrisc/interrupt.c
@@ -60,6 +60,8 @@ void openrisc_cpu_do_interrupt(CPUState *cs)
 env->sr |= SR_SM;
 env->sr &= ~SR_IEE;
 env->sr &= ~SR_TEE;
+env->pmr &= ~PMR_DME;
+env->pmr &= ~PMR_SME;
 env->tlb->cpu_openrisc_map_address_data = _openrisc_get_phys_nommu;
 env->tlb->cpu_openrisc_map_address_code = _openrisc_get_phys_nommu;
 env->lock_addr = -1;
diff --git a/target/openrisc/machine.c b/target/openrisc/machine.c
index a82be62..a20cce7 100644
--- a/target/openrisc/machine.c
+++ b/target/openrisc/machine.c
@@ -138,6 +138,7 @@ static const VMStateDescription vmstate_env = {
 VMSTATE_UINT32(dmmucfgr, CPUOpenRISCState),
 VMSTATE_UINT32(immucfgr, CPUOpenRISCState),
 VMSTATE_UINT32(evbar, CPUOpenRISCState),
+VMSTATE_UINT32(pmr, CPUOpenRISCState),
 VMSTATE_UINT32(esr, CPUOpenRISCState),
 VMSTATE_UINT32(fpcsr, CPUOpenRISCState),
 VMSTATE_UINT64(mac, CPUOpenRISCState),
diff --git a/target/openrisc/sys_helper.c b/target/openrisc/sys_helper.c
index fa3d6a4..cb1e085 100644
--- a/target/openrisc/sys_helper.c
+++ b/target/openrisc/sys_helper.c
@@ -22,6 +22,7 @@
 #include "cpu.h"
 #include "exec/exec-all.h"
 #include "exec/helper-proto.h"
+#include "exception.h"
 
 #define TO_SPR(group, number) (((group) << 11) + (number))
 
@@ -141,6 +142,14 @@ void HELPER(mtspr)(CPUOpenRISCState *env,
 case TO_SPR(5, 2):  /* MACHI */
 env->mac = deposit64(env->mac, 32, 32, rb);
 break;
+case TO_SPR(8, 0):  /* PMR */
+env->pmr = rb;
+if (env->pmr & PMR_DME || env->pmr & PMR_SME) {
+cpu_restore_state(cs, GETPC() + 4);
+cs->halted = 1;
+raise_exception(cpu, EXCP_HLT);
+}
+break;
 case TO_SPR(9, 0):  /* PICMR */
 env->picmr |= rb;
 break;
@@ -287,6 +296,9 @@ target_ulong HELPER(mfspr)(CPUOpenRISCState *env,
 return 

Re: [Qemu-devel] [PATCH] hw/core/generic-loader: Fix crash when running without CPU

2017-04-23 Thread Michael Tokarev
27.02.2017 22:36, Thomas Huth wrote:
> On 25.01.2017 21:45, Thomas Huth wrote:
>> When running QEMU with "-M none -device loader,file=kernel.elf", it
>> currently crashes with a segmentation fault, because the "none"-machine
>> does not have any CPU by default and the generic loader code tries
>> to dereference s->cpu. Fix it by adding an appropriate check for a
>> NULL pointer.

Applied to -trivial, thanks!

/mjt



Re: [Qemu-devel] [PATCH 1/2] virtio-blk: Remove useless condition around g_free()

2017-04-23 Thread Michael Tokarev
07.02.2017 16:27, Fam Zheng wrote:
> Laszlo spotted and studied this wasteful "if". He pointed out:
> 
> The original virtio_blk_free_request needed an "if" as it accesses one
> field, since 671ec3f05655 ("virtio-blk: Convert VirtIOBlockReq.elem to
> pointer", 2014-06-11); later on in f897bf751fbd ("virtio-blk: embed
> VirtQueueElement in VirtIOBlockReq", 2014-07-09) the field became
> embedded, so the "if" became unnecessary (at which point we were using
> g_slice_free(), but it is the same.
> 
> Now drop it.

Applied to -trivial, thanks!

/mjt



Re: [Qemu-devel] [PATCH] qemu-doc: Fix broken URLs of amnhltm.zip and dosidle210.zip

2017-04-23 Thread Michael Tokarev
08.03.2017 15:13, Thomas Huth wrote:
> There are some broken URLs in the qemu-doc which reference tools that
> are not available at their original location anymore. Fortunately, they
> have been mirrored to archive.org, so point to that location instead.

Applied to -trivial, thanks!

/mjt



Re: [Qemu-devel] [PATCH v2] use _Static_assert in QEMU_BUILD_BUG_ON

2017-04-23 Thread Michael Tokarev
14.03.2017 19:59, Andreas Grapentin wrote:
> QEMU_BUILD_BUG_ON should use C11's _Static_assert, if the compiler supports 
> it,
> to provide more readable messages on failure.
...

Applied to -trivial, with trivial commit comment fix.  Thanks!

/mjt



Re: [Qemu-devel] [PATCH v5]COLO:Fix spell error in Colo doc

2017-04-23 Thread Michael Tokarev
21.03.2017 04:53, Eric Blake wrote:
> On 03/20/2017 08:39 PM, wangguang wrote:
>> Subject: [PATCH]COLO:Fix spell error in Colo doc
> 
> I added qemu-trivial in v4; you should keep it in the loop.
> 
> Still missing a space after ':' in the subject line, and still the
> awkward duplication of the subject line in the body of the commit message.
...

As Zhang Chen pointed out in the original patch submission,
this patch isn't really needed since the command is actually
different, and he'll update the doc later.

Thanks,

/mjt



Re: [Qemu-devel] [PATCH] s390x/misc_helper.c: wrap s390_virtio_hypercall in BQL

2017-04-23 Thread Alexander Graf


> Am 23.04.2017 um 18:08 schrieb Aurelien Jarno :
> 
> s390_virtio_hypercall can trigger IO events and interrupts, most notably
> when using virtio-ccw devices.
> 
> Signed-off-by: Aurelien Jarno 
> ---
> roms/qemu-palcode  | 2 +-
> roms/seabios   | 2 +-
> target/s390x/misc_helper.c | 2 ++
> 3 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/roms/qemu-palcode b/roms/qemu-palcode
> index f3c7e44c70..c87a92639b 16
> --- a/roms/qemu-palcode
> +++ b/roms/qemu-palcode
> @@ -1 +1 @@
> -Subproject commit f3c7e44c70254975df2a00af39701eafbac4d471
> +Subproject commit c87a92639b28ac42bc8f6c67443543b405dc479b
> diff --git a/roms/seabios b/roms/seabios
> index 5f4c7b13cd..e2fc41e24e 16
> --- a/roms/seabios
> +++ b/roms/seabios
> @@ -1 +1 @@
> -Subproject commit 5f4c7b13cdf9c450eb55645f4362ea58fa61b79b
> +Subproject commit e2fc41e24ee0ada60fc511d60b15a41b294538be

I guess you didn't mean to send those hunks?

> diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
> index 4946b56ab3..aec737d707 100644
> --- a/target/s390x/misc_helper.c
> +++ b/target/s390x/misc_helper.c
> @@ -307,7 +307,9 @@ void HELPER(diag)(CPUS390XState *env, uint32_t r1, 
> uint32_t r3, uint32_t num)
> switch (num) {
> case 0x500:
> /* KVM hypercall */
> +qemu_mutex_lock_iothread();
> r = s390_virtio_hypercall(env);
> +qemu_mutex_unlock_iothread();

That change however looks good to me. So without the subprj commits,

Reviewed-by: Alexander Graf 

Alex

> break;
> case 0x44:
> /* yield */
> -- 
> 2.11.0
> 




Re: [Qemu-devel] [PATCH 03/21] object: fix potential leak in getters

2017-04-23 Thread Michael Tokarev
11.03.2017 16:22, Marc-André Lureau wrote:
> If the property is not of the requested type, the getters will leak a
> QObject.

I'm not really sure it's -trivial material.

Not applying 01/23 either.

Should whole series be applied to the same tree perhaps?

Thanks,

/mjt



[Qemu-devel] [PATCH] s390x/misc_helper.c: wrap s390_virtio_hypercall in BQL

2017-04-23 Thread Aurelien Jarno
s390_virtio_hypercall can trigger IO events and interrupts, most notably
when using virtio-ccw devices.

Signed-off-by: Aurelien Jarno 
---
 roms/qemu-palcode  | 2 +-
 roms/seabios   | 2 +-
 target/s390x/misc_helper.c | 2 ++
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/roms/qemu-palcode b/roms/qemu-palcode
index f3c7e44c70..c87a92639b 16
--- a/roms/qemu-palcode
+++ b/roms/qemu-palcode
@@ -1 +1 @@
-Subproject commit f3c7e44c70254975df2a00af39701eafbac4d471
+Subproject commit c87a92639b28ac42bc8f6c67443543b405dc479b
diff --git a/roms/seabios b/roms/seabios
index 5f4c7b13cd..e2fc41e24e 16
--- a/roms/seabios
+++ b/roms/seabios
@@ -1 +1 @@
-Subproject commit 5f4c7b13cdf9c450eb55645f4362ea58fa61b79b
+Subproject commit e2fc41e24ee0ada60fc511d60b15a41b294538be
diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
index 4946b56ab3..aec737d707 100644
--- a/target/s390x/misc_helper.c
+++ b/target/s390x/misc_helper.c
@@ -307,7 +307,9 @@ void HELPER(diag)(CPUS390XState *env, uint32_t r1, uint32_t 
r3, uint32_t num)
 switch (num) {
 case 0x500:
 /* KVM hypercall */
+qemu_mutex_lock_iothread();
 r = s390_virtio_hypercall(env);
+qemu_mutex_unlock_iothread();
 break;
 case 0x44:
 /* yield */
-- 
2.11.0




[Qemu-devel] [PATCH 2/2] qemu-img: fix some spelling errors

2017-04-23 Thread jemmy858585
From: Lidong Chen 

Fix some spelling errors in is_allocated_sectors comment.

Signed-off-by: Lidong Chen 
---
 qemu-img.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index df6d165..0b3349c 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1033,8 +1033,8 @@ done:
 }
 
 /*
- * Returns true iff the first sector pointed to by 'buf' contains at least
- * a non-NUL byte.
+ * Returns true if the first sector pointed to by 'buf' contains at least
+ * a non-NULL byte.
  *
  * 'pnum' is set to the number of sectors (including and immediately following
  * the first one) that are known to be in the same allocated/unallocated state.
-- 
1.8.3.1




[Qemu-devel] [PATCH 1/2] qemu-img: make sure contain the consecutive number of zero bytes

2017-04-23 Thread jemmy858585
From: Lidong Chen 

is_allocated_sectors_min don't guarantee to contain the
consecutive number of zero bytes. this patch fixes this bug.

Signed-off-by: Lidong Chen 
---
 qemu-img.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/qemu-img.c b/qemu-img.c
index b220cf7..df6d165 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1060,9 +1060,9 @@ static int is_allocated_sectors(const uint8_t *buf, int 
n, int *pnum)
 }
 
 /*
- * Like is_allocated_sectors, but if the buffer starts with a used sector,
- * up to 'min' consecutive sectors containing zeros are ignored. This avoids
- * breaking up write requests for only small sparse areas.
+ * Like is_allocated_sectors, but up to 'min' consecutive sectors
+ * containing zeros are ignored. This avoids breaking up write requests
+ * for only small sparse areas.
  */
 static int is_allocated_sectors_min(const uint8_t *buf, int n, int *pnum,
 int min)
@@ -1071,11 +1071,12 @@ static int is_allocated_sectors_min(const uint8_t *buf, 
int n, int *pnum,
 int num_checked, num_used;
 
 if (n < min) {
-min = n;
+*pnum = n;
+return 1;
 }
 
 ret = is_allocated_sectors(buf, n, pnum);
-if (!ret) {
+if (!ret && *pnum >= min) {
 return ret;
 }
 
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH] Video and sound capture to a videofile through ffmpeg

2017-04-23 Thread no-reply
Hi,

This series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Message-id: 1492956615-2395-1-git-send-email-vip-a...@yandex.ru
Type: series
Subject: [Qemu-devel] [PATCH] Video and sound capture to a videofile through 
ffmpeg

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=8
make docker-test-quick@centos6
make docker-test-mingw@fedora
make docker-test-build@min-glib
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] 
patchew/1492956615-2395-1-git-send-email-vip-a...@yandex.ru -> 
patchew/1492956615-2395-1-git-send-email-vip-a...@yandex.ru
Switched to a new branch 'test'
23dca8f Video and sound capture to a videofile through ffmpeg

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into '/var/tmp/patchew-tester-tmp-1audw8b2/src/dtc'...
Submodule path 'dtc': checked out '558cd81bdd432769b59bff01240c44f82cfb1a9d'
  BUILD   centos6
make[1]: Entering directory '/var/tmp/patchew-tester-tmp-1audw8b2/src'
  ARCHIVE qemu.tgz
  ARCHIVE dtc.tgz
  COPYRUNNER
RUN test-quick in qemu:centos6 
Packages installed:
SDL-devel-1.2.14-7.el6_7.1.x86_64
ccache-3.1.6-2.el6.x86_64
epel-release-6-8.noarch
gcc-4.4.7-17.el6.x86_64
git-1.7.1-4.el6_7.1.x86_64
glib2-devel-2.28.8-5.el6.x86_64
libfdt-devel-1.4.0-1.el6.x86_64
make-3.81-23.el6.x86_64
package g++ is not installed
pixman-devel-0.32.8-1.el6.x86_64
tar-1.23-15.el6_8.x86_64
zlib-devel-1.2.3-29.el6.x86_64

Environment variables:
PACKAGES=libfdt-devel ccache tar git make gcc g++ zlib-devel 
glib2-devel SDL-devel pixman-devel epel-release
HOSTNAME=a361ce3f9e8b
TERM=xterm
MAKEFLAGS= -j8
HISTSIZE=1000
J=8
USER=root
CCACHE_DIR=/var/tmp/ccache
EXTRA_CONFIGURE_OPTS=
V=
SHOW_ENV=1
MAIL=/var/spool/mail/root
PATH=/usr/lib/ccache:/usr/lib64/ccache:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
LANG=en_US.UTF-8
TARGET_LIST=
HISTCONTROL=ignoredups
SHLVL=1
HOME=/root
TEST_DIR=/tmp/qemu-test
LOGNAME=root
LESSOPEN=||/usr/bin/lesspipe.sh %s
FEATURES= dtc
DEBUG=
G_BROKEN_FILENAMES=1
CCACHE_HASHDIR=
_=/usr/bin/env

Configure options:
--enable-werror --target-list=x86_64-softmmu,aarch64-softmmu 
--prefix=/var/tmp/qemu-build/install
grep: scripts/tracetool/backend/*.py: No such file or directory
No C++ compiler available; disabling C++ specific optional code

ERROR: libav check failed
   Make sure to have the libav libs and headers installed.

tests/docker/Makefile.include:118: recipe for target 'docker-run' failed
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory '/var/tmp/patchew-tester-tmp-1audw8b2/src'
tests/docker/Makefile.include:149: recipe for target 
'docker-run-test-quick@centos6' failed
make: *** [docker-run-test-quick@centos6] Error 2
=== OUTPUT END ===

Test command exited with code: 2


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

[Qemu-devel] [PATCH] Video and sound capture to a videofile through ffmpeg

2017-04-23 Thread Alex K
Hello everyone,

I've made a patch that adds ability to record video of what's going on
inside qemu. It uses ffmpeg libraries. Basically, the patch adds
2 new commands to the console:
capture_start path framerate
capture_stop

path is required
framerate could be 24, 25, 30 or 60. Default is 60
video codec is always h264

The patch uses ffmpeg so you will need to install these packages:
ffmpeg libavformat-dev libavcodec-dev libavutil-dev libswscale-dev

This is my first time posting here, so please correct me if I'm doing
something wrong

Signed-off-by: Alex K 
---
 configure  |  20 +
 default-configs/i386-softmmu.mak   |   1 +
 default-configs/x86_64-softmmu.mak |   1 +
 hmp-commands.hx|  34 ++
 hmp.h  |   2 +
 hw/display/Makefile.objs   |   2 +
 hw/display/capture.c   | 761 +
 hw/display/capture.h   |  78 
 8 files changed, 899 insertions(+)
 create mode 100644 hw/display/capture.c
 create mode 100644 hw/display/capture.h

diff --git a/configure b/configure
index 6db3044..0b927f8 100755
--- a/configure
+++ b/configure
@@ -281,6 +281,7 @@ opengl=""
 opengl_dmabuf="no"
 avx2_opt="no"
 zlib="yes"
+libav="yes"
 lzo=""
 snappy=""
 bzip2=""
@@ -1987,6 +1988,25 @@ if test "$seccomp" != "no" ; then
 seccomp="no"
 fi
 fi
+#
+# libav check
+
+if test "$libav" != "no" ; then
+cat > $TMPC << EOF
+#include 
+#include 
+
+int main(void){ av_register_all(); avcodec_register_all(); return 0; }
+EOF
+if compile_prog "" "-lm -lpthread -lavformat -lavcodec -lavutil -lswscale 
-lswresample" ; then
+:
+else
+error_exit "libav check failed" \
+"Make sure to have the libav libs and headers installed."
+fi
+fi
+LIBS="$LIBS -lm -lpthread -lavformat -lavcodec -lavutil -lswscale -lswresample"
+
 ##
 # xen probe
 
diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 029e952..a24ac7c 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -60,3 +60,4 @@ CONFIG_SMBIOS=y
 CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
 CONFIG_PXB=y
 CONFIG_ACPI_VMGENID=y
+CONFIG_CAPTURE=y
diff --git a/default-configs/x86_64-softmmu.mak 
b/default-configs/x86_64-softmmu.mak
index d1d7432..9919e93 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -60,3 +60,4 @@ CONFIG_SMBIOS=y
 CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
 CONFIG_PXB=y
 CONFIG_ACPI_VMGENID=y
+CONFIG_CAPTURE=y
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 8819281..2c708ae 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1777,3 +1777,37 @@ ETEXI
 STEXI
 @end table
 ETEXI
+
+{
+.name   = "capture_start",
+.args_type  = "filename:F,fps:i?",
+.params = "filename [framerate]",
+.help   = "Start video capture",
+.cmd= hmp_capture_start,
+},
+
+STEXI
+@item capture_start @var{filename} [@var{framerate}]
+@findex capture_start
+Start video capture.
+Capture video into @var{filename} with framerate @var{framerate}.
+
+Defaults:
+@itemize @minus
+@item framerate = 60
+@end itemize
+ETEXI
+
+{
+.name   = "capture_stop",
+.args_type  = "",
+.params = "",
+.help   = "Stop video capture",
+.cmd= hmp_capture_stop,
+},
+
+STEXI
+@item capture_stop
+@findex capture_stop
+Stop video capture.
+ETEXI
diff --git a/hmp.h b/hmp.h
index 799fd37..36c7a4d 100644
--- a/hmp.h
+++ b/hmp.h
@@ -138,5 +138,7 @@ void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict 
*qdict);
 void hmp_info_dump(Monitor *mon, const QDict *qdict);
 void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict);
 void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict);
+void hmp_capture_start(Monitor *mon, const QDict *qdict);
+void hmp_capture_stop(Monitor *mon, const QDict *qdict);
 
 #endif
diff --git a/hw/display/Makefile.objs b/hw/display/Makefile.objs
index 551c050..a918896 100644
--- a/hw/display/Makefile.objs
+++ b/hw/display/Makefile.objs
@@ -20,6 +20,8 @@ common-obj-$(CONFIG_ZAURUS) += tc6393xb.o
 
 common-obj-$(CONFIG_MILKYMIST_TMU2) += milkymist-tmu2.o
 
+obj-$(CONFIG_CAPTURE) += capture.o
+
 obj-$(CONFIG_OMAP) += omap_dss.o
 obj-$(CONFIG_OMAP) += omap_lcdc.o
 obj-$(CONFIG_PXA2XX) += pxa2xx_lcd.o
diff --git a/hw/display/capture.c b/hw/display/capture.c
new file mode 100644
index 000..c89aaa0
--- /dev/null
+++ b/hw/display/capture.c
@@ -0,0 +1,761 @@
+#include "capture.h"
+
+static void sound_capture_notify(void *opaque, audcnotification_e cmd)
+{
+(void) opaque;
+(void) cmd;
+}
+
+static void sound_capture_destroy(void *opaque)
+{
+(void) opaque;
+}
+
+static void write_empty_sound(void *opaque, struct CaptureThreadWorkerData 
*data)
+{
+AVFormatContext *oc = data->oc;
+   

Re: [Qemu-devel] [PATCH] qemu-img: use blk_co_pwrite_zeroes for zero sectors when compressed

2017-04-23 Thread 858585 jemmy
On Fri, Apr 21, 2017 at 1:37 PM, 858585 jemmy  wrote:
> On Fri, Apr 21, 2017 at 10:58 AM, 858585 jemmy  wrote:
>> On Thu, Apr 20, 2017 at 6:00 PM, Kevin Wolf  wrote:
>>> Am 20.04.2017 um 10:38 hat jemmy858...@gmail.com geschrieben:
 From: Lidong Chen 

 when the buffer is zero, blk_co_pwrite_zeroes is more effectively than
 blk_co_pwritev with BDRV_REQ_WRITE_COMPRESSED. this patch can reduces
 the time when converts the qcow2 image with lots of zero.

 Signed-off-by: Lidong Chen 
>>>
>>> Good catch, using blk_co_pwrite_zeroes() makes sense even for compressed
>>> images.
>>>
 diff --git a/qemu-img.c b/qemu-img.c
 index b220cf7..0256539 100644
 --- a/qemu-img.c
 +++ b/qemu-img.c
 @@ -1675,13 +1675,20 @@ static int coroutine_fn 
 convert_co_write(ImgConvertState *s, int64_t sector_num,
   * write if the buffer is completely zeroed and we're allowed 
 to
   * keep the target sparse. */
  if (s->compressed) {
 -if (s->has_zero_init && s->min_sparse &&
 -buffer_is_zero(buf, n * BDRV_SECTOR_SIZE))
 -{
 -assert(!s->target_has_backing);
 -break;
 +if (buffer_is_zero(buf, n * BDRV_SECTOR_SIZE)) {
 +if (s->has_zero_init && s->min_sparse) {
 +assert(!s->target_has_backing);
 +break;
 +} else {
 +ret = blk_co_pwrite_zeroes(s->target,
 +   sector_num << BDRV_SECTOR_BITS,
 +   n << BDRV_SECTOR_BITS, 0);
 +if (ret < 0) {
 +return ret;
 +}
 +break;
 +}
  }
>>>
>>> If s->min_sparse == 0, we may neither skip the write not use
>>> blk_co_pwrite_zeroes(), because this requires actual full allocation
>>> with explicit zero sectors.
>>>
>>> Of course, if you fix this, what you end up with here is a duplicate of
>>> the code path for non-compressed images. The remaining difference seems
>>> to be the BDRV_REQ_WRITE_COMPRESSED flag and buffer_is_zero() vs.
>>> is_allocated_sectors_min() (because uncompressed clusters can be written
>>> partially, but compressed clusters can't).
>>
>> I have a try to unify the code.
>>
>> I don't understand why use  is_allocated_sectors_min when don't compressed.
>> the s->min_sparse is 8 default, which is smaller than cluster_sectors.
>>
>> if a cluster which data is  8 sector zero and 8 sector non-zero
>> repeated, it will call
>> blk_co_pwritev and blk_co_pwrite_zeroes many times for a cluster.
>>
>> why not compare the zero by cluster_sectors size?
>
> I write this code, run in guest os.
>
> #include 
> #include 
> #include 
>
> int main()
> {
> char *zero;
> char *nonzero;
> FILE* fp = fopen("./test.dat", "ab");
>
> zero = malloc(sizeof(char)*512*8);
> nonzero = malloc(sizeof(char)*512*8);
>
> memset(zero, 0, sizeof(char)*512*8);
> memset(nonzero, 1, sizeof(char)*512*8);
>
> while (1) {
> fwrite(zero, sizeof(char)*512*8, 1, fp);
> fwrite(nonzero, sizeof(char)*512*8, 1, fp);
> }
> fclose(fp);
> }
>
> qemu-img info /mnt/img2016111016860868.qcow2
> image: /mnt/img2016111016860868.qcow2
> file format: qcow2
> virtual size: 20G (21474836480 bytes)
> disk size: 19G (20061552640 bytes)
> cluster_size: 65536
> backing file: /baseimage/img2016042213665396/img2016042213665396.qcow2
>
> use -S 65536 option.
>
> time /root/kvm/bin/qemu-img convert -p -B
> /baseimage/img2016042213665396/img2016042213665396.qcow2 -O qcow2
> /mnt/img2016111016860868.qcow2 /mnt/img2017041611162809_zip_new.qcow2
> -S 65536
> (100.00/100%)
>
> real0m32.203s
> user0m5.165s
> sys 0m27.887s
>
> time /root/kvm/bin/qemu-img convert -p -B
> /baseimage/img2016042213665396/img2016042213665396.qcow2 -O qcow2
> /mnt/img2016111016860868.qcow2 /mnt/img2017041611162809_zip_new.qcow2
> (100.00/100%)
>
> real1m38.665s
> user0m45.418s
> sys 1m7.518s
>
> should we set cluster_sectors as the default value of s->min_sparse?

change the default value of s->min_sparse will break the API.
qemu-img --help describe that the default value is 4k.

  '-S' indicates the consecutive number of bytes (defaults to 4k) that must
   contain only zeros for qemu-img to create a sparse image during
   conversion. If the number of bytes is 0, the source will not be
scanned for
   unallocated or zero sectors, and the destination image will always be
   fully allocated

>
>>
>>>
>>> So I suppose that instead of just fixing the above bug, we could 

  1   2   >