date:20170131

Re: [Qemu-devel] [PATCH v8 25/25] tcg: enable MTTCG by default for ARM on x86 hosts

2017-01-31 Thread Alex Bennée


Richard Henderson  writes:

> On 01/27/2017 02:39 AM, Alex Bennée wrote:
>> +/* This defines the natural memory order supported by this
>> + * architecture before guarantees made by various barrier
>> + * instructions.
>> + *
>> + * The x86 has a pretty strong memory ordering which only really
>> + * allows for some stores to be re-ordered after loads.
>> + */
>> +#include "tcg-mo.h"
>> +
>> +static inline int get_tcg_target_mo(void)
>> +{
>> +return TCG_MO_ALL & ~TCG_MO_LD_ST;
>> +}
>> +
>> +#define TCG_TARGET_DEFAULT_MO get_tcg_target_mo()
>> +
>
> Why the inline function?  The expression in the define would seem sufficient.
> Otherwise,

Good point. It was just to get around the definition at pre-processor and
compile time.

>
> Reviewed-by: Richard Henderson 
>
>
> r~


--
Alex Bennée

Re: [Qemu-devel] [PATCH v8 16/25] cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap

2017-01-31 Thread Alex Bennée


Richard Henderson  writes:

> On 01/27/2017 02:39 AM, Alex Bennée wrote:
>> +for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
>>
>> -tlb_debug("%d\n", mmu_idx);
>> +if (test_bit(mmu_idx, _idx_bitmask)) {
>> +tlb_debug("%d\n", mmu_idx);
>>
>> -memset(env->tlb_table[mmu_idx], -1, sizeof(env->tlb_table[0]));
>> -memset(env->tlb_v_table[mmu_idx], -1, sizeof(env->tlb_v_table[0]));
>> +memset(env->tlb_table[mmu_idx], -1, sizeof(env->tlb_table[0]));
>> +memset(env->tlb_v_table[mmu_idx], -1, 
>> sizeof(env->tlb_v_table[0]));
>> +}
>
> Perhaps it doesn't matter since NB_MMU_MODES is so small but
>
>for (; idxmap != 0; idxmap &= idxmap - 1) {
>  int mmu_idx = ctz32(idxmap);
>  ...
>}
>
> iterates only for the bits that are set.
>
>> -typedef enum ARMMMUIdx {
>> -ARMMMUIdx_S12NSE0 = 0,
>> -ARMMMUIdx_S12NSE1 = 1,
>> -ARMMMUIdx_S1E2 = 2,
>> -ARMMMUIdx_S1E3 = 3,
>> -ARMMMUIdx_S1SE0 = 4,
>> -ARMMMUIdx_S1SE1 = 5,
>> -ARMMMUIdx_S2NS = 6,
>> +typedef enum ARMMMUBitMap {
>> +ARMMMUBit_S12NSE0 = 1 << 0,
>> +ARMMMUBit_S12NSE1 = 1 << 1,
>> +ARMMMUBit_S1E2 = 1 << 2,
>> +ARMMMUBit_S1E3 = 1 << 3,
>> +ARMMMUBit_S1SE0 = 1 << 4,
>> +ARMMMUBit_S1SE1 = 1 << 5,
>> +ARMMMUBit_S2NS = 1 << 6,
>>  /* Indexes below here don't have TLBs and are used only for AT system
>>   * instructions or for the first stage of an S12 page table walk.
>>   */
>> -ARMMMUIdx_S1NSE0 = 7,
>> -ARMMMUIdx_S1NSE1 = 8,
>> -} ARMMMUIdx;
>> +ARMMMUBit_S1NSE0 = 1 << 7,
>> +ARMMMUBit_S1NSE1 = 1 << 8,
>> +} ARMMMUBitMap;
>>
>> -#define MMU_USER_IDX 0
>> +typedef int ARMMMUIdx;
>> +
>> +static inline ARMMMUIdx arm_mmu_bit_to_idx(ARMMMUBitMap bit)
>> +{
>> +g_assert(ctpop16(bit) == 1);
>> +return ctz32(bit);
>> +}
>> +
>> +static inline ARMMMUBitMap arm_mmu_idx_to_bit(ARMMMUIdx idx)
>> +{
>> +return 1 << idx;
>> +}
>
> I don't understand this redefinition though, causing...
>
>> @@ -2109,13 +2109,13 @@ static void ats_write(CPUARMState *env, const 
>> ARMCPRegInfo *ri, uint64_t value)
>>  /* stage 1 current state PL1: ATS1CPR, ATS1CPW */
>>  switch (el) {
>>  case 3:
>> -mmu_idx = ARMMMUIdx_S1E3;
>> +mmu_bit = ARMMMUBit_S1E3;
>>  break;
>>  case 2:
>> -mmu_idx = ARMMMUIdx_S1NSE1;
>> +mmu_bit = ARMMMUBit_S1NSE1;
>>  break;
>>  case 1:
>> -mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_S1NSE1;
>> +mmu_bit = secure ? ARMMMUBit_S1SE1 : ARMMMUBit_S1NSE1;
>>  break;
>>  default:
>>  g_assert_not_reached();
>> @@ -2125,13 +2125,13 @@ static void ats_write(CPUARMState *env, const 
>> ARMCPRegInfo *ri, uint64_t value)
>>  /* stage 1 current state PL0: ATS1CUR, ATS1CUW */
>>  switch (el) {
>>  case 3:
>> -mmu_idx = ARMMMUIdx_S1SE0;
>> +mmu_bit = ARMMMUBit_S1SE0;
>>  break;
>>  case 2:
>> -mmu_idx = ARMMMUIdx_S1NSE0;
>> +mmu_bit = ARMMMUBit_S1NSE0;
>>  break;
>>  case 1:
>> -mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_S1NSE0;
>> +mmu_bit = secure ? ARMMMUBit_S1SE0 : ARMMMUBit_S1NSE0;
>>  break;
>>  default:
>>  g_assert_not_reached();
>> @@ -2139,17 +2139,17 @@ static void ats_write(CPUARMState *env, const 
>> ARMCPRegInfo *ri, uint64_t value)
>>  break;
>>  case 4:
>>  /* stage 1+2 NonSecure PL1: ATS12NSOPR, ATS12NSOPW */
>> -mmu_idx = ARMMMUIdx_S12NSE1;
>> +mmu_bit = ARMMMUBit_S12NSE1;
>>  break;
>>  case 6:
>>  /* stage 1+2 NonSecure PL0: ATS12NSOUR, ATS12NSOUW */
>> -mmu_idx = ARMMMUIdx_S12NSE0;
>> +mmu_bit = ARMMMUBit_S12NSE0;
>>  break;
>>  default:
>>  g_assert_not_reached();
>>  }
>>
>> -par64 = do_ats_write(env, value, access_type, mmu_idx);
>> +par64 = do_ats_write(env, value, access_type, 
>> arm_mmu_bit_to_idx(mmu_bit));
>
> ... this sort of churn, only to convert *back* to an index.

Yeah I've dropped this is v9, ARM now justs does 1 << ARMMMUIdx_foo at
the tlb_flush call sites.

>
>> @@ -2185,26 +2186,26 @@ static void ats_write64(CPUARMState *env, const 
>> ARMCPRegInfo *ri,
>>  case 0:
>>  switch (ri->opc1) {
>>  case 0: /* AT S1E1R, AT S1E1W */
>> -mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_S1NSE1;
>> +mmu_idx = secure ? ARMMMUBit_S1SE1 : ARMMMUBit_S1NSE1;
>>  break;
>>  case 4: /* AT S1E2R, AT S1E2W */
>> -mmu_idx = ARMMMUIdx_S1E2;
>> +mmu_idx = ARMMMUBit_S1E2;
>>  break;
>>  case 6: /* AT S1E3R, AT S1E3W */
>> -mmu_idx = ARMMMUIdx_S1E3;
>> +mmu_idx = ARMMMUBit_S1E3;
>>

Re: [Qemu-devel] [PATCH 1/2] ppc/kvm: Handle the "family" CPU via alias instead of registering new types

2017-01-31 Thread Thomas Huth

On 01.02.2017 01:10, David Gibson wrote:
> On Tue, Jan 31, 2017 at 02:11:58PM +0100, Thomas Huth wrote:
>> When running with KVM on POWER, we are registering a "family" CPU
>> type for the host CPU that we are running on. For example, on all
>> POWER8-compatible hosts, we register a "POWER8" CPU type, so that
>> you can always start QEMU with "-cpu POWER8" there, without the
>> need to know whether you are running on a POWER8, POWER8E or POWER8NVL
>> host machine.
>> However, we also have a "POWER8" CPU alias in the ppc_cpu_aliases list
>> (that is mainly useful for TCG). This leads to two cosmetical drawbacks:
>> If the user runs QEMU with "-cpu ?", we always claim that POWER8 is an
>> "alias for POWER8_v2.0" - which is simply not true when running with
>> KVM on POWER. And when using the 'query-cpu-definitions' QMP call,
>> there are currently two entries for "POWER8", one for the alias, and
>> one for the additional registered type.
>> To solve the two problems, we should rather update the "family" alias
>> instead of registering a new types. We then only have one "POWER8"
>> CPU definition around, an alias, which also points to the right
>> destination.
>>
>> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1396536
>> Signed-off-by: Thomas Huth 
> 
> Reviewed-by: David Gibson 
> 
> Updating the otherwise static table of aliases is kind of ugly, but
> then so is registering an extra full type as we do now.
> 
> Is this safe to apply without the follow up patch to vl.c.

Yes. It fixes the problem with "query-cpu-definitions" already. You just
need the other patch to get the output of "-cpu ?" right, too.

 Thomas




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v4 0/2] block/qapi: refactor and optimize the qmp_query_blockstats()

2017-01-31 Thread Markus Armbruster

Max Reitz  writes:

> Thanks, dropped one of the duplicate info->value assignments in patch 2
> and applied the series to my block tree:
>
> https://github.com/XanClic/qemu/commits/block
>
> (Assuming Markus' R-b meant that he does not intends to take it through
> his tree.)

"Does not intend" is a bit too strong, but please go ahead and save me
work :)

[Qemu-devel] [PATCH v2] xen: use qdev_unplug() instead of g_free() in xen_pv_find_xendev()

2017-01-31 Thread Juergen Gross

The error exits of xen_pv_find_xendev() free the new xen-device via
g_free() which is wrong.

As the xen-device has been initialized as qdev it must be removed
via qdev_unplug().

This bug has been introduced with commit 3a6c9172ac5951e6dac2b3f6
("xen: create qdev for each backend device").

Reported-by: Roger Pau Monné 
Tested-by: Roger Pau Monné 
Signed-off-by: Juergen Gross 
---
V2: set free method to avoid memory leak (Peter Maydell)
use DEVICE(xendev) instead of >qdev (Peter Maydell)
---
 hw/xen/xen_backend.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index d119004..6c21c37 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -124,10 +124,11 @@ static struct XenDevice *xen_be_get_xendev(const char 
*type, int dom, int dev,
 /* init new xendev */
 xendev = g_malloc0(ops->size);
 object_initialize(>qdev, ops->size, TYPE_XENBACKEND);
-qdev_set_parent_bus(>qdev, xen_sysbus);
-qdev_set_id(>qdev, g_strdup_printf("xen-%s-%d", type, dev));
-qdev_init_nofail(>qdev);
-object_unref(OBJECT(>qdev));
+OBJECT(xendev)->free = g_free;
+qdev_set_parent_bus(DEVICE(xendev), xen_sysbus);
+qdev_set_id(DEVICE(xendev), g_strdup_printf("xen-%s-%d", type, dev));
+qdev_init_nofail(DEVICE(xendev));
+object_unref(OBJECT(xendev));
 
 xendev->type  = type;
 xendev->dom   = dom;
@@ -145,7 +146,7 @@ static struct XenDevice *xen_be_get_xendev(const char 
*type, int dom, int dev,
 xendev->evtchndev = xenevtchn_open(NULL, 0);
 if (xendev->evtchndev == NULL) {
 xen_pv_printf(NULL, 0, "can't open evtchn device\n");
-g_free(xendev);
+qdev_unplug(DEVICE(xendev), NULL);
 return NULL;
 }
 fcntl(xenevtchn_fd(xendev->evtchndev), F_SETFD, FD_CLOEXEC);
@@ -155,7 +156,7 @@ static struct XenDevice *xen_be_get_xendev(const char 
*type, int dom, int dev,
 if (xendev->gnttabdev == NULL) {
 xen_pv_printf(NULL, 0, "can't open gnttab device\n");
 xenevtchn_close(xendev->evtchndev);
-g_free(xendev);
+qdev_unplug(DEVICE(xendev), NULL);
 return NULL;
 }
 } else {
-- 
2.10.2

Re: [Qemu-devel] [PATCH] Drop QEMU_GNUC_PREREQ() checks for gcc older than 4.1

2017-01-31 Thread Markus Armbruster

Paolo Bonzini  writes:

> On 31/01/2017 12:40, Markus Armbruster wrote:
>>>  
>>>  #define QEMU_NORETURN __attribute__ ((__noreturn__))
>>>  
>>> -#if QEMU_GNUC_PREREQ(3, 4)
>>>  #define QEMU_WARN_UNUSED_RESULT __attribute__((warn_unused_result))
>>> -#else
>>> -#define QEMU_WARN_UNUSED_RESULT
>>> -#endif
>> Should we inline this macro?
>> 
>>>  
>>> -#if QEMU_GNUC_PREREQ(4, 0)
>>>  #define QEMU_SENTINEL __attribute__((sentinel))
>>> -#else
>>> -#define QEMU_SENTINEL
>>> -#endif
>> Likewise.
>
> Why, since we don't do that for QEMU_NORETURN, QEMU_PACKED, etc.?

Because we do it for aligned, always_inline, constructor, format, mode,
noinline, and in places even noreturn and packed:

$ git-grep __attribute__ | sed '/define/d;s/.*__attribute__ 
*((\([A-Za-z0-9_]*\).*/\1/' | sort -u

[Qemu-devel] [PULL 2/5] sheepdog: reorganize coroutine flow

2017-01-31 Thread Jeff Cody

From: Paolo Bonzini 

Delimit co_recv's lifetime clearly in aio_read_response.

Do a simple qemu_coroutine_enter in aio_read_response, letting
sd_co_writev call sd_write_done.

Handle nr_pending in the same way in sd_co_rw_vector,
sd_write_done and sd_co_flush_to_disk.

Remove sd_co_rw_vector's return value; just leave with no
pending requests.

[Jeff: added missing 'return' back, spotted by Paolo after
   series was applied.]

Signed-off-by: Jeff Cody 
---
 block/sheepdog.c | 115 ---
 1 file changed, 42 insertions(+), 73 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 5fde37a..e0985df 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -345,8 +345,6 @@ struct SheepdogAIOCB {
 enum AIOCBState aiocb_type;
 
 Coroutine *coroutine;
-void (*aio_done_func)(SheepdogAIOCB *);
-
 int nr_pending;
 
 uint32_t min_affect_data_idx;
@@ -449,14 +447,13 @@ static const char * sd_strerror(int err)
  *
  * 1. In sd_co_rw_vector, we send the I/O requests to the server and
  *link the requests to the inflight_list in the
- *BDRVSheepdogState.  The function exits without waiting for
+ *BDRVSheepdogState.  The function yields while waiting for
  *receiving the response.
  *
  * 2. We receive the response in aio_read_response, the fd handler to
- *the sheepdog connection.  If metadata update is needed, we send
- *the write request to the vdi object in sd_write_done, the write
- *completion function.  We switch back to sd_co_readv/writev after
- *all the requests belonging to the AIOCB are finished.
+ *the sheepdog connection.  We switch back to sd_co_readv/sd_writev
+ *after all the requests belonging to the AIOCB are finished.  If
+ *needed, sd_co_writev will send another requests for the vdi object.
  */
 
 static inline AIOReq *alloc_aio_req(BDRVSheepdogState *s, SheepdogAIOCB *acb,
@@ -491,12 +488,6 @@ static inline void free_aio_req(BDRVSheepdogState *s, 
AIOReq *aio_req)
 acb->nr_pending--;
 }
 
-static void coroutine_fn sd_finish_aiocb(SheepdogAIOCB *acb)
-{
-qemu_coroutine_enter(acb->coroutine);
-qemu_aio_unref(acb);
-}
-
 static const AIOCBInfo sd_aiocb_info = {
 .aiocb_size = sizeof(SheepdogAIOCB),
 };
@@ -517,7 +508,6 @@ static SheepdogAIOCB *sd_aio_setup(BlockDriverState *bs, 
QEMUIOVector *qiov,
 acb->sector_num = sector_num;
 acb->nb_sectors = nb_sectors;
 
-acb->aio_done_func = NULL;
 acb->coroutine = qemu_coroutine_self();
 acb->ret = 0;
 acb->nr_pending = 0;
@@ -788,9 +778,6 @@ static void coroutine_fn aio_read_response(void *opaque)
 
 switch (acb->aiocb_type) {
 case AIOCB_WRITE_UDATA:
-/* this coroutine context is no longer suitable for co_recv
- * because we may send data to update vdi objects */
-s->co_recv = NULL;
 if (!is_data_obj(aio_req->oid)) {
 break;
 }
@@ -838,6 +825,11 @@ static void coroutine_fn aio_read_response(void *opaque)
 }
 }
 
+/* No more data for this aio_req (reload_inode below uses its own file
+ * descriptor handler which doesn't use co_recv).
+*/
+s->co_recv = NULL;
+
 switch (rsp.result) {
 case SD_RES_SUCCESS:
 break;
@@ -855,7 +847,7 @@ static void coroutine_fn aio_read_response(void *opaque)
 aio_req->oid = vid_to_vdi_oid(s->inode.vdi_id);
 }
 resend_aioreq(s, aio_req);
-goto out;
+return;
 default:
 acb->ret = -EIO;
 error_report("%s", sd_strerror(rsp.result));
@@ -868,13 +860,12 @@ static void coroutine_fn aio_read_response(void *opaque)
  * We've finished all requests which belong to the AIOCB, so
  * we can switch back to sd_co_readv/writev now.
  */
-acb->aio_done_func(acb);
+qemu_coroutine_enter(acb->coroutine);
 }
-out:
-s->co_recv = NULL;
+
 return;
+
 err:
-s->co_recv = NULL;
 reconnect_to_sdog(opaque);
 }
 
@@ -1973,7 +1964,6 @@ static int sd_truncate(BlockDriverState *bs, int64_t 
offset)
 /*
  * This function is called after writing data objects.  If we need to
  * update metadata, this sends a write request to the vdi object.
- * Otherwise, this switches back to sd_co_readv/writev.
  */
 static void coroutine_fn sd_write_done(SheepdogAIOCB *acb)
 {
@@ -1986,6 +1976,7 @@ static void coroutine_fn sd_write_done(SheepdogAIOCB *acb)
 mx = acb->max_dirty_data_idx;
 if (mn <= mx) {
 /* we need to update the vdi object. */
+++acb->nr_pending;
 offset = sizeof(s->inode) - sizeof(s->inode.data_vdi_id) +
 mn * sizeof(s->inode.data_vdi_id[0]);
 data_len = (mx - mn + 1) * sizeof(s->inode.data_vdi_id[0]);
@@ -1999,13 +1990,10 @@ static void coroutine_fn sd_write_done(SheepdogAIOCB 
*acb)
 data_len, offset, 0, false, 0, offset);

[Qemu-devel] [PULL 5/5] sheepdog: reorganize check for overlapping requests

2017-01-31 Thread Jeff Cody

From: Paolo Bonzini 

Wrap the code that was copied repeatedly in the two functions,
sd_aio_setup and sd_aio_complete.

Signed-off-by: Paolo Bonzini 
Message-id: 20161129113245.32724-6-pbonz...@redhat.com
Signed-off-by: Jeff Cody 
---
 block/sheepdog.c | 66 ++--
 1 file changed, 30 insertions(+), 36 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 3ef7601..f757157 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -479,6 +479,19 @@ static inline AIOReq *alloc_aio_req(BDRVSheepdogState *s, 
SheepdogAIOCB *acb,
 return aio_req;
 }
 
+static void wait_for_overlapping_aiocb(BDRVSheepdogState *s, SheepdogAIOCB 
*acb)
+{
+SheepdogAIOCB *cb;
+
+retry:
+QLIST_FOREACH(cb, >inflight_aiocb_head, aiocb_siblings) {
+if (AIOCBOverlapping(acb, cb)) {
+qemu_co_queue_wait(>overlapping_queue);
+goto retry;
+}
+}
+}
+
 static void sd_aio_setup(SheepdogAIOCB *acb, BDRVSheepdogState *s,
  QEMUIOVector *qiov, int64_t sector_num, int 
nb_sectors,
  int type)
@@ -505,6 +518,13 @@ static void sd_aio_setup(SheepdogAIOCB *acb, 
BDRVSheepdogState *s,
 acb->min_dirty_data_idx = UINT32_MAX;
 acb->max_dirty_data_idx = 0;
 acb->aiocb_type = type;
+
+if (type == AIOCB_FLUSH_CACHE) {
+return;
+}
+
+wait_for_overlapping_aiocb(s, acb);
+QLIST_INSERT_HEAD(>inflight_aiocb_head, acb, aiocb_siblings);
 }
 
 /* Return -EIO in case of error, file descriptor on success */
@@ -2189,18 +2209,14 @@ static void coroutine_fn sd_co_rw_vector(SheepdogAIOCB 
*acb)
 }
 }
 
-static bool check_overlapping_aiocb(BDRVSheepdogState *s, SheepdogAIOCB *aiocb)
+static void sd_aio_complete(SheepdogAIOCB *acb)
 {
-SheepdogAIOCB *cb;
-
-QLIST_FOREACH(cb, >inflight_aiocb_head, aiocb_siblings) {
-if (AIOCBOverlapping(aiocb, cb)) {
-return true;
-}
+if (acb->aiocb_type == AIOCB_FLUSH_CACHE) {
+return;
 }
 
-QLIST_INSERT_HEAD(>inflight_aiocb_head, aiocb, aiocb_siblings);
-return false;
+QLIST_REMOVE(acb, aiocb_siblings);
+qemu_co_queue_restart_all(>s->overlapping_queue);
 }
 
 static coroutine_fn int sd_co_writev(BlockDriverState *bs, int64_t sector_num,
@@ -2219,18 +2235,10 @@ static coroutine_fn int sd_co_writev(BlockDriverState 
*bs, int64_t sector_num,
 }
 
 sd_aio_setup(, s, qiov, sector_num, nb_sectors, AIOCB_WRITE_UDATA);
-
-retry:
-if (check_overlapping_aiocb(s, )) {
-qemu_co_queue_wait(>overlapping_queue);
-goto retry;
-}
-
 sd_co_rw_vector();
 sd_write_done();
+sd_aio_complete();
 
-QLIST_REMOVE(, aiocb_siblings);
-qemu_co_queue_restart_all(>overlapping_queue);
 return acb.ret;
 }
 
@@ -2241,17 +2249,9 @@ static coroutine_fn int sd_co_readv(BlockDriverState 
*bs, int64_t sector_num,
 BDRVSheepdogState *s = bs->opaque;
 
 sd_aio_setup(, s, qiov, sector_num, nb_sectors, AIOCB_READ_UDATA);
-
-retry:
-if (check_overlapping_aiocb(s, )) {
-qemu_co_queue_wait(>overlapping_queue);
-goto retry;
-}
-
 sd_co_rw_vector();
+sd_aio_complete();
 
-QLIST_REMOVE(, aiocb_siblings);
-qemu_co_queue_restart_all(>overlapping_queue);
 return acb.ret;
 }
 
@@ -2275,6 +2275,8 @@ static int coroutine_fn 
sd_co_flush_to_disk(BlockDriverState *bs)
 if (--acb.nr_pending) {
 qemu_coroutine_yield();
 }
+
+sd_aio_complete();
 return acb.ret;
 }
 
@@ -2730,17 +2732,9 @@ static coroutine_fn int sd_co_pdiscard(BlockDriverState 
*bs, int64_t offset,
 }
 sd_aio_setup(, s, _iov, offset >> BDRV_SECTOR_BITS,
  count >> BDRV_SECTOR_BITS, AIOCB_DISCARD_OBJ);
-
-retry:
-if (check_overlapping_aiocb(s, )) {
-qemu_co_queue_wait(>overlapping_queue);
-goto retry;
-}
-
 sd_co_rw_vector();
+sd_aio_complete();
 
-QLIST_REMOVE(, aiocb_siblings);
-qemu_co_queue_restart_all(>overlapping_queue);
 return acb.ret;
 }
 
-- 
2.9.3

[Qemu-devel] [PULL 3/5] sheepdog: do not use BlockAIOCB

2017-01-31 Thread Jeff Cody

From: Paolo Bonzini 

Sheepdog's AIOCB are completely internal entities for a group of
requests and do not need dynamic allocation.

Signed-off-by: Paolo Bonzini 
Message-id: 20161129113245.32724-4-pbonz...@redhat.com
Signed-off-by: Jeff Cody 
---
 block/sheepdog.c | 99 ++--
 1 file changed, 39 insertions(+), 60 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index e0985df..33ded57 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -306,6 +306,7 @@ static inline size_t count_data_objs(const struct 
SheepdogInode *inode)
 } while (0)
 
 typedef struct SheepdogAIOCB SheepdogAIOCB;
+typedef struct BDRVSheepdogState BDRVSheepdogState;
 
 typedef struct AIOReq {
 SheepdogAIOCB *aiocb;
@@ -334,7 +335,7 @@ enum AIOCBState {
|| y->max_affect_data_idx < x->min_affect_data_idx))
 
 struct SheepdogAIOCB {
-BlockAIOCB common;
+BDRVSheepdogState *s;
 
 QEMUIOVector *qiov;
 
@@ -362,7 +363,7 @@ struct SheepdogAIOCB {
 QLIST_ENTRY(SheepdogAIOCB) aiocb_siblings;
 };
 
-typedef struct BDRVSheepdogState {
+struct BDRVSheepdogState {
 BlockDriverState *bs;
 AioContext *aio_context;
 
@@ -389,7 +390,7 @@ typedef struct BDRVSheepdogState {
 
 CoQueue overlapping_queue;
 QLIST_HEAD(inflight_aiocb_head, SheepdogAIOCB) inflight_aiocb_head;
-} BDRVSheepdogState;
+};
 
 typedef struct BDRVSheepdogReopenState {
 int fd;
@@ -488,20 +489,15 @@ static inline void free_aio_req(BDRVSheepdogState *s, 
AIOReq *aio_req)
 acb->nr_pending--;
 }
 
-static const AIOCBInfo sd_aiocb_info = {
-.aiocb_size = sizeof(SheepdogAIOCB),
-};
-
-static SheepdogAIOCB *sd_aio_setup(BlockDriverState *bs, QEMUIOVector *qiov,
-   int64_t sector_num, int nb_sectors)
+static void sd_aio_setup(SheepdogAIOCB *acb, BDRVSheepdogState *s,
+ QEMUIOVector *qiov, int64_t sector_num, int 
nb_sectors,
+ int type)
 {
-SheepdogAIOCB *acb;
 uint32_t object_size;
-BDRVSheepdogState *s = bs->opaque;
 
 object_size = (UINT32_C(1) << s->inode.block_size_shift);
 
-acb = qemu_aio_get(_aiocb_info, bs, NULL, NULL);
+acb->s = s;
 
 acb->qiov = qiov;
 
@@ -518,8 +514,7 @@ static SheepdogAIOCB *sd_aio_setup(BlockDriverState *bs, 
QEMUIOVector *qiov,
 
 acb->min_dirty_data_idx = UINT32_MAX;
 acb->max_dirty_data_idx = 0;
-
-return acb;
+acb->aiocb_type = type;
 }
 
 /* Return -EIO in case of error, file descriptor on success */
@@ -1967,7 +1962,7 @@ static int sd_truncate(BlockDriverState *bs, int64_t 
offset)
  */
 static void coroutine_fn sd_write_done(SheepdogAIOCB *acb)
 {
-BDRVSheepdogState *s = acb->common.bs->opaque;
+BDRVSheepdogState *s = acb->s;
 struct iovec iov;
 AIOReq *aio_req;
 uint32_t offset, data_len, mn, mx;
@@ -2105,16 +2100,15 @@ out:
  * Returns 1 when we need to wait a response, 0 when there is no sent
  * request and -errno in error cases.
  */
-static void coroutine_fn sd_co_rw_vector(void *p)
+static void coroutine_fn sd_co_rw_vector(SheepdogAIOCB *acb)
 {
-SheepdogAIOCB *acb = p;
 int ret = 0;
 unsigned long len, done = 0, total = acb->nb_sectors * BDRV_SECTOR_SIZE;
 unsigned long idx;
 uint32_t object_size;
 uint64_t oid;
 uint64_t offset;
-BDRVSheepdogState *s = acb->common.bs->opaque;
+BDRVSheepdogState *s = acb->s;
 SheepdogInode *inode = >inode;
 AIOReq *aio_req;
 
@@ -,7 +2216,7 @@ static bool check_overlapping_aiocb(BDRVSheepdogState *s, 
SheepdogAIOCB *aiocb)
 static coroutine_fn int sd_co_writev(BlockDriverState *bs, int64_t sector_num,
 int nb_sectors, QEMUIOVector *qiov)
 {
-SheepdogAIOCB *acb;
+SheepdogAIOCB acb;
 int ret;
 int64_t offset = (sector_num + nb_sectors) * BDRV_SECTOR_SIZE;
 BDRVSheepdogState *s = bs->opaque;
@@ -2234,76 +2228,65 @@ static coroutine_fn int sd_co_writev(BlockDriverState 
*bs, int64_t sector_num,
 }
 }
 
-acb = sd_aio_setup(bs, qiov, sector_num, nb_sectors);
-acb->aiocb_type = AIOCB_WRITE_UDATA;
+sd_aio_setup(, s, qiov, sector_num, nb_sectors, AIOCB_WRITE_UDATA);
 
 retry:
-if (check_overlapping_aiocb(s, acb)) {
+if (check_overlapping_aiocb(s, )) {
 qemu_co_queue_wait(>overlapping_queue);
 goto retry;
 }
 
-sd_co_rw_vector(acb);
-sd_write_done(acb);
+sd_co_rw_vector();
+sd_write_done();
 
-QLIST_REMOVE(acb, aiocb_siblings);
+QLIST_REMOVE(, aiocb_siblings);
 qemu_co_queue_restart_all(>overlapping_queue);
-ret = acb->ret;
-qemu_aio_unref(acb);
-return ret;
+return acb.ret;
 }
 
 static coroutine_fn int sd_co_readv(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov)
 {
-SheepdogAIOCB *acb;
-int ret;
+SheepdogAIOCB acb;
 BDRVSheepdogState *s =

Re: [Qemu-devel] [PATCH for-2.9 v3 0/5] Sheepdog cleanups

2017-01-31 Thread Jeff Cody

On Mon, Jan 30, 2017 at 07:48:56PM -0600, Paolo Bonzini wrote:
> 
> 
> On 04/01/2017 11:47, Jeff Cody wrote:
> > On Wed, Jan 04, 2017 at 12:42:53PM +0100, Paolo Bonzini wrote:
> >>
> >>
> >> On 04/01/2017 05:07, Jeff Cody wrote:
> >>> On Wed, Dec 21, 2016 at 03:07:07PM +0100, Paolo Bonzini wrote:
> 
> 
>  On 29/11/2016 12:32, Paolo Bonzini wrote:
> > Cleaning up the code and removing duplication makes it simpler to
> > later adapt it for the multiqueue work.
> >
> > Tested against sheepdog 1.0.  I also tested taking snapshots and 
> > reverting
> > to older snapshots, but the latter only worked with "dog vdi rollback".
> > Neither loadvm nor qemu-img worked for me.
> >
> > Paolo
> >
> > v1->v2: placate patchew
> > v2->v3: rebase
> >
> > Paolo Bonzini (5):
> >   sheepdog: remove unused cancellation support
> >   sheepdog: reorganize coroutine flow
> >   sheepdog: do not use BlockAIOCB
> >   sheepdog: simplify inflight_aio_head management
> >   sheepdog: reorganize check for overlapping requests
> >
> >  block/sheepdog.c | 289 
> > ---
> >  1 file changed, 83 insertions(+), 206 deletions(-)
> >
> 
>  2.8 is now out, so ping.
> 
> >>>
> >>> I don't have a functional sheepdog setup at the moment; have you tested
> >>> these patches, or should I set up a test rig for them? (I am guessing I
> >>> should do the latter; either way, I'll pull the patches in once I or 
> >>> someone
> >>> else has tested them).
> >>
> >> Yes, I have tested them.  Fedora's sheepdog package is fubar right now.
> >> I've built an updated one and asked for ownership but haven't had any
> >> answer yet.
> >>
> >> Paolo
> > 
> > Great, thanks.
> > 
> > Applied to my block branch:
> > 
> > git://github.com/codyprime/qemu-kvm-jtc.git block
> > 
> > -Jeff
> > 
> > 
> 
> Ping?
> 

Sorry, never flushed my pull request queue.  Flushed now.

Thanks,
Jeff

[Qemu-devel] [PULL 1/5] sheepdog: remove unused cancellation support

2017-01-31 Thread Jeff Cody

From: Paolo Bonzini 

SheepdogAIOCB is internal to sheepdog.c, hence it is never canceled.

Signed-off-by: Paolo Bonzini 
Message-id: 20161129113245.32724-2-pbonz...@redhat.com
Signed-off-by: Jeff Cody 
---
 block/sheepdog.c | 52 
 1 file changed, 52 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 5637e0c..5fde37a 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -347,7 +347,6 @@ struct SheepdogAIOCB {
 Coroutine *coroutine;
 void (*aio_done_func)(SheepdogAIOCB *);
 
-bool cancelable;
 int nr_pending;
 
 uint32_t min_affect_data_idx;
@@ -486,7 +485,6 @@ static inline void free_aio_req(BDRVSheepdogState *s, 
AIOReq *aio_req)
 {
 SheepdogAIOCB *acb = aio_req->aiocb;
 
-acb->cancelable = false;
 QLIST_REMOVE(aio_req, aio_siblings);
 g_free(aio_req);
 
@@ -499,57 +497,8 @@ static void coroutine_fn sd_finish_aiocb(SheepdogAIOCB 
*acb)
 qemu_aio_unref(acb);
 }
 
-/*
- * Check whether the specified acb can be canceled
- *
- * We can cancel aio when any request belonging to the acb is:
- *  - Not processed by the sheepdog server.
- *  - Not linked to the inflight queue.
- */
-static bool sd_acb_cancelable(const SheepdogAIOCB *acb)
-{
-BDRVSheepdogState *s = acb->common.bs->opaque;
-AIOReq *aioreq;
-
-if (!acb->cancelable) {
-return false;
-}
-
-QLIST_FOREACH(aioreq, >inflight_aio_head, aio_siblings) {
-if (aioreq->aiocb == acb) {
-return false;
-}
-}
-
-return true;
-}
-
-static void sd_aio_cancel(BlockAIOCB *blockacb)
-{
-SheepdogAIOCB *acb = (SheepdogAIOCB *)blockacb;
-BDRVSheepdogState *s = acb->common.bs->opaque;
-AIOReq *aioreq, *next;
-
-if (sd_acb_cancelable(acb)) {
-/* Remove outstanding requests from failed queue.  */
-QLIST_FOREACH_SAFE(aioreq, >failed_aio_head, aio_siblings,
-   next) {
-if (aioreq->aiocb == acb) {
-free_aio_req(s, aioreq);
-}
-}
-
-assert(acb->nr_pending == 0);
-if (acb->common.cb) {
-acb->common.cb(acb->common.opaque, -ECANCELED);
-}
-sd_finish_aiocb(acb);
-}
-}
-
 static const AIOCBInfo sd_aiocb_info = {
 .aiocb_size = sizeof(SheepdogAIOCB),
-.cancel_async   = sd_aio_cancel,
 };
 
 static SheepdogAIOCB *sd_aio_setup(BlockDriverState *bs, QEMUIOVector *qiov,
@@ -569,7 +518,6 @@ static SheepdogAIOCB *sd_aio_setup(BlockDriverState *bs, 
QEMUIOVector *qiov,
 acb->nb_sectors = nb_sectors;
 
 acb->aio_done_func = NULL;
-acb->cancelable = true;
 acb->coroutine = qemu_coroutine_self();
 acb->ret = 0;
 acb->nr_pending = 0;
-- 
2.9.3

[Qemu-devel] [PULL 4/5] sheepdog: simplify inflight_aio_head management

2017-01-31 Thread Jeff Cody

From: Paolo Bonzini 

Add to the list in add_aio_request and, indirectly, resend_aioreq.  Inline
free_aio_req in the caller, it does not simply undo alloc_aio_req's job.

Signed-off-by: Paolo Bonzini 
Message-id: 20161129113245.32724-5-pbonz...@redhat.com
Signed-off-by: Jeff Cody 
---
 block/sheepdog.c | 23 ++-
 1 file changed, 6 insertions(+), 17 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 33ded57..3ef7601 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -479,16 +479,6 @@ static inline AIOReq *alloc_aio_req(BDRVSheepdogState *s, 
SheepdogAIOCB *acb,
 return aio_req;
 }
 
-static inline void free_aio_req(BDRVSheepdogState *s, AIOReq *aio_req)
-{
-SheepdogAIOCB *acb = aio_req->aiocb;
-
-QLIST_REMOVE(aio_req, aio_siblings);
-g_free(aio_req);
-
-acb->nr_pending--;
-}
-
 static void sd_aio_setup(SheepdogAIOCB *acb, BDRVSheepdogState *s,
  QEMUIOVector *qiov, int64_t sector_num, int 
nb_sectors,
  int type)
@@ -730,7 +720,6 @@ static coroutine_fn void reconnect_to_sdog(void *opaque)
 while (!QLIST_EMPTY(>failed_aio_head)) {
 aio_req = QLIST_FIRST(>failed_aio_head);
 QLIST_REMOVE(aio_req, aio_siblings);
-QLIST_INSERT_HEAD(>inflight_aio_head, aio_req, aio_siblings);
 resend_aioreq(s, aio_req);
 }
 }
@@ -825,6 +814,7 @@ static void coroutine_fn aio_read_response(void *opaque)
 */
 s->co_recv = NULL;
 
+QLIST_REMOVE(aio_req, aio_siblings);
 switch (rsp.result) {
 case SD_RES_SUCCESS:
 break;
@@ -849,8 +839,9 @@ static void coroutine_fn aio_read_response(void *opaque)
 break;
 }
 
-free_aio_req(s, aio_req);
-if (!acb->nr_pending) {
+g_free(aio_req);
+
+if (!--acb->nr_pending) {
 /*
  * We've finished all requests which belong to the AIOCB, so
  * we can switch back to sd_co_readv/writev now.
@@ -1110,6 +1101,8 @@ static void coroutine_fn 
add_aio_request(BDRVSheepdogState *s, AIOReq *aio_req,
 uint64_t old_oid = aio_req->base_oid;
 bool create = aio_req->create;
 
+QLIST_INSERT_HEAD(>inflight_aio_head, aio_req, aio_siblings);
+
 if (!nr_copies) {
 error_report("bug");
 }
@@ -1983,7 +1976,6 @@ static void coroutine_fn sd_write_done(SheepdogAIOCB *acb)
 iov.iov_len = sizeof(s->inode);
 aio_req = alloc_aio_req(s, acb, vid_to_vdi_oid(s->inode.vdi_id),
 data_len, offset, 0, false, 0, offset);
-QLIST_INSERT_HEAD(>inflight_aio_head, aio_req, aio_siblings);
 add_aio_request(s, aio_req, , 1, AIOCB_WRITE_UDATA);
 if (--acb->nr_pending) {
 qemu_coroutine_yield();
@@ -2185,8 +2177,6 @@ static void coroutine_fn sd_co_rw_vector(SheepdogAIOCB 
*acb)
 old_oid,
 acb->aiocb_type == AIOCB_DISCARD_OBJ ?
 0 : done);
-QLIST_INSERT_HEAD(>inflight_aio_head, aio_req, aio_siblings);
-
 add_aio_request(s, aio_req, acb->qiov->iov, acb->qiov->niov,
 acb->aiocb_type);
 done:
@@ -2280,7 +2270,6 @@ static int coroutine_fn 
sd_co_flush_to_disk(BlockDriverState *bs)
 acb.nr_pending++;
 aio_req = alloc_aio_req(s, , vid_to_vdi_oid(s->inode.vdi_id),
 0, 0, 0, false, 0, 0);
-QLIST_INSERT_HEAD(>inflight_aio_head, aio_req, aio_siblings);
 add_aio_request(s, aio_req, NULL, 0, acb.aiocb_type);
 
 if (--acb.nr_pending) {
-- 
2.9.3

[Qemu-devel] [PULL 0/5] Block patches

2017-01-31 Thread Jeff Cody

The following changes since commit a0def594286d9110a6035e02eef558cf3cf5d847:

  Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging 
(2017-01-30 10:23:20 +)

are available in the git repository at:

  https://github.com/codyprime/qemu-kvm-jtc.git tags/block-pull-request

for you to fetch changes up to acf6e5f0962c4be670d4a93ede77423512521876:

  sheepdog: reorganize check for overlapping requests (2017-02-01 00:17:20 
-0500)


Block patches


Paolo Bonzini (5):
  sheepdog: remove unused cancellation support
  sheepdog: reorganize coroutine flow
  sheepdog: do not use BlockAIOCB
  sheepdog: simplify inflight_aio_head management
  sheepdog: reorganize check for overlapping requests

 block/sheepdog.c | 289 ---
 1 file changed, 84 insertions(+), 205 deletions(-)

-- 
2.9.3

Re: [Qemu-devel] [RFC PATCH 12/17] target/ppc/POWER9: Add POWER9 pa-features definition

2017-01-31 Thread David Gibson

On Fri, Jan 13, 2017 at 05:28:18PM +1100, Suraj Jitindar Singh wrote:
> Add a pa-features definition which includes all of the new fields which
> have been added, note we don't claim support for any of these new features
> at this stage.
> 
> Signed-off-by: Suraj Jitindar Singh 

Reviewed-by: David Gibson 

> ---
>  hw/ppc/spapr.c | 18 ++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 45bd2de..35799da 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -357,6 +357,20 @@ static void spapr_populate_pa_features(CPUPPCState *env, 
> void *fdt, int offset)
>  0x80, 0x00, 0x00, 0x00, 0x00, 0x00,
>  0x00, 0x00, 0x00, 0x00, 0x80, 0x00,
>  0x80, 0x00, 0x80, 0x00, 0x00, 0x00 };
> +/* Currently we don't advertise any of the "new" ISAv3.00 functionality 
> */
> +uint8_t pa_features_300[] = { 64, 0,
> +0xf6, 0x1f, 0xc7, 0xc0, 0x80, 0xf0, /*  0 -  5 */
> +0x80, 0x00, 0x00, 0x00, 0x00, 0x00, /*  6 - 11 */
> +0x00, 0x00, 0x00, 0x00, 0x80, 0x00, /* 12 - 17 */
> +0x80, 0x00, 0x80, 0x00, 0x00, 0x00, /* 18 - 23 */
> +0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 24 - 29 */
> +0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 30 - 35 */
> +0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 36 - 41 */
> +0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 42 - 47 */
> +0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 48 - 53 */
> +0x00, 0x00, 0x00, 0x00, 0x00, 0x00, /* 54 - 59 */
> +0x00, 0x00, 0x00, 0x00   }; /* 60 - 63 */
> +
>  uint8_t *pa_features;
>  size_t pa_size;
>  
> @@ -371,6 +385,10 @@ static void spapr_populate_pa_features(CPUPPCState *env, 
> void *fdt, int offset)
>  pa_features = pa_features_207;
>  pa_size = sizeof(pa_features_207);
>  break;
> +case POWERPC_MMU_3_00:
> +pa_features = pa_features_300;
> +pa_size = sizeof(pa_features_300);
> +break;
>  default:
>  return;
>  }

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH 13/17] target/ppc/POWER9: Add cpu_has_work function for POWER9

2017-01-31 Thread David Gibson

On Fri, Jan 13, 2017 at 05:28:19PM +1100, Suraj Jitindar Singh wrote:
> The cpu has work function is used to mask interrupts used to determine
> if there is work for the cpu based on the LPCR. Add a function to do this
> for POWER9 and add it to the POWER9 cpu definition. This is similar to that
> for POWER8 except using the LPCR bits as defined for POWER9.
> 
> Signed-off-by: Suraj Jitindar Singh 

Reviewed-by: David Gibson 

> ---
>  target/ppc/translate_init.c | 45 
> +
>  1 file changed, 45 insertions(+)
> 
> diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> index 87297a7..9db004d 100644
> --- a/target/ppc/translate_init.c
> +++ b/target/ppc/translate_init.c
> @@ -8797,10 +8797,54 @@ static bool ppc_pvr_match_power9(PowerPCCPUClass 
> *pcc, uint32_t pvr)
>  return false;
>  }
>  
> +static bool cpu_has_work_POWER9(CPUState *cs)
> +{
> +PowerPCCPU *cpu = POWERPC_CPU(cs);
> +CPUPPCState *env = >env;
> +
> +if (cs->halted) {
> +if (!(cs->interrupt_request & CPU_INTERRUPT_HARD)) {
> +return false;
> +}
> +/* External Exception */
> +if ((env->pending_interrupts & (1u << PPC_INTERRUPT_EXT)) &&
> +(env->spr[SPR_LPCR] & LPCR_EEE)) {
> +return true;
> +}
> +/* Decrementer Exception */
> +if ((env->pending_interrupts & (1u << PPC_INTERRUPT_DECR)) &&
> +(env->spr[SPR_LPCR] & LPCR_DEE)) {
> +return true;
> +}
> +/* Machine Check or Hypervisor Maintenance Exception */
> +if ((env->pending_interrupts & (1u << PPC_INTERRUPT_MCK |
> +1u << PPC_INTERRUPT_HMI)) && (env->spr[SPR_LPCR] & LPCR_OEE)) {
> +return true;
> +}
> +/* Privileged Doorbell Exception */
> +if ((env->pending_interrupts & (1u << PPC_INTERRUPT_DOORBELL)) &&
> +(env->spr[SPR_LPCR] & LPCR_PDEE)) {
> +return true;
> +}
> +/* Hypervisor Doorbell Exception */
> +if ((env->pending_interrupts & (1u << PPC_INTERRUPT_HDOORBELL)) &&
> +(env->spr[SPR_LPCR] & LPCR_HDEE)) {
> +return true;
> +}
> +if (env->pending_interrupts & (1u << PPC_INTERRUPT_RESET)) {
> +return true;
> +}
> +return false;
> +} else {
> +return msr_ee && (cs->interrupt_request & CPU_INTERRUPT_HARD);
> +}
> +}
> +
>  POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
>  {
>  DeviceClass *dc = DEVICE_CLASS(oc);
>  PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc);
> +CPUClass *cc = CPU_CLASS(oc);
>  
>  dc->fw_name = "PowerPC,POWER9";
>  dc->desc = "POWER9";
> @@ -8811,6 +8855,7 @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data)
>   PCR_COMPAT_2_05;
>  pcc->init_proc = init_proc_POWER9;
>  pcc->check_pow = check_pow_nocheck;
> +cc->has_work = cpu_has_work_POWER9;
>  pcc->insns_flags = PPC_INSNS_BASE | PPC_ISEL | PPC_STRING | PPC_MFTB |
> PPC_FLOAT | PPC_FLOAT_FSEL | PPC_FLOAT_FRES |
> PPC_FLOAT_FSQRT | PPC_FLOAT_FRSQRTE |

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH 10/17] target/ppc/POWER9: Add POWER9 mmu fault handler

2017-01-31 Thread David Gibson

On Fri, Jan 13, 2017 at 05:28:16PM +1100, Suraj Jitindar Singh wrote:
> Add a new mmu fault handler for the POWER9 cpu and add it as the handler
> for the POWER9 cpu definition.
> 
> This handler checks if the guest is radix or hash based on the value in the
> partition table entry and calls the correct fault handler accordingly.
> 
> The hash fault handling code has also been updated to check if the
> partition is using segment tables.
> 
> Currently only legacy hash (no segment tables) is supported.
> 
> Signed-off-by: Suraj Jitindar Singh 
> ---
>  target/ppc/mmu-hash64.c |  8 
>  target/ppc/mmu.h|  8 
>  target/ppc/mmu_helper.c | 47 
> +
>  target/ppc/translate_init.c |  2 +-
>  4 files changed, 64 insertions(+), 1 deletion(-)
> 
> diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
> index b9d4f4e..b476b3f 100644
> --- a/target/ppc/mmu-hash64.c
> +++ b/target/ppc/mmu-hash64.c
> @@ -73,6 +73,14 @@ static ppc_slb_t *slb_lookup(PowerPCCPU *cpu, target_ulong 
> eaddr)
>  }
>  }
>  
> +/* Check if in-memory segment tables are in use */
> +if (ppc64_use_proc_tbl(cpu)) {
> +/* TODO - Unsupported */
> +qemu_log_mask(LOG_UNIMP, "%s: unimplemented - segment table 
> support\n",
> +  __func__);
> +/* Not much we can do here, caller will generate a segment interrupt 
> */
> +}

I think this logic would be better in the fault handler.  For the
fault path in the segment table case, I don't think we want to even
model the SLB (in hardware the SLB is an important optimization, but I
don't think the software SLB will be notably better than just looking
up the seg table directly).  I think the other users of slb_lookup()
are in contexts that only make sense in the context of a software
managed SLB anyway.

>  return NULL;
>  }
>  
> diff --git a/target/ppc/mmu.h b/target/ppc/mmu.h
> index c7967c3..e07b128 100644
> --- a/target/ppc/mmu.h
> +++ b/target/ppc/mmu.h
> @@ -3,6 +3,10 @@
>  
>  #ifndef CONFIG_USER_ONLY
>  
> +/* Common Partition Table Entry Fields */
> +#define PATBE0_HR0x8000
> +#define PATBE1_GR0x8000
> +
>  /* Partition Table Entry */
>  struct patb_entry {
>  uint64_t patbe0, patbe1;
> @@ -11,6 +15,10 @@ struct patb_entry {
>  #ifdef TARGET_PPC64
>  
>  void ppc64_set_external_patb(PowerPCCPU *cpu, void *patb, Error **errp);
> +bool ppc64_use_proc_tbl(PowerPCCPU *cpu);
> +bool ppc64_radix_guest(PowerPCCPU *cpu);
> +int ppc64_handle_mmu_fault(PowerPCCPU *cpu, vaddr eaddr, int rwx,
> +   int mmu_idx);
>  
>  #endif /* TARGET_PPC64 */
>  
> diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
> index bc6c117..612f407 100644
> --- a/target/ppc/mmu_helper.c
> +++ b/target/ppc/mmu_helper.c
> @@ -28,6 +28,7 @@
>  #include "exec/cpu_ldst.h"
>  #include "exec/log.h"
>  #include "helper_regs.h"
> +#include "qemu/error-report.h"
>  #include "mmu.h"
>  
>  //#define DEBUG_MMU
> @@ -1281,6 +1282,17 @@ void dump_mmu(FILE *f, fprintf_function cpu_fprintf, 
> CPUPPCState *env)
>  case POWERPC_MMU_2_07a:
>  dump_slb(f, cpu_fprintf, ppc_env_get_cpu(env));
>  break;
> +case POWERPC_MMU_3_00:
> +if (ppc64_radix_guest(ppc_env_get_cpu(env))) {
> +/* TODO - Unsupported */
> +} else {
> +if (ppc64_use_proc_tbl(ppc_env_get_cpu(env))) {
> +/* TODO - Unsupported */
> +} else {
> +dump_slb(f, cpu_fprintf, ppc_env_get_cpu(env));
> +break;
> +}
> +}
>  #endif
>  default:
>  qemu_log_mask(LOG_UNIMP, "%s: unimplemented\n", __func__);
> @@ -1422,6 +1434,17 @@ hwaddr ppc_cpu_get_phys_page_debug(CPUState *cs, vaddr 
> addr)
>  case POWERPC_MMU_2_07:
>  case POWERPC_MMU_2_07a:
>  return ppc_hash64_get_phys_page_debug(cpu, addr);
> +case POWERPC_MMU_3_00:
> +if (ppc64_radix_guest(ppc_env_get_cpu(env))) {
> +/* TODO - Unsupported */
> +} else {
> +if (ppc64_use_proc_tbl(ppc_env_get_cpu(env))) {
> +/* TODO - Unsupported */
> +} else {
> +return ppc_hash64_get_phys_page_debug(cpu, addr);
> +}
> +}
> +break;
>  #endif
>  
>  case POWERPC_MMU_32B:
> @@ -2919,3 +2942,27 @@ void ppc64_set_external_patb(PowerPCCPU *cpu, void 
> *patb, Error **errp)
>  
>  env->external_patbe = (struct patb_entry *) patb;
>  }
> +
> +inline bool ppc64_use_proc_tbl(PowerPCCPU *cpu)

There's basically no point putting an inline on functions that are in
a .c rather than a .h (it will only be inlined for callers in this .o,
not elsewhere).  These are simple enough that I think they do belong
in the .h instead.

> +{
> +return !!(cpu->env.spr[SPR_LPCR] & LPCR_UPRT);
> +}
> +
> +inline bool

Re: [Qemu-devel] [RFC PATCH 11/17] target/ppc/POWER9: Update to new pte format for POWER9 accesses

2017-01-31 Thread David Gibson

On Fri, Jan 13, 2017 at 05:28:17PM +1100, Suraj Jitindar Singh wrote:
> The page table entry format was updated for the POWER9 processor.
> 
> It was decided that kernels would used the old format irrespective
> with the translation occuring at the hypervisor level. Thus we convert
> between the old and new format when accessing the ptes. Since we need the
> whole pte to perform this conversion, we remove the old functions which
> accessed either the first or second doubleword and introduce a new
> functions which access the entire pte, returning the entry converted
> back to the old format (if required). Update call sites accordingly.
> 
> Signed-off-by: Suraj Jitindar Singh 
> ---
>  hw/ppc/spapr_hcall.c| 51 ++-
>  target/ppc/mmu-hash64.c | 13 +
>  target/ppc/mmu-hash64.h | 71 
> -
>  3 files changed, 86 insertions(+), 49 deletions(-)
> 
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index 9a9bedf..9f0c20d 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -125,7 +125,8 @@ static target_ulong h_enter(PowerPCCPU *cpu, 
> sPAPRMachineState *spapr,
>  pte_index &= ~7ULL;
>  token = ppc_hash64_start_access(cpu, pte_index);
>  for (; index < 8; index++) {
> -if (!(ppc_hash64_load_hpte0(cpu, token, index) & 
> HPTE64_V_VALID)) {
> +ppc_hash_pte64_t pte = ppc_hash64_load_hpte(cpu, token, index);
> +if (!(pte.pte0 & HPTE64_V_VALID)) {
>  break;
>  }
>  }
> @@ -134,8 +135,10 @@ static target_ulong h_enter(PowerPCCPU *cpu, 
> sPAPRMachineState *spapr,
>  return H_PTEG_FULL;
>  }
>  } else {
> +ppc_hash_pte64_t pte;
>  token = ppc_hash64_start_access(cpu, pte_index);

IIRC the only reason for the clumsy start_access / stop_access stuff
was because we wanted to do the two loads separately (to avoid loading
word1 in cases we don't need it).  Since you're combining both loads
into a single helper, I think you can put the start_access /
stop_access into the same helper.

Or have I missed something.

> -if (ppc_hash64_load_hpte0(cpu, token, 0) & HPTE64_V_VALID) {
> +pte = ppc_hash64_load_hpte(cpu, token, 0);
> +if (pte.pte0 & HPTE64_V_VALID) {
>  ppc_hash64_stop_access(cpu, token);
>  return H_PTEG_FULL;
>  }
> @@ -163,26 +166,25 @@ static RemoveResult remove_hpte(PowerPCCPU *cpu, 
> target_ulong ptex,
>  {
>  CPUPPCState *env = >env;
>  uint64_t token;
> -target_ulong v, r;
> +ppc_hash_pte64_t pte;
>  
>  if (!valid_pte_index(env, ptex)) {
>  return REMOVE_PARM;
>  }
>  
>  token = ppc_hash64_start_access(cpu, ptex);
> -v = ppc_hash64_load_hpte0(cpu, token, 0);
> -r = ppc_hash64_load_hpte1(cpu, token, 0);
> +pte = ppc_hash64_load_hpte(cpu, token, 0);
>  ppc_hash64_stop_access(cpu, token);
>  
> -if ((v & HPTE64_V_VALID) == 0 ||
> -((flags & H_AVPN) && (v & ~0x7fULL) != avpn) ||
> -((flags & H_ANDCOND) && (v & avpn) != 0)) {
> +if ((pte.pte0 & HPTE64_V_VALID) == 0 ||
> +((flags & H_AVPN) && (pte.pte0 & ~0x7fULL) != avpn) ||
> +((flags & H_ANDCOND) && (pte.pte0 & avpn) != 0)) {
>  return REMOVE_NOT_FOUND;
>  }
> -*vp = v;
> -*rp = r;
> +*vp = pte.pte0;
> +*rp = pte.pte1;
>  ppc_hash64_store_hpte(cpu, ptex, HPTE64_V_HPTE_DIRTY, 0);
> -ppc_hash64_tlb_flush_hpte(cpu, ptex, v, r);
> +ppc_hash64_tlb_flush_hpte(cpu, ptex, pte.pte0, pte.pte1);
>  return REMOVE_SUCCESS;
>  }
>  
> @@ -293,35 +295,36 @@ static target_ulong h_protect(PowerPCCPU *cpu, 
> sPAPRMachineState *spapr,
>  target_ulong flags = args[0];
>  target_ulong pte_index = args[1];
>  target_ulong avpn = args[2];
> +ppc_hash_pte64_t pte;
>  uint64_t token;
> -target_ulong v, r;
>  
>  if (!valid_pte_index(env, pte_index)) {
>  return H_PARAMETER;
>  }
>  
>  token = ppc_hash64_start_access(cpu, pte_index);
> -v = ppc_hash64_load_hpte0(cpu, token, 0);
> -r = ppc_hash64_load_hpte1(cpu, token, 0);
> +pte = ppc_hash64_load_hpte(cpu, token, 0);
>  ppc_hash64_stop_access(cpu, token);
>  
> -if ((v & HPTE64_V_VALID) == 0 ||
> -((flags & H_AVPN) && (v & ~0x7fULL) != avpn)) {
> +if ((pte.pte0 & HPTE64_V_VALID) == 0 ||
> +((flags & H_AVPN) && (pte.pte0 & ~0x7fULL) != avpn)) {
>  return H_NOT_FOUND;
>  }
>  
> -r &= ~(HPTE64_R_PP0 | HPTE64_R_PP | HPTE64_R_N |
> -   HPTE64_R_KEY_HI | HPTE64_R_KEY_LO);
> -r |= (flags << 55) & HPTE64_R_PP0;
> -r |= (flags << 48) & HPTE64_R_KEY_HI;
> -r |= flags & (HPTE64_R_PP | HPTE64_R_N | HPTE64_R_KEY_LO);
> +pte.pte1 &= ~(HPTE64_R_PP0 | HPTE64_R_PP | HPTE64_R_N |
> +  HPTE64_R_KEY_HI | HPTE64_R_KEY_LO);
> +pte.pte1 |= (flags <<

Re: [Qemu-devel] [RFC PATCH 08/17] target/ppc/POWER9: Add external partition table pointer to cpu state

2017-01-31 Thread David Gibson

On Fri, Jan 13, 2017 at 05:28:14PM +1100, Suraj Jitindar Singh wrote:
> Similarly to how we have an external hpt pointer in the cpu state, add
> an external partition table pointer and update it to point to the
> partition table entry in the machine state struct on cpu reset.
> 
> Signed-off-by: Suraj Jitindar Singh 

As with the previous patch, I don't quite follow what's going on
here.  It seems to me that if the external HPT is set, that should
make the softmmu logic bypass *both* an HPT set by SDR1 (<= POWER8) or
an HPT set by the partition table (POWER9).  So I'm not sure why we
need the dummy partition table entry.

To look at it another way, the external HPT is special because it lies
outside the guest's address space, but we need its state because the
guest can manipulate it via hypercall.  For the partition table entry,
even if we're minimally modelling the HV parts of the POWER9 MMU,
isn't the partition table entry just fixed at startup?

> ---
>  hw/ppc/spapr_cpu_core.c | 12 ++--
>  target/ppc/cpu.h|  3 +++
>  target/ppc/mmu.h|  6 ++
>  target/ppc/mmu_helper.c | 12 
>  4 files changed, 31 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index 8cc7058..72a7f90 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -17,6 +17,7 @@
>  #include "hw/ppc/ppc.h"
>  #include "target/ppc/mmu-hash64.h"
>  #include "sysemu/numa.h"
> +#include "mmu.h"
>  
>  static void spapr_cpu_reset(void *opaque)
>  {
> @@ -34,8 +35,15 @@ static void spapr_cpu_reset(void *opaque)
>  
>  env->spr[SPR_HIOR] = 0;
>  
> -ppc_hash64_set_external_hpt(cpu, spapr->htab, spapr->htab_shift,
> -_fatal);
> +switch (env->mmu_model) {
> +case POWERPC_MMU_3_00:
> +ppc64_set_external_patb(cpu, spapr->patb, _fatal);
> +default:
> +/* We assume legacy until told otherwise, thus set HPT irrespective 
> */
> +ppc_hash64_set_external_hpt(cpu, spapr->htab, spapr->htab_shift,
> +_fatal);
> +break;
> +}
>  }
>  
>  static void spapr_cpu_destroy(PowerPCCPU *cpu)
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index 0ab49b3..e8b7c06 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -77,6 +77,7 @@
>  #include "exec/cpu-defs.h"
>  #include "cpu-qom.h"
>  #include "fpu/softfloat.h"
> +#include "mmu.h"
>  
>  #if defined (TARGET_PPC64)
>  #define PPC_ELF_MACHINE EM_PPC64
> @@ -1009,6 +1010,8 @@ struct CPUPPCState {
>  target_ulong sr[32];
>  /* externally stored hash table */
>  uint8_t *external_htab;
> +/* externally stored partition table entry */
> +struct patb_entry *external_patbe;
>  /* BATs */
>  uint32_t nb_BATs;
>  target_ulong DBAT[2][8];
> diff --git a/target/ppc/mmu.h b/target/ppc/mmu.h
> index 67b9707..c7967c3 100644
> --- a/target/ppc/mmu.h
> +++ b/target/ppc/mmu.h
> @@ -8,6 +8,12 @@ struct patb_entry {
>  uint64_t patbe0, patbe1;
>  };
>  
> +#ifdef TARGET_PPC64
> +
> +void ppc64_set_external_patb(PowerPCCPU *cpu, void *patb, Error **errp);
> +
> +#endif /* TARGET_PPC64 */
> +
>  #endif /* CONFIG_USER_ONLY */
>  
>  #endif /* MMU_H */
> diff --git a/target/ppc/mmu_helper.c b/target/ppc/mmu_helper.c
> index 2ab4562..bc6c117 100644
> --- a/target/ppc/mmu_helper.c
> +++ b/target/ppc/mmu_helper.c
> @@ -28,6 +28,7 @@
>  #include "exec/cpu_ldst.h"
>  #include "exec/log.h"
>  #include "helper_regs.h"
> +#include "mmu.h"
>  
>  //#define DEBUG_MMU
>  //#define DEBUG_BATS
> @@ -2907,3 +2908,14 @@ void tlb_fill(CPUState *cs, target_ulong addr, 
> MMUAccessType access_type,
> retaddr);
>  }
>  }
> +
> +/**/
> +
> +/* ISA v3.00 (POWER9) Generic MMU Helpers */
> +
> +void ppc64_set_external_patb(PowerPCCPU *cpu, void *patb, Error **errp)
> +{
> +CPUPPCState *env = >env;
> +
> +env->external_patbe = (struct patb_entry *) patb;
> +}

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH 07/17] target/ppc/POWER9: Add partition table pointer to sPAPRMachineState

2017-01-31 Thread David Gibson

On Fri, Jan 13, 2017 at 05:28:13PM +1100, Suraj Jitindar Singh wrote:
> POWER9 uses a partition table to store information relating to how
> address translation is performed on a per partition basis.
> 
> Add a data area for this to the sPAPRMachineState struct and (re)allocate
> it on machine reset.
> 
> Signed-off-by: Suraj Jitindar Singh 

Hm.  I'm having trouble understanding this one.

IIUC, for the "pseries" machine type this partition table entry is
essentially a dummy one, since there's actually only one partition.
For KVM the machine would be represented by one of many partition
table entries on the real host - so the entry here isn't really
relevant.  For TCG, I'm not sure what would be looking at it.  We
haven't implemented the partition->host level translation in TCG
anyway, and in any case pseries (rather than powernv) should be
bypassing that - in which case I'd expect it to bypass looking at the
dummy partition table entry as well.

> ---
>  hw/ppc/spapr.c | 38 --
>  include/hw/ppc/spapr.h |  1 +
>  target/ppc/mmu.h   | 13 +
>  3 files changed, 46 insertions(+), 6 deletions(-)
>  create mode 100644 target/ppc/mmu.h
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 208ef7b..45bd2de 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -41,6 +41,7 @@
>  #include "migration/migration.h"
>  #include "mmu-hash64.h"
>  #include "qom/cpu.h"
> +#include "mmu.h"
>  
>  #include "hw/boards.h"
>  #include "hw/ppc/ppc.h"
> @@ -1115,6 +1116,26 @@ static void spapr_reallocate_hpt(sPAPRMachineState 
> *spapr, int shift,
>  }
>  }
>  
> +static void spapr_reallocate_patb(sPAPRMachineState *spapr, Error **errp)
> +{
> +g_free(spapr->patb);
> +spapr->patb = NULL;
> +
> +if (!kvm_enabled()) {
> +/* We need to allocate a partition table entry */
> +size_t size = sizeof(struct patb_entry);
> +
> +spapr->patb = qemu_memalign(size, size);
> +if (!spapr->patb) {
> +error_setg_errno(errp, errno, "Could not allocate memory for "
> +  "partition table entry");
> +return;
> +}
> +
> +memset(spapr->patb, 0, size);
> +}
> +}
> +
>  static void find_unknown_sysbus_device(SysBusDevice *sbdev, void *opaque)
>  {
>  bool matched = false;
> @@ -1134,7 +1155,7 @@ static void ppc_spapr_reset(void)
>  {
>  MachineState *machine = MACHINE(qdev_get_machine());
>  sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
> -PowerPCCPU *first_ppc_cpu;
> +PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu);
>  uint32_t rtas_limit;
>  hwaddr rtas_addr, fdt_addr;
>  void *fdt;
> @@ -1143,10 +1164,16 @@ static void ppc_spapr_reset(void)
>  /* Check for unknown sysbus devices */
>  foreach_dynamic_sysbus_device(find_unknown_sysbus_device, NULL);
>  
> -/* Allocate and/or reset the hash page table */
> -spapr_reallocate_hpt(spapr,
> - spapr_hpt_shift_for_ramsize(machine->maxram_size),
> - _fatal);
> +switch (first_ppc_cpu->env.mmu_model) {
> +case POWERPC_MMU_3_00:
> +/* Allocate the partition table */
> +spapr_reallocate_patb(spapr, _fatal);
> +default:
> +/* Allocate and/or reset the hash page table */
> +spapr_reallocate_hpt(spapr,
> + 
> spapr_hpt_shift_for_ramsize(machine->maxram_size),
> + _fatal);
> +}
>  
>  /* Update the RMA size if necessary */
>  if (spapr->vrma_adjust) {
> @@ -1193,7 +1220,6 @@ static void ppc_spapr_reset(void)
>  g_free(fdt);
>  
>  /* Set up the entry state */
> -first_ppc_cpu = POWERPC_CPU(first_cpu);
>  first_ppc_cpu->env.gpr[3] = fdt_addr;
>  first_ppc_cpu->env.gpr[5] = 0;
>  first_cpu->halted = 0;
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index bd5bcf7..b654773 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -63,6 +63,7 @@ struct sPAPRMachineState {
>  
>  void *htab;
>  uint32_t htab_shift;
> +void *patb;
>  hwaddr rma_size;
>  int vrma_adjust;
>  ssize_t rtas_size;
> diff --git a/target/ppc/mmu.h b/target/ppc/mmu.h
> new file mode 100644
> index 000..67b9707
> --- /dev/null
> +++ b/target/ppc/mmu.h
> @@ -0,0 +1,13 @@
> +#ifndef MMU_H
> +#define MMU_H
> +
> +#ifndef CONFIG_USER_ONLY
> +
> +/* Partition Table Entry */
> +struct patb_entry {
> +uint64_t patbe0, patbe1;
> +};
> +
> +#endif /* CONFIG_USER_ONLY */
> +
> +#endif /* MMU_H */

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH 09/17] target/ppc/POWER9: Remove SDR1 register

2017-01-31 Thread David Gibson

On Fri, Jan 13, 2017 at 05:28:15PM +1100, Suraj Jitindar Singh wrote:
> The SDR1 registers was used to store the location of the hash page table.
> 
> This register no longer exists on POWER9 processors, so don't create it.
> 
> We now store the hash page table location in the process table entry.
> 
> We now check if the SDR1 register exists before printing its value when
> displaying register debug info.
> 
> Signed-off-by: Suraj Jitindar Singh 
> ---
>  target/ppc/kvm.c| 10 +-
>  target/ppc/mmu-hash64.c | 15 ++-
>  target/ppc/translate.c  |  6 --
>  target/ppc/translate_init.c | 16 +---
>  4 files changed, 40 insertions(+), 7 deletions(-)
> 
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 9c4834c..6016930 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -930,7 +930,15 @@ int kvmppc_put_books_sregs(PowerPCCPU *cpu)
>  
>  sregs.pvr = env->spr[SPR_PVR];
>  
> -sregs.u.s.sdr1 = env->spr[SPR_SDR1];
> +switch (env->mmu_model) {
> +case POWERPC_MMU_3_00:
> +if (env->external_patbe)
> +sregs.u.s.sdr1 = env->external_patbe->patbe0;

I take it from this that KVM is expecting the address of the guest HPT
in the SDR1 slot of sregs, even on POWER9?  Rather than using a new
ONE_REG entry for the POWER9 data.

Note that this slot was already a hack for PR KVM - we are putting a
userspace address in there, which is clearly not a real SDR1 value.
For HV KVM, I think this value is ignored entirely, since in that case
KVM allocates / controls the guest HPT, not qemu.

> +break;
> +default:
> +sregs.u.s.sdr1 = env->spr[SPR_SDR1];
> +break;
> +}

So.. maybe we should take this opportunity to clean this path up and
special case all "external HPT" cases to pass the relevant value to PR
KVM without abusing the env->spr[SPR_SDR1] field.

>  
>  /* Sync SLB */
>  #ifdef TARGET_PPC64
> diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
> index fe7da18..b9d4f4e 100644
> --- a/target/ppc/mmu-hash64.c
> +++ b/target/ppc/mmu-hash64.c
> @@ -291,7 +291,20 @@ void ppc_hash64_set_sdr1(PowerPCCPU *cpu, target_ulong 
> value,
>  CPUPPCState *env = >env;
>  target_ulong htabsize = value & SDR_64_HTABSIZE;
>  
> -env->spr[SPR_SDR1] = value;
> +switch (env->mmu_model) {
> +case POWERPC_MMU_3_00:
> +/*
> + * Technically P9 doesn't have a SDR1, the hash table address should 
> be
> + * stored in the partition table entry instead.
> + */
> +if (env->external_patbe)
> +env->external_patbe->patbe0 = value;

This definitely doesn't belong in a function called ..._set_sdr1().
Either rename this function or (probably better), set the partition
table entry from a new function and make the decision in the caller.


> +break;
> +default:
> +env->spr[SPR_SDR1] = value;
> +break;
> +}
> +
>  if (htabsize > 28) {
>  error_setg(errp,
> "Invalid HTABSIZE 0x" TARGET_FMT_lx" stored in SDR1",
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index 1212180..521aed3 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -6922,9 +6922,11 @@ void ppc_cpu_dump_state(CPUState *cs, FILE *f, 
> fprintf_function cpu_fprintf,
>  case POWERPC_MMU_2_06a:
>  case POWERPC_MMU_2_07:
>  case POWERPC_MMU_2_07a:
> +case POWERPC_MMU_3_00:
>  #endif
> -cpu_fprintf(f, " SDR1 " TARGET_FMT_lx "   DAR " TARGET_FMT_lx
> -   "  DSISR " TARGET_FMT_lx "\n", env->spr[SPR_SDR1],
> +if (env->spr_cb[SPR_SDR1].name)
> +cpu_fprintf(f, " SDR1 " TARGET_FMT_lx " ", env->spr[SPR_SDR1]);
> +cpu_fprintf(f, "  DAR " TARGET_FMT_lx "  DSISR " TARGET_FMT_lx "\n",
>  env->spr[SPR_DAR], env->spr[SPR_DSISR]);
>  break;
>  case POWERPC_MMU_BOOKE206:
> diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> index a1994d3..c771ba3 100644
> --- a/target/ppc/translate_init.c
> +++ b/target/ppc/translate_init.c
> @@ -32,6 +32,7 @@
>  #include "qapi/visitor.h"
>  #include "hw/qdev-properties.h"
>  #include "hw/ppc/ppc.h"
> +#include "mmu.h"
>  
>  //#define PPC_DUMP_CPU
>  //#define PPC_DEBUG_SPR
> @@ -722,8 +723,8 @@ static void gen_spr_generic (CPUPPCState *env)
>   0x);
>  }
>  
> -/* SPR common to all non-embedded PowerPC, including 601 */
> -static void gen_spr_ne_601 (CPUPPCState *env)
> +/* SPR common to all non-embedded PowerPC, including POWER9 */
> +static void gen_spr_ne_power9 (CPUPPCState *env)
>  {
>  /* Exception processing */
>  spr_register_kvm(env, SPR_DSISR, "DSISR",
> @@ -739,6 +740,12 @@ static void gen_spr_ne_601 (CPUPPCState *env)
>   SPR_NOACCESS, SPR_NOACCESS,
>   _read_decr, _write_decr,
>   0x);
> +}
> +
> +/* SPR common to all

Re: [Qemu-devel] [PATCH v4 2/2] block/qapi: reduce the execution time of qmp_query_blockstats

2017-01-31 Thread Dou Liyang


Hi, Max

At 02/01/2017 11:02 AM, Max Reitz wrote:

On 15.01.2017 09:01, Dou Liyang wrote:

In order to reduce the execution time, this patch optimize
the qmp_query_blockstats():
Remove the next_query_bds function.
Remove the bdrv_query_stats function.
Remove some judgement sentence.

The original qmp_query_blockstats calls next_query_bds to get
the next objects in each loops. In the next_query_bds, it checks
the query_nodes and blk. It also call bdrv_query_stats to get
the stats, In the bdrv_query_stats, it checks blk and bs each
times. This waste more times, which may stall the main loop a
bit. And if the disk is too many and donot use the dataplane
feature, this may affect the performance in main loop thread.

This patch removes that two functions, and makes the structure
clearly.

Signed-off-by: Dou Liyang 
---
 block/qapi.c | 73 
 1 file changed, 29 insertions(+), 44 deletions(-)

diff --git a/block/qapi.c b/block/qapi.c
index bc622cd..45e9d47 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -456,23 +456,6 @@ static BlockStats *bdrv_query_bds_stats(const 
BlockDriverState *bs,
 return s;
 }

-static BlockStats *bdrv_query_stats(BlockBackend *blk,
-const BlockDriverState *bs,
-bool query_backing)
-{
-BlockStats *s;
-
-s = bdrv_query_bds_stats(bs, query_backing);
-
-if (blk) {
-s->has_device = true;
-s->device = g_strdup(blk_name(blk));
-bdrv_query_blk_stats(s->stats, blk);
-}
-
-return s;
-}
-
 BlockInfoList *qmp_query_block(Error **errp)
 {
 BlockInfoList *head = NULL, **p_next = 
@@ -496,42 +479,44 @@ BlockInfoList *qmp_query_block(Error **errp)
 return head;
 }

-static bool next_query_bds(BlockBackend **blk, BlockDriverState **bs,
-   bool query_nodes)
-{
-if (query_nodes) {
-*bs = bdrv_next_node(*bs);
-return !!*bs;
-}
-
-*blk = blk_next(*blk);
-*bs = *blk ? blk_bs(*blk) : NULL;
-
-return !!*blk;
-}
-
 BlockStatsList *qmp_query_blockstats(bool has_query_nodes,
  bool query_nodes,
  Error **errp)
 {
 BlockStatsList *head = NULL, **p_next = 
-BlockBackend *blk = NULL;
-BlockDriverState *bs = NULL;
+BlockBackend *blk;
+BlockDriverState *bs;

 /* Just to be safe if query_nodes is not always initialized */
-query_nodes = has_query_nodes && query_nodes;
-
-while (next_query_bds(, , query_nodes)) {
-BlockStatsList *info = g_malloc0(sizeof(*info));
-AioContext *ctx = blk ? blk_get_aio_context(blk)
-  : bdrv_get_aio_context(bs);
+if (has_query_nodes && query_nodes) {
+for (bs = bdrv_next_node(NULL); bs; bs = bdrv_next_node(bs)) {
+BlockStatsList *info = g_malloc0(sizeof(*info));
+AioContext *ctx = bdrv_get_aio_context(bs);

-aio_context_acquire(ctx);
-info->value = bdrv_query_stats(blk, bs, !query_nodes);
-aio_context_release(ctx);
+aio_context_acquire(ctx);
+info->value = bdrv_query_bds_stats(bs, false);
+aio_context_release(ctx);

-*p_next = info;
-p_next = >next;
+*p_next = info;
+p_next = >next;
+}
+} else {
+for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
+BlockStatsList *info = g_malloc0(sizeof(*info));
+AioContext *ctx = blk_get_aio_context(blk);
+BlockStats *s;
+
+aio_context_acquire(ctx);
+info->value = s = bdrv_query_bds_stats(blk_bs(blk), true);


Superfluous assignment to info->value here...


Yes, It is.




+s->has_device = true;
+s->device = g_strdup(blk_name(blk));
+bdrv_query_blk_stats(s->stats, blk);
+aio_context_release(ctx);
+
+info->value = s;


...since it's written here anyway. I'll remove the first assignment when
applying the patch to my tree.


Thanks very much!



Max


+*p_next = info;
+p_next = >next;
+}
 }

 return head;






Thanks,
Liyang

Re: [Qemu-devel] [PATCH v4 0/2] block/qapi: refactor and optimize the qmp_query_blockstats()

2017-01-31 Thread Max Reitz

On 15.01.2017 09:01, Dou Liyang wrote:
> Change log v3 -> v4:
>  1. Develop these into the non-RFC patches.
>  2. Fix some comments.
>  3. do declarations first.
> 
> Change log v2 -> v3:
>  1. Remove the unnecessary code for the bdrv_next_node().
>  2. Remove the change of the locking rules.
> Even if this change can improve the performance, but it may
> effect the consistency.
> 
> The Qemu uses the qmp_query_blockstats() function to get the
> blockstats. As the function has been changed several times, it
> becomes more complex and longer.
> 
> For the multi-disks guest, if we want to execute I/O operations
> and this function at the same time. we can use the dataplane
> feature to hold I/O performance does not drop. But, Normally
> without this feature, How to reduce the decline in performance?
> 
> These patches refactor the qmp_query_blockstats() to make the
> code easier to follow, and shorter as follows:
> 
> From:
> 
> +--+  +-+
>  | 1|  | 4.  |
>  |next_query_bds|  |bdrv_query_bds_stats +---+
>  |  |  | |   |
>  +^-+  +-^---+   |
>   |  |   |
> +-+--+  ++---+   |
> | 0. |  | 2. |   |
> |qmp_query_blockstats+-->bdrv_query_stats<
> ||  ||
> ++  ++---+
>  |
>+-v---+
>| 3.  |
>|bdrv_query_blk_stats |
>| |
>+-+
> 
> To:
> 
> +--+
> |  |
>+v---+  |
>+--->  3.|  |
> +---+  |   |bdrv_query_bds_stats+--+
> | 1.+--+   ||
> |   +  ++
> |qmp_query_blockstats--+
> |   |  |
> +---+  |   ++
>|   | 2. |
>+--->|
>|bdrv_query_blk_stats|
>||
>++
> 
> 
> They also optimize the fuction by reducing its running time.
> 
> 1. The function running time
> 
> the time it takes(ns) in each requests:
> the disk numbers | 10| 500
> -
> before these patches | 19429 | 667722 
> after these patches  | 18536 | 627945
> 
> 2. For the performance
> used the dd command likes this to test: 
> dd if=date_1.dat of=date_2.dat conv=fsync oflag=direct bs=1k count=100k.
> 
> the I/O performance is degraded(%) during the monitor:
> the disk numbers | 10| 500
> -
> before these patches | 1.3   | 14.2
> after these patches  | 1.0   | 11.3
> 
> Dou Liyang (2):
>   block/qapi: reduce the coupling between the bdrv_query_stats and
> bdrv_query_bds_stats
>   block/qapi: reduce the execution time of qmp_query_blockstats
> 
>  block/qapi.c | 95 
> ++--
>  1 file changed, 41 insertions(+), 54 deletions(-)

Thanks, dropped one of the duplicate info->value assignments in patch 2
and applied the series to my block tree:

https://github.com/XanClic/qemu/commits/block

(Assuming Markus' R-b meant that he does not intends to take it through
his tree.)

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v4 2/2] block/qapi: reduce the execution time of qmp_query_blockstats

2017-01-31 Thread Max Reitz

On 15.01.2017 09:01, Dou Liyang wrote:
> In order to reduce the execution time, this patch optimize
> the qmp_query_blockstats():
> Remove the next_query_bds function.
> Remove the bdrv_query_stats function.
> Remove some judgement sentence.
> 
> The original qmp_query_blockstats calls next_query_bds to get
> the next objects in each loops. In the next_query_bds, it checks
> the query_nodes and blk. It also call bdrv_query_stats to get
> the stats, In the bdrv_query_stats, it checks blk and bs each
> times. This waste more times, which may stall the main loop a
> bit. And if the disk is too many and donot use the dataplane
> feature, this may affect the performance in main loop thread.
> 
> This patch removes that two functions, and makes the structure
> clearly.
> 
> Signed-off-by: Dou Liyang 
> ---
>  block/qapi.c | 73 
> 
>  1 file changed, 29 insertions(+), 44 deletions(-)
> 
> diff --git a/block/qapi.c b/block/qapi.c
> index bc622cd..45e9d47 100644
> --- a/block/qapi.c
> +++ b/block/qapi.c
> @@ -456,23 +456,6 @@ static BlockStats *bdrv_query_bds_stats(const 
> BlockDriverState *bs,
>  return s;
>  }
>  
> -static BlockStats *bdrv_query_stats(BlockBackend *blk,
> -const BlockDriverState *bs,
> -bool query_backing)
> -{
> -BlockStats *s;
> -
> -s = bdrv_query_bds_stats(bs, query_backing);
> -
> -if (blk) {
> -s->has_device = true;
> -s->device = g_strdup(blk_name(blk));
> -bdrv_query_blk_stats(s->stats, blk);
> -}
> -
> -return s;
> -}
> -
>  BlockInfoList *qmp_query_block(Error **errp)
>  {
>  BlockInfoList *head = NULL, **p_next = 
> @@ -496,42 +479,44 @@ BlockInfoList *qmp_query_block(Error **errp)
>  return head;
>  }
>  
> -static bool next_query_bds(BlockBackend **blk, BlockDriverState **bs,
> -   bool query_nodes)
> -{
> -if (query_nodes) {
> -*bs = bdrv_next_node(*bs);
> -return !!*bs;
> -}
> -
> -*blk = blk_next(*blk);
> -*bs = *blk ? blk_bs(*blk) : NULL;
> -
> -return !!*blk;
> -}
> -
>  BlockStatsList *qmp_query_blockstats(bool has_query_nodes,
>   bool query_nodes,
>   Error **errp)
>  {
>  BlockStatsList *head = NULL, **p_next = 
> -BlockBackend *blk = NULL;
> -BlockDriverState *bs = NULL;
> +BlockBackend *blk;
> +BlockDriverState *bs;
>  
>  /* Just to be safe if query_nodes is not always initialized */
> -query_nodes = has_query_nodes && query_nodes;
> -
> -while (next_query_bds(, , query_nodes)) {
> -BlockStatsList *info = g_malloc0(sizeof(*info));
> -AioContext *ctx = blk ? blk_get_aio_context(blk)
> -  : bdrv_get_aio_context(bs);
> +if (has_query_nodes && query_nodes) {
> +for (bs = bdrv_next_node(NULL); bs; bs = bdrv_next_node(bs)) {
> +BlockStatsList *info = g_malloc0(sizeof(*info));
> +AioContext *ctx = bdrv_get_aio_context(bs);
>  
> -aio_context_acquire(ctx);
> -info->value = bdrv_query_stats(blk, bs, !query_nodes);
> -aio_context_release(ctx);
> +aio_context_acquire(ctx);
> +info->value = bdrv_query_bds_stats(bs, false);
> +aio_context_release(ctx);
>  
> -*p_next = info;
> -p_next = >next;
> +*p_next = info;
> +p_next = >next;
> +}
> +} else {
> +for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
> +BlockStatsList *info = g_malloc0(sizeof(*info));
> +AioContext *ctx = blk_get_aio_context(blk);
> +BlockStats *s;
> +
> +aio_context_acquire(ctx);
> +info->value = s = bdrv_query_bds_stats(blk_bs(blk), true);

Superfluous assignment to info->value here...

> +s->has_device = true;
> +s->device = g_strdup(blk_name(blk));
> +bdrv_query_blk_stats(s->stats, blk);
> +aio_context_release(ctx);
> +
> +info->value = s;

...since it's written here anyway. I'll remove the first assignment when
applying the patch to my tree.

Max

> +*p_next = info;
> +p_next = >next;
> +}
>  }
>  
>  return head;
> 




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 0/3] block: check full backing filename when searching protocol filenames

2017-01-31 Thread Max Reitz

On 26.01.2017 02:08, Jeff Cody wrote:
> Differences from v1:
> 
> Patch 1: set local_error = NULL after freeing (Thanks Eric)
> Patch 2: new patch, groundwork for qemu-iotest in patch 3
> Patch 3: qemu-iotest, as requested (Thanks Max)
> 
> 
> Jeff Cody (3):
>   block: check full backing filename when searching protocol filenames
>   qemu-iotests: Don't create fifos / pidfiles with protocol paths
>   qemu-iotest: test to lookup protocol-based image with relative backing
> 
>  block.c  | 13 ++
>  tests/qemu-iotests/173   | 97 
> 
>  tests/qemu-iotests/173.out   | 12 +
>  tests/qemu-iotests/common.config |  6 ++-
>  tests/qemu-iotests/common.qemu   | 10 ++---
>  tests/qemu-iotests/common.rc |  6 +--
>  tests/qemu-iotests/group |  1 +
>  7 files changed, 135 insertions(+), 10 deletions(-)
>  create mode 100755 tests/qemu-iotests/173
>  create mode 100644 tests/qemu-iotests/173.out

Thanks, applied to my block tree:

https://github.com/XanClic/qemu/commits/block

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PULL v3 03/21] ppc: switch to constants within BUILD_BUG_ON

2017-01-31 Thread David Gibson

On Wed, Feb 01, 2017 at 03:37:54AM +0200, Michael S. Tsirkin wrote:
> On Wed, Feb 01, 2017 at 09:16:28AM +1100, David Gibson wrote:
> > On Tue, Jan 31, 2017 at 03:14:29PM +0200, Michael S. Tsirkin wrote:
> > > We are switching BUILD_BUG_ON to verify that it's parameter is a
> > > compile-time constant, and it turns out that some gcc versions
> > > (specifically gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609) are
> > > not smart enough to figure it out for expressions involving local
> > > variables. This is harmless but means that the check is ineffective for
> > > these platforms.  To fix, replace the variable with macros.
> > > 
> > > Reported-by: Peter Maydell 
> > > Signed-off-by: Michael S. Tsirkin 
> > 
> > Ugh.. I pulled this into my tree already (I've been having trouble
> > with my test setup, which is why I haven't sent my own pullreq yet).
> > However, it needs a s/%u/%llu/ to compile correctly on some platforms.
> 
> I need it to avoid breaking build. I fixed it up - want to drop it
> from you tree? Or leave it there - git will resolve the trivial
> conflict.

Since you've already sent a pull, I'm guessing it will be merged by
the time I'm able to get my pullreq together.  Since I'll rebase
before that, I expect it will drop out of my tree naturally.

> 
> > > ---
> > >  hw/ppc/spapr.c | 14 --
> > >  1 file changed, 8 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index a642e66..b81f2ac 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -2630,8 +2630,8 @@ static void spapr_phb_placement(sPAPRMachineState 
> > > *spapr, uint32_t index,
> > >   * 1TiB 64-bit MMIO windows for each PHB.
> > >   */
> > >  const uint64_t base_buid = 0x8002000ULL;
> > > -const int max_phbs =
> > > -(SPAPR_PCI_LIMIT - SPAPR_PCI_BASE) / SPAPR_PCI_MEM64_WIN_SIZE - 
> > > 1;
> > > +#define SPAPR_MAX_PHBS ((SPAPR_PCI_LIMIT - SPAPR_PCI_BASE) / \
> > > +SPAPR_PCI_MEM64_WIN_SIZE - 1)
> > >  int i;
> > >  
> > >  /* Sanity check natural alignments */
> > > @@ -2640,12 +2640,14 @@ static void spapr_phb_placement(sPAPRMachineState 
> > > *spapr, uint32_t index,
> > >  QEMU_BUILD_BUG_ON((SPAPR_PCI_MEM64_WIN_SIZE % 
> > > SPAPR_PCI_MEM32_WIN_SIZE) != 0);
> > >  QEMU_BUILD_BUG_ON((SPAPR_PCI_MEM32_WIN_SIZE % SPAPR_PCI_IO_WIN_SIZE) 
> > > != 0);
> > >  /* Sanity check bounds */
> > > -QEMU_BUILD_BUG_ON((max_phbs * SPAPR_PCI_IO_WIN_SIZE) > 
> > > SPAPR_PCI_MEM32_WIN_SIZE);
> > > -QEMU_BUILD_BUG_ON((max_phbs * SPAPR_PCI_MEM32_WIN_SIZE) > 
> > > SPAPR_PCI_MEM64_WIN_SIZE);
> > > +QEMU_BUILD_BUG_ON((SPAPR_MAX_PHBS * SPAPR_PCI_IO_WIN_SIZE) >
> > > +  SPAPR_PCI_MEM32_WIN_SIZE);
> > > +QEMU_BUILD_BUG_ON((SPAPR_MAX_PHBS * SPAPR_PCI_MEM32_WIN_SIZE) >
> > > +  SPAPR_PCI_MEM64_WIN_SIZE);
> > >  
> > > -if (index >= max_phbs) {
> > > +if (index >= SPAPR_MAX_PHBS) {
> > >  error_setg(errp, "\"index\" for PAPR PHB is too large (max %u)",
> > > -   max_phbs - 1);
> > > +   SPAPR_MAX_PHBS - 1);
> > >  return;
> > >  }
> > >  
> > 
> 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH 00/17] target/ppc: Implement POWER9 pseries tcg legacy kernel support

2017-01-31 Thread David Gibson

On Wed, Feb 01, 2017 at 01:16:00PM +1100, David Gibson wrote:
> On Fri, Jan 13, 2017 at 05:28:06PM +1100, Suraj Jitindar Singh wrote:
> > This patch set provides the initial implementation of support for the 
> > POWER9 processor running in tcg mode under the pseries machine type.
> > 
> > To use a POWER9 cpu provide the command line option "-cpu POWER9".
> > 
> > This is the initial work to make the mmu emulation model look like a POWER9
> > 
> > Currently only legacy kernels are supported (hash without segment tables).
> > 
> > This patch set will be followed by two more in the near future to support
> > the new process table capabilities added in ISAv3.00; hash via segment
> > tables and radix. Kernels with support for this will currently fail when
> > they try to register a process table as this isn't yet implemented.
> > 
> > The assumption is that we're running a legacy kernel, in the event the
> > guest registers a process table (when support exists) we will handle it
> > accordingly.
> > 
> > The main changes are:
> >  - Define a new mmu model and fault handler
> >  - Add new LPCR fields and check them accordingly
> >  - Add a partition table entry to the machine state
> >  - Point to the partition table entry in the cpu state
> >  - Remove SDR1
> >  - Adapt to new pte format
> >  - NOOP the cp_abort instruction
> >  - Small bug fixes
> > 
> > This was intially one huge patch so I've tried to break it up into what I
> > think are logical chunks, how exactly this should be split up is up for 
> > debate.
> > 
> > A current upstream kernel with POWER9 support added to the architecture
> > vector should correctly report a POWER9 cpu under /proc/cpuinfo.
> 
> I've merged 14-16/17 since they seem to stand on their own as fixes.

And 17/17 as well.

> 
> > 
> > Suraj Jitindar Singh (17):
> >   powerpc/cpu-models: rename ISAv3.00 logical PVR definition
> >   hw/ppc/spapr: Add POWER9 to pseries cpu models
> >   target/ppc: Add pcr_supported to POWER9 cpu class definition
> >   target/ppc/POWER9: Add ISAv3.00 MMU definition
> >   target/ppc/POWER9: Adapt LPCR handling for POWER9
> >   target/ppc/POWER9: Direct all instr and data storage interrupts to the
> > hypv
> >   target/ppc/POWER9: Add partition table pointer to sPAPRMachineState
> >   target/ppc/POWER9: Add external partition table pointer to cpu state
> >   target/ppc/POWER9: Remove SDR1 register
> >   target/ppc/POWER9: Add POWER9 mmu fault handler
> >   target/ppc/POWER9: Update to new pte format for POWER9 accesses
> >   target/ppc/POWER9: Add POWER9 pa-features definition
> >   target/ppc/POWER9: Add cpu_has_work function for POWER9
> >   target/ppc/debug: Print LPCR register value if register exists
> >   tcg/POWER9: NOOP the cp_abort instruction
> >   target/ppc/mmu_hash64: Fix printing unsigned as signed int
> >   target/ppc/mmu_hash64: Fix incorrect shift value in amr calculation
> > 
> >  hw/ppc/spapr.c  | 56 ---
> >  hw/ppc/spapr_cpu_core.c | 15 +++-
> >  hw/ppc/spapr_hcall.c| 51 +
> >  include/hw/ppc/spapr.h  |  1 +
> >  target/ppc/cpu-models.h |  2 +-
> >  target/ppc/cpu-qom.h|  5 ++-
> >  target/ppc/cpu.h| 24 +++-
> >  target/ppc/kvm.c| 10 -
> >  target/ppc/mmu-hash64.c | 68 +++--
> >  target/ppc/mmu-hash64.h | 73 +--
> >  target/ppc/mmu.h| 27 +
> >  target/ppc/mmu_helper.c | 61 ++
> >  target/ppc/translate.c  | 14 ++-
> >  target/ppc/translate_init.c | 92 
> > +++--
> >  14 files changed, 418 insertions(+), 81 deletions(-)
> >  create mode 100644 target/ppc/mmu.h
> > 
> 



-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH 00/17] target/ppc: Implement POWER9 pseries tcg legacy kernel support

2017-01-31 Thread David Gibson

On Fri, Jan 13, 2017 at 05:28:06PM +1100, Suraj Jitindar Singh wrote:
> This patch set provides the initial implementation of support for the 
> POWER9 processor running in tcg mode under the pseries machine type.
> 
> To use a POWER9 cpu provide the command line option "-cpu POWER9".
> 
> This is the initial work to make the mmu emulation model look like a POWER9
> 
> Currently only legacy kernels are supported (hash without segment tables).
> 
> This patch set will be followed by two more in the near future to support
> the new process table capabilities added in ISAv3.00; hash via segment
> tables and radix. Kernels with support for this will currently fail when
> they try to register a process table as this isn't yet implemented.
> 
> The assumption is that we're running a legacy kernel, in the event the
> guest registers a process table (when support exists) we will handle it
> accordingly.
> 
> The main changes are:
>  - Define a new mmu model and fault handler
>  - Add new LPCR fields and check them accordingly
>  - Add a partition table entry to the machine state
>  - Point to the partition table entry in the cpu state
>  - Remove SDR1
>  - Adapt to new pte format
>  - NOOP the cp_abort instruction
>  - Small bug fixes
> 
> This was intially one huge patch so I've tried to break it up into what I
> think are logical chunks, how exactly this should be split up is up for 
> debate.
> 
> A current upstream kernel with POWER9 support added to the architecture
> vector should correctly report a POWER9 cpu under /proc/cpuinfo.

I've merged 14-16/17 since they seem to stand on their own as fixes.

> 
> Suraj Jitindar Singh (17):
>   powerpc/cpu-models: rename ISAv3.00 logical PVR definition
>   hw/ppc/spapr: Add POWER9 to pseries cpu models
>   target/ppc: Add pcr_supported to POWER9 cpu class definition
>   target/ppc/POWER9: Add ISAv3.00 MMU definition
>   target/ppc/POWER9: Adapt LPCR handling for POWER9
>   target/ppc/POWER9: Direct all instr and data storage interrupts to the
> hypv
>   target/ppc/POWER9: Add partition table pointer to sPAPRMachineState
>   target/ppc/POWER9: Add external partition table pointer to cpu state
>   target/ppc/POWER9: Remove SDR1 register
>   target/ppc/POWER9: Add POWER9 mmu fault handler
>   target/ppc/POWER9: Update to new pte format for POWER9 accesses
>   target/ppc/POWER9: Add POWER9 pa-features definition
>   target/ppc/POWER9: Add cpu_has_work function for POWER9
>   target/ppc/debug: Print LPCR register value if register exists
>   tcg/POWER9: NOOP the cp_abort instruction
>   target/ppc/mmu_hash64: Fix printing unsigned as signed int
>   target/ppc/mmu_hash64: Fix incorrect shift value in amr calculation
> 
>  hw/ppc/spapr.c  | 56 ---
>  hw/ppc/spapr_cpu_core.c | 15 +++-
>  hw/ppc/spapr_hcall.c| 51 +
>  include/hw/ppc/spapr.h  |  1 +
>  target/ppc/cpu-models.h |  2 +-
>  target/ppc/cpu-qom.h|  5 ++-
>  target/ppc/cpu.h| 24 +++-
>  target/ppc/kvm.c| 10 -
>  target/ppc/mmu-hash64.c | 68 +++--
>  target/ppc/mmu-hash64.h | 73 +--
>  target/ppc/mmu.h| 27 +
>  target/ppc/mmu_helper.c | 61 ++
>  target/ppc/translate.c  | 14 ++-
>  target/ppc/translate_init.c | 92 
> +++--
>  14 files changed, 418 insertions(+), 81 deletions(-)
>  create mode 100644 target/ppc/mmu.h
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Qemu-devel] [PATCH v3] vl: Move the cpu_synchronize_all_post_init() after generic devices initialization

2017-01-31 Thread Dou Liyang

At the Qemu initialization, we call the cpu_synchronize_all_post_init()
to synchronize All CPU states to KVM in the ./vl.c::main().

Currently, it is called before we initialize the CPUs, which is created
by "-device" command and parsed by generic devices initialization, So,
these CPUs may be ignored to synchronize.

The patch moves the cpu_synchronize_all_post_init func after generic
devices initialization to make sure that all the CPUs can be included.

Signed-off-by: Dou Liyang 
---
Change log:
v2-> v3:
1. Rebase the patch.
2. Rewrite the log
v1-> v2:
1. Split it from
https://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg03354.html
2. Rewrite the log.

 vl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index 0b72b12..bceb835 100644
--- a/vl.c
+++ b/vl.c
@@ -4490,8 +4490,6 @@ int main(int argc, char **argv, char **envp)
 
 audio_init();
 
-cpu_synchronize_all_post_init();
-
 if (hax_enabled()) {
 hax_sync_vcpus();
 }
@@ -4517,6 +4515,8 @@ int main(int argc, char **argv, char **envp)
 exit(1);
 }
 
+cpu_synchronize_all_post_init();
+
 numa_post_machine_init();
 
 rom_reset_order_override();
-- 
2.5.5

Re: [Qemu-devel] [RFC PATCH 00/17] target/ppc: Implement POWER9 pseries tcg legacy kernel support

2017-01-31 Thread David Gibson

On Thu, Jan 12, 2017 at 10:55:49PM -0800, no-re...@patchew.org wrote:
> Hi,
> 
> Your series seems to have some coding style problems. See output below for
> more information:
> 
> Message-id: 1484288903-18807-1-git-send-email-sjitindarsi...@gmail.com
> Subject: [Qemu-devel] [RFC PATCH 00/17] target/ppc: Implement POWER9 pseries 
> tcg legacy kernel support
> Type: series

The style bot does generate a fair few false positives, but AFAICT all
the things below should be addressed.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] block: bdrv_invalidate_cache: invalidate children first

2017-01-31 Thread Max Reitz

On 31.01.2017 12:23, Vladimir Sementsov-Ogievskiy wrote:
> Current implementation invalidates firstly parent bds and then its
> children. This leads to the following bug:
> 
> after incoming migration, in bdrv_invalidate_cache_all:
> 1. invalidate parent bds - reopen it with BDRV_O_INACTIVE cleared
> 2. child is not yet invalidated
> 3. parent check that its BDRV_O_INACTIVE is cleared
> 4. parent writes to child
> 5. assert in bdrv_co_pwritev, as BDRV_O_INACTIVE is set for child
> 
> This patch fixes it by just changing invalidate sequence: invalidate
> children first.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
> 
> v2: I've missed that bdrv_invalidate_cache is already recursive, so we
> can change sequence here. Also v1 doesn't cover the case when
> bdrv_invalidate_cache is called not from bdrv_invalidate_cache_all.
> 
>  block.c | 11 +--
>  1 file changed, 5 insertions(+), 6 deletions(-)

Reviewed-by: Max Reitz 

I'll wait a bit until I apply this patch (please remind me next week if
I forget it...), though, because migration-related things are not
exactly my forte, so maybe someone else has something to say.

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v6 1/2] block/vxhs.c: Add support for a new block device type called "vxhs"

2017-01-31 Thread ashish mittal

Hi,

There has been some work going on on the VxHS libvirt patches and we
are making good progress.

A suggestion has been made on the libvirt community to check if we can
change the qemu VxHS device specification syntax as follows.

Replace

(a) +file.server.host=192.168.0.1,file.server.port=

with

(b) +file.host=192.168.0.1,file.port=

The reasoning being that since we have only one host (true as the
failover is now being handled completely/transparently) within the
libqnio library), the "server" part is redundant.

Excerpt from John Ferlan's email -
/==/
#2. Is the desire to ever support more than 1 host? If not, then is the
"server" syntax you've borrowed from the Gluster code necessary? Could
you just go with the single "host" like NBD and SSH. As it relates to
the qemu command line - I'm not quite as clear. From the example I see
in commit id '7b7da9e28', the gluster syntax would have:

+file.server.0.type=tcp,file.server.0.host=example.org,file.server.0.port=6000,\
+file.server.1.type=tcp,file.server.1.host=example.org,file.server.1.port=24007,\
+file.server.2.type=unix,file.server.2.socket=/path/to/sock,format=qcow2,\

whereas, the VxHS syntax is:
 +file.server.host=192.168.0.1,file.server.port=,format=raw,if=none,\

FWIW: I also note there is no ".type=tcp" in your output - so perhaps
the "default" is tcp unless otherwise specified, but I'm sure of the
qemu syntax requirements in this area. I assume that since there's only
1 server, the ".0, .1, .2" become unnecessary (something added by commit
id 'f1bbc7df4' for multiple gluster hosts).

I haven't closedly followed the qemu syntax discussion, but it would it
would be possible to use:

+file.host=192.168.0.1,file.port=

Similar to how NBD (see commit id 'a1674fd9') and SSH (see commit id
'bc225b1b5') are handled.
/==/

If this proposal looks OK to the community, then I will make this user
interface change in the next VxHS qemu patch.

Thanks,
Ashish

On Wed, Nov 16, 2016 at 9:05 AM, ashish mittal  wrote:
> On Wed, Nov 16, 2016 at 3:27 AM, Stefan Hajnoczi  wrote:
>> On Wed, Nov 16, 2016 at 9:49 AM, Fam Zheng  wrote:
>>> On Wed, 11/16 10:04, Markus Armbruster wrote:
 ashish mittal  writes:

 > Thanks for concluding on this.
 >
 > I will rearrange the qnio_api.h header accordingly as follows:
 >
 > +#include "qemu/osdep.h"

 Headers should not include osdep.h.
>>>
>>> This is about including "osdep.h" _and_ "qnio_api.h" in block/vxhs.c, so 
>>> what
>>> Ashish means looks good to me.
>>
>> Yes, I think "will rearrange the qnio_api.h header" was a typo and was
>> supposed to be block/vxhs.c.
>>
>> Stefan
>
> Thanks for the correction. Yes, i meant rearrange headers in block/vxhs.c.

Re: [Qemu-devel] [PATCH v4 2/7] qcow2: Discard/zero clusters by byte count

2017-01-31 Thread Max Reitz

On 30.01.2017 17:52, Eric Blake wrote:
> On 01/28/2017 02:21 PM, Max Reitz wrote:
>> On 20.12.2016 20:15, Eric Blake wrote:
>>> Passing a byte offset, but sector count, when we ultimately
>>> want to operate on cluster granularity, is madness.  Clean up
>>> the interfaces to take both offset and count as bytes, while
>>> still keeping the assertion added previously that the caller
>>> must align the values to a cluster.  Then rename things to
>>> make sure backports don't get confused by changed units:
>>> instead of qcow2_discard_clusters() and qcow2_zero_clusters(),
>>> we now have qcow2_cluster_discard and qcow2_cluster_zeroize().
>>>
>>> Signed-off-by: Eric Blake 
>>>
> 
>>> +int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
>>> +  uint64_t count, int flags);
>>
>> I know byte count parameters are often called "count", but I find it a
>> bit problematic if the function name contains a word that can be a unit.
>> In these cases, it's not entirely clear whether "count" refers to bytes
>> or clusters. Maybe "length" or "size" would be better?
> 
> Now that you mention it, 'cluster' and 'count' is indeed confusing.  I'm
> leaning towards 'size'.  Do I need to respin, or is that something that
> the maintainer is comfortable tweaking?

There's probably not too much that could go wrong, but I wouldn't be
exactly comfortable with it.

>> Purely subjective and thus optional, of course.
>>
>> Another thing: I'm not quite positive whether qcow2_cluster_zeroize()
>> can actually handle a byte count greater than INT_MAX. If you could pass
>> such a number to it, the multiplication "ret * s->cluster_size" might
>> overflow. It's only caller qcow2_co_pwrite_zeroes() will never exceed
>> that limit, but maybe this should be made clear somewhere -- either by
>> making the parameter an int instead of a uint64_t or by asserting it in
>> the function.
> 
> I'd lean towards an assertion, especially since it would be nice to
> someday unify all the internal block interfaces to get to a 64-bit
> interface wherever possible

Kevin once said "We should write code that makes sense today, not code
that will make sense some day in the future." Or something like that. :-)

> (limiting ourselves to 32-bit is necessary
> for some drivers, like NBD, but that limitation should be enforced via
> proper BlockLimits, rather than by inherent limits in our API).  An
> assertion with 64-bit types is better than yet one more place to audit
> for missing assertions if we use a 32-bit type now and then widen to
> 64-bit later.

Depends on the viewpoint. You're right, it does prevent us from
accidentally implicitly narrowing a 64 bit size, but on the other hand,
it won't be clear just from looking at the function signature that this
function actually only supports a 32 bit parameter.

Not sure what to do here. Maybe switch to a language that would warn us
before implicitly narrowing values. */me ducks*

As long as the assertion is visible enough (i.e. immediately after the
variable declaration block), it should be fine.

Max

>>> -int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
>>> -int nb_sectors, enum qcow2_discard_type type, bool full_discard)
>>> +int qcow2_cluster_discard(BlockDriverState *bs, uint64_t offset,
>>> +  uint64_t count, enum qcow2_discard_type type,
>>> +  bool full_discard)
>>>  {
>>>  BDRVQcow2State *s = bs->opaque;
>>> -uint64_t end_offset;
>>> +uint64_t end_offset = offset + count;
>>>  uint64_t nb_clusters;
>>>  int ret;
>>>
>>> -end_offset = offset + (nb_sectors << BDRV_SECTOR_BITS);
>>> -
>>>  /* Caller must pass aligned values */
>>>  assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
>>>  assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size));
>>
>> Directly below this we have "offset - end_offset" which could be
>> shortened to "count" (or "length" or "size" or...).
> 
> Okay, there's now enough comments that it's worth me respinning a v5.



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] qcow2: Optimize the refcount-block overlap check

2017-01-31 Thread Max Reitz

On 30.01.2017 17:14, Alberto Garcia wrote:
> The metadata overlap checks introduced in a40f1c2add help detect
> corruption in the qcow2 image by verifying that data writes don't
> overlap with existing metadata sections.
> 
> The 'refcount-block' check in particular iterates over the refcount
> table in order to get the addresses of all refcount blocks and check
> that none of them overlap with the region where we want to write.
> 
> The problem with the refcount table is that since it always occupies
> complete clusters its size is usually very big.

Actually the main problem is that BDRVQcow2State.refcount_table_size is
updated very generously as opposed to BDRVQcow2State.l1_size. The latter
is usually exactly as large as it needs to be (because the L1 table size
usually doesn't change), whereas the refcount_table_size is just bumped
up every time the image gets bigger until it reaches or exceeds the
value it needs to be.

> With the default
> values of cluster_size=64KB and refcount_bits=16 this table holds 8192
> entries, each one of them enough to map 2GB worth of host clusters.
> 
> So unless we're using images with several TB of allocated data this
> table is going to be mostly empty, and iterating over it is a waste of
> CPU. If the storage backend is fast enough this can have an effect on
> I/O performance.
> 
> This patch keeps the index of the last used (i.e. non-zero) entry in
> the refcount table and updates it every time the table changes. The
> refcount-block overlap check then uses that index instead of reading
> the whole table.
> 
> In my tests with a 4GB qcow2 file stored in RAM this doubles the
> amount of write IOPS.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block/qcow2-refcount.c | 21 -
>  block/qcow2.c  |  1 +
>  block/qcow2.h  |  1 +
>  3 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index cbfb3fe064..5e4d36587f 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -83,6 +83,16 @@ static Qcow2SetRefcountFunc *const set_refcount_funcs[] = {
>  /*/
>  /* refcount handling */
>  
> +static void update_max_refcount_table_index(BDRVQcow2State *s)
> +{
> +unsigned i = s->refcount_table_size - 1;
> +while (i > 0 && (s->refcount_table[i] & REFT_OFFSET_MASK) == 0) {
> +i--;
> +}
> +/* Set s->max_refcount_table_index to the index of the last used entry */
> +s->max_refcount_table_index = i;
> +}
> +
>  int qcow2_refcount_init(BlockDriverState *bs)
>  {
>  BDRVQcow2State *s = bs->opaque;
> @@ -111,6 +121,7 @@ int qcow2_refcount_init(BlockDriverState *bs)
>  }
>  for(i = 0; i < s->refcount_table_size; i++)
>  be64_to_cpus(>refcount_table[i]);
> +update_max_refcount_table_index(s);
>  }
>  return 0;
>   fail:
> @@ -439,6 +450,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
>  }
>  
>  s->refcount_table[refcount_table_index] = new_block;
> +s->max_refcount_table_index = refcount_table_index;

Should be MAX(s->max_refcount_table_index, refcount_table_index) or this
won't support refcount tables with holes in them.

Apart from this, the patch looks good to me. (Just a nagging comment below.)

>  
>  /* The new refcount block may be where the caller intended to put its
>   * data, so let it restart the search. */
> @@ -580,6 +592,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
>  s->refcount_table = new_table;
>  s->refcount_table_size = table_size;
>  s->refcount_table_offset = table_offset;
> +update_max_refcount_table_index(s);
>  
>  /* Free old table. */
>  qcow2_free_clusters(bs, old_table_offset, old_table_size * 
> sizeof(uint64_t),
> @@ -2171,6 +2184,7 @@ write_refblocks:
>  s->refcount_table = on_disk_reftable;
>  s->refcount_table_offset = reftable_offset;
>  s->refcount_table_size = reftable_size;
> +update_max_refcount_table_index(s);
>  
>  return 0;
>  
> @@ -2383,7 +2397,11 @@ int qcow2_check_metadata_overlap(BlockDriverState *bs, 
> int ign, int64_t offset,
>  }
>  
>  if ((chk & QCOW2_OL_REFCOUNT_BLOCK) && s->refcount_table) {
> -for (i = 0; i < s->refcount_table_size; i++) {
> +unsigned last_entry = s->max_refcount_table_index;
> +assert(last_entry < s->refcount_table_size);
> +assert(last_entry + 1 == s->refcount_table_size ||
> +   (s->refcount_table[last_entry + 1] & REFT_OFFSET_MASK) == 0);

I'm not sure we need this second assertion, but I don't mind it too much
either.

I'd actually find an assertion that last_entry is less than UNSIGNED_MAX
more important because otherwise the loop below would be an endless one. :-)

(Yes, it's pretty obvious that it's less than UNSIGNED_MAX because it's
less than

Re: [Qemu-devel] [PULL v3 03/21] ppc: switch to constants within BUILD_BUG_ON

2017-01-31 Thread Michael S. Tsirkin

On Wed, Feb 01, 2017 at 09:16:28AM +1100, David Gibson wrote:
> On Tue, Jan 31, 2017 at 03:14:29PM +0200, Michael S. Tsirkin wrote:
> > We are switching BUILD_BUG_ON to verify that it's parameter is a
> > compile-time constant, and it turns out that some gcc versions
> > (specifically gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609) are
> > not smart enough to figure it out for expressions involving local
> > variables. This is harmless but means that the check is ineffective for
> > these platforms.  To fix, replace the variable with macros.
> > 
> > Reported-by: Peter Maydell 
> > Signed-off-by: Michael S. Tsirkin 
> 
> Ugh.. I pulled this into my tree already (I've been having trouble
> with my test setup, which is why I haven't sent my own pullreq yet).
> However, it needs a s/%u/%llu/ to compile correctly on some platforms.

I need it to avoid breaking build. I fixed it up - want to drop it
from you tree? Or leave it there - git will resolve the trivial conflict.

> > ---
> >  hw/ppc/spapr.c | 14 --
> >  1 file changed, 8 insertions(+), 6 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index a642e66..b81f2ac 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -2630,8 +2630,8 @@ static void spapr_phb_placement(sPAPRMachineState 
> > *spapr, uint32_t index,
> >   * 1TiB 64-bit MMIO windows for each PHB.
> >   */
> >  const uint64_t base_buid = 0x8002000ULL;
> > -const int max_phbs =
> > -(SPAPR_PCI_LIMIT - SPAPR_PCI_BASE) / SPAPR_PCI_MEM64_WIN_SIZE - 1;
> > +#define SPAPR_MAX_PHBS ((SPAPR_PCI_LIMIT - SPAPR_PCI_BASE) / \
> > +SPAPR_PCI_MEM64_WIN_SIZE - 1)
> >  int i;
> >  
> >  /* Sanity check natural alignments */
> > @@ -2640,12 +2640,14 @@ static void spapr_phb_placement(sPAPRMachineState 
> > *spapr, uint32_t index,
> >  QEMU_BUILD_BUG_ON((SPAPR_PCI_MEM64_WIN_SIZE % 
> > SPAPR_PCI_MEM32_WIN_SIZE) != 0);
> >  QEMU_BUILD_BUG_ON((SPAPR_PCI_MEM32_WIN_SIZE % SPAPR_PCI_IO_WIN_SIZE) 
> > != 0);
> >  /* Sanity check bounds */
> > -QEMU_BUILD_BUG_ON((max_phbs * SPAPR_PCI_IO_WIN_SIZE) > 
> > SPAPR_PCI_MEM32_WIN_SIZE);
> > -QEMU_BUILD_BUG_ON((max_phbs * SPAPR_PCI_MEM32_WIN_SIZE) > 
> > SPAPR_PCI_MEM64_WIN_SIZE);
> > +QEMU_BUILD_BUG_ON((SPAPR_MAX_PHBS * SPAPR_PCI_IO_WIN_SIZE) >
> > +  SPAPR_PCI_MEM32_WIN_SIZE);
> > +QEMU_BUILD_BUG_ON((SPAPR_MAX_PHBS * SPAPR_PCI_MEM32_WIN_SIZE) >
> > +  SPAPR_PCI_MEM64_WIN_SIZE);
> >  
> > -if (index >= max_phbs) {
> > +if (index >= SPAPR_MAX_PHBS) {
> >  error_setg(errp, "\"index\" for PAPR PHB is too large (max %u)",
> > -   max_phbs - 1);
> > +   SPAPR_MAX_PHBS - 1);
> >  return;
> >  }
> >  
> 
> -- 
> David Gibson  | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au| minimalist, thank you.  NOT _the_ 
> _other_
>   | _way_ _around_!
> http://www.ozlabs.org/~dgibson

Re: [Qemu-devel] [PATCH v2] vl: Ensure the cpu_synchronize_all_post_init() in the appropriate location

2017-01-31 Thread Dou Liyang


Hi, Alex

Sorry for late reply.

At 01/27/2017 10:03 PM, Alex Bennée wrote:


Dou Liyang  writes:


At the Qemu initialization, we call the cpu_synchronize_all_post_init()
to synchronize All CPU states to KVM in the ./vl.c::main().

Currently, it is called before we initialize the CPUs, which created by
"-device" command, So, these CPUs may be ignored to synchronize.

The patch moves the numa_post_machine_init func in the appropriate
location to make sure that all the CPUs has already been created
when it is called.


This doesn't match what the patch does (the function is
cpu_synchronise_all_post_init) but I see another patch is already merged
that moved the NUMA one.



Oops, Yes, I make the mistake.


I'm afraid you'll need to re-base anyway because it no longer applies
cleanly.



Yes, I will re-base it.

Thanks,

Liyang






Signed-off-by: Dou Liyang 
---

Change log v1-> v2:
1. Split it from
https://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg03354.html
2. Rewrite the log.

 vl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/vl.c b/vl.c
index c643d3f..38269be 100644
--- a/vl.c
+++ b/vl.c
@@ -4547,8 +4547,6 @@ int main(int argc, char **argv, char **envp)

 audio_init();

-cpu_synchronize_all_post_init();
-
 numa_post_machine_init();

 if (qemu_opts_foreach(qemu_find_opts("fw_cfg"),
@@ -4571,6 +4569,9 @@ int main(int argc, char **argv, char **envp)
   device_init_func, NULL, NULL)) {
 exit(1);
 }
+
+cpu_synchronize_all_post_init();
+
 rom_reset_order_override();

 /* Did we create any drives that we failed to create a device for? */



--
Alex Bennée

Re: [Qemu-devel] [PATCH 1/1] kvm: log available guest crash parameters

2017-01-31 Thread Denis V. Lunev

On 01/31/2017 06:43 PM, Paolo Bonzini wrote:
>
> On 31/01/2017 05:07, Denis V. Lunev wrote:
>> From: Anton Nefedov 
>>
>> Windows reports BSOD parameters through Hyper-V crash MSRs. This
>> information is very useful for initial crash analysis and thus
>> it would be nice to see it in the QEMU log file. There is suitable
>> log mask for the purpose.
>>
>> Linux guest does not provide this information, but still it would
>> be nice to log that we have crashed.
> We tell libvirt that the guest crashed through a QMP event, so I think
> this should be passed as arguments of the QMP event.  You could add:
>
> - a QAPI type GuestPanicInformation
>
> {'union': 'GuestPanicInformation',
>  'data': { 'hyper-v': 'GuestPanicInformationHyperV' } }
>
> {'struct': 'GuestPanicInformationHyperV',
>  'data': { 'arg1': 'uint64',
>'arg2': 'uint64',
>'arg3': 'uint64',
>'arg4': 'uint64' } }
>
> - a QOM property on the CPU object, crash-information, whose getter
> returns an error unless cpu->crash_occurred = true
>
> - an optional GuestPanicInformation argument to the GUEST_PANICKED
> event, which is passed if the QOM property returns a non-error value
I like the idea of the event but also want to have the log message.
LOG_GUEST_ERROR level fits well to the idea at my opinion.

Thank you for the idea :)

Den

Re: [Qemu-devel] [PATCH] qcow2: Optimize the refcount-block overlap check

2017-01-31 Thread Max Reitz

On 30.01.2017 17:34, Alberto Garcia wrote:
> On Mon 30 Jan 2017 05:14:41 PM CET, Alberto Garcia wrote:
> 
>> This patch keeps the index of the last used (i.e. non-zero) entry in
>> the refcount table and updates it every time the table changes. The
>> refcount-block overlap check then uses that index instead of reading
>> the whole table.
> 
> Note that while I decided to go for this approach the patch can be made
> much simpler by simply stopping at the first empty entry in the refcount
> table:
> 
> if ((chk & QCOW2_OL_REFCOUNT_BLOCK) && s->refcount_table) {
> for (i = 0; i < s->refcount_table_size; i++) {
> if (!(s->refcount_table[i] & REFT_OFFSET_MASK)) {
> break;
> }
> if (overlaps_with(s->refcount_table[i] & REFT_OFFSET_MASK,
> s->cluster_size)) {
> return QCOW2_OL_REFCOUNT_BLOCK;
> }
> }
> }
> 
> I don't think QEMU produces files where refcount_table[i] == 0 but
> refcount_table[i + 1] != 0. Do they even make sense? In any case, my
> patch would cover those cases too, but this simplified version wouldn't.

I think it would be reasonable enough to assume that such images don't
exist (and if they do, we don't do overlap checks above the hole -- that
isn't exactly the end of the world). But your patch looks simple enough
as it is.

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH V2 2/2] block/nfs: fix naming of runtime opts

2017-01-31 Thread Max Reitz

On 31.01.2017 16:56, Peter Lieven wrote:
> commit 94d6a7a accidently left the naming of runtime opts and QAPI
> scheme inconsistent. As one consequence passing of parameters in the
> URI is broken. Sync the naming of the runtime opts to the QAPI
> scheme.
> 
> Please note that this is technically backwards incompatible with the 2.8
> release, but the 2.8 release is the only version that had the wrong naming.
> Furthermore release 2.8 suffered from a NULL pointer deference during
> URI parsing.

I always love things like this. "This technically breaks X, but
obviously nobody used X, because X always crashed." Reminds me of how we
success^W accidentally removed qcow2 encryption.

Technically however this also changes the runtime parameter names, I
think. Giving them probably did work in 2.8, so it is not compatible
there. I don't mind very much, though, and you're the maintainer, so
it's fine with me.

Anyway, I think this patch should also update nfs_refresh_filename().

Max

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PULL v3 03/21] ppc: switch to constants within BUILD_BUG_ON

2017-01-31 Thread David Gibson

On Tue, Jan 31, 2017 at 03:14:29PM +0200, Michael S. Tsirkin wrote:
> We are switching BUILD_BUG_ON to verify that it's parameter is a
> compile-time constant, and it turns out that some gcc versions
> (specifically gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609) are
> not smart enough to figure it out for expressions involving local
> variables. This is harmless but means that the check is ineffective for
> these platforms.  To fix, replace the variable with macros.
> 
> Reported-by: Peter Maydell 
> Signed-off-by: Michael S. Tsirkin 

Ugh.. I pulled this into my tree already (I've been having trouble
with my test setup, which is why I haven't sent my own pullreq yet).
However, it needs a s/%u/%llu/ to compile correctly on some platforms.

> ---
>  hw/ppc/spapr.c | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index a642e66..b81f2ac 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2630,8 +2630,8 @@ static void spapr_phb_placement(sPAPRMachineState 
> *spapr, uint32_t index,
>   * 1TiB 64-bit MMIO windows for each PHB.
>   */
>  const uint64_t base_buid = 0x8002000ULL;
> -const int max_phbs =
> -(SPAPR_PCI_LIMIT - SPAPR_PCI_BASE) / SPAPR_PCI_MEM64_WIN_SIZE - 1;
> +#define SPAPR_MAX_PHBS ((SPAPR_PCI_LIMIT - SPAPR_PCI_BASE) / \
> +SPAPR_PCI_MEM64_WIN_SIZE - 1)
>  int i;
>  
>  /* Sanity check natural alignments */
> @@ -2640,12 +2640,14 @@ static void spapr_phb_placement(sPAPRMachineState 
> *spapr, uint32_t index,
>  QEMU_BUILD_BUG_ON((SPAPR_PCI_MEM64_WIN_SIZE % SPAPR_PCI_MEM32_WIN_SIZE) 
> != 0);
>  QEMU_BUILD_BUG_ON((SPAPR_PCI_MEM32_WIN_SIZE % SPAPR_PCI_IO_WIN_SIZE) != 
> 0);
>  /* Sanity check bounds */
> -QEMU_BUILD_BUG_ON((max_phbs * SPAPR_PCI_IO_WIN_SIZE) > 
> SPAPR_PCI_MEM32_WIN_SIZE);
> -QEMU_BUILD_BUG_ON((max_phbs * SPAPR_PCI_MEM32_WIN_SIZE) > 
> SPAPR_PCI_MEM64_WIN_SIZE);
> +QEMU_BUILD_BUG_ON((SPAPR_MAX_PHBS * SPAPR_PCI_IO_WIN_SIZE) >
> +  SPAPR_PCI_MEM32_WIN_SIZE);
> +QEMU_BUILD_BUG_ON((SPAPR_MAX_PHBS * SPAPR_PCI_MEM32_WIN_SIZE) >
> +  SPAPR_PCI_MEM64_WIN_SIZE);
>  
> -if (index >= max_phbs) {
> +if (index >= SPAPR_MAX_PHBS) {
>  error_setg(errp, "\"index\" for PAPR PHB is too large (max %u)",
> -   max_phbs - 1);
> +   SPAPR_MAX_PHBS - 1);
>  return;
>  }
>  

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [RFC PATCH 06/17] target/ppc/POWER9: Direct all instr and data storage interrupts to the hypv

2017-01-31 Thread David Gibson

On Fri, Jan 13, 2017 at 05:28:12PM +1100, Suraj Jitindar Singh wrote:
> The vpm0 bit was removed from the LPCR in POWER9, this bit controlled
> whether ISI and DSI interrupts were directed to the hypervisor or the
> partition. These interrupts now go to the hypervisor irrespective, thus
> it is no longer necessary to check the vmp0 bit in the LPCR.
> 
> Signed-off-by: Suraj Jitindar Singh 

Reviewed-by: David Gibson 

> ---
>  target/ppc/mmu-hash64.c | 20 ++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
> index 3a2acb8..fe7da18 100644
> --- a/target/ppc/mmu-hash64.c
> +++ b/target/ppc/mmu-hash64.c
> @@ -640,7 +640,15 @@ static void ppc_hash64_set_isi(CPUState *cs, CPUPPCState 
> *env,
>  if (msr_ir) {
>  vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM1);
>  } else {
> -vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
> +switch (env->mmu_model) {
> +case POWERPC_MMU_3_00:
> +/* Field deprecated in ISAv3.00 - interrupts always go to hyperv 
> */
> +vpm = true;
> +break;
> +default:
> +vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
> +break;
> +}
>  }
>  if (vpm && !msr_hv) {
>  cs->exception_index = POWERPC_EXCP_HISI;
> @@ -658,7 +666,15 @@ static void ppc_hash64_set_dsi(CPUState *cs, CPUPPCState 
> *env, uint64_t dar,
>  if (msr_dr) {
>  vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM1);
>  } else {
> -vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
> +switch (env->mmu_model) {
> +case POWERPC_MMU_3_00:
> +/* Field deprecated in ISAv3.00 - interrupts always go to hyperv 
> */
> +vpm = true;
> +break;
> +default:
> +vpm = !!(env->spr[SPR_LPCR] & LPCR_VPM0);
> +break;
> +}
>  }
>  if (vpm && !msr_hv) {
>  cs->exception_index = POWERPC_EXCP_HDSI;

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 2/2] vl: Print CPU help after we've registered the CPU accelerators

2017-01-31 Thread David Gibson

On Tue, Jan 31, 2017 at 02:11:59PM +0100, Thomas Huth wrote:
> When running with KVM on POWER, we register some CPU types during
> the initialization function of the ppc64 KVM code (which unfortunately
> also can not be done via a type_init() like it is done on x86). So to
> be able to see these updates in the CPU help text, the code that calls
> list_cpus() has to be run after configure_accelerator(). This move should
> be fine since the "cpu_model" variable is also never used before the call
> to configure_accelerator(), and thus there should not be any unwanted
> side effects in the code before configure_accelerator() if the user
> started QEMU with "-cpu ?" or "-cpu help".
> 
> Signed-off-by: Thomas Huth 

Reviewed-by: David Gibson 

This looks fine to me, but it's not within my area to apply.  Paolo,
if you want to ack I'm happy to take it through my tree if that's
convenient for you.

> ---
>  vl.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/vl.c b/vl.c
> index 0b72b12..315c5c3 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -4055,11 +4055,6 @@ int main(int argc, char **argv, char **envp)
>  qemu_set_hw_version(machine_class->hw_version);
>  }
>  
> -if (cpu_model && is_help_option(cpu_model)) {
> -list_cpus(stdout, , cpu_model);
> -exit(0);
> -}
> -
>  if (!trace_init_backends()) {
>  exit(1);
>  }
> @@ -4298,6 +4293,11 @@ int main(int argc, char **argv, char **envp)
>  
>  configure_accelerator(current_machine);
>  
> +if (cpu_model && is_help_option(cpu_model)) {
> +list_cpus(stdout, , cpu_model);
> +exit(0);
> +}
> +
>  if (qtest_chrdev) {
>  qtest_init(qtest_chrdev, qtest_log, _fatal);
>  }

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 1/2] ppc/kvm: Handle the "family" CPU via alias instead of registering new types

2017-01-31 Thread David Gibson

On Tue, Jan 31, 2017 at 02:11:58PM +0100, Thomas Huth wrote:
> When running with KVM on POWER, we are registering a "family" CPU
> type for the host CPU that we are running on. For example, on all
> POWER8-compatible hosts, we register a "POWER8" CPU type, so that
> you can always start QEMU with "-cpu POWER8" there, without the
> need to know whether you are running on a POWER8, POWER8E or POWER8NVL
> host machine.
> However, we also have a "POWER8" CPU alias in the ppc_cpu_aliases list
> (that is mainly useful for TCG). This leads to two cosmetical drawbacks:
> If the user runs QEMU with "-cpu ?", we always claim that POWER8 is an
> "alias for POWER8_v2.0" - which is simply not true when running with
> KVM on POWER. And when using the 'query-cpu-definitions' QMP call,
> there are currently two entries for "POWER8", one for the alias, and
> one for the additional registered type.
> To solve the two problems, we should rather update the "family" alias
> instead of registering a new types. We then only have one "POWER8"
> CPU definition around, an alias, which also points to the right
> destination.
> 
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1396536
> Signed-off-by: Thomas Huth 

Reviewed-by: David Gibson 

Updating the otherwise static table of aliases is kind of ugly, but
then so is registering an extra full type as we do now.

Is this safe to apply without the follow up patch to vl.c.

> ---
>  target/ppc/kvm.c | 36 +++-
>  1 file changed, 23 insertions(+), 13 deletions(-)
> 
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index ec92c64..f58c260 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -24,6 +24,7 @@
>  #include "qemu-common.h"
>  #include "qemu/error-report.h"
>  #include "cpu.h"
> +#include "cpu-models.h"
>  #include "qemu/timer.h"
>  #include "sysemu/sysemu.h"
>  #include "sysemu/hw_accel.h"
> @@ -2412,6 +2413,7 @@ static int kvm_ppc_register_host_cpu_type(void)
>  };
>  PowerPCCPUClass *pvr_pcc;
>  DeviceClass *dc;
> +int i;
>  
>  pvr_pcc = kvm_ppc_get_host_cpu_class();
>  if (pvr_pcc == NULL) {
> @@ -2420,13 +2422,6 @@ static int kvm_ppc_register_host_cpu_type(void)
>  type_info.parent = object_class_get_name(OBJECT_CLASS(pvr_pcc));
>  type_register(_info);
>  
> -/* Register generic family CPU class for a family */
> -pvr_pcc = ppc_cpu_get_family_class(pvr_pcc);
> -dc = DEVICE_CLASS(pvr_pcc);
> -type_info.parent = object_class_get_name(OBJECT_CLASS(pvr_pcc));
> -type_info.name = g_strdup_printf("%s-"TYPE_POWERPC_CPU, dc->desc);
> -type_register(_info);
> -
>  #if defined(TARGET_PPC64)
>  type_info.name = g_strdup_printf("%s-"TYPE_SPAPR_CPU_CORE, "host");
>  type_info.parent = TYPE_SPAPR_CPU_CORE,
> @@ -2436,14 +2431,29 @@ static int kvm_ppc_register_host_cpu_type(void)
>  type_info.class_data = (void *) "host";
>  type_register(_info);
>  g_free((void *)type_info.name);
> -
> -/* Register generic spapr CPU family class for current host CPU type */
> -type_info.name = g_strdup_printf("%s-"TYPE_SPAPR_CPU_CORE, dc->desc);
> -type_info.class_data = (void *) dc->desc;
> -type_register(_info);
> -g_free((void *)type_info.name);
>  #endif
>  
> +/*
> + * Update generic CPU family class alias (e.g. on a POWER8NVL host,
> + * we want "POWER8" to be a "family" alias that points to the current
> + * host CPU type, too)
> + */
> +dc = DEVICE_CLASS(ppc_cpu_get_family_class(pvr_pcc));
> +for (i = 0; ppc_cpu_aliases[i].alias != NULL; i++) {
> +if (strcmp(ppc_cpu_aliases[i].alias, dc->desc) == 0) {
> +ObjectClass *oc = OBJECT_CLASS(pvr_pcc);
> +char *suffix;
> +
> +ppc_cpu_aliases[i].model = g_strdup(object_class_get_name(oc));
> +suffix = strstr(ppc_cpu_aliases[i].model, "-"TYPE_POWERPC_CPU);
> +if (suffix) {
> +*suffix = 0;
> +}
> +ppc_cpu_aliases[i].oc = oc;
> +break;
> +}
> +}
> +
>  return 0;
>  }
>  

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2.1] target-ppc: Add MMU model check for booke machines

2017-01-31 Thread David Gibson

On Thu, Jan 26, 2017 at 10:15:59AM +, Valentin Plotkin wrote:
> From: Valentin Plotkin 
> 
> Machines bamboo, e500 and virtex-ml507 assume a certain MMU model,
> otherwise resulting in unpredictable behavior. Add apropriate checks
> into *_init functions.
> 
> Signed-off-by: Valentin Plotkin 

Applied to ppc-for-2.9.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 0/2] POWER9 TCG enablements - part12

2017-01-31 Thread David Gibson

On Tue, Jan 24, 2017 at 12:17:22PM +0530, Nikunj A Dadhania wrote:
> Nikunj A Dadhania  writes:
> 
> > This series contains 5 new instructions for POWER9 ISA3.0
> > VSX Scalar Test Data Class
> > VSX Vector Test Data Class
> >
> > Changelog:
> > v1:
> > * Zero the match variable in the element loops
> >
> > v0:
> > * Concise logic for identifying data class in Scalar/Vector 
> >   test data class instructions
> >
> > Nikunj A Dadhania (2):
> >   target-ppc: Add xvtstdc[sp,dp] instructions
> >   target-ppc: Add xststdc[sp, dp, qp] instructions
> 
> Ping?

Sorry, I've been away.  I've applied them now.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

[Qemu-devel] [PATCH v5 3/3] qemu-io: Add failure regression tests

2017-01-31 Thread Nir Soffer

Add regression tests checking that qemu-io fails with non-zero exit code
when reading non-existing file or using the wrong image format.

Signed-off-by: Nir Soffer 
---
 tests/qemu-iotests/174 | 59 ++
 tests/qemu-iotests/174.out |  7 ++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 67 insertions(+)
 create mode 100755 tests/qemu-iotests/174
 create mode 100644 tests/qemu-iotests/174.out

diff --git a/tests/qemu-iotests/174 b/tests/qemu-iotests/174
new file mode 100755
index 000..c1c20a1
--- /dev/null
+++ b/tests/qemu-iotests/174
@@ -0,0 +1,59 @@
+#!/bin/bash
+#
+# Test that qemu-io fail with non-zero exit code
+#
+# Copyright (C) 2017 Nir Soffer 
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=nir...@gmail.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+status=1   # failure is the default!
+
+_cleanup()
+{
+   _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_unsupported_fmt raw
+
+
+size=256K
+IMGFMT=raw IMGOPTS= _make_test_img $size | _filter_imgfmt
+
+echo
+echo "== reading wrong format should fail =="
+$QEMU_IO -f $IMGFMT -c "read 0 $size" "$TEST_IMG" 2>/dev/null
+test $? -eq 1 || _fail "did not fail"
+
+echo
+echo "== reading missing file should fail =="
+$QEMU_IO -c "read 0 $size" "$TEST_DIR/missing" 2>/dev/null
+test $? -eq 1 || _fail "did not fail"
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/174.out b/tests/qemu-iotests/174.out
new file mode 100644
index 000..a06d237
--- /dev/null
+++ b/tests/qemu-iotests/174.out
@@ -0,0 +1,7 @@
+QA output created by 174
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=262144
+
+== reading wrong format should fail ==
+
+== reading missing file should fail ==
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 866c1a0..1732a8b 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -165,3 +165,4 @@
 170 rw auto quick
 171 rw auto quick
 172 auto
+174 auto
-- 
2.9.3

[Qemu-devel] [PATCH v5 1/3] qemu-io: Return non-zero exit code on failure

2017-01-31 Thread Nir Soffer

The result of openfile was not checked, leading to failure deep in the
actual command with confusing error message, and exiting with exit code 0.

Here is a simple example - trying to read with the wrong format:

$ touch file
$ qemu-io -f qcow2 -c 'read -P 1 0 1024' file; echo $?
can't open device file: Image is not in qcow2 format
no file open, try 'help open'
0

With this patch, we fail earlier with exit code 1:

$ ./qemu-io -f qcow2 -c 'read -P 1 0 1024' file; echo $?
can't open device file: Image is not in qcow2 format
1

Failing earlier, we don't log this error now:

no file open, try 'help open'

But some tests expected it; the line was removed from the test output.

Signed-off-by: Nir Soffer 
Reviewed-by: Eric Blake 
---
 qemu-io.c  |  8 ++--
 tests/qemu-iotests/059.out |  3 ---
 tests/qemu-iotests/070.out |  1 -
 tests/qemu-iotests/075.out |  7 ---
 tests/qemu-iotests/076.out |  3 ---
 tests/qemu-iotests/078.out |  6 --
 tests/qemu-iotests/080.out | 18 --
 tests/qemu-iotests/083.out | 17 -
 tests/qemu-iotests/088.out |  6 --
 tests/qemu-iotests/092.out | 12 
 tests/qemu-iotests/116.out |  7 ---
 tests/qemu-iotests/131.out |  1 -
 tests/qemu-iotests/140.out |  1 -
 13 files changed, 6 insertions(+), 84 deletions(-)

diff --git a/qemu-io.c b/qemu-io.c
index 23a229f..427cbae 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -595,13 +595,17 @@ int main(int argc, char **argv)
 exit(1);
 }
 opts = qemu_opts_to_qdict(qopts, NULL);
-openfile(NULL, flags, writethrough, opts);
+if (openfile(NULL, flags, writethrough, opts)) {
+exit(1);
+}
 } else {
 if (format) {
 opts = qdict_new();
 qdict_put(opts, "driver", qstring_from_str(format));
 }
-openfile(argv[optind], flags, writethrough, opts);
+if (openfile(argv[optind], flags, writethrough, opts)) {
+exit(1);
+}
 }
 }
 command_loop();
diff --git a/tests/qemu-iotests/059.out b/tests/qemu-iotests/059.out
index 678adb4..898b528 100644
--- a/tests/qemu-iotests/059.out
+++ b/tests/qemu-iotests/059.out
@@ -3,17 +3,14 @@ QA output created by 059
 === Testing invalid granularity ===
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
 can't open device TEST_DIR/t.vmdk: Invalid granularity, image may be corrupt
-no file open, try 'help open'
 
 === Testing too big L2 table size ===
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
 can't open device TEST_DIR/t.vmdk: L2 table size too big
-no file open, try 'help open'
 
 === Testing too big L1 table size ===
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
 can't open device TEST_DIR/t.vmdk: L1 size too big
-no file open, try 'help open'
 
 === Testing monolithicFlat creation and opening ===
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=2147483648 
subformat=monolithicFlat
diff --git a/tests/qemu-iotests/070.out b/tests/qemu-iotests/070.out
index 131a5b1..c269d99 100644
--- a/tests/qemu-iotests/070.out
+++ b/tests/qemu-iotests/070.out
@@ -4,7 +4,6 @@ QA output created by 070
 can't open device TEST_DIR/iotest-dirtylog-10G-4M.vhdx: VHDX image file 
'TEST_DIR/iotest-dirtylog-10G-4M.vhdx' opened read-only, but contains a log 
that needs to be replayed
 To replay the log, run:
 qemu-img check -r all 'TEST_DIR/iotest-dirtylog-10G-4M.vhdx'
- no file open, try 'help open'
 === Verify open image replays log  ===
 read 18874368/18874368 bytes at offset 0
 18 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
diff --git a/tests/qemu-iotests/075.out b/tests/qemu-iotests/075.out
index 87beae4..b234b75 100644
--- a/tests/qemu-iotests/075.out
+++ b/tests/qemu-iotests/075.out
@@ -10,29 +10,22 @@ read 512/512 bytes at offset 1048064
 
 == block_size must be a multiple of 512 ==
 can't open device TEST_DIR/simple-pattern.cloop: block_size 513 must be a 
multiple of 512
-no file open, try 'help open'
 
 == block_size cannot be zero ==
 can't open device TEST_DIR/simple-pattern.cloop: block_size cannot be zero
-no file open, try 'help open'
 
 == huge block_size ===
 can't open device TEST_DIR/simple-pattern.cloop: block_size 4294966784 must be 
64 MB or less
-no file open, try 'help open'
 
 == offsets_size overflow ===
 can't open device TEST_DIR/simple-pattern.cloop: n_blocks 4294967295 must be 
536870911 or less
-no file open, try 'help open'
 
 == refuse images that require too many offsets ===
 can't open device TEST_DIR/simple-pattern.cloop: image requires too many 
offsets, try increasing block size
-no file open, try 'help open'
 
 == refuse images with non-monotonically increasing offsets ==
 can't open device TEST_DIR/simple-pattern.cloop: offsets not monotonically 
increasing at index 1, image file is corrupt
-no file open, try 'help

[Qemu-devel] [PATCH v5 2/3] qemu-iotests: Add _unsupported_fmt helper

2017-01-31 Thread Nir Soffer

This helper allows adding tests supporting any format expect the
specified formats. This may be useful to test that many formats behave
in a common way.

Signed-off-by: Nir Soffer 
---
 tests/qemu-iotests/common.rc | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 3213765..c6d5d81 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -355,6 +355,17 @@ _supported_fmt()
 _notrun "not suitable for this image format: $IMGFMT"
 }
 
+# tests whether $IMGFMT is one of the unsupported image format for a test
+#
+_unsupported_fmt()
+{
+for f; do
+if [ "$f" = "$IMGFMT" ]; then
+_notrun "not suitable for this image format: $IMGFMT"
+fi
+done
+}
+
 # tests whether $IMGPROTO is one of the supported image protocols for a test
 #
 _supported_proto()
-- 
2.9.3

[Qemu-devel] [PATCH v5 0/3] Fix qemu-io return value on failure

2017-01-31 Thread Nir Soffer

This series fix qemu-io to fail with non zero exit code when failing to open
the file.

Changes since v4:
- Added _unsupported_fmt helper
- Test any format except raw, instead of only raw
- Don't test stderr content, depends on the format
- New test move to 174 since 173 is pending
- Private copyright for new test
- Fix commit message issues

Changes since v3:
- Add cover letter
- Squash the tests fix with the behavior change, so git bisect landing on the
  change in behavior does not hit unnecessarily-broken tests.

Tested by running qemu-io manually and by running tests/check-block.sh.
Note that test 059 has one unrelated test failure.

Nir Soffer (3):
  qemu-io: Return non-zero exit code on failure
  qemu-iotests: Add _unsupported_fmt helper
  qemu-io: Add failure regression tests

 qemu-io.c|  8 --
 tests/qemu-iotests/059.out   |  3 ---
 tests/qemu-iotests/070.out   |  1 -
 tests/qemu-iotests/075.out   |  7 --
 tests/qemu-iotests/076.out   |  3 ---
 tests/qemu-iotests/078.out   |  6 -
 tests/qemu-iotests/080.out   | 18 --
 tests/qemu-iotests/083.out   | 17 -
 tests/qemu-iotests/088.out   |  6 -
 tests/qemu-iotests/092.out   | 12 -
 tests/qemu-iotests/116.out   |  7 --
 tests/qemu-iotests/131.out   |  1 -
 tests/qemu-iotests/140.out   |  1 -
 tests/qemu-iotests/174   | 59 
 tests/qemu-iotests/174.out   |  7 ++
 tests/qemu-iotests/common.rc | 11 +
 tests/qemu-iotests/group |  1 +
 17 files changed, 84 insertions(+), 84 deletions(-)
 create mode 100755 tests/qemu-iotests/174
 create mode 100644 tests/qemu-iotests/174.out

-- 
2.9.3

Re: [Qemu-devel] [Qemu-block] [PATCH v3 2/3] block: Fix target variable of BLKSECTGET ioctl

2017-01-31 Thread Max Reitz

On 20.01.2017 17:25, Eric Farman wrote:
> Commit 6f6071745bd0 ("raw-posix: Fetch max sectors for host block device")
> introduced a routine to call the kernel BLKSECTGET ioctl, which stores the
> result back to user space.  However, the size of the data returned depends
> on the routine handling the ioctl.  The (compat_)blkdev_ioctl returns a
> short, while sg_ioctl returns an int.  Thus, on big-endian systems, we can
> find ourselves accidentally shifting the result to a much larger value.
> (On s390x, a short is 16 bits while an int is 32 bits.)
> 
> Also, the two ioctl handlers return values in different scales (block
> returns sectors, while sg returns bytes), so some tweaking of the outputs
> is required such that hdev_get_max_transfer_length returns a value in a
> consistent set of units.
> 
> Signed-off-by: Eric Farman 
> ---
>  block/file-posix.c| 17 ++---
>  include/block/block.h |  1 +
>  2 files changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 28b47d9..9f83725 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -651,12 +651,15 @@ static void raw_reopen_abort(BDRVReopenState *state)
>  state->opaque = NULL;
>  }
>  
> -static int hdev_get_max_transfer_length(int fd)
> +static int hdev_get_max_transfer_length(BlockDriverState *bs, int fd)
>  {
>  #ifdef BLKSECTGET
> -int max_sectors = 0;
> -if (ioctl(fd, BLKSECTGET, _sectors) == 0) {
> -return max_sectors;
> +int max_bytes = 0;
> +short max_sectors = 0;
> +if (bs->sg && ioctl(fd, BLKSECTGET, _bytes) == 0) {
> +return max_bytes;
> +} else if (!bs->sg && ioctl(fd, BLKSECTGET, _sectors) == 0) {
> +return max_sectors << BDRV_SECTOR_BITS;
>  } else {
>  return -errno;
>  }
> @@ -672,9 +675,9 @@ static void raw_refresh_limits(BlockDriverState *bs, 
> Error **errp)
>  
>  if (!fstat(s->fd, )) {
>  if (S_ISBLK(st.st_mode)) {
> -int ret = hdev_get_max_transfer_length(s->fd);
> -if (ret > 0 && ret <= BDRV_REQUEST_MAX_SECTORS) {
> -bs->bl.max_transfer = pow2floor(ret << BDRV_SECTOR_BITS);
> +int ret = hdev_get_max_transfer_length(bs, s->fd);
> +if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
> +bs->bl.max_transfer = pow2floor(ret);
>  }
>  }
>  }
> diff --git a/include/block/block.h b/include/block/block.h
> index 8b0dcda..4e81f20 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -116,6 +116,7 @@ typedef struct HDGeometry {
>  
>  #define BDRV_REQUEST_MAX_SECTORS MIN(SIZE_MAX >> BDRV_SECTOR_BITS, \
>   INT_MAX >> BDRV_SECTOR_BITS)
> +#define BDRV_REQUEST_MAX_BYTES (BDRV_REQUEST_MAX_SECTORS << BDRV_SECTOR_BITS)

I'd rather like it the other way round (i.e.

#define BDRV_REQUEST_MAX_BYTES MIN(SIZE_MAX, INT_MAX)
#define BDRV_REQUEST_MAX_SECTORS (BDRV_REQUEST_MAX_BYTES >>
BDRV_SECTOR_BITS)

), but I don't suppose I can interest you in a respin, can I? Would it
be OK with you if I changed you commit when I apply it?

Max

>  
>  /*
>   * Allocation status flags
> 




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Qemu-block] [PATCH v3 0/3] scsi-generic and BLKSECTGET

2017-01-31 Thread Max Reitz

On 31.01.2017 12:37, John Snow wrote:
> 
> 
> On 01/24/2017 07:09 AM, Fam Zheng wrote:
>> On Tue, 01/24 12:23, Paolo Bonzini wrote:
>>>
>>>
>>> On 22/01/2017 15:29, Fam Zheng wrote:
 On Fri, 01/20 17:25, Eric Farman wrote:
> Changes:
>  v2->v3:
>   - Move byte/sector conversions to patch 2 [Fam Zheng]
>   - Rename "max_sectors" when holding a byte value [Fam Zheng]
>  v1->v2:
>   - Patch 3, make hdev_get_max_transfer_length return bytes [Fam Zheng]

 Reviewed-by: Fam Zheng 
>>>
>>> Looks good, who's merging this? :)
>>
>> All I know is not me :) Adding Kevin and Max who maintain block/file-posix.c.
>>
>> Fam
>>
> 
> Kevin's traveling for DevConf.cz, so how about Max? :)

Did Kevin tell you to write that? If so, tell him he's got to merge all
of my patches once he's back. :-P

Ooooh, I see, he's trying to get me to merge so many patches I'll have
to send my own pull request. Cleverrr. Verrry clever. ;-)

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v8 00/25] Remaining MTTCG Base patches and ARM enablement

2017-01-31 Thread Richard Henderson


On 01/27/2017 02:38 AM, Alex Bennée wrote:

Hi,

All of the changes in this revision are addressing comments from v7
posted last week. A new pre-cursor patch was added:

  cputlb and arm/sparc targets: convert mmuidx flushes from varg to
bitmap

To change the cputlb API to use a bitmap instead of vargs. This has
generated quite a bit of churn in the ARM target but it is pretty
mechanical.

I also folded the BQL irq protection patches from v7 into:

  tcg: drop global lock during TCG code execution

This is required to keep the series bisectable although the BQL safety
is only really relevant to guests using MTTCG. I didn't think it was
worth making the asserts conditional on parallel_cpus although it does
mean this patch gets a little bigger.

The other big change was to:

  cputlb: introduce tlb_flush_*_all_cpus[_synced]

Where I replaced the wait flag with an expanded set of API calls. The
*_synced variants which are marked as QEMU_NORETURN to make their
behaviour clear.

The series applies to origin/master as of today and you can find my
tree at:

  https://github.com/stsquad/qemu/tree/mttcg/base-patches-v8

There is the usual collection of r-b tags and minor merge/re-base
fixes all documented in the --- sections of the commit messages.

In terms of merging strategy I would appreciate some thoughts. While I
think the series is ready to go I appreciate it is quite a chunk to
merge in one go. That said an early merge gives us plenty of time to
shake out any lingering issues before feature freeze.


As far as I'm concerned, I'm done with my review and this can go in.

Those remaining nits in target/arm are extremely minor and ought to be between 
you and PMM as ARM maintainer -- whatever he's happy with should go.


Once this is in I'll send a patch to enable mttcg for Alpha, which has also 
tested well.



r~

Re: [Qemu-devel] [PATCH v8 25/25] tcg: enable MTTCG by default for ARM on x86 hosts

2017-01-31 Thread Richard Henderson


On 01/27/2017 02:39 AM, Alex Bennée wrote:

+/* This defines the natural memory order supported by this
+ * architecture before guarantees made by various barrier
+ * instructions.
+ *
+ * The x86 has a pretty strong memory ordering which only really
+ * allows for some stores to be re-ordered after loads.
+ */
+#include "tcg-mo.h"
+
+static inline int get_tcg_target_mo(void)
+{
+return TCG_MO_ALL & ~TCG_MO_LD_ST;
+}
+
+#define TCG_TARGET_DEFAULT_MO get_tcg_target_mo()
+


Why the inline function?  The expression in the define would seem sufficient. 
Otherwise,


Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH v8 24/25] target-arm: ensure all cross vCPUs TLB flushes complete

2017-01-31 Thread Richard Henderson


On 01/27/2017 02:39 AM, Alex Bennée wrote:

Previously flushes on other vCPUs would only get serviced when they
exited their TranslationBlocks. While this isn't overly problematic it
violates the semantics of TLB flush from the point of view of source
vCPU.

To solve this we call the cputlb *_all_cpus() functions to do the
flushes and ask it to ensure all flushes are completed before we start
the next instruction. As this involves exiting the cpu_loop we need to
ensure the PC is saved before the tlb helper functions are called.

Signed-off-by: Alex Bennée 


Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH v8 19/25] cputlb: introduce tlb_flush_*_all_cpus[_synced]

2017-01-31 Thread Richard Henderson


On 01/27/2017 02:39 AM, Alex Bennée wrote:

This introduces support to the cputlb API for flushing all CPUs TLBs
with one call. This avoids the need for target helpers to iterate
through the vCPUs themselves.

An additional variant of the API (_synced) do not return from the
caller and will cause the work to be scheduled as "safe work". The
result will be all the flush operations will be complete by the time
the originating vCPU starts executing again. It is up to the caller to
ensure enough state has been saved so execution can be restarted at
the next appropriate instruction.

Some guest architectures can defer completion of flush operations
until later. If they later schedule work using the async_safe_work
mechanism they can be sure other vCPUs will have flushed their TLBs by
the point execution returns from the safe work.

Signed-off-by: Alex Bennée 


Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH 24/24] qcow2-bitmap: cache bitmap list in BDRVQcow2State

2017-01-31 Thread Max Reitz

On 23.01.2017 13:10, Vladimir Sementsov-Ogievskiy wrote:
> Do not reload bitmap list every time it is needed.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/qcow2-bitmap.c | 125 
> ++-
>  block/qcow2.c|   2 +
>  block/qcow2.h|   4 ++
>  3 files changed, 89 insertions(+), 42 deletions(-)

Hm, difficult. On a glance, this patch seems alright, and it probably
makes sense. Maybe it would make the code actually a bit easier to
understand if it was squashed into the other patches. But OTOH it would
mean that I would have to start review from scratch for those patches...

So in my opinion this patch makes sense and you should keep it updated,
but I think we should keep it separate from the rest of the series. Get
the first 23 patches in first and then we can merge this on top later on.

Would that be OK?

Max

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] Drop QEMU_GNUC_PREREQ() checks for gcc older than 4.1

2017-01-31 Thread Paolo Bonzini



On 31/01/2017 11:58, Peter Maydell wrote:
> On 31 January 2017 at 16:55, Daniel P. Berrange  wrote:
>> On Tue, Jan 31, 2017 at 04:14:47PM +, Peter Maydell wrote:
>>> We already require gcc 4.1 or newer (for the atomic
>>> support), so the fallback codepaths for older gcc
>>> versions than that are now dead code and we can
>>> just delete them.
>>
>> Do we have any explicit check alreadu for 4.1, or do we
>> just let the build fail on the atomic code ?
>>
>> IOW, is there any use in having..
>>
>> #if !QEMU_GNUC_PREREQ(4, 1)
>> # error "QEMU requires GCC >= 4.1 or CLang"
>> #endif
> 
> I guess we just barf in the atomics. An explicit check might
> not be a bad idea I guess, though it depends a bit on how
> many releases we've been implicitly requiring 4.1 for. I
> suppose it would act as documentation of our current
> minimum req.

We have:

- first occurrence on Linux hosts: 0.13.0 (for vhost)

- first occurrence on non-Linux hosts: 0.15.0 (simpletrace backend),
then backed out in 1.0

- next occurrence on all hosts, but only in tests: 1.4.0

- finally, first occurrence on all hosts with simple ./configure &&
make: 2.0.0

Paolo

Re: [Qemu-devel] [PATCH] Drop QEMU_GNUC_PREREQ() checks for gcc older than 4.1

2017-01-31 Thread Paolo Bonzini



On 31/01/2017 12:40, Markus Armbruster wrote:
>>  
>>  #define QEMU_NORETURN __attribute__ ((__noreturn__))
>>  
>> -#if QEMU_GNUC_PREREQ(3, 4)
>>  #define QEMU_WARN_UNUSED_RESULT __attribute__((warn_unused_result))
>> -#else
>> -#define QEMU_WARN_UNUSED_RESULT
>> -#endif
> Should we inline this macro?
> 
>>  
>> -#if QEMU_GNUC_PREREQ(4, 0)
>>  #define QEMU_SENTINEL __attribute__((sentinel))
>> -#else
>> -#define QEMU_SENTINEL
>> -#endif
> Likewise.

Why, since we don't do that for QEMU_NORETURN, QEMU_PACKED, etc.?

Paolo

Re: [Qemu-devel] [PATCH 23/24] qcow2: add .bdrv_remove_persistent_dirty_bitmap

2017-01-31 Thread Max Reitz

On 23.01.2017 13:10, Vladimir Sementsov-Ogievskiy wrote:
> Realize .bdrv_remove_persistent_dirty_bitmap interface.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/qcow2-bitmap.c | 40 
>  block/qcow2.c|  1 +
>  block/qcow2.h|  3 +++
>  3 files changed, 44 insertions(+)
> 
> diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
> index 2687a3acd5..be026fc80e 100644
> --- a/block/qcow2-bitmap.c
> +++ b/block/qcow2-bitmap.c
> @@ -1064,6 +1064,46 @@ static Qcow2Bitmap 
> *find_bitmap_by_name(Qcow2BitmapList *bm_list,
>  return NULL;
>  }
>  
> +void qcow2_remove_persistent_dirty_bitmap(BlockDriverState *bs,
> +  const char *name,
> +  Error **errp)
> +{
> +int ret;
> +BDRVQcow2State *s = bs->opaque;
> +Qcow2Bitmap *bm;
> +Qcow2BitmapList *bm_list;
> +
> +if (s->nb_bitmaps == 0) {
> +/* No bitmaps - nothing to do */

Shouldn't it be an error? I.e. "bitmap not found"?

> +return;
> +}
> +
> +bm_list = bitmap_list_load(bs, s->bitmap_directory_offset,
> +   s->bitmap_directory_size, errp);
> +if (bm_list == NULL) {
> +return;
> +}
> +
> +bm = find_bitmap_by_name(bm_list, name);
> +if (bm == NULL) {
> +goto fail;

What about setting errp? Or do you not consider this an error?

I think it should be an error.

> +}
> +
> +QSIMPLEQ_REMOVE(bm_list, bm, Qcow2Bitmap, entry);

bm->name is leaked here.

Max

> +
> +ret = update_ext_header_and_dir(bs, bm_list);
> +if (ret < 0) {
> +error_setg_errno(errp, -ret, "Failed to update bitmap extension");
> +goto fail;
> +}
> +
> +free_bitmap_clusters(bs, >table);
> +
> +fail:
> +bitmap_free(bm);
> +bitmap_list_free(bm_list);
> +}
> +
>  void qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs, Error **errp)
>  {
>  BdrvDirtyBitmap *bitmap;
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 76a6a2ba28..f6bac38e51 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -3544,6 +3544,7 @@ BlockDriver bdrv_qcow2 = {
>  .bdrv_load_autoloading_dirty_bitmaps = 
> qcow2_load_autoloading_dirty_bitmaps,
>  .bdrv_store_persistent_dirty_bitmaps = 
> qcow2_store_persistent_dirty_bitmaps,
>  .bdrv_can_store_new_dirty_bitmap = qcow2_can_store_new_dirty_bitmap,
> +.bdrv_remove_persistent_dirty_bitmap = 
> qcow2_remove_persistent_dirty_bitmap,
>  };
>  
>  static void bdrv_qcow2_init(void)
> diff --git a/block/qcow2.h b/block/qcow2.h
> index eaad34a527..a009827f60 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -627,5 +627,8 @@ bool qcow2_can_store_new_dirty_bitmap(BlockDriverState 
> *bs,
>  int qcow2_check_bitmaps_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
>void **refcount_table,
>int64_t *refcount_table_size);
> +void qcow2_remove_persistent_dirty_bitmap(BlockDriverState *bs,
> +  const char *name,
> +  Error **errp);
>  
>  #endif
> 




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 22/24] block/dirty-bitmap: deep release dirty bitmaps

2017-01-31 Thread Max Reitz

On 01.02.2017 00:00, Max Reitz wrote:
> On 31.01.2017 17:03, Vladimir Sementsov-Ogievskiy wrote:

[...]

>> I've started to rewrite it in that way and it seems like deep almost
>> always will be false. What about just move call to
>> bs->drv->bdrv_remove_persistent_dirty_bitmap() directly to
>> qmp_block_dirty_bitmap_remove and add parameter
>> 'remove_persistent' (or 'remove_from_storage', to make it maximum
>> descriptive) only to this qmp command?
> 
> Or you could add an explicit bdrv_remove_persistent_dirty_bitmap() which
> just wraps the bs->drv function.
> 
>> Or even without that parameter, as leaving inconsistent version of
>> bitmap in the storage doesn't seem useful.
> 
> Hm, right, the bitmap will be automatically removed from the image once
> it's closed, right?

[No, it won't.]

But you're right, if a persistent bitmap is completely removed by the
user, it should probably be always removed from the image file, too.

The only use I could think of for dropping a bitmap from memory but
wanting to keep it in the file is for keeping it at a certain state. It
would make more sense to allow the user to disable a bitmap to achieve
this, though.

We don't have a block-dirty-bitmap-disable command yet, though, but I
guess it shouldn't be too hard to add...?

(Anyway, it's something that's out of scope for this series, I think.)

Max

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v8 17/25] cputlb: add tlb_flush_by_mmuidx async routines

2017-01-31 Thread Richard Henderson


On 01/27/2017 02:39 AM, Alex Bennée wrote:

This converts the remaining TLB flush routines to use async work when
detecting a cross-vCPU flush. The only minor complication is having to
serialise the var_list of MMU indexes into a form that can be punted
to an asynchronous job.

The pending_tlb_flush field on QOM's CPU structure also becomes a
bitfield rather than a boolean.

Signed-off-by: Alex Bennée 


Reviewed-by: Richard Henderson 


r~

Re: [Qemu-devel] [PATCH v8 16/25] cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap

2017-01-31 Thread Richard Henderson


On 01/27/2017 02:39 AM, Alex Bennée wrote:

+for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {

-tlb_debug("%d\n", mmu_idx);
+if (test_bit(mmu_idx, _idx_bitmask)) {
+tlb_debug("%d\n", mmu_idx);

-memset(env->tlb_table[mmu_idx], -1, sizeof(env->tlb_table[0]));
-memset(env->tlb_v_table[mmu_idx], -1, sizeof(env->tlb_v_table[0]));
+memset(env->tlb_table[mmu_idx], -1, sizeof(env->tlb_table[0]));
+memset(env->tlb_v_table[mmu_idx], -1, sizeof(env->tlb_v_table[0]));
+}


Perhaps it doesn't matter since NB_MMU_MODES is so small but

  for (; idxmap != 0; idxmap &= idxmap - 1) {
int mmu_idx = ctz32(idxmap);
...
  }

iterates only for the bits that are set.


-typedef enum ARMMMUIdx {
-ARMMMUIdx_S12NSE0 = 0,
-ARMMMUIdx_S12NSE1 = 1,
-ARMMMUIdx_S1E2 = 2,
-ARMMMUIdx_S1E3 = 3,
-ARMMMUIdx_S1SE0 = 4,
-ARMMMUIdx_S1SE1 = 5,
-ARMMMUIdx_S2NS = 6,
+typedef enum ARMMMUBitMap {
+ARMMMUBit_S12NSE0 = 1 << 0,
+ARMMMUBit_S12NSE1 = 1 << 1,
+ARMMMUBit_S1E2 = 1 << 2,
+ARMMMUBit_S1E3 = 1 << 3,
+ARMMMUBit_S1SE0 = 1 << 4,
+ARMMMUBit_S1SE1 = 1 << 5,
+ARMMMUBit_S2NS = 1 << 6,
 /* Indexes below here don't have TLBs and are used only for AT system
  * instructions or for the first stage of an S12 page table walk.
  */
-ARMMMUIdx_S1NSE0 = 7,
-ARMMMUIdx_S1NSE1 = 8,
-} ARMMMUIdx;
+ARMMMUBit_S1NSE0 = 1 << 7,
+ARMMMUBit_S1NSE1 = 1 << 8,
+} ARMMMUBitMap;

-#define MMU_USER_IDX 0
+typedef int ARMMMUIdx;
+
+static inline ARMMMUIdx arm_mmu_bit_to_idx(ARMMMUBitMap bit)
+{
+g_assert(ctpop16(bit) == 1);
+return ctz32(bit);
+}
+
+static inline ARMMMUBitMap arm_mmu_idx_to_bit(ARMMMUIdx idx)
+{
+return 1 << idx;
+}


I don't understand this redefinition though, causing...


@@ -2109,13 +2109,13 @@ static void ats_write(CPUARMState *env, const 
ARMCPRegInfo *ri, uint64_t value)
 /* stage 1 current state PL1: ATS1CPR, ATS1CPW */
 switch (el) {
 case 3:
-mmu_idx = ARMMMUIdx_S1E3;
+mmu_bit = ARMMMUBit_S1E3;
 break;
 case 2:
-mmu_idx = ARMMMUIdx_S1NSE1;
+mmu_bit = ARMMMUBit_S1NSE1;
 break;
 case 1:
-mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_S1NSE1;
+mmu_bit = secure ? ARMMMUBit_S1SE1 : ARMMMUBit_S1NSE1;
 break;
 default:
 g_assert_not_reached();
@@ -2125,13 +2125,13 @@ static void ats_write(CPUARMState *env, const 
ARMCPRegInfo *ri, uint64_t value)
 /* stage 1 current state PL0: ATS1CUR, ATS1CUW */
 switch (el) {
 case 3:
-mmu_idx = ARMMMUIdx_S1SE0;
+mmu_bit = ARMMMUBit_S1SE0;
 break;
 case 2:
-mmu_idx = ARMMMUIdx_S1NSE0;
+mmu_bit = ARMMMUBit_S1NSE0;
 break;
 case 1:
-mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_S1NSE0;
+mmu_bit = secure ? ARMMMUBit_S1SE0 : ARMMMUBit_S1NSE0;
 break;
 default:
 g_assert_not_reached();
@@ -2139,17 +2139,17 @@ static void ats_write(CPUARMState *env, const 
ARMCPRegInfo *ri, uint64_t value)
 break;
 case 4:
 /* stage 1+2 NonSecure PL1: ATS12NSOPR, ATS12NSOPW */
-mmu_idx = ARMMMUIdx_S12NSE1;
+mmu_bit = ARMMMUBit_S12NSE1;
 break;
 case 6:
 /* stage 1+2 NonSecure PL0: ATS12NSOUR, ATS12NSOUW */
-mmu_idx = ARMMMUIdx_S12NSE0;
+mmu_bit = ARMMMUBit_S12NSE0;
 break;
 default:
 g_assert_not_reached();
 }

-par64 = do_ats_write(env, value, access_type, mmu_idx);
+par64 = do_ats_write(env, value, access_type, arm_mmu_bit_to_idx(mmu_bit));


... this sort of churn, only to convert *back* to an index.


@@ -2185,26 +2186,26 @@ static void ats_write64(CPUARMState *env, const 
ARMCPRegInfo *ri,
 case 0:
 switch (ri->opc1) {
 case 0: /* AT S1E1R, AT S1E1W */
-mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_S1NSE1;
+mmu_idx = secure ? ARMMMUBit_S1SE1 : ARMMMUBit_S1NSE1;
 break;
 case 4: /* AT S1E2R, AT S1E2W */
-mmu_idx = ARMMMUIdx_S1E2;
+mmu_idx = ARMMMUBit_S1E2;
 break;
 case 6: /* AT S1E3R, AT S1E3W */
-mmu_idx = ARMMMUIdx_S1E3;
+mmu_idx = ARMMMUBit_S1E3;
 break;
 default:
 g_assert_not_reached();
 }
 break;
 case 2: /* AT S1E0R, AT S1E0W */
-mmu_idx = secure ? ARMMMUIdx_S1SE0 : ARMMMUIdx_S1NSE0;
+mmu_idx = secure ? ARMMMUBit_S1SE0 : ARMMMUBit_S1NSE0;
 break;
 case 4: /* AT S12E1R, AT S12E1W */
-mmu_idx = secure ? ARMMMUIdx_S1SE1 : ARMMMUIdx_S12NSE1;
+mmu_idx = secure ? ARMMMUBit_S1SE1 : ARMMMUBit_S12NSE1;
 break;
 case 6: /* AT S12E0R, AT S12E0W */
-mmu_idx = secure ?

Re: [Qemu-devel] [PATCH 22/24] block/dirty-bitmap: deep release dirty bitmaps

2017-01-31 Thread Max Reitz

On 31.01.2017 17:03, Vladimir Sementsov-Ogievskiy wrote:
> 29.01.2017 19:54, Max Reitz wrote:
>> On 23.01.2017 13:10, Vladimir Sementsov-Ogievskiy wrote:
>>> Remove persistent bitmap from its storage on bdrv_release_dirty_bitmap.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>> ---
>>>   block/dirty-bitmap.c  | 21 ++---
>>>   include/block/block_int.h |  3 +++
>>>   2 files changed, 21 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
>>> index 775181c9ab..0977df6309 100644
>>> --- a/block/dirty-bitmap.c
>>> +++ b/block/dirty-bitmap.c
>>> @@ -297,7 +297,8 @@ void bdrv_dirty_bitmap_truncate(BlockDriverState
>>> *bs)
>>> static void
>>> bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
>>> BdrvDirtyBitmap
>>> *bitmap,
>>> -  bool only_named)
>>> +  bool only_named,
>>> +  bool deep)
>> I'd rather call it "remove_persistent" or something, which is bad
>> grammar, but it's better at getting the point across.
>>
>>>   {
>>>   BdrvDirtyBitmap *bm, *next;
>>>   QLIST_FOREACH_SAFE(bm, >dirty_bitmaps, list, next) {
>>> @@ -305,6 +306,19 @@ static void
>>> bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
>>>   assert(!bm->active_iterators);
>>>   assert(!bdrv_dirty_bitmap_frozen(bm));
>>>   assert(!bm->meta);
>>> +
>>> +if (deep && bm->persistent && bs->drv &&
>>> +bs->drv->bdrv_remove_persistent_dirty_bitmap)
>>> +{
>>> +Error *local_err = NULL;
>>> +bs->drv->bdrv_remove_persistent_dirty_bitmap(bs,
>>> bm->name,
>>> +
>>> _err);
>>> +if (local_err != NULL) {
>>> +error_report_err(local_err);
>> This looks like maybe it's the wrong thing to do... I do agree that
>> sometimes it may not be fatal, but sometimes it probably is.
>>
>>> +}
>>> +}
>>> +
>>> +
>>>   QLIST_REMOVE(bm, list);
>>>   hbitmap_free(bm->bitmap);
>>>   g_free(bm->name);
>>> @@ -322,16 +336,17 @@ static void
>>> bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
>>> void bdrv_release_dirty_bitmap(BlockDriverState *bs,
>>> BdrvDirtyBitmap *bitmap)
>>>   {
>>> -bdrv_do_release_matching_dirty_bitmap(bs, bitmap, false);
>>> +bdrv_do_release_matching_dirty_bitmap(bs, bitmap, false, true);
>> The @deep parameter (or however you decide to call it) should be
>> available to this function's caller, too. I don't believe qcow2's
>> load_bitmap() wants to delete the persistent bitmap in the error path;
>> and block-dirty-bitmap-remove should probably allow the user to decide
>> what to do.
>>
>> Which brings me to the error thing above: If the user (or management
>> tool) decides that block-dirty-bitmap-remove should remove the
>> persistent bitmap, I believe that the operation should be aborted if the
>> persistent bitmap cannot be removed. It should not just go on and
>> release the in-memory bitmap but abort altogether. If the user just
>> wants to release that in-memory bitmap then, they can still go on an
>> call block-dirty-bitmap-remove with deep=false.
> 
> I've started to rewrite it in that way and it seems like deep almost
> always will be false. What about just move call to
> bs->drv->bdrv_remove_persistent_dirty_bitmap() directly to
> qmp_block_dirty_bitmap_remove and add parameter
> 'remove_persistent' (or 'remove_from_storage', to make it maximum
> descriptive) only to this qmp command?

Or you could add an explicit bdrv_remove_persistent_dirty_bitmap() which
just wraps the bs->drv function.

> Or even without that parameter, as leaving inconsistent version of
> bitmap in the storage doesn't seem useful.

Hm, right, the bitmap will be automatically removed from the image once
it's closed, right?

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 09/24] qcow2: add .bdrv_load_autoloading_dirty_bitmaps

2017-01-31 Thread Max Reitz

On 31.01.2017 16:11, Vladimir Sementsov-Ogievskiy wrote:
> 29.01.2017 01:20, Max Reitz wrote:
>> On 23.01.2017 13:10, Vladimir Sementsov-Ogievskiy wrote:
>>> Auto loading bitmaps are bitmaps in Qcow2, with the AUTO flag set. They
>>> are loaded when the image is opened and become BdrvDirtyBitmaps for the
>>> corresponding drive.
>>>
>>> Extra data in bitmaps is not supported for now.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>> ---
>>>  block/Makefile.objs  |   2 +-
>>>  block/qcow2-bitmap.c | 712 
>>> +++
>>>  block/qcow2.c|   2 +
>>>  block/qcow2.h|   3 +
>>>  4 files changed, 718 insertions(+), 1 deletion(-)
>>>  create mode 100644 block/qcow2-bitmap.c
>> [...]
>>
>>> new file mode 100644
>>> index 00..758622dcfc
>>> --- /dev/null
>>> +++ b/block/qcow2-bitmap.c
>> [...]
>>
>>> +
>>> +/*
>>> + * Bitmap List end
>>> + */
>>> +
>>> +static int update_ext_header_and_dir_in_place(BlockDriverState *bs,
>>> +  Qcow2BitmapList *bm_list)
>>> +{
>>> +BDRVQcow2State *s = bs->opaque;
>>> +int ret;
>>> +
>>> +if (!(s->autoclear_features & QCOW2_AUTOCLEAR_BITMAPS) ||
>>> +bm_list == NULL || QSIMPLEQ_EMPTY(bm_list) ||
>>> +bitmap_list_count(bm_list) != s->nb_bitmaps)
>>> +{
>>> +return -EINVAL;
>>> +}
>>> +
>>> +s->autoclear_features &= ~(uint64_t)QCOW2_AUTOCLEAR_BITMAPS;
>>> +ret = update_header_sync(bs);
>> This function is defined only in patch 13, until then the code won't
>> compile; the definition should be moved into this patch.
>>
>> Max
>>
> 
> Stupid omission, sorry =(. I'll move it and add your r-b, ok?

Yep, that's fine.

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 07/24] qcow2: add bitmaps extension

2017-01-31 Thread Max Reitz

On 30.01.2017 08:21, Vladimir Sementsov-Ogievskiy wrote:
> 29.01.2017 00:29, Max Reitz wrote:
>> On 23.01.2017 13:10, Vladimir Sementsov-Ogievskiy wrote:
>>> Add bitmap extension as specified in docs/specs/qcow2.txt.
>>> For now, just mirror extension header into Qcow2 state and check
>>> constraints.
>>>
>>> For now, disable image resize if it has bitmaps. It will be fixed later.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>> ---
>>>   block/qcow2.c | 119
>>> +-
>>>   block/qcow2.h |  24 
>>>   2 files changed, 141 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/block/qcow2.c b/block/qcow2.c
>>> index 96fb8a8f16..a8835988c7 100644
>>> --- a/block/qcow2.c
>>> +++ b/block/qcow2.c
>> [...]
>>
>>> @@ -185,6 +265,15 @@ static int
>>> qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
>>>   offset += ((ext.len + 7) & ~7);
>>>   }
>>>   +if (need_update && !bdrv_is_root_node(bs) &&
>>> +!(bdrv_get_flags(bs) & BDRV_O_INACTIVE))
>>> +{
>>> +ret = qcow2_update_header(bs);
>> Good idea, but qcow2_read_extensions() can be called pretty early in
>> qcow2_open(). Therefore, some fields in the BDRVQcow2State are not
>> necessarily set already.
>>
>> For instance, it is always called before s->snapshots_offset and
>> s->nb_snapshots are set. Thus, this will effectively always discard all
>> snapshots.
>>
>> Max
>>
> 
> So, I need to move need_update to qcow2_read_extensions parameters as
> 'bool *' and handle it later, in the end of qcow2_open. Ok?

Yes, that sounds reasonable.

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 1/2] qemu-io: don't allow I/O operations larger than INT_MAX

2017-01-31 Thread Max Reitz

On 31.01.2017 19:11, Alberto Garcia wrote:
> On Tue 31 Jan 2017 05:41:23 PM CET, Eric Blake  wrote:
> 
 Ideally, it would be nice to fix the block layer to allow larger
 requests (since we already have code to auto-fragment down to device
 limits, we should be able to rely on that code instead of having to
 duplicate artificial constraints everywhere else in the tree).  But
 that's a bigger task, and this is a good patch in the interim.
>>>
>>> Related question: what's the largest request than a guest can
>>> theoretically submit?
>>
>> off_t supports up to 2^63 (not 2^64, because it is a signed type).
>> Ideally, we should be constrained only by the disk size (as no one
>> actually has 2^63 bytes of storage available), by using uint64_t
>> offset AND length in all our APIs; but right now, we still have a lot
>> of 32-bit length issues, and often signed length limiting us to 2^31
>> depending on the API.
> 
> My question was more like: what happens if the guest submits a request
> with the maximum possible size? Does QEMU handle that correctly?

As far as I'm aware we rely on the device emulation code to split the
request.

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] char: drop data written to a disconnected pty

2017-01-31 Thread Ed Swierk

On Tue, Jan 31, 2017 at 6:06 AM, Marc-André Lureau  wrote:
> I think this can be confusing if some backends silently drop the data (under 
> disconnected state), while other don't. Perhaps we should have instead a new 
> common chardev property "hup-drop" ? (suggestions for better name welcome)

Either way, when a pty is disconnected data will get dropped, whether
by the pty backend (as proposed) or by the serial port device or other
frontend (as currently).

The only difference from a user's perspective is whether the dropped
data gets written to the log file, which is the main motivation for
this change. The backward compatibility concern would be an existing
application that relies on data not being logged when the pty is
disconnected.

--Ed

Re: [Qemu-devel] [PATCH Risu 1/2] risu_ppc64: Fix Risu to run under qemu linux user

2017-01-31 Thread joserz

On Mon, Jan 30, 2017 at 11:49:34AM +, Peter Maydell wrote:
> On 30 January 2017 at 02:47, Jose Ricardo Ziviani
>  wrote:
> > Qemu linux-user doesn't fill uc_mcontext completely like full emul. does.
> > For instance, uc->uc_mcontext.regs->nip is an invalid so this
> > commit replaces it by uc->uc_mcontext.gp_regs[PT_NIP]
> 
> It's not clear to me from this commit message whether this is
> a bug in QEMU's userspace emulation which this is trying to work
> around (in which case we should just fix it in QEMU), or a
> bug in risu where we were incorrectly relying on something the
> kernel doesn't actually guarantee. Which is it?
> 
> Also, looking at the kernel source and headers as far
> as I can see uc_context.regs is a pointer set up such that
> uc->uc_mcontext.regs->nip is pointing at the same bit of
> memory where uc->uc_mcontext.gp_regs[PT_NIP] is,
> and the QEMU code does similar, so I don't see how you can
> get two different values from the two things.
> 
> (It is certainly the case that risu is quite good at exercising
> odd corner cases of the signal handling code in QEMU which most
> normal programs don't care about...)

Peter

I just sent a patch "linux-user: fill target sigcontext struct accordingly" to 
fix it in QEMU. Please, forget this patchset, I'll reorganize it and send it 
later.

Thank you

Ziviani

[Qemu-devel] [PATCH] linux-user: fill target sigcontext struct accordingly

2017-01-31 Thread Jose Ricardo Ziviani

A segfault is noticed when an emulated program uses any of ucontext
regs fields. Risu detected this issue in the following operation when
handling a signal:
  ucontext_t *uc = (ucontext_t*)uc;
  uc->uc_mcontext.regs->nip += 4;

but this works fine:
  uc->uc_mcontext.gp_regs[PT_NIP] += 4;

This patch set regs to a valid location as well as other sigcontext
fields.

Signed-off-by: Jose Ricardo Ziviani 
---
 linux-user/signal.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 5064de0..8209539 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -5155,6 +5155,7 @@ static void setup_rt_frame(int sig, struct 
target_sigaction *ka,
 target_ulong rt_sf_addr, newsp = 0;
 int i, err = 0;
 #if defined(TARGET_PPC64)
+struct target_sigcontext *sc = 0;
 struct image_info *image = ((TaskState *)thread_cpu->opaque)->info;
 #endif
 
@@ -5183,6 +5184,10 @@ static void setup_rt_frame(int sig, struct 
target_sigaction *ka,
 #if defined(TARGET_PPC64)
 mctx = _sf->uc.tuc_sigcontext.mcontext;
 trampptr = _sf->trampoline[0];
+
+sc = _sf->uc.tuc_sigcontext;
+__put_user(h2g(mctx), >regs);
+__put_user(sig, >signal);
 #else
 mctx = _sf->uc.tuc_mcontext;
 trampptr = (uint32_t *)_sf->uc.tuc_mcontext.tramp;
-- 
2.7.4

Re: [Qemu-devel] [libvirt] char: Logging serial pty output when disconnected

2017-01-31 Thread Ed Swierk

On Tue, Jan 31, 2017 at 7:34 AM, Paolo Bonzini  wrote:
>
>
> On 31/01/2017 04:39, Daniel P. Berrange wrote:
>> I don't think so - retrying in this way is pointless IMHO - it is just
>> going to get the same result on every retry on 99% of occassions.
>
> Just to provide the full context, the retry happens even if you get
> EAGAIN, and in that case it does makes sense.
>
> But if the pty is disconnected I agree it should discard the data.

Thanks for the feedback. Please refer to this proposed patch:
http://patchwork.ozlabs.org/patch/721974/ .

--Ed

Re: [Qemu-devel] [Bug 1250360] Re: qcow2 image logical corruption after host crash

2017-01-31 Thread Blue

Since then, I switched all my vm images to (sparse) raw  and never
experienced corruption problems again.
I could not say if this can still be reproduced today, even then it was
probably a corner case.
I would suggest the closing of the issue as we cannot gather newer and more
relevant data.

On Tue, Jan 31, 2017 at 10:49 PM, Thomas Huth <1250...@bugs.launchpad.net>
wrote:

> Did this issue happen ever again with a recent version of QEMU? ... if
> not, should we close this bug nowadays?
>
> ** Changed in: qemu
>Status: New => Incomplete
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1250360
>
> Title:
>   qcow2 image logical corruption after host crash
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/qemu/+bug/1250360/+subscriptions
>

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1250360

Title:
  qcow2 image logical corruption after host crash

Status in QEMU:
  Incomplete

Bug description:
  Description of problem:
  In case of power failure disk images that were active and created in qcow2 
format can become logically corrupt so that they actually appear as unused 
(full of zeroes).
  Data seems to be there, but at this moment i cannot find any reliable method 
to recover it. Should it be a raw image, a recovery path would be available, 
but a qcow2 image only presents zeroes once it gets corrupted. My understanding 
is that the blockmap of the image gets reset and the image is then assumed to 
be unused.
  My detailed setup :

  Kernel 2.6.32-358.18.1.el6.x86_64
  qemu-kvm-0.12.1.2-2.355.0.1.el6.centos.7.x86_64
  Used via libvirt libvirt-0.10.2-18.el6_4.14.x86_64
  The image was used from a NFS share (the nfs server did NOT crash and 
remained permanently active).

  qemu-img check finds no corruption;
  qemu-img convert will fully convert the image to raw at a raw image full of 
zeroes. However, there is data in the file, and the storage backend was not 
restarted, inactivated during the incident.
  I encountered this issue on two different machines, in both cases i was not 
able to recover the data.
  Image was qcow2, thin provisioned, created like this :
   qemu-img create -f qcow2 -o cluster_size=2M imagename.img

  While addressing the root cause in order to not have this issue repeat
  would be the ideal scenario, a temporary workaround to run on the
  affected qcow2 image to "patch" it and recover the data (eventually
  after a full fsck/recovery inside the guest) would also be good.
  Otherwise we are basically losing data on a large scale when using
  qcow2.

  Version-Release number of selected component (if applicable):
  Kernel 2.6.32-358.18.1.el6.x86_64
  qemu-kvm-0.12.1.2-2.355.0.1.el6.centos.7.x86_64
  Used via libvirt libvirt-0.10.2-18.el6_4.14.x86_64

  How reproducible:
  I am not able (and don't have at the moment enough resources to try to 
manually reproduce it), but the probability of the issue seems quite high as 
this is the second case of such corruption in weeks.
  Additional info:
  I can privately provide an image displaying the corruption.

  The reported problem has actually two aspects : first is the cause that 
eventually produces this issue.
  The second is the fact that once the logical corruption has occured, qemu-img 
check finds nothing wrong with the image - this is obviously wrong.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1250360/+subscriptions

Re: [Qemu-devel] qemu-pcc 2.8.0 linux-user segfaults

2017-01-31 Thread Aníbal Limón



On 01/30/2017 05:52 PM, Sam Bobroff wrote:
> On Mon, Jan 16, 2017 at 04:03:21PM -0600, Aníbal Limón wrote:
>>
>>
>> On 01/16/2017 03:56 PM, Aníbal Limón wrote:
>>> Hi folks,
>>>
>>> I'm trying to upgrade qemu to 2.8.0 in Openembedded-core and segfaults
>>> in qemu-ppc when is executing:
> 
> Hi Aníbal,
> 
> I've recently encountered a similar problem and I've posted a fix. If
> you'd like to try it, it's here:
> 
> https://lists.gnu.org/archive/html/qemu-ppc/2017-01/msg00413.html

The patch works, thanks for made it.

Cheers,
alimon

> 
> Cheers,
> Sam.
> 
>>> /home/alimon/repos/poky/build-ppc/tmp/work/x86_64-linux/qemu-native/2.8.0-r0/build/ppc-linux-user/qemu-ppc
>>> -s 16M -r 3.2.0 -cpu 7400 -L
>>> /home/alimon/repos/poky/build-ppc/tmp/sysroots/qemuppc -E
>>> LD_LIBRARY_PATH=/home/alimon/repos/poky/build-ppc/tmp/work/ppc7400-poky-linux/gobject-introspection/1.50.0-r0/build/.libs:.libs:/home/alimon/repos/poky/build-ppc/tmp/sysroots/qemuppc//usr/lib:/home/alimon/repos/poky/build-ppc/tmp/sysroots/qemuppc//lib
>>> /home/alimon/repos/poky/build-ppc/tmp/work/ppc7400-poky-linux/gobject-introspection/1.50.0-r0/build/tmp-introspectu_ewt_1z/Gio-2.0
>>> --introspect-dump=/home/alimon/repos/poky/build-ppc/tmp/work/ppc7400-poky-linux/gobject-introspection/1.50.0-r0/build/tmp-introspectu_ewt_1z/functions.txt,/home/alimon/repos/poky/build-ppc/tmp/work/ppc7400-poky-linux/gobject-introspection/1.50.0-r0/build/tmp-introspectu_ewt_1z/dump.xml
>>> [Thread debugging using libthread_db enabled]
>>>
>>>
>>> And the debug info,
>>>
>>> (gdb) info threads
>>>   Id   Target Id Frame
>>>   1Thread 0x77fd0780 (LWP 25457) "qemu-ppc"
>>> pthread_cond_wait@@GLIBC_2.3.2 () at
>>> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
>>>   2Thread 0x7647e700 (LWP 25461) "qemu-ppc" syscall () at
>>> ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
>>> * 3Thread 0x77f4d700 (LWP 25462) "qemu-ppc" 0x0086fba4
>>> in static_code_gen_buffer ()
>>>
>>> (gdb) bt
>>> #0  0x0086fba4 in static_code_gen_buffer ()
>>> #1  0x0040e922 in cpu_tb_exec (itb=,
>>> itb=, cpu=0x29864d0) at
>>> /home/alimon/repos/poky/build-ppc/tmp/work/x86_64-linux/qemu-native/2.8.0-r0/qemu-2.8.0/cpu-exec.c:164
>>> #2  cpu_loop_exec_tb (sc=, tb_exit=,
>>> last_tb=, tb=, cpu=0x29864d0) at
>>> /home/alimon/repos/poky/build-ppc/tmp/work/x86_64-linux/qemu-native/2.8.0-r0/qemu-2.8.0/cpu-exec.c:544
>>> #3  cpu_exec (cpu=cpu@entry=0x29864d0) at
>>> /home/alimon/repos/poky/build-ppc/tmp/work/x86_64-linux/qemu-native/2.8.0-r0/qemu-2.8.0/cpu-exec.c:638
>>> #4  0x00445fba in cpu_loop (env=env@entry=0x298e750) at
>>> /home/alimon/repos/poky/build-ppc/tmp/work/x86_64-linux/qemu-native/2.8.0-r0/qemu-2.8.0/linux-user/main.c:1359
>>> #5  0x00448a95 in clone_func (arg=0x7fffa910) at
>>> /home/alimon/repos/poky/build-ppc/tmp/work/x86_64-linux/qemu-native/2.8.0-r0/qemu-2.8.0/linux-user/syscall.c:6090
>>> #6  0x76a750a4 in start_thread (arg=0x77f4d700) at
>>> pthread_create.c:309
>>> #7  0x767aa62d in clone () at
>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
>>
>> Here is the core dump,
>>
>> https://drive.google.com/file/d/0B9uDfO-FJ1kgY3ZhendISTZzOUU/view?usp=sharing
>>
>>>
>>> Any help will be appreciated.
>>>
>>> Best regards,
>>> alimon
>>>
>>
> 
> 
> 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v4 1/9] ACPI: Add a function for building named qword entries

2017-01-31 Thread Michael S. Tsirkin

On Tue, Jan 31, 2017 at 10:51:02AM +0100, Igor Mammedov wrote:
> On Mon, 30 Jan 2017 22:28:41 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > On Fri, Jan 27, 2017 at 10:43:13AM -0500, Kevin O'Connor wrote:
> > > On Fri, Jan 27, 2017 at 03:46:33PM +0100, Laszlo Ersek wrote:  
> > > > On 01/27/17 15:18, Kevin O'Connor wrote:  
> > > > > If an offset is going to be added, shouldn't both a source offset and
> > > > > destination offset be used?
> > > > > 
> > > > > /*
> > > > >  * COMMAND_WRITE_POINTER - update a writeable file named
> > > > >  * @pointer.dest_file at @pointer.dest_offset, by writing 
> > > > > pointer
> > > > >  * plus @pointer.src_offset to the blob originating from
> > > > >  * @src_file. 1,2,4 or 8 byte unsigned write is used depending
> > > > >  * on @pointer.size.
> > > > >  */
> > > > > struct {
> > > > > char dest_file[BIOS_LINKER_LOADER_FILESZ];
> > > > > char src_file[BIOS_LINKER_LOADER_FILESZ];
> > > > > uint32_t src_offset, dest_offset;
> > > > > uint8_t size;
> > > > > } pointer;
> > > > > 
> > > > > I doubt the offsets or size is really all that important though.  
> > > > 
> > > > The offset into the fw_cfg file that receives the allocation address is
> > > > important, that allows the same file to receive several different
> > > > addresses (for different downloaded blobs), at different offsets.
> > > > 
> > > > OTOH, asking the firmware to add a constant to the address value before
> > > > writing it to the fw_cfg file is not necessary, in my opinion. The blob
> > > > that the firmware allocated and downloaded originates from QEMU to begin
> > > > with, so QEMU knows its internal structure.  
> > > 
> > > I guess I'm missing why QEMU would want to use the same writable file
> > > for multiple pointers as well as why it would want support for
> > > pointers smaller than 8 bytes in size.  If it's because it may be
> > > easier to support an internal QEMU blob of a particular format, then
> > > adding a src_offset would facilitate that.
> > > 
> > > However, if it was done so that WRITE_POINTER mimicks ADD_POINTER then
> > > that's fine too.  I'm okay with either format.
> > > 
> > > -Kevin  
> > 
> > Both reasons :) offset is because it's easier for QEMU not to have to add
> > more files (e.g. it simplifies cross-version migration if we don't).
> On one hand offset simplifies since one file could be re-used for
> several pointers, on the other hand it doesn't make difference wrt
> migration since offset becomes ABI and has to be maintained in
> cross-version migration scenario (size of file shouldn't be issue
> as they are re-sizable now). So we just end-up with offset vs new file
> versioning.

Not really - offset is migrated automatically since it's in RAM.
No need to version it.

> However considering that number of files is limited,
> offset scales up better.
> 
> > size is to mimick ADD_POINTER.
> >

[Qemu-devel] [Bug 1250360] Re: qcow2 image logical corruption after host crash

2017-01-31 Thread Thomas Huth

Did this issue happen ever again with a recent version of QEMU? ... if
not, should we close this bug nowadays?

** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1250360

Title:
  qcow2 image logical corruption after host crash

Status in QEMU:
  Incomplete

Bug description:
  Description of problem:
  In case of power failure disk images that were active and created in qcow2 
format can become logically corrupt so that they actually appear as unused 
(full of zeroes).
  Data seems to be there, but at this moment i cannot find any reliable method 
to recover it. Should it be a raw image, a recovery path would be available, 
but a qcow2 image only presents zeroes once it gets corrupted. My understanding 
is that the blockmap of the image gets reset and the image is then assumed to 
be unused.
  My detailed setup :

  Kernel 2.6.32-358.18.1.el6.x86_64
  qemu-kvm-0.12.1.2-2.355.0.1.el6.centos.7.x86_64
  Used via libvirt libvirt-0.10.2-18.el6_4.14.x86_64
  The image was used from a NFS share (the nfs server did NOT crash and 
remained permanently active).

  qemu-img check finds no corruption;
  qemu-img convert will fully convert the image to raw at a raw image full of 
zeroes. However, there is data in the file, and the storage backend was not 
restarted, inactivated during the incident.
  I encountered this issue on two different machines, in both cases i was not 
able to recover the data.
  Image was qcow2, thin provisioned, created like this :
   qemu-img create -f qcow2 -o cluster_size=2M imagename.img

  While addressing the root cause in order to not have this issue repeat
  would be the ideal scenario, a temporary workaround to run on the
  affected qcow2 image to "patch" it and recover the data (eventually
  after a full fsck/recovery inside the guest) would also be good.
  Otherwise we are basically losing data on a large scale when using
  qcow2.


  Version-Release number of selected component (if applicable):
  Kernel 2.6.32-358.18.1.el6.x86_64
  qemu-kvm-0.12.1.2-2.355.0.1.el6.centos.7.x86_64
  Used via libvirt libvirt-0.10.2-18.el6_4.14.x86_64

  How reproducible:
  I am not able (and don't have at the moment enough resources to try to 
manually reproduce it), but the probability of the issue seems quite high as 
this is the second case of such corruption in weeks.
  Additional info:
  I can privately provide an image displaying the corruption.

  The reported problem has actually two aspects : first is the cause that 
eventually produces this issue.
  The second is the fact that once the logical corruption has occured, qemu-img 
check finds nothing wrong with the image - this is obviously wrong.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1250360/+subscriptions

Re: [Qemu-devel] [PATCH] q35: Provide improved sample configurations

2017-01-31 Thread Eric Blake

On 01/31/2017 02:34 PM, Gerd Hoffmann wrote:

> 
> Note that you can specify -readconfig multiple times, so you can split
> out common stuff and ask people to run "qemu -readconfig
> q35-paravirt-base.cfg -readconfig q35-paravirt-$style.cfg"

Can we add #include-like semantics, so that one .cfg file can
automatically pull in pre-requisites without having to document a
command-line use of multiple -readconfig?

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [Bug 1241569] Re: qemu-system-alpha console unresponsive

2017-01-31 Thread Thomas Huth

** Changed in: qemu
   Status: New => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1241569

Title:
  qemu-system-alpha console unresponsive

Status in QEMU:
  Invalid

Bug description:
  I have created a virtual machine using the QEMU Alpha emulator (very
  basic, 1 scsi disc, 1 scsi CDROM, 1gb memory). The machine starts, but
  entering any system commands at the prompt just echs back the command
  typed. For example

  >>> show device
  got: show device
  >>> 

  Obviously booting any OS from this is not possible.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1241569/+subscriptions

[Qemu-devel] [PULL v5 21/22] arm: better stub version for MISMATCH_CHECK

2017-01-31 Thread Michael S. Tsirkin

stub version of MISMATCH_CHECK is empty so it's easy to misuse for
people not building kvm on arm.  Use QEMU_BUILD_BUG_ON similar to the
non-stub version to make it easier to catch bugs.

Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Peter Maydell 
---
 target/arm/kvm-consts.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/target/arm/kvm-consts.h b/target/arm/kvm-consts.h
index a2c9518..06b6c92 100644
--- a/target/arm/kvm-consts.h
+++ b/target/arm/kvm-consts.h
@@ -21,7 +21,9 @@
 #define MISMATCH_CHECK(X, Y) QEMU_BUILD_BUG_ON(X != Y)
 
 #else
-#define MISMATCH_CHECK(X, Y)
+
+#define MISMATCH_CHECK(X, Y) QEMU_BUILD_BUG_ON(0)
+
 #endif
 
 #define CP_REG_SIZE_SHIFT 52
-- 
MST

[Qemu-devel] [PULL v5 19/22] vhost-user: delete chardev on cleanup

2017-01-31 Thread Michael S. Tsirkin

From: Marc-André Lureau 

Remove the chardev implicitly when cleaning up the netdev. This
prevents from reusing the chardev since it would be in an incorrect
state with the slave.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1256618

Signed-off-by: Marc-André Lureau 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Eric Blake 
---
 net/vhost-user.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/vhost-user.c b/net/vhost-user.c
index b0f0ab6..77b8110 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -151,7 +151,10 @@ static void vhost_user_cleanup(NetClientState *nc)
 s->vhost_net = NULL;
 }
 if (nc->queue_index == 0) {
+Chardev *chr = qemu_chr_fe_get_driver(>chr);
+
 qemu_chr_fe_deinit(>chr);
+qemu_chr_delete(chr);
 }
 
 qemu_purge_queued_packets(nc);
-- 
MST

[Qemu-devel] [PULL v5 16/22] pci: Convert msix_init() to Error and fix callers

2017-01-31 Thread Michael S. Tsirkin

From: Cao jin 

msix_init() reports errors with error_report(), which is wrong when
it's used in realize().  The same issue was fixed for msi_init() in
commit 1108b2f. In order to make the API change as small as possible,
leave the return value check to later patch.

For some devices(like e1000e, vmxnet3, nvme) who won't fail because of
msix_init's failure, suppress the error report by passing NULL error
object.

Bonus: add comment for msix_init.

CC: Jiri Pirko 
CC: Gerd Hoffmann 
CC: Dmitry Fleytman 
CC: Jason Wang 
CC: Michael S. Tsirkin 
CC: Hannes Reinecke 
CC: Paolo Bonzini 
CC: Alex Williamson 
CC: Markus Armbruster 
CC: Marcel Apfelbaum 
Signed-off-by: Cao jin 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/pci/msix.h  |  5 +++--
 hw/block/nvme.c|  2 +-
 hw/misc/ivshmem.c  |  8 
 hw/net/e1000e.c|  2 +-
 hw/net/rocker/rocker.c |  4 +++-
 hw/net/vmxnet3.c   |  2 +-
 hw/pci-bridge/gen_pcie_root_port.c |  3 +--
 hw/pci/msix.c  | 36 +++-
 hw/scsi/megasas.c  |  4 +++-
 hw/usb/hcd-xhci.c  |  4 ++--
 hw/vfio/pci.c  |  8 ++--
 hw/virtio/virtio-pci.c |  4 ++--
 12 files changed, 58 insertions(+), 24 deletions(-)

diff --git a/include/hw/pci/msix.h b/include/hw/pci/msix.h
index 048a29d..1f27658 100644
--- a/include/hw/pci/msix.h
+++ b/include/hw/pci/msix.h
@@ -9,9 +9,10 @@ MSIMessage msix_get_message(PCIDevice *dev, unsigned int 
vector);
 int msix_init(PCIDevice *dev, unsigned short nentries,
   MemoryRegion *table_bar, uint8_t table_bar_nr,
   unsigned table_offset, MemoryRegion *pba_bar,
-  uint8_t pba_bar_nr, unsigned pba_offset, uint8_t cap_pos);
+  uint8_t pba_bar_nr, unsigned pba_offset, uint8_t cap_pos,
+  Error **errp);
 int msix_init_exclusive_bar(PCIDevice *dev, unsigned short nentries,
-uint8_t bar_nr);
+uint8_t bar_nr, Error **errp);
 
 void msix_write_config(PCIDevice *dev, uint32_t address, uint32_t val, int 
len);
 
diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index d479fd2..ae91a18 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -872,7 +872,7 @@ static int nvme_init(PCIDevice *pci_dev)
 pci_register_bar(>parent_obj, 0,
 PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64,
 >iomem);
-msix_init_exclusive_bar(>parent_obj, n->num_queues, 4);
+msix_init_exclusive_bar(>parent_obj, n->num_queues, 4, NULL);
 
 id->vid = cpu_to_le16(pci_get_word(pci_conf + PCI_VENDOR_ID));
 id->ssvid = cpu_to_le16(pci_get_word(pci_conf + PCI_SUBSYSTEM_VENDOR_ID));
diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 846e903..bf57e63 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -749,13 +749,13 @@ static void ivshmem_reset(DeviceState *d)
 }
 }
 
-static int ivshmem_setup_interrupts(IVShmemState *s)
+static int ivshmem_setup_interrupts(IVShmemState *s, Error **errp)
 {
 /* allocate QEMU callback data for receiving interrupts */
 s->msi_vectors = g_malloc0(s->vectors * sizeof(MSIVector));
 
 if (ivshmem_has_feature(s, IVSHMEM_MSI)) {
-if (msix_init_exclusive_bar(PCI_DEVICE(s), s->vectors, 1)) {
+if (msix_init_exclusive_bar(PCI_DEVICE(s), s->vectors, 1, errp)) {
 return -1;
 }
 
@@ -898,8 +898,8 @@ static void ivshmem_common_realize(PCIDevice *dev, Error 
**errp)
 qemu_chr_fe_set_handlers(>server_chr, ivshmem_can_receive,
  ivshmem_read, NULL, s, NULL, true);
 
-if (ivshmem_setup_interrupts(s) < 0) {
-error_setg(errp, "failed to initialize interrupts");
+if (ivshmem_setup_interrupts(s, errp) < 0) {
+error_prepend(errp, "Failed to initialize interrupts: ");
 return;
 }
 }
diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index 0e9a25b..b0f429b 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -292,7 +292,7 @@ e1000e_init_msix(E1000EState *s)
 E1000E_MSIX_IDX, E1000E_MSIX_TABLE,
 >msix,
 E1000E_MSIX_IDX, E1000E_MSIX_PBA,
-0xA0);
+0xA0, NULL);
 
 if (res < 0) {
 trace_e1000e_msix_init_fail(res);
diff --git a/hw/net/rocker/rocker.c b/hw/net/rocker/rocker.c
index e9d215a..6e70fdd 100644
--- a/hw/net/rocker/rocker.c
+++ b/hw/net/rocker/rocker.c
@@ -1256,14 +1256,16 @@ static int rocker_msix_init(Rocker *r)
 {
 PCIDevice *dev

[Qemu-devel] [PULL v5 20/22] hw/pci: disable pci-bridge's shpc by default

2017-01-31 Thread Michael S. Tsirkin

From: Marcel Apfelbaum 

The shpc component is optional while  ACPI hotplug is used
for hot-plugging PCI devices into a PCI-PCI bridge.
Disabling the shpc by default will make slot 0 usable at boot time
and not only for hot-plug, without loosing any functionality.
Older machines will have shpc enabled for compatibility reasons.

Signed-off-by: Marcel Apfelbaum 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/compat.h| 4 
 hw/pci-bridge/pci_bridge_dev.c | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/hw/compat.h b/include/hw/compat.h
index ee0dd1b..b7db438 100644
--- a/include/hw/compat.h
+++ b/include/hw/compat.h
@@ -14,6 +14,10 @@
 .driver   = "pflash_cfi01",\
 .property = "old-multiple-chip-handling",\
 .value= "on",\
+},{\
+.driver   = "pci-bridge",\
+.property = "shpc",\
+.value= "on",\
 },
 
 #define HW_COMPAT_2_7 \
diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
index 5dbd933..647ad80 100644
--- a/hw/pci-bridge/pci_bridge_dev.c
+++ b/hw/pci-bridge/pci_bridge_dev.c
@@ -163,7 +163,7 @@ static Property pci_bridge_dev_properties[] = {
 DEFINE_PROP_ON_OFF_AUTO(PCI_BRIDGE_DEV_PROP_MSI, PCIBridgeDev, msi,
 ON_OFF_AUTO_AUTO),
 DEFINE_PROP_BIT(PCI_BRIDGE_DEV_PROP_SHPC, PCIBridgeDev, flags,
-PCI_BRIDGE_DEV_F_SHPC_REQ, true),
+PCI_BRIDGE_DEV_F_SHPC_REQ, false),
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
MST

[Qemu-devel] [PULL v5 17/22] virtio: make virtio_should_notify static

2017-01-31 Thread Michael S. Tsirkin

From: Paolo Bonzini 

Signed-off-by: Paolo Bonzini 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Stefan Hajnoczi 
---
 include/hw/virtio/virtio.h | 1 -
 hw/virtio/virtio.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 6523bac..525da24 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -182,7 +182,6 @@ void virtqueue_get_avail_bytes(VirtQueue *vq, unsigned int 
*in_bytes,
unsigned int *out_bytes,
unsigned max_in_bytes, unsigned max_out_bytes);
 
-bool virtio_should_notify(VirtIODevice *vdev, VirtQueue *vq);
 void virtio_notify_irqfd(VirtIODevice *vdev, VirtQueue *vq);
 void virtio_notify(VirtIODevice *vdev, VirtQueue *vq);
 
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index f292a53..6365706 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1383,7 +1383,7 @@ static void virtio_set_isr(VirtIODevice *vdev, int value)
 }
 }
 
-bool virtio_should_notify(VirtIODevice *vdev, VirtQueue *vq)
+static bool virtio_should_notify(VirtIODevice *vdev, VirtQueue *vq)
 {
 uint16_t old, new;
 bool v;
-- 
MST

[Qemu-devel] [PULL v5 15/22] hcd-xhci: check & correct param before using it

2017-01-31 Thread Michael S. Tsirkin

From: Cao jin 

usb_xhci_realize() corrects invalid values of property "intrs"
automatically, but the uncorrected value is passed to msi_init(),
which chokes on invalid values.  Delay that until after the
correction.

Resources allocated by usb_xhci_init() are leaked when msi_init()
fails.  Fix by calling it after msi_init().

CC: Gerd Hoffmann 
CC: Markus Armbruster 
CC: Marcel Apfelbaum 
CC: Michael S. Tsirkin 

Reviewed-by: Markus Armbruster 
Acked-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/usb/hcd-xhci.c | 37 ++---
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index e0b5169..6575d05 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -3627,25 +3627,6 @@ static void usb_xhci_realize(struct PCIDevice *dev, 
Error **errp)
 dev->config[PCI_CACHE_LINE_SIZE] = 0x10;
 dev->config[0x60] = 0x30; /* release number */
 
-usb_xhci_init(xhci);
-
-if (xhci->msi != ON_OFF_AUTO_OFF) {
-ret = msi_init(dev, 0x70, xhci->numintrs, true, false, );
-/* Any error other than -ENOTSUP(board's MSI support is broken)
- * is a programming error */
-assert(!ret || ret == -ENOTSUP);
-if (ret && xhci->msi == ON_OFF_AUTO_ON) {
-/* Can't satisfy user's explicit msi=on request, fail */
-error_append_hint(, "You have to use msi=auto (default) or "
-"msi=off with this machine type.\n");
-error_propagate(errp, err);
-return;
-}
-assert(!err || xhci->msi == ON_OFF_AUTO_AUTO);
-/* With msi=auto, we fall back to MSI off silently */
-error_free(err);
-}
-
 if (xhci->numintrs > MAXINTRS) {
 xhci->numintrs = MAXINTRS;
 }
@@ -3667,6 +3648,24 @@ static void usb_xhci_realize(struct PCIDevice *dev, 
Error **errp)
 xhci->max_pstreams_mask = 0;
 }
 
+if (xhci->msi != ON_OFF_AUTO_OFF) {
+ret = msi_init(dev, 0x70, xhci->numintrs, true, false, );
+/* Any error other than -ENOTSUP(board's MSI support is broken)
+ * is a programming error */
+assert(!ret || ret == -ENOTSUP);
+if (ret && xhci->msi == ON_OFF_AUTO_ON) {
+/* Can't satisfy user's explicit msi=on request, fail */
+error_append_hint(, "You have to use msi=auto (default) or "
+"msi=off with this machine type.\n");
+error_propagate(errp, err);
+return;
+}
+assert(!err || xhci->msi == ON_OFF_AUTO_AUTO);
+/* With msi=auto, we fall back to MSI off silently */
+error_free(err);
+}
+
+usb_xhci_init(xhci);
 xhci->mfwrap_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, xhci_mfwrap_timer, 
xhci);
 
 memory_region_init(>mem, OBJECT(xhci), "xhci", LEN_REGS);
-- 
MST

[Qemu-devel] [PULL v5 08/22] pci: mark ROMs read-only

2017-01-31 Thread Michael S. Tsirkin

Looks like we didn't mark PCI ROMs as RO allowing
mischief such as guests writing there.
Further, e.g. vhost gets confused trying to allocate
enough space to log writes there. Fix it up.

Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Marcel Apfelbaum 
Tested-by: Laurent Vivier 
---
 hw/pci/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 47ca3af..a563555 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2195,7 +2195,7 @@ static void pci_add_option_rom(PCIDevice *pdev, bool 
is_default_rom,
 snprintf(name, sizeof(name), "%s.rom", 
object_get_typename(OBJECT(pdev)));
 }
 pdev->has_rom = true;
-memory_region_init_ram(>rom, OBJECT(pdev), name, size, _fatal);
+memory_region_init_rom(>rom, OBJECT(pdev), name, size, _fatal);
 vmstate_register_ram(>rom, >qdev);
 ptr = memory_region_get_ram_ptr(>rom);
 load_image(path, ptr);
-- 
MST

[Qemu-devel] [PULL v5 14/22] msix: Follow CODING_STYLE

2017-01-31 Thread Michael S. Tsirkin

From: Cao jin 

CC: Markus Armbruster 
CC: Marcel Apfelbaum 
CC: Michael S. Tsirkin 

Reviewed-by: Markus Armbruster 
Acked-by: Marcel Apfelbaum 
Signed-off-by: Cao jin 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/pci/msix.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/pci/msix.c b/hw/pci/msix.c
index ee1714d..c938a9b 100644
--- a/hw/pci/msix.c
+++ b/hw/pci/msix.c
@@ -447,8 +447,10 @@ void msix_notify(PCIDevice *dev, unsigned vector)
 {
 MSIMessage msg;
 
-if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector])
+if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector]) {
 return;
+}
+
 if (msix_is_masked(dev, vector)) {
 msix_set_pending(dev, vector);
 return;
@@ -483,8 +485,10 @@ void msix_reset(PCIDevice *dev)
 /* Mark vector as used. */
 int msix_vector_use(PCIDevice *dev, unsigned vector)
 {
-if (vector >= dev->msix_entries_nr)
+if (vector >= dev->msix_entries_nr) {
 return -EINVAL;
+}
+
 dev->msix_entry_used[vector]++;
 return 0;
 }
-- 
MST

Re: [Qemu-devel] [PATCH] q35: Provide improved sample configurations

2017-01-31 Thread Gerd Hoffmann

  Hi,

> Both configuration file are full commented, neatly
> organized, and use paravirtualized devices instead of
> emulated devices whenever possible for a better user
> experience. Moreover, they follow the PCI Express
> Guidelines (docs/pcie.txt) for their topology.
> ---
> I plan to provide similar sample configurations for
> aarch64/virt guests once the generic PCIe Root Ports
> have been merged.
> 
>  docs/q35-chipset.cfg   | 152 

Please leave q35-chipset.cfg there.

The purpose of q35-chipset.cfg is to document how you can build a
virtual machine which comes as close as possible to physical hardware
with q35 northbridge and ich9 southbridge.  So it is pretty much
fixed ;)  Maybe it makes sense to add a comment at the top clarifying
this.  The (currently commented out) pcie switch configuration can be
dropped or moved to another file.  And maybe we can add the ethernet
device meanwhile (has the e1000e emulation a ich9 device?).

Adding another sample config focusing on pcie and good performance is
perfectly fine of course.

>  docs/q35-graphical.cfg | 154 
> +
>  docs/q35-serial.cfg| 110 +++

Note that you can specify -readconfig multiple times, so you can split
out common stuff and ask people to run "qemu -readconfig
q35-paravirt-base.cfg -readconfig q35-paravirt-$style.cfg"

> +# PCIe controllers
> +# =
> +#
> +# We create four PCIe Root Ports: the first three are used
> +# by devices defined below, while the last one is left
> +# unused so that one more device can be hotplugged.
> +
> +[device "pci.1"]
> +  driver = "ioh3420"
> +  bus = "pcie.0"
> +  addr = "0x2"
> +  port = "0x10"
> +  chassis = "1"

I'd suggest to create 8 pcie root ports, as multifunction device in a
single slot, following pcie.txt suggestions.

And I'd pick slot 1c for that, simply because the pcie root ports are in
that slot on physical hardware, but that is just cosmetic and a matter
of taste.

> +[device "usb"]
> +  driver = "nec-usb-xhci"
> +  bus = "pci.3"
> +  addr = "0x0"

Good, needs promotion ;)

> +[device "keyboard"]
> +  driver = "usb-kbd"
> +  bus = "usb.0"
> +  port = "1"

We have a ps/2 keyboard anyway, so this is pointless (on x86, arm is a
different story of course).

> +[device "tablet"]
> +  driver = "usb-tablet"
> +  bus = "usb.0"
> +  port = "2"

There is also virtio-tablet, but for a generic config usb is probably
the better choice as virtio-tablet is supported by modern linux only (on
x86, on arm we have only modern linux anyway so picking virtio-tablet
should be fine).

> +# Video card
> +# =
> +#
> +# We plug the QXL video card directly into the PCIe Root
> +# Bus as it is a legacy PCI device; this way, we can reduce
> +# the number of PCIe controllers in the guest.
> +
> +[device "video"]
> +  driver = "qxl-vga"
> +  bus = "pcie.0"
> +  addr = "0x1"

Note that you can put multiple devices into a single root port, using
multifunction.  I have a guest here which looks like this:

root@localhost ~# lspci -vt
-[:00]-+-00.0  Intel Corporation 82G33/G31/P35/P31 Express DRAM
Controller
   +-02.0  Device 1234:
   +-1b.0  Intel Corporation 82801I (ICH9 Family) HD Audio
Controller
   +-1c.0-[01]00.0  Red Hat, Inc Virtio SCSI
   +-1c.1-[02]00.0  Red Hat, Inc Virtio network device
   +-1c.2-[03]--
   +-1c.3-[04]--
   +-1c.4-[05]--+-00.0  Red Hat, Inc Virtio console
   |+-00.1  Red Hat, Inc Virtio memory balloon
   |\-00.2  Red Hat, Inc Virtio input
   +-1c.5-[06]00.0  NEC Corporation uPD720200 USB 3.0 Host
Controller
   +-1c.6-[07]--
   +-1c.7-[08]--
   +-1e.0-[09-0a]00.0-[0a]--
   +-1f.0  Intel Corporation 82801IB (ICH9) LPC Interface
Controller
   +-1f.2  Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port
SATA Controller [AHCI mode]
   \-1f.3  Intel Corporation 82801I (ICH9 Family) SMBus
Controller

cheers,
  Gerd

[Qemu-devel] [PULL v5 10/22] hw/pcie: Introduce a base class for PCI Express Root Ports

2017-01-31 Thread Michael S. Tsirkin

From: Marcel Apfelbaum 

The 'base' PCI Express Root Port includes
the common code to be re-used for all
Root Ports implementations. Most of the code
was taken from the current implementation
of Intel's IOH 3420 Root Port.

Signed-off-by: Marcel Apfelbaum 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 default-configs/arm-softmmu.mak|   1 +
 default-configs/i386-softmmu.mak   |   1 +
 default-configs/x86_64-softmmu.mak |   1 +
 include/hw/pci/pcie_port.h |  19 +
 hw/pci-bridge/pcie_root_port.c | 171 +
 hw/pci-bridge/Makefile.objs|   1 +
 6 files changed, 194 insertions(+)
 create mode 100644 hw/pci-bridge/pcie_root_port.c

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index 6de3e16..6f2a180 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -108,6 +108,7 @@ CONFIG_FSL_IMX25=y
 
 CONFIG_IMX_I2C=y
 
+CONFIG_PCIE_PORT=y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
 CONFIG_I82801B11=y
diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 0b51360..9288838 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -51,6 +51,7 @@ CONFIG_PVPANIC=y
 CONFIG_MEM_HOTPLUG=y
 CONFIG_NVDIMM=y
 CONFIG_ACPI_NVDIMM=y
+CONFIG_PCIE_PORT=y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
 CONFIG_I82801B11=y
diff --git a/default-configs/x86_64-softmmu.mak 
b/default-configs/x86_64-softmmu.mak
index 7f89503..7d2c2d4 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -51,6 +51,7 @@ CONFIG_PVPANIC=y
 CONFIG_MEM_HOTPLUG=y
 CONFIG_NVDIMM=y
 CONFIG_ACPI_NVDIMM=y
+CONFIG_PCIE_PORT=y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
 CONFIG_I82801B11=y
diff --git a/include/hw/pci/pcie_port.h b/include/hw/pci/pcie_port.h
index f7b64db..1333266 100644
--- a/include/hw/pci/pcie_port.h
+++ b/include/hw/pci/pcie_port.h
@@ -57,4 +57,23 @@ PCIESlot *pcie_chassis_find_slot(uint8_t chassis, uint16_t 
slot);
 int pcie_chassis_add_slot(struct PCIESlot *slot);
 void pcie_chassis_del_slot(PCIESlot *s);
 
+#define TYPE_PCIE_ROOT_PORT "pcie-root-port-base"
+#define PCIE_ROOT_PORT_CLASS(klass) \
+ OBJECT_CLASS_CHECK(PCIERootPortClass, (klass), TYPE_PCIE_ROOT_PORT)
+#define PCIE_ROOT_PORT_GET_CLASS(obj) \
+ OBJECT_GET_CLASS(PCIERootPortClass, (obj), TYPE_PCIE_ROOT_PORT)
+
+typedef struct PCIERootPortClass {
+PCIDeviceClass parent_class;
+
+uint8_t (*aer_vector)(const PCIDevice *dev);
+int (*interrupts_init)(PCIDevice *dev, Error **errp);
+void (*interrupts_uninit)(PCIDevice *dev);
+
+int exp_offset;
+int aer_offset;
+int ssvid_offset;
+int ssid;
+} PCIERootPortClass;
+
 #endif /* QEMU_PCIE_PORT_H */
diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
new file mode 100644
index 000..cf36318
--- /dev/null
+++ b/hw/pci-bridge/pcie_root_port.c
@@ -0,0 +1,171 @@
+/*
+ * Base class for PCI Express Root Ports
+ *
+ * Copyright (C) 2017 Red Hat Inc
+ *
+ * Authors:
+ *   Marcel Apfelbaum 
+ *
+ * Most of the code was migrated from hw/pci-bridge/ioh3420.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "hw/pci/pcie_port.h"
+
+static void rp_aer_vector_update(PCIDevice *d)
+{
+PCIERootPortClass *rpc = PCIE_ROOT_PORT_GET_CLASS(d);
+
+if (rpc->aer_vector) {
+pcie_aer_root_set_vector(d, rpc->aer_vector(d));
+}
+}
+
+static void rp_write_config(PCIDevice *d, uint32_t address,
+uint32_t val, int len)
+{
+uint32_t root_cmd =
+pci_get_long(d->config + d->exp.aer_cap + PCI_ERR_ROOT_COMMAND);
+
+pci_bridge_write_config(d, address, val, len);
+rp_aer_vector_update(d);
+pcie_cap_slot_write_config(d, address, val, len);
+pcie_aer_write_config(d, address, val, len);
+pcie_aer_root_write_config(d, address, val, len, root_cmd);
+}
+
+static void rp_reset(DeviceState *qdev)
+{
+PCIDevice *d = PCI_DEVICE(qdev);
+
+rp_aer_vector_update(d);
+pcie_cap_root_reset(d);
+pcie_cap_deverr_reset(d);
+pcie_cap_slot_reset(d);
+pcie_cap_arifwd_reset(d);
+pcie_aer_root_reset(d);
+pci_bridge_reset(qdev);
+pci_bridge_disable_base_limit(d);
+}
+
+static void rp_realize(PCIDevice *d, Error **errp)
+{
+PCIEPort *p = PCIE_PORT(d);
+PCIESlot *s = PCIE_SLOT(d);
+PCIDeviceClass *dc = PCI_DEVICE_GET_CLASS(d);
+PCIERootPortClass *rpc = PCIE_ROOT_PORT_GET_CLASS(d);
+int rc;
+Error *local_err = NULL;
+
+pci_config_set_interrupt_pin(d->config, 1);
+pci_bridge_initfn(d, TYPE_PCIE_BUS);
+pcie_port_init_reg(d);
+
+rc = pci_bridge_ssvid_init(d, rpc->ssvid_offset, dc->vendor_id, rpc->ssid);
+if (rc < 0) {
+

[Qemu-devel] [PULL v5 22/22] arm: add trailing ; after MISMATCH_CHECK

2017-01-31 Thread Michael S. Tsirkin

Macro calls without a trailing ; look weird in C, this works as a side
effect of how QEMU_BUILD_BUG_ON is implemented. Fix this up.

Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Peter Maydell 
---
 target/arm/kvm-consts.h | 98 -
 1 file changed, 49 insertions(+), 49 deletions(-)

diff --git a/target/arm/kvm-consts.h b/target/arm/kvm-consts.h
index 06b6c92..aad2825 100644
--- a/target/arm/kvm-consts.h
+++ b/target/arm/kvm-consts.h
@@ -33,12 +33,12 @@
 #define CP_REG_ARM 0x4000ULL
 #define CP_REG_ARCH_MASK   0xff00ULL
 
-MISMATCH_CHECK(CP_REG_SIZE_SHIFT, KVM_REG_SIZE_SHIFT)
-MISMATCH_CHECK(CP_REG_SIZE_MASK, KVM_REG_SIZE_MASK)
-MISMATCH_CHECK(CP_REG_SIZE_U32, KVM_REG_SIZE_U32)
-MISMATCH_CHECK(CP_REG_SIZE_U64, KVM_REG_SIZE_U64)
-MISMATCH_CHECK(CP_REG_ARM, KVM_REG_ARM)
-MISMATCH_CHECK(CP_REG_ARCH_MASK, KVM_REG_ARCH_MASK)
+MISMATCH_CHECK(CP_REG_SIZE_SHIFT, KVM_REG_SIZE_SHIFT);
+MISMATCH_CHECK(CP_REG_SIZE_MASK, KVM_REG_SIZE_MASK);
+MISMATCH_CHECK(CP_REG_SIZE_U32, KVM_REG_SIZE_U32);
+MISMATCH_CHECK(CP_REG_SIZE_U64, KVM_REG_SIZE_U64);
+MISMATCH_CHECK(CP_REG_ARM, KVM_REG_ARM);
+MISMATCH_CHECK(CP_REG_ARCH_MASK, KVM_REG_ARCH_MASK);
 
 #define QEMU_PSCI_0_1_FN_BASE 0x95c1ba5e
 #define QEMU_PSCI_0_1_FN(n) (QEMU_PSCI_0_1_FN_BASE + (n))
@@ -47,10 +47,10 @@ MISMATCH_CHECK(CP_REG_ARCH_MASK, KVM_REG_ARCH_MASK)
 #define QEMU_PSCI_0_1_FN_CPU_ON QEMU_PSCI_0_1_FN(2)
 #define QEMU_PSCI_0_1_FN_MIGRATE QEMU_PSCI_0_1_FN(3)
 
-MISMATCH_CHECK(QEMU_PSCI_0_1_FN_CPU_SUSPEND, KVM_PSCI_FN_CPU_SUSPEND)
-MISMATCH_CHECK(QEMU_PSCI_0_1_FN_CPU_OFF, KVM_PSCI_FN_CPU_OFF)
-MISMATCH_CHECK(QEMU_PSCI_0_1_FN_CPU_ON, KVM_PSCI_FN_CPU_ON)
-MISMATCH_CHECK(QEMU_PSCI_0_1_FN_MIGRATE, KVM_PSCI_FN_MIGRATE)
+MISMATCH_CHECK(QEMU_PSCI_0_1_FN_CPU_SUSPEND, KVM_PSCI_FN_CPU_SUSPEND);
+MISMATCH_CHECK(QEMU_PSCI_0_1_FN_CPU_OFF, KVM_PSCI_FN_CPU_OFF);
+MISMATCH_CHECK(QEMU_PSCI_0_1_FN_CPU_ON, KVM_PSCI_FN_CPU_ON);
+MISMATCH_CHECK(QEMU_PSCI_0_1_FN_MIGRATE, KVM_PSCI_FN_MIGRATE);
 
 #define QEMU_PSCI_0_2_FN_BASE 0x8400
 #define QEMU_PSCI_0_2_FN(n) (QEMU_PSCI_0_2_FN_BASE + (n))
@@ -77,13 +77,13 @@ MISMATCH_CHECK(QEMU_PSCI_0_1_FN_MIGRATE, 
KVM_PSCI_FN_MIGRATE)
 #define QEMU_PSCI_0_2_FN64_AFFINITY_INFO QEMU_PSCI_0_2_FN64(4)
 #define QEMU_PSCI_0_2_FN64_MIGRATE QEMU_PSCI_0_2_FN64(5)
 
-MISMATCH_CHECK(QEMU_PSCI_0_2_FN_CPU_SUSPEND, PSCI_0_2_FN_CPU_SUSPEND)
-MISMATCH_CHECK(QEMU_PSCI_0_2_FN_CPU_OFF, PSCI_0_2_FN_CPU_OFF)
-MISMATCH_CHECK(QEMU_PSCI_0_2_FN_CPU_ON, PSCI_0_2_FN_CPU_ON)
-MISMATCH_CHECK(QEMU_PSCI_0_2_FN_MIGRATE, PSCI_0_2_FN_MIGRATE)
-MISMATCH_CHECK(QEMU_PSCI_0_2_FN64_CPU_SUSPEND, PSCI_0_2_FN64_CPU_SUSPEND)
-MISMATCH_CHECK(QEMU_PSCI_0_2_FN64_CPU_ON, PSCI_0_2_FN64_CPU_ON)
-MISMATCH_CHECK(QEMU_PSCI_0_2_FN64_MIGRATE, PSCI_0_2_FN64_MIGRATE)
+MISMATCH_CHECK(QEMU_PSCI_0_2_FN_CPU_SUSPEND, PSCI_0_2_FN_CPU_SUSPEND);
+MISMATCH_CHECK(QEMU_PSCI_0_2_FN_CPU_OFF, PSCI_0_2_FN_CPU_OFF);
+MISMATCH_CHECK(QEMU_PSCI_0_2_FN_CPU_ON, PSCI_0_2_FN_CPU_ON);
+MISMATCH_CHECK(QEMU_PSCI_0_2_FN_MIGRATE, PSCI_0_2_FN_MIGRATE);
+MISMATCH_CHECK(QEMU_PSCI_0_2_FN64_CPU_SUSPEND, PSCI_0_2_FN64_CPU_SUSPEND);
+MISMATCH_CHECK(QEMU_PSCI_0_2_FN64_CPU_ON, PSCI_0_2_FN64_CPU_ON);
+MISMATCH_CHECK(QEMU_PSCI_0_2_FN64_MIGRATE, PSCI_0_2_FN64_MIGRATE);
 
 /* PSCI v0.2 return values used by TCG emulation of PSCI */
 
@@ -93,9 +93,9 @@ MISMATCH_CHECK(QEMU_PSCI_0_2_FN64_MIGRATE, 
PSCI_0_2_FN64_MIGRATE)
 /* We implement version 0.2 only */
 #define QEMU_PSCI_0_2_RET_VERSION_0_2   2
 
-MISMATCH_CHECK(QEMU_PSCI_0_2_RET_TOS_MIGRATION_NOT_REQUIRED, PSCI_0_2_TOS_MP)
+MISMATCH_CHECK(QEMU_PSCI_0_2_RET_TOS_MIGRATION_NOT_REQUIRED, PSCI_0_2_TOS_MP);
 MISMATCH_CHECK(QEMU_PSCI_0_2_RET_VERSION_0_2,
-   (PSCI_VERSION_MAJOR(0) | PSCI_VERSION_MINOR(2)))
+   (PSCI_VERSION_MAJOR(0) | PSCI_VERSION_MINOR(2)));
 
 /* PSCI return values (inclusive of all PSCI versions) */
 #define QEMU_PSCI_RET_SUCCESS 0
@@ -108,15 +108,15 @@ MISMATCH_CHECK(QEMU_PSCI_0_2_RET_VERSION_0_2,
 #define QEMU_PSCI_RET_NOT_PRESENT -7
 #define QEMU_PSCI_RET_DISABLED-8
 
-MISMATCH_CHECK(QEMU_PSCI_RET_SUCCESS, PSCI_RET_SUCCESS)
-MISMATCH_CHECK(QEMU_PSCI_RET_NOT_SUPPORTED, PSCI_RET_NOT_SUPPORTED)
-MISMATCH_CHECK(QEMU_PSCI_RET_INVALID_PARAMS, PSCI_RET_INVALID_PARAMS)
-MISMATCH_CHECK(QEMU_PSCI_RET_DENIED, PSCI_RET_DENIED)
-MISMATCH_CHECK(QEMU_PSCI_RET_ALREADY_ON, PSCI_RET_ALREADY_ON)
-MISMATCH_CHECK(QEMU_PSCI_RET_ON_PENDING, PSCI_RET_ON_PENDING)
-MISMATCH_CHECK(QEMU_PSCI_RET_INTERNAL_FAILURE, PSCI_RET_INTERNAL_FAILURE)
-MISMATCH_CHECK(QEMU_PSCI_RET_NOT_PRESENT, PSCI_RET_NOT_PRESENT)
-MISMATCH_CHECK(QEMU_PSCI_RET_DISABLED, PSCI_RET_DISABLED)
+MISMATCH_CHECK(QEMU_PSCI_RET_SUCCESS, PSCI_RET_SUCCESS);
+MISMATCH_CHECK(QEMU_PSCI_RET_NOT_SUPPORTED, PSCI_RET_NOT_SUPPORTED);
+MISMATCH_CHECK(QEMU_PSCI_RET_INVALID_PARAMS, PSCI_RET_INVALID_PARAMS);

[Qemu-devel] [PULL v5 06/22] compiler: expression version of QEMU_BUILD_BUG_ON

2017-01-31 Thread Michael S. Tsirkin

QEMU_BUILD_BUG_ON uses a typedef in order to be safe
to use outside functions, but sometimes it's useful
to have a version that can be used within an expression.
Following what Linux does, introduce QEMU_BUILD_BUG_ON_ZERO
that return zero after checking condition at build time.

Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Eric Blake 
Reviewed-by: Markus Armbruster 
---
 include/qemu/compiler.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index 057639a..e0ce9ff 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -97,6 +97,9 @@
 #define QEMU_BUILD_BUG_ON(x)
 #endif
 
+#define QEMU_BUILD_BUG_ON_ZERO(x) (sizeof(QEMU_BUILD_BUG_ON_STRUCT(x)) - \
+   sizeof(QEMU_BUILD_BUG_ON_STRUCT(x)))
+
 #if defined __GNUC__
 # if !QEMU_GNUC_PREREQ(4, 4)
/* gcc versions before 4.4.x don't support gnu_printf, so use printf. */
-- 
MST

[Qemu-devel] [PULL v5 05/22] compiler: rework BUG_ON using a struct

2017-01-31 Thread Michael S. Tsirkin

There are theoretical concerns that some compilers might not trigger
build failures on attempts to define an array of size (x ? -1 : 1) where
x is a variable and make it a variable sized array instead. Let rewrite
using a struct with a negative bit field size instead as there are no
dynamic bit field sizes.  This is similar to what Linux does.

Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Eric Blake 
Reviewed-by: Markus Armbruster 
---
 include/qemu/compiler.h | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index 77b9ce3..057639a 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -85,10 +85,14 @@
 #define typeof_field(type, field) typeof(((type *)0)->field)
 #define type_check(t1,t2) ((t1*)0 - (t2*)0)
 
+#define QEMU_BUILD_BUG_ON_STRUCT(x) \
+struct { \
+int:(x) ? -1 : 1; \
+}
+
 #ifdef __COUNTER__
-#define QEMU_BUILD_BUG_ON(x) \
-typedef char glue(qemu_build_bug_on__, __COUNTER__)[(x) ? -1 : 1] \
-__attribute__((unused))
+#define QEMU_BUILD_BUG_ON(x) typedef QEMU_BUILD_BUG_ON_STRUCT(x) \
+glue(qemu_build_bug_on__, __COUNTER__) __attribute__((unused))
 #else
 #define QEMU_BUILD_BUG_ON(x)
 #endif
-- 
MST

[Qemu-devel] [PULL v5 11/22] hw/ioh3420: derive from PCI Express Root Port base class

2017-01-31 Thread Michael S. Tsirkin

From: Marcel Apfelbaum 

Preserve only Intel specific details.

Signed-off-by: Marcel Apfelbaum 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/pci-bridge/ioh3420.c | 121 ++--
 1 file changed, 15 insertions(+), 106 deletions(-)

diff --git a/hw/pci-bridge/ioh3420.c b/hw/pci-bridge/ioh3420.c
index 0eef87a..da4e5bd 100644
--- a/hw/pci-bridge/ioh3420.c
+++ b/hw/pci-bridge/ioh3420.c
@@ -61,119 +61,28 @@ static uint8_t ioh3420_aer_vector(const PCIDevice *d)
 return 0;
 }
 
-static void ioh3420_aer_vector_update(PCIDevice *d)
+static int ioh3420_interrupts_init(PCIDevice *d, Error **errp)
 {
-pcie_aer_root_set_vector(d, ioh3420_aer_vector(d));
-}
-
-static void ioh3420_write_config(PCIDevice *d,
-   uint32_t address, uint32_t val, int len)
-{
-uint32_t root_cmd =
-pci_get_long(d->config + d->exp.aer_cap + PCI_ERR_ROOT_COMMAND);
-
-pci_bridge_write_config(d, address, val, len);
-ioh3420_aer_vector_update(d);
-pcie_cap_slot_write_config(d, address, val, len);
-pcie_aer_write_config(d, address, val, len);
-pcie_aer_root_write_config(d, address, val, len, root_cmd);
-}
-
-static void ioh3420_reset(DeviceState *qdev)
-{
-PCIDevice *d = PCI_DEVICE(qdev);
-
-ioh3420_aer_vector_update(d);
-pcie_cap_root_reset(d);
-pcie_cap_deverr_reset(d);
-pcie_cap_slot_reset(d);
-pcie_cap_arifwd_reset(d);
-pcie_aer_root_reset(d);
-pci_bridge_reset(qdev);
-pci_bridge_disable_base_limit(d);
-}
-
-static int ioh3420_initfn(PCIDevice *d)
-{
-PCIEPort *p = PCIE_PORT(d);
-PCIESlot *s = PCIE_SLOT(d);
 int rc;
-Error *err = NULL;
-
-pci_config_set_interrupt_pin(d->config, 1);
-pci_bridge_initfn(d, TYPE_PCIE_BUS);
-pcie_port_init_reg(d);
-
-rc = pci_bridge_ssvid_init(d, IOH_EP_SSVID_OFFSET,
-   IOH_EP_SSVID_SVID, IOH_EP_SSVID_SSID);
-if (rc < 0) {
-goto err_bridge;
-}
+Error *local_err = NULL;
 
 rc = msi_init(d, IOH_EP_MSI_OFFSET, IOH_EP_MSI_NR_VECTOR,
   IOH_EP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
-  IOH_EP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT, );
+  IOH_EP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
+  _err);
 if (rc < 0) {
 assert(rc == -ENOTSUP);
-error_report_err(err);
-goto err_bridge;
+error_propagate(errp, local_err);
 }
 
-rc = pcie_cap_init(d, IOH_EP_EXP_OFFSET, PCI_EXP_TYPE_ROOT_PORT, p->port);
-if (rc < 0) {
-goto err_msi;
-}
-
-pcie_cap_arifwd_init(d);
-pcie_cap_deverr_init(d);
-pcie_cap_slot_init(d, s->slot);
-pcie_cap_root_init(d);
-
-pcie_chassis_create(s->chassis);
-rc = pcie_chassis_add_slot(s);
-if (rc < 0) {
-goto err_pcie_cap;
-}
-
-rc = pcie_aer_init(d, PCI_ERR_VER, IOH_EP_AER_OFFSET,
-   PCI_ERR_SIZEOF, );
-if (rc < 0) {
-error_report_err(err);
-goto err;
-}
-pcie_aer_root_init(d);
-ioh3420_aer_vector_update(d);
-
-return 0;
-
-err:
-pcie_chassis_del_slot(s);
-err_pcie_cap:
-pcie_cap_exit(d);
-err_msi:
-msi_uninit(d);
-err_bridge:
-pci_bridge_exitfn(d);
 return rc;
 }
 
-static void ioh3420_exitfn(PCIDevice *d)
+static void ioh3420_interrupts_uninit(PCIDevice *d)
 {
-PCIESlot *s = PCIE_SLOT(d);
-
-pcie_aer_exit(d);
-pcie_chassis_del_slot(s);
-pcie_cap_exit(d);
 msi_uninit(d);
-pci_bridge_exitfn(d);
 }
 
-static Property ioh3420_props[] = {
-DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present,
-QEMU_PCIE_SLTCAP_PCP_BITNR, true),
-DEFINE_PROP_END_OF_LIST()
-};
-
 static const VMStateDescription vmstate_ioh3420 = {
 .name = "ioh-3240-express-root-port",
 .version_id = 1,
@@ -191,25 +100,25 @@ static void ioh3420_class_init(ObjectClass *klass, void 
*data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
 PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+PCIERootPortClass *rpc = PCIE_ROOT_PORT_CLASS(klass);
 
-k->is_express = 1;
-k->is_bridge = 1;
-k->config_write = ioh3420_write_config;
-k->init = ioh3420_initfn;
-k->exit = ioh3420_exitfn;
 k->vendor_id = PCI_VENDOR_ID_INTEL;
 k->device_id = PCI_DEVICE_ID_IOH_EPORT;
 k->revision = PCI_DEVICE_ID_IOH_REV;
-set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
 dc->desc = "Intel IOH device id 3420 PCIE Root Port";
-dc->reset = ioh3420_reset;
 dc->vmsd = _ioh3420;
-dc->props = ioh3420_props;
+rpc->aer_vector = ioh3420_aer_vector;
+rpc->interrupts_init = ioh3420_interrupts_init;
+rpc->interrupts_uninit = ioh3420_interrupts_uninit;
+rpc->exp_offset = IOH_EP_EXP_OFFSET;
+rpc->aer_offset = IOH_EP_AER_OFFSET;
+rpc->ssvid_offset = IOH_EP_SSVID_OFFSET;
+

[Qemu-devel] [PULL v5 12/22] hw/pcie: Introduce Generic PCI Express Root Port

2017-01-31 Thread Michael S. Tsirkin

From: Marcel Apfelbaum 

The Generic Root Port behaves almost the same as the
Intel's IOH device with id 3420, without having
Intel specific attributes.

The device has two purposes:
 (1) Can be used on both X86 and ARM machines.
 (2) It will allow us to tweak the behaviour
(e.g add vendor-specific PCI capabilities)
 - something that obviously cannot be done
   on a known device.

Signed-off-by: Marcel Apfelbaum 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Tested-by: Andrea Bolognani 
---
 include/hw/pci/pci.h   |  1 +
 hw/pci-bridge/gen_pcie_root_port.c | 88 ++
 hw/pci-bridge/Makefile.objs|  2 +-
 3 files changed, 90 insertions(+), 1 deletion(-)
 create mode 100644 hw/pci-bridge/gen_pcie_root_port.c

diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 772692f..cbc1fdf 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -96,6 +96,7 @@
 #define PCI_DEVICE_ID_REDHAT_PXB 0x0009
 #define PCI_DEVICE_ID_REDHAT_BRIDGE_SEAT 0x000a
 #define PCI_DEVICE_ID_REDHAT_PXB_PCIE0x000b
+#define PCI_DEVICE_ID_REDHAT_PCIE_RP 0x000c
 #define PCI_DEVICE_ID_REDHAT_QXL 0x0100
 
 #define FMT_PCIBUS  PRIx64
diff --git a/hw/pci-bridge/gen_pcie_root_port.c 
b/hw/pci-bridge/gen_pcie_root_port.c
new file mode 100644
index 000..185fd90
--- /dev/null
+++ b/hw/pci-bridge/gen_pcie_root_port.c
@@ -0,0 +1,88 @@
+/*
+ * Generic PCI Express Root Port emulation
+ *
+ * Copyright (C) 2017 Red Hat Inc
+ *
+ * Authors:
+ *   Marcel Apfelbaum 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "hw/pci/msix.h"
+#include "hw/pci/pcie_port.h"
+
+#define TYPE_GEN_PCIE_ROOT_PORT"pcie-root-port"
+
+#define GEN_PCIE_ROOT_PORT_AER_OFFSET   0x100
+#define GEN_PCIE_ROOT_PORT_MSIX_NR_VECTOR   1
+
+static uint8_t gen_rp_aer_vector(const PCIDevice *d)
+{
+return 0;
+}
+
+static int gen_rp_interrupts_init(PCIDevice *d, Error **errp)
+{
+int rc;
+
+rc = msix_init_exclusive_bar(d, GEN_PCIE_ROOT_PORT_MSIX_NR_VECTOR, 0);
+
+if (rc < 0) {
+assert(rc == -ENOTSUP);
+error_setg(errp, "Unable to init msix vectors");
+} else {
+msix_vector_use(d, 0);
+}
+
+return rc;
+}
+
+static void gen_rp_interrupts_uninit(PCIDevice *d)
+{
+msix_uninit_exclusive_bar(d);
+}
+
+static const VMStateDescription vmstate_rp_dev = {
+.name = "pcie-root-port",
+.version_id = 1,
+.minimum_version_id = 1,
+.post_load = pcie_cap_slot_post_load,
+.fields = (VMStateField[]) {
+VMSTATE_PCI_DEVICE(parent_obj.parent_obj.parent_obj, PCIESlot),
+VMSTATE_STRUCT(parent_obj.parent_obj.parent_obj.exp.aer_log,
+   PCIESlot, 0, vmstate_pcie_aer_log, PCIEAERLog),
+VMSTATE_END_OF_LIST()
+}
+};
+
+static void gen_rp_dev_class_init(ObjectClass *klass, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+PCIERootPortClass *rpc = PCIE_ROOT_PORT_CLASS(klass);
+
+k->vendor_id = PCI_VENDOR_ID_REDHAT;
+k->device_id = PCI_DEVICE_ID_REDHAT_PCIE_RP;
+dc->desc = "PCI Express Root Port";
+dc->vmsd = _rp_dev;
+rpc->aer_vector = gen_rp_aer_vector;
+rpc->interrupts_init = gen_rp_interrupts_init;
+rpc->interrupts_uninit = gen_rp_interrupts_uninit;
+rpc->aer_offset = GEN_PCIE_ROOT_PORT_AER_OFFSET;
+}
+
+static const TypeInfo gen_rp_dev_info = {
+.name  = TYPE_GEN_PCIE_ROOT_PORT,
+.parent= TYPE_PCIE_ROOT_PORT,
+.class_init= gen_rp_dev_class_init,
+};
+
+ static void gen_rp_register_types(void)
+ {
+type_register_static(_rp_dev_info);
+ }
+ type_init(gen_rp_register_types)
diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
index 4f2039f..85e8e39 100644
--- a/hw/pci-bridge/Makefile.objs
+++ b/hw/pci-bridge/Makefile.objs
@@ -1,6 +1,6 @@
 common-obj-y += pci_bridge_dev.o
 common-obj-y += pci_expander_bridge.o
-common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o
+common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o
 common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
 common-obj-$(CONFIG_IOH3420) += ioh3420.o
 common-obj-$(CONFIG_I82801B11) += i82801b11.o
-- 
MST

[Qemu-devel] [PULL v5 18/22] vhost: skip ROM sections

2017-01-31 Thread Michael S. Tsirkin

vhost does not support RO protections on memory at the moment - adding
ROMs would mean that e.g. a buggy guest might change them in-memory - a
condition from which guest reset does not recover. Not nice.

We also definitely don't want to try logging writes into ROMs -
in particular guests set very high addresses for ROM BARs
so logging these writes would waste a lot of memory.

Maybe ROMs could be supported with the iotlb variant -
not sure, but there seems to be no good reason for virtio
to try to do DMA from ROM. So let's just skip ROM memory.

Suggested-by: Laurent Vivier 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Laurent Vivier 
Tested-by: Laurent Vivier 
---
 hw/virtio/vhost.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index b124d97..febe519 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -612,7 +612,8 @@ static void vhost_set_memory(MemoryListener *listener,
 
 static bool vhost_section(MemoryRegionSection *section)
 {
-return memory_region_is_ram(section->mr);
+return memory_region_is_ram(section->mr) &&
+!memory_region_is_rom(section->mr);
 }
 
 static void vhost_begin(MemoryListener *listener)
-- 
MST

[Qemu-devel] [PULL v5 09/22] intel_iommu: fix and simplify size calculation in process_device_iotlb_desc()

2017-01-31 Thread Michael S. Tsirkin

From: Jason Wang 

We don't use 1ULL which is wrong during size calculation. Fix it, and
while at it, switch to use cto64() and adds a comments to make it
simpler and easier to be understood.

Reported-by: Paolo Bonzini 
Cc: Paolo Bonzini 
Signed-off-by: Jason Wang 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Paolo Bonzini 
---
 hw/i386/intel_iommu.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index ec62239..3270fb9 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1485,8 +1485,16 @@ static bool 
vtd_process_device_iotlb_desc(IntelIOMMUState *s,
 goto done;
 }
 
+/* According to ATS spec table 2.4:
+ * S = 0, bits 15:12 =  range size: 4K
+ * S = 1, bits 15:12 = xxx0 range size: 8K
+ * S = 1, bits 15:12 = xx01 range size: 16K
+ * S = 1, bits 15:12 = x011 range size: 32K
+ * S = 1, bits 15:12 = 0111 range size: 64K
+ * ...
+ */
 if (size) {
-sz = 1 << (ctz64(~(addr | (VTD_PAGE_MASK_4K - 1))) + 1);
+sz = (VTD_PAGE_SIZE * 2) << cto64(addr >> VTD_PAGE_SHIFT);
 addr &= ~(sz - 1);
 } else {
 sz = VTD_PAGE_SIZE;
-- 
MST

[Qemu-devel] [PULL v5 04/22] QEMU_BUILD_BUG_ON: use COUNTER

2017-01-31 Thread Michael S. Tsirkin

Some headers use QEMU_BUILD_BUG_ON. This causes a problem
if the C file including that header happens to have
QEMU_BUILD_BUG_ON at the same line number.

Fix using a widely available extension: __COUNTER__.
If unavailable, provide a stub.

Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Markus Armbruster 
Reviewed-by: Eric Blake 
---
 include/qemu/compiler.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index 7512082..77b9ce3 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -85,9 +85,13 @@
 #define typeof_field(type, field) typeof(((type *)0)->field)
 #define type_check(t1,t2) ((t1*)0 - (t2*)0)
 
+#ifdef __COUNTER__
 #define QEMU_BUILD_BUG_ON(x) \
-typedef char glue(qemu_build_bug_on__, __LINE__)[(x) ? -1 : 1] \
+typedef char glue(qemu_build_bug_on__, __COUNTER__)[(x) ? -1 : 1] \
 __attribute__((unused))
+#else
+#define QEMU_BUILD_BUG_ON(x)
+#endif
 
 #if defined __GNUC__
 # if !QEMU_GNUC_PREREQ(4, 4)
-- 
MST

[Qemu-devel] [PULL v5 03/22] ppc: switch to constants within BUILD_BUG_ON

2017-01-31 Thread Michael S. Tsirkin

We are switching BUILD_BUG_ON to verify that it's parameter is a
compile-time constant, and it turns out that some gcc versions
(specifically gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609) are
not smart enough to figure it out for expressions involving local
variables. This is harmless but means that the check is ineffective for
these platforms.  To fix, replace the variable with macros.

Reported-by: Peter Maydell 
Signed-off-by: Michael S. Tsirkin 
---
 hw/ppc/spapr.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a642e66..b81f2ac 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2630,8 +2630,8 @@ static void spapr_phb_placement(sPAPRMachineState *spapr, 
uint32_t index,
  * 1TiB 64-bit MMIO windows for each PHB.
  */
 const uint64_t base_buid = 0x8002000ULL;
-const int max_phbs =
-(SPAPR_PCI_LIMIT - SPAPR_PCI_BASE) / SPAPR_PCI_MEM64_WIN_SIZE - 1;
+#define SPAPR_MAX_PHBS ((SPAPR_PCI_LIMIT - SPAPR_PCI_BASE) / \
+SPAPR_PCI_MEM64_WIN_SIZE - 1)
 int i;
 
 /* Sanity check natural alignments */
@@ -2640,12 +2640,14 @@ static void spapr_phb_placement(sPAPRMachineState 
*spapr, uint32_t index,
 QEMU_BUILD_BUG_ON((SPAPR_PCI_MEM64_WIN_SIZE % SPAPR_PCI_MEM32_WIN_SIZE) != 
0);
 QEMU_BUILD_BUG_ON((SPAPR_PCI_MEM32_WIN_SIZE % SPAPR_PCI_IO_WIN_SIZE) != 0);
 /* Sanity check bounds */
-QEMU_BUILD_BUG_ON((max_phbs * SPAPR_PCI_IO_WIN_SIZE) > 
SPAPR_PCI_MEM32_WIN_SIZE);
-QEMU_BUILD_BUG_ON((max_phbs * SPAPR_PCI_MEM32_WIN_SIZE) > 
SPAPR_PCI_MEM64_WIN_SIZE);
+QEMU_BUILD_BUG_ON((SPAPR_MAX_PHBS * SPAPR_PCI_IO_WIN_SIZE) >
+  SPAPR_PCI_MEM32_WIN_SIZE);
+QEMU_BUILD_BUG_ON((SPAPR_MAX_PHBS * SPAPR_PCI_MEM32_WIN_SIZE) >
+  SPAPR_PCI_MEM64_WIN_SIZE);
 
-if (index >= max_phbs) {
+if (index >= SPAPR_MAX_PHBS) {
 error_setg(errp, "\"index\" for PAPR PHB is too large (max %u)",
-   max_phbs - 1);
+   SPAPR_MAX_PHBS - 1);
 return;
 }
 
-- 
MST

[Qemu-devel] [PULL v5 13/22] hw/i386: check if nvdimm is enabled before plugging

2017-01-31 Thread Michael S. Tsirkin

From: Haozhong Zhang 

The missing of 'nvdimm' in the machine type option '-M' means NVDIMM
is disabled. QEMU should refuse to plug any NVDIMM device in this case
and report the misconfiguration.

The behavior of NVDIMM on unsupported platform (HW/FW) is vendor
specific. For some vendors, it's undefined and the platform may do
anything. Thus, I think QEMU is free to choose the implementation.
Aborting QEMU (i.e. refusing to boot) is the easiest one.

Reported-by: Stefan Hajnoczi 
Signed-off-by: Haozhong Zhang 
Message-Id: 20170112110928.GF4621@stefanha-x1.localdomain
Message-Id: 20170111093630.2088-1-stefa...@redhat.com
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Xiao Guangrong 
Reviewed-by: Eduardo Habkost 
Reviewed-by: Stefan Hajnoczi 
---
 hw/i386/pc.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 706e233..e3fcd51 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1708,6 +1708,11 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev,
 }
 
 if (object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)) {
+if (!pcms->acpi_nvdimm_state.is_enabled) {
+error_setg(_err,
+   "nvdimm is not enabled: missing 'nvdimm' in '-M'");
+goto out;
+}
 nvdimm_plug(>acpi_nvdimm_state);
 }
 
-- 
MST

[Qemu-devel] [PULL v5 07/22] ARRAY_SIZE: check that argument is an array

2017-01-31 Thread Michael S. Tsirkin

It's a familiar pattern: some code uses ARRAY_SIZE, then refactoring
changes the argument from an array to a pointer to a dynamically
allocated buffer.  Code keeps compiling but any ARRAY_SIZE calls now
return the size of the pointer divided by element size.

Let's add build time checks to ARRAY_SIZE before we allow more
of these in the code-base.

Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Markus Armbruster 
Reviewed-by: Eric Blake 
---
 include/qemu/osdep.h | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 689f253..56c9e22 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -198,8 +198,15 @@ extern int daemon(int, int);
 #define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))
 #endif
 
+/*
+ * &(x)[0] is always a pointer - if it's same type as x then the argument is a
+ * pointer, not an array.
+ */
+#define QEMU_IS_ARRAY(x) (!__builtin_types_compatible_p(typeof(x), \
+typeof(&(x)[0])))
 #ifndef ARRAY_SIZE
-#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+#define ARRAY_SIZE(x) ((sizeof(x) / sizeof((x)[0])) + \
+   QEMU_BUILD_BUG_ON_ZERO(!QEMU_IS_ARRAY(x)))
 #endif
 
 int qemu_daemon(int nochdir, int noclose);
-- 
MST

[Qemu-devel] [PULL v5 01/22] compiler: drop ; after BUILD_BUG_ON

2017-01-31 Thread Michael S. Tsirkin

All users include the trailing ; anyway, let's require that -
it seems cleaner.

Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Eric Blake 
Reviewed-by: Markus Armbruster 
---
 include/qemu/compiler.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index 157698b..7512082 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -86,7 +86,8 @@
 #define type_check(t1,t2) ((t1*)0 - (t2*)0)
 
 #define QEMU_BUILD_BUG_ON(x) \
-typedef char glue(qemu_build_bug_on__,__LINE__)[(x)?-1:1] 
__attribute__((unused));
+typedef char glue(qemu_build_bug_on__, __LINE__)[(x) ? -1 : 1] \
+__attribute__((unused))
 
 #if defined __GNUC__
 # if !QEMU_GNUC_PREREQ(4, 4)
-- 
MST

[Qemu-devel] [PULL v5 00/22] virtio, vhost, pci: fixes, features

2017-01-31 Thread Michael S. Tsirkin

The following changes since commit a0def594286d9110a6035e02eef558cf3cf5d847:

  Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging 
(2017-01-30 10:23:20 +)

are available in the git repository at:

  git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream

for you to fetch changes up to 4f81928e550032f758c9b5207e403b3671da0eed:

  arm: add trailing ; after MISMATCH_CHECK (2017-01-31 22:16:39 +0200)


virtio, vhost, pci: fixes, features

generic pci root port support
disable shpc by default
safer version of ARRAY_SIZE and QEMU_BUILD_BUG_ON
fixes and cleanups all over the place

Signed-off-by: Michael S. Tsirkin 


Cao jin (3):
  msix: Follow CODING_STYLE
  hcd-xhci: check & correct param before using it
  pci: Convert msix_init() to Error and fix callers

Haozhong Zhang (1):
  hw/i386: check if nvdimm is enabled before plugging

Jason Wang (1):
  intel_iommu: fix and simplify size calculation in 
process_device_iotlb_desc()

Marc-André Lureau (1):
  vhost-user: delete chardev on cleanup

Marcel Apfelbaum (4):
  hw/pcie: Introduce a base class for PCI Express Root Ports
  hw/ioh3420: derive from PCI Express Root Port base class
  hw/pcie: Introduce Generic PCI Express Root Port
  hw/pci: disable pci-bridge's shpc by default

Michael S. Tsirkin (11):
  compiler: drop ; after BUILD_BUG_ON
  qxl: switch to constants within BUILD_BUG_ON
  ppc: switch to constants within BUILD_BUG_ON
  QEMU_BUILD_BUG_ON: use __COUNTER__
  compiler: rework BUG_ON using a struct
  compiler: expression version of QEMU_BUILD_BUG_ON
  ARRAY_SIZE: check that argument is an array
  pci: mark ROMs read-only
  vhost: skip ROM sections
  arm: better stub version for MISMATCH_CHECK
  arm: add trailing ; after MISMATCH_CHECK

Paolo Bonzini (1):
  virtio: make virtio_should_notify static

 default-configs/arm-softmmu.mak|   1 +
 default-configs/i386-softmmu.mak   |   1 +
 default-configs/x86_64-softmmu.mak |   1 +
 include/hw/compat.h|   4 +
 include/hw/pci/msix.h  |   5 +-
 include/hw/pci/pci.h   |   1 +
 include/hw/pci/pcie_port.h |  19 +
 include/hw/virtio/virtio.h |   1 -
 include/qemu/compiler.h|  16 +++-
 include/qemu/osdep.h   |   9 +-
 target/arm/kvm-consts.h| 102 +++---
 hw/block/nvme.c|   2 +-
 hw/display/qxl.c   |   9 +-
 hw/i386/intel_iommu.c  |  10 ++-
 hw/i386/pc.c   |   5 ++
 hw/misc/ivshmem.c  |   8 +-
 hw/net/e1000e.c|   2 +-
 hw/net/rocker/rocker.c |   4 +-
 hw/net/vmxnet3.c   |   2 +-
 hw/pci-bridge/gen_pcie_root_port.c |  87 +++
 hw/pci-bridge/ioh3420.c| 121 --
 hw/pci-bridge/pci_bridge_dev.c |   2 +-
 hw/pci-bridge/pcie_root_port.c | 171 +
 hw/pci/msix.c  |  44 --
 hw/pci/pci.c   |   2 +-
 hw/ppc/spapr.c |  14 +--
 hw/scsi/megasas.c  |   4 +-
 hw/usb/hcd-xhci.c  |  41 +
 hw/vfio/pci.c  |   8 +-
 hw/virtio/vhost.c  |   3 +-
 hw/virtio/virtio-pci.c |   4 +-
 hw/virtio/virtio.c |   2 +-
 net/vhost-user.c   |   3 +
 hw/pci-bridge/Makefile.objs|   1 +
 34 files changed, 490 insertions(+), 219 deletions(-)
 create mode 100644 hw/pci-bridge/gen_pcie_root_port.c
 create mode 100644 hw/pci-bridge/pcie_root_port.c

[Qemu-devel] [PULL v5 02/22] qxl: switch to constants within BUILD_BUG_ON

2017-01-31 Thread Michael S. Tsirkin

We are switching BUILD_BUG_ON to verify that it's parameter is a
compile-time constant, and it turns out that some gcc versions
(specifically gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609) are
not smart enough to figure it out for expressions involving local
variables. This is harmless but means that the check is ineffective for
these platforms.  To fix, replace variables with macros.

Reported-by: Peter Maydell 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Eric Blake 
---
 hw/display/qxl.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/hw/display/qxl.c b/hw/display/qxl.c
index 62d0c80..af4c0ca 100644
--- a/hw/display/qxl.c
+++ b/hw/display/qxl.c
@@ -306,12 +306,11 @@ void qxl_spice_reset_cursor(PCIQXLDevice *qxl)
 
 static ram_addr_t qxl_rom_size(void)
 {
-uint32_t required_rom_size = sizeof(QXLRom) + sizeof(QXLModes) +
- sizeof(qxl_modes);
-uint32_t rom_size = 8192; /* two pages */
+#define QXL_REQUIRED_SZ (sizeof(QXLRom) + sizeof(QXLModes) + sizeof(qxl_modes))
+#define QXL_ROM_SZ 8192
 
-QEMU_BUILD_BUG_ON(required_rom_size > rom_size);
-return rom_size;
+QEMU_BUILD_BUG_ON(QXL_REQUIRED_SZ > QXL_ROM_SZ);
+return QXL_ROM_SZ;
 }
 
 static void init_qxl_rom(PCIQXLDevice *d)
-- 
MST

Re: [Qemu-devel] [PATCH v8 15/25] cputlb: introduce tlb_flush_* async work.

2017-01-31 Thread Richard Henderson


On 01/27/2017 02:39 AM, Alex Bennée wrote:

From: KONRAD Frederic 

Some architectures allow to flush the tlb of other VCPUs. This is not a problem
when we have only one thread for all VCPUs but it definitely needs to be an
asynchronous work when we are in true multithreaded work.

We take the tb_lock() when doing this to avoid racing with other threads
which may be invalidating TB's at the same time. The alternative would
be to use proper atomic primitives to clear the tlb entries en-mass.

This patch doesn't do anything to protect other cputlb function being
called in MTTCG mode making cross vCPU changes.

Signed-off-by: KONRAD Frederic 
[AJB: remove need for g_malloc on defer, make check fixes, tb_lock]
Signed-off-by: Alex Bennée 


Reviewed-by: Richard Henderson 

r~

1 2 3 4 5 >

1 - 100 of 406 matches

Mail list logo