date:20160129

[Qemu-devel] [PATCH v10 20/25] qapi: Swap 'name' in visit_* callbacks to match public API

2016-01-29 Thread Eric Blake

As explained in the previous patches, matching argument order of
'name, ' to JSON's "name":value makes sense.  However,
while the last two patches were easy with Coccinelle, I ended up
doing this one all by hand.  Now all the visitor callbacks match
the main interface.

The compiler is able to enforce that all clients match the changed
interface in visitor-impl.h, even where two pointers are being
swapped, because only one of the two pointers is const (if that
were not the case, then C's looseness on treating 'char *' like
'void *' would have made review a bit harder).

Signed-off-by: Eric Blake 
Reviewed-by: Marc-André Lureau 

---
v10: enhance commit message, tweak spacing of 'const char *const'
v9: no change
v8: new patch
---
 include/qapi/visitor-impl.h  | 39 +--
 qapi/qapi-visit-core.c   | 38 +++---
 qapi/opts-visitor.c  | 16 
 qapi/qapi-dealloc-visitor.c  | 25 -
 qapi/qmp-input-visitor.c | 22 +++---
 qapi/qmp-output-visitor.c| 16 
 qapi/string-input-visitor.c  | 16 
 qapi/string-output-visitor.c | 16 
 8 files changed, 95 insertions(+), 93 deletions(-)

diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
index 29e2c08..734cc13 100644
--- a/include/qapi/visitor-impl.h
+++ b/include/qapi/visitor-impl.h
@@ -18,8 +18,8 @@
 struct Visitor
 {
 /* Must be set */
-void (*start_struct)(Visitor *v, void **obj, const char *kind,
- const char *name, size_t size, Error **errp);
+void (*start_struct)(Visitor *v, const char *name, void **obj,
+ const char *kind, size_t size, Error **errp);
 void (*end_struct)(Visitor *v, Error **errp);

 void (*start_implicit_struct)(Visitor *v, void **obj, size_t size,
@@ -30,38 +30,41 @@ struct Visitor
 GenericList *(*next_list)(Visitor *v, GenericList **list, Error **errp);
 void (*end_list)(Visitor *v, Error **errp);

-void (*type_enum)(Visitor *v, int *obj, const char * const strings[],
-  const char *kind, const char *name, Error **errp);
+void (*type_enum)(Visitor *v, const char *name, int *obj,
+  const char *const strings[], const char *kind,
+  Error **errp);
 /* May be NULL; only needed for input visitors. */
-void (*get_next_type)(Visitor *v, QType *type, bool promote_int,
-  const char *name, Error **errp);
+void (*get_next_type)(Visitor *v, const char *name, QType *type,
+  bool promote_int, Error **errp);

 /* Must be set. */
-void (*type_int64)(Visitor *v, int64_t *obj, const char *name,
+void (*type_int64)(Visitor *v, const char *name, int64_t *obj,
Error **errp);
 /* Must be set. */
-void (*type_uint64)(Visitor *v, uint64_t *obj, const char *name,
+void (*type_uint64)(Visitor *v, const char *name, uint64_t *obj,
 Error **errp);
 /* Optional; fallback is type_uint64().  */
-void (*type_size)(Visitor *v, uint64_t *obj, const char *name,
+void (*type_size)(Visitor *v, const char *name, uint64_t *obj,
   Error **errp);
 /* Must be set. */
-void (*type_bool)(Visitor *v, bool *obj, const char *name, Error **errp);
-void (*type_str)(Visitor *v, char **obj, const char *name, Error **errp);
-void (*type_number)(Visitor *v, double *obj, const char *name,
+void (*type_bool)(Visitor *v, const char *name, bool *obj, Error **errp);
+void (*type_str)(Visitor *v, const char *name, char **obj, Error **errp);
+void (*type_number)(Visitor *v, const char *name, double *obj,
 Error **errp);
-void (*type_any)(Visitor *v, QObject **obj, const char *name,
+void (*type_any)(Visitor *v, const char *name, QObject **obj,
  Error **errp);

 /* May be NULL; most useful for input visitors. */
-void (*optional)(Visitor *v, bool *present, const char *name);
+void (*optional)(Visitor *v, const char *name, bool *present);

 bool (*start_union)(Visitor *v, bool data_present, Error **errp);
 };

-void input_type_enum(Visitor *v, int *obj, const char * const strings[],
- const char *kind, const char *name, Error **errp);
-void output_type_enum(Visitor *v, int *obj, const char * const strings[],
-  const char *kind, const char *name, Error **errp);
+void input_type_enum(Visitor *v, const char *name, int *obj,
+ const char *const strings[], const char *kind,
+ Error **errp);
+void output_type_enum(Visitor *v, const char *name, int *obj,
+  const char *const strings[], const char *kind,
+  Error **errp);

 #endif
diff --git

Re: [Qemu-devel] [PATCH v8 14/16] block: Rewrite bdrv_close_all()

2016-01-29 Thread Max Reitz

On 28.01.2016 05:17, Fam Zheng wrote:
> On Wed, 01/27 18:59, Max Reitz wrote:
>> This patch rewrites bdrv_close_all(): Until now, all root BDSs have been
>> force-closed. This is bad because it can lead to cached data not being
>> flushed to disk.
>>
>> Instead, try to make all reference holders relinquish their reference
>> voluntarily:
>>
>> 1. All BlockBackend users are handled by making all BBs simply eject
>>their BDS tree. Since a BDS can never be on top of a BB, this will
>>not cause any of the issues as seen with the force-closing of BDSs.
>>The references will be relinquished and any further access to the BB
>>will fail gracefully.
>> 2. All BDSs which are owned by the monitor itself (because they do not
>>have a BB) are relinquished next.
>> 3. Besides BBs and the monitor, block jobs and other BDSs are the only
>>things left that can hold a reference to BDSs. After every remaining
>>block job has been canceled, there should not be any BDSs left (and
>>the loop added here will always terminate (as long as NDEBUG is not
>>defined), because either all_bdrv_states will be empty or there will
>>not be any block job left to cancel, failing the assertion).
>>
>> Signed-off-by: Max Reitz 
>> Reviewed-by: Kevin Wolf 
>> ---
>>  block.c | 45 +
>>  1 file changed, 37 insertions(+), 8 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index f8dd4a3..478e0db 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -2145,9 +2145,7 @@ static void bdrv_close(BlockDriverState *bs)
>>  {
>>  BdrvAioNotifier *ban, *ban_next;
>>  
>> -if (bs->job) {
>> -block_job_cancel_sync(bs->job);
>> -}
>> +assert(!bs->job);
>>  
>>  /* Disable I/O limits and drain all pending throttled requests */
>>  if (bs->throttle_state) {
>> @@ -2213,13 +2211,44 @@ static void bdrv_close(BlockDriverState *bs)
>>  void bdrv_close_all(void)
>>  {
>>  BlockDriverState *bs;
>> +AioContext *aio_context;
>> +int original_refcount = 0;
>>  
>> -QTAILQ_FOREACH(bs, _states, device_list) {
>> -AioContext *aio_context = bdrv_get_aio_context(bs);
>> +/* Drop references from requests still in flight, such as canceled block
>> + * jobs whose AIO context has not been polled yet */
>> +bdrv_drain_all();
>>  
>> -aio_context_acquire(aio_context);
>> -bdrv_close(bs);
>> -aio_context_release(aio_context);
>> +blockdev_close_all_bdrv_states();
>> +blk_remove_all_bs();
> 
> This (monitor before BB) doesn't match the order in the commit message (BB
> before monitor).

Will ask random.org whether to change the order here or in the commit
message. :-)

>> +
>> +/* Cancel all block jobs */
>> +while (!QTAILQ_EMPTY(_bdrv_states)) {
>> +QTAILQ_FOREACH(bs, _bdrv_states, bs_list) {
>> +aio_context = bdrv_get_aio_context(bs);
>> +
>> +aio_context_acquire(aio_context);
>> +if (bs->job) {
>> +/* So we can safely query the current refcount */
>> +bdrv_ref(bs);
>> +original_refcount = bs->refcnt;
>> +
>> +block_job_cancel_sync(bs->job);
>> +aio_context_release(aio_context);
>> +break;
>> +}
>> +aio_context_release(aio_context);
>> +}
>> +
>> +/* All the remaining BlockDriverStates are referenced directly or
>> + * indirectly from block jobs, so there needs to be at least one BDS
>> + * directly used by a block job */
>> +assert(bs);
>> +
>> +/* Wait for the block job to release its reference */
>> +while (bs->refcnt >= original_refcount) {
>> +aio_poll(aio_context, true);
> 
> Why is this safe without acquiring aio_context? But oh wait, completions of
> block jobs are defered to main loop BH, so I think to release the reference,
> aio_poll(qemu_get_aio_context(), ...) is the right thing to do.

Actually, I think, commit 94db6d2d30962cc0422a69c88c3b3e9981b33e50 made
this loop completely unnecessary. Will investigate.

Max

> This is also the problem in block_job_cancel_sync, which can dead loop waiting
> for job->completed flag, without processing main loop BH.
> 
> Fam
> 
>> +}
>> +bdrv_unref(bs);
>>  }
>>  }
>>  
>> -- 
>> 2.7.0
>>




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 1/1] arm: virt: change GPIO trigger interrupt to pulse

2016-01-29 Thread Peter Maydell

On 29 January 2016 at 14:46, Shannon Zhao  wrote:
> On 2016/1/29 22:35, Wei Huang wrote:
>> On 01/29/2016 04:10 AM, Shannon Zhao wrote:
>>> This makes ACPI work well but makes DT not work. The reason is systemd or
>>> acpid open /dev/input/event0 failed. So the interrupt could be injected
>>> and
>>> could see under /proc/interrupts but guest doesn't have any action. I'll
>>> investigate why it opens failed later.
>>
>>
>> That is interesting. Could you try it with the following? This reverses
>> the order to down-up and worked on ACPI case.
>>
> Yeah, that's very weird.
>
>> qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 0);
>> qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 1);
>>
> I'll try this tomorrow. But even if this works, it's still weird.

I wonder if we should be asserting the GPIO pin in the powerdown-request
hook and then deasserting it on system reset somewhere...

thanks
-- PMM

Re: [Qemu-devel] [PATCH v9 35/37] qapi: Change visit_type_FOO() to no longer return partial objects

2016-01-29 Thread Eric Blake

On 01/29/2016 05:03 AM, Markus Armbruster wrote:

> 
> With (1) don't assign, the caller can pick an error value by assigning
> it before the visit, and it must not access the value on error unless it
> does.
> 
> With (2) assign zero, the caller can't pick an error value, but may
> safely access the value even on error.
> 
> Tradeoff.  I figure either can work for us.
> 
>>> (3) Assign null pointer, else don't assign anything
>>>
>>> CON: inconsistent
>>> CON: mix of (1)'s and (2)'s CON
>>
>> Which I think is what I did in this patch.
> 
> I don't like the inconsistency.  It complicates the interface.

I'll go ahead and audit to see whether more scalar visits were relying
on (1) and would have to be rewritten to use style (2); vs. whether more
pointer visits were passing in uninitialized obj and would have to be
rewritten to use style (1).

> I think behavior (1) don't assign and (2) assign zero both work, we just
> have to pick one and run with it.
> 
> If we pick behavior (1) don't assign, then we should assert something
> like !obj || !*obj on entry.  With such assertions in place, I think (1)
> should be roughly as safe as (2).

I think your assessment is right, and it's now just a matter of seeing
which way to get to a consistent state is less effort (I may still end
up doing both ways as competing patches, for comparison purposes).

> or maybe returns whether something was allocated:
> 
> out_obj:
> if (visit_end_struct(v) && err) {
>qapi_free_T(*obj);
> }

I'm liking that.  Dealloc and output visitors always return false, and
input visitors may need to track something on their stack for whether
they allocated or returned error earlier on, but it results in less
generated output.  Basically, it's lowering the 'bool allocated' that I
added in this attempt out of the generated code and into the visitors.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Qemu-ppc] [PULL 03/39] macio: use the existing IDEDMA aiocb to hold the active DMA aiocb

2016-01-29 Thread Aurelien Jarno

On 2016-01-29 16:06, David Gibson wrote:
> From: Mark Cave-Ayland 
> 
> Currently the aiocb is held within MACIOIDEState, however the IDE core code
> assumes that the current actvie DMA aiocb is held in aiocb in a few places,
> e.g. ide_bus_reset() and ide_reset().
> 
> Switch over to using IDEDMA aiocb to store the aiocb for the current active
> DMA request so that bus resets and restarts are handled correctly. As a
> consequence we can now use ide_set_inactive() rather than handling its
> functionality ourselves.
> 
> Signed-off-by: Mark Cave-Ayland 
> Reviewed-by: John Snow 
> Signed-off-by: David Gibson 
> ---
>  hw/ide/macio.c  |  20 +-
>  hw/ide/macio.c.orig | 634 
> 

I don't think you want to add this file to the git.

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [PATCH v8 00/16] block: Rework bdrv_close_all()

2016-01-29 Thread Kevin Wolf

Am 27.01.2016 um 18:59 hat Max Reitz geschrieben:
> Currently, bdrv_close_all() force-closes all BDSs with a BlockBackend,
> which can lead to data corruption (see the iotest added in the final
> patch of this series) and is most certainly very ugly.
> 
> This series reworks bdrv_close_all() to instead eject the BDS trees from
> all BlockBackends and then close the monitor-owned BDS trees, which are
> the only BDSs without a BB. In effect, all BDSs are closed just by
> getting closed automatically due to their reference count becoming 0.
> 
> Note that the approach taken here leaks all BlockBackends. This does not
> really matter, however, since qemu is about to exit anyway.

Apart from patch 5:
Reviewed-by: Kevin Wolf

Re: [Qemu-devel] [PATCH] usb: ehci: add capability mmio write function

2016-01-29 Thread Gerd Hoffmann

On Fr, 2016-01-29 at 18:30 +0530, P J P wrote:
> From: Prasad J Pandit 
> 
> USB Ehci emulation supports host controller capability registers.
> But its mmio '.write' function was missing, which lead to a null
> pointer dereference issue. Add a do nothing 'ehci_caps_write'
> definition to avoid it; Do nothing because capability registers
> are Read Only(RO).

Surely makes sense, xhci does the same, I'll pick it up.

Maybe we should have a generic nop_write function somewhere.  Not that
there can much go wrong by cut here, but still ...

cheers,
  Gerd

[Qemu-devel] [PATCH v10 08/25] vl: Ensure qapi visitor properly ends struct visit

2016-01-29 Thread Eric Blake

Guarantee that visit_end_struct() is called if
visit_start_struct() succeeded.  This matches the behavior of
most other uses of visitors, and is a step towards the possibility
of a future patch that adds and enforces some tighter semantics to
the visitor interface (namely, cleanup of the visitor would no
longer have to mop up as many leftovers from an aborted partial
visit).

The change to code here matches the flow of hmp.c:hmp_object_add();
a later patch will then further simplify the cleanup logic of both
places by refactoring visit_end_struct() to not require a second
local error object.

Signed-off-by: Eric Blake 
Reviewed-by: Marc-André Lureau 

---
v10: resplit 4/37 and 5/37 by action rather than file, retain R-b.
v9: no change
v8: no change
v7: place earlier in series, drop attempts to provide a 'kind' string,
drop bogus avoidance of qmp_object_del() on error
v6: new patch, split from RFC on v5 7/46
---
 vl.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/vl.c b/vl.c
index 24d30f2..aaa5403 100644
--- a/vl.c
+++ b/vl.c
@@ -2822,6 +2822,7 @@ static bool object_create_delayed(const char *type)
 static int object_create(void *opaque, QemuOpts *opts, Error **errp)
 {
 Error *err = NULL;
+Error *err_end = NULL;
 char *type = NULL;
 char *id = NULL;
 OptsVisitor *ov;
@@ -2844,23 +2845,24 @@ static int object_create(void *opaque, QemuOpts *opts, 
Error **errp)
 goto out;
 }
 if (!type_predicate(type)) {
+visit_end_struct(v, NULL);
 goto out;
 }

 qdict_del(pdict, "id");
 visit_type_str(v, , "id", );
 if (err) {
-goto out;
+goto out_end;
 }

 object_add(type, id, pdict, v, );
-if (err) {
-goto out;
-}
-visit_end_struct(v, );
-if (err) {
+
+out_end:
+visit_end_struct(v, _end);
+if (!err && err_end) {
 qmp_object_del(id, NULL);
 }
+error_propagate(, err_end);

 out:
 opts_visitor_cleanup(ov);
-- 
2.5.0

[Qemu-devel] [PATCH v10 14/25] qapi: Make all visitors supply uint64 callbacks

2016-01-29 Thread Eric Blake

Our qapi visitor contract supports multiple integer visitors,
but left the type_uint64 visitor as optional (falling back on
type_int64); which in turn can lead to awkward behavior with
numbers larger than INT64_MAX (the user has to be aware of
twos complement, and deal with negatives).

This patch does not address the disparity in handling large
values as negatives.  It merely moves the fallback from uint64
to int64 from the visitor core to the visitors, where the issue
can actually be fixed, by implementing the missing type_uint64()
callbacks on top of the respective type_int64() callbacks, and
with a FIXME comment explaining why that's wrong.

With that done, we now have a type_uint64() callback in every
driver, so we can make it mandatory from the core.  And although
the type_int64() callback can cover the entire valid range of
type_uint{8,16,32} on valid user input, using type_uint64() to
avoid mixed signedness makes more sense.

Signed-off-by: Eric Blake 

---
v10: improve commit message, split out dealloc type_size change
v9: hoist in part of 11/35, drop Marc-Andre's R-b
v8: no change
v7: split off int64 callbacks and retitle, add more FIXMEs in the
code, hoist use of type_uint64 here from 3/23, improved commit
message
v6: new patch, but stems from v5 23/46
---
 include/qapi/visitor-impl.h  |  9 ++---
 qapi/qapi-visit-core.c   | 36 +++-
 qapi/qapi-dealloc-visitor.c  |  6 ++
 qapi/qmp-input-visitor.c | 17 +
 qapi/qmp-output-visitor.c|  9 +
 qapi/string-input-visitor.c  | 15 +++
 qapi/string-output-visitor.c |  9 +
 7 files changed, 73 insertions(+), 28 deletions(-)

diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
index 319efe8..92c4bcb 100644
--- a/include/qapi/visitor-impl.h
+++ b/include/qapi/visitor-impl.h
@@ -40,6 +40,12 @@ struct Visitor
 void (*type_int64)(Visitor *v, int64_t *obj, const char *name,
Error **errp);
 /* Must be set. */
+void (*type_uint64)(Visitor *v, uint64_t *obj, const char *name,
+Error **errp);
+/* Optional; fallback is type_uint64().  */
+void (*type_size)(Visitor *v, uint64_t *obj, const char *name,
+  Error **errp);
+/* Must be set. */
 void (*type_bool)(Visitor *v, bool *obj, const char *name, Error **errp);
 void (*type_str)(Visitor *v, char **obj, const char *name, Error **errp);
 void (*type_number)(Visitor *v, double *obj, const char *name,
@@ -53,12 +59,9 @@ struct Visitor
 void (*type_uint8)(Visitor *v, uint8_t *obj, const char *name, Error 
**errp);
 void (*type_uint16)(Visitor *v, uint16_t *obj, const char *name, Error 
**errp);
 void (*type_uint32)(Visitor *v, uint32_t *obj, const char *name, Error 
**errp);
-void (*type_uint64)(Visitor *v, uint64_t *obj, const char *name, Error 
**errp);
 void (*type_int8)(Visitor *v, int8_t *obj, const char *name, Error **errp);
 void (*type_int16)(Visitor *v, int16_t *obj, const char *name, Error 
**errp);
 void (*type_int32)(Visitor *v, int32_t *obj, const char *name, Error 
**errp);
-/* visit_type_size() falls back to (*type_uint64)() if type_size is unset 
*/
-void (*type_size)(Visitor *v, uint64_t *obj, const char *name, Error 
**errp);
 bool (*start_union)(Visitor *v, bool data_present, Error **errp);
 };

diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index 3a888ab..ac5a861 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -96,14 +96,14 @@ void visit_type_int(Visitor *v, int64_t *obj, const char 
*name, Error **errp)

 void visit_type_uint8(Visitor *v, uint8_t *obj, const char *name, Error **errp)
 {
-int64_t value;
+uint64_t value;

 if (v->type_uint8) {
 v->type_uint8(v, obj, name, errp);
 } else {
 value = *obj;
-v->type_int64(v, , name, errp);
-if (value < 0 || value > UINT8_MAX) {
+v->type_uint64(v, , name, errp);
+if (value > UINT8_MAX) {
 /* FIXME questionable reuse of errp if callback changed
value on error */
 error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
@@ -116,14 +116,14 @@ void visit_type_uint8(Visitor *v, uint8_t *obj, const 
char *name, Error **errp)

 void visit_type_uint16(Visitor *v, uint16_t *obj, const char *name, Error 
**errp)
 {
-int64_t value;
+uint64_t value;

 if (v->type_uint16) {
 v->type_uint16(v, obj, name, errp);
 } else {
 value = *obj;
-v->type_int64(v, , name, errp);
-if (value < 0 || value > UINT16_MAX) {
+v->type_uint64(v, , name, errp);
+if (value > UINT16_MAX) {
 /* FIXME questionable reuse of errp if callback changed
value on error */
 error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
@@ -136,14 +136,14 @@ void visit_type_uint16(Visitor *v, uint16_t *obj, const 
char

[Qemu-devel] [PATCH v10 15/25] qapi: Consolidate visitor small integer callbacks

2016-01-29 Thread Eric Blake

Commit 4e27e819 introduced optional visitor callbacks for all
sorts of int types, but no visitor has supplied any of the
callbacks for sizes less than 64 bits.  In other words, the
generic implementation based on using type_[u]int64() followed
by bounds-checking works just fine. In the interest of
simplicity, it's easier to make the visitor callback interface
not have to worry about the other sizes.

Adding some helper functions minimizes the boilerplate required
to correct FIXMEs added earlier with regards to questionable
reuse of errp, particularly now that we can guarantee from a
single file audit that value is unchanged if an error is set.

Signed-off-by: Eric Blake 
Reviewed-by: Marc-André Lureau 

---
v10: tweak copyright line, rebase to earlier changes
v9: hoist some of visitor-impl.h changes into 9/35 and 10/35
v8: no change
v7: further factor out helper functions that eliminate the
questionable errp reuse
v6: split off from v5 23/46
original version also appeared in v6-v9 of subset D
---
 include/qapi/visitor-impl.h |   8 +--
 qapi/qapi-visit-core.c  | 158 +---
 2 files changed, 60 insertions(+), 106 deletions(-)

diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
index 92c4bcb..29e2c08 100644
--- a/include/qapi/visitor-impl.h
+++ b/include/qapi/visitor-impl.h
@@ -1,7 +1,7 @@
 /*
  * Core Definitions for QAPI Visitor implementations
  *
- * Copyright (C) 2012 Red Hat, Inc.
+ * Copyright (C) 2012-2016 Red Hat, Inc.
  *
  * Author: Paolo Bonizni 
  *
@@ -56,12 +56,6 @@ struct Visitor
 /* May be NULL; most useful for input visitors. */
 void (*optional)(Visitor *v, bool *present, const char *name);

-void (*type_uint8)(Visitor *v, uint8_t *obj, const char *name, Error 
**errp);
-void (*type_uint16)(Visitor *v, uint16_t *obj, const char *name, Error 
**errp);
-void (*type_uint32)(Visitor *v, uint32_t *obj, const char *name, Error 
**errp);
-void (*type_int8)(Visitor *v, int8_t *obj, const char *name, Error **errp);
-void (*type_int16)(Visitor *v, int16_t *obj, const char *name, Error 
**errp);
-void (*type_int32)(Visitor *v, int32_t *obj, const char *name, Error 
**errp);
 bool (*start_union)(Visitor *v, bool data_present, Error **errp);
 };

diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index ac5a861..d65521f 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -94,129 +94,89 @@ void visit_type_int(Visitor *v, int64_t *obj, const char 
*name, Error **errp)
 v->type_int64(v, obj, name, errp);
 }

+static void visit_type_uintN(Visitor *v, uint64_t *obj, const char *name,
+ uint64_t max, const char *type, Error **errp)
+{
+Error *err = NULL;
+uint64_t value = *obj;
+
+v->type_uint64(v, , name, );
+if (err) {
+error_propagate(errp, err);
+} else if (value > max) {
+error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+   name ? name : "null", type);
+} else {
+*obj = value;
+}
+}
+
 void visit_type_uint8(Visitor *v, uint8_t *obj, const char *name, Error **errp)
 {
-uint64_t value;
-
-if (v->type_uint8) {
-v->type_uint8(v, obj, name, errp);
-} else {
-value = *obj;
-v->type_uint64(v, , name, errp);
-if (value > UINT8_MAX) {
-/* FIXME questionable reuse of errp if callback changed
-   value on error */
-error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
-   name ? name : "null", "uint8_t");
-return;
-}
-*obj = value;
-}
+uint64_t value = *obj;
+visit_type_uintN(v, , name, UINT8_MAX, "uint8_t", errp);
+*obj = value;
 }

-void visit_type_uint16(Visitor *v, uint16_t *obj, const char *name, Error 
**errp)
+void visit_type_uint16(Visitor *v, uint16_t *obj, const char *name,
+   Error **errp)
 {
-uint64_t value;
-
-if (v->type_uint16) {
-v->type_uint16(v, obj, name, errp);
-} else {
-value = *obj;
-v->type_uint64(v, , name, errp);
-if (value > UINT16_MAX) {
-/* FIXME questionable reuse of errp if callback changed
-   value on error */
-error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
-   name ? name : "null", "uint16_t");
-return;
-}
-*obj = value;
-}
+uint64_t value = *obj;
+visit_type_uintN(v, , name, UINT16_MAX, "uint16_t", errp);
+*obj = value;
 }

-void visit_type_uint32(Visitor *v, uint32_t *obj, const char *name, Error 
**errp)
+void visit_type_uint32(Visitor *v, uint32_t *obj, const char *name,
+   Error **errp)
 {
-uint64_t value;
-
-if (v->type_uint32) {
-v->type_uint32(v, obj, name, errp);
-} else {
-value = *obj;
-v->type_uint64(v, , name, errp);
-if

[Qemu-devel] [PATCH v10 19/25] qom: Swap 'name' next to visitor in ObjectPropertyAccessor

2016-01-29 Thread Eric Blake

Similar to the previous patch, it's nice to have all functions
in the tree that involve a visitor and a name for conversion to
or from QAPI to consistently stick the 'name' parameter next
to the Visitor parameter.

Done by manually changing include/qom/object.h and qom/object.c,
then running this Coccinelle script and touching up the fallout
(Coccinelle insisted on adding some trailing whitespace).

@ rule1 @
identifier fn;
typedef Object, Visitor, Error;
identifier obj, v, opaque, name, errp;
@@
 void fn
- (Object *obj, Visitor *v, void *opaque, const char *name,
+ (Object *obj, Visitor *v, const char *name, void *opaque,
   Error **errp) { ... }

@@
identifier rule1.fn;
expression obj, v, opaque, name, errp;
@@
 fn(obj, v,
-   opaque, name,
+   name, opaque,
errp)

Signed-off-by: Eric Blake 
Reviewed-by: Marc-André Lureau 

---
v10: redo Coccinelle script with 'typedef' instead of type; net
result is the same so R-b retained
v9: typo fix in commit message, rebase to master context
v8: new patch
---
 include/qom/object.h |   4 +-
 backends/hostmem.c   |  16 +++---
 bootdevice.c |   8 +--
 hw/acpi/ich9.c   |  35 +---
 hw/block/nvme.c  |   8 +--
 hw/core/machine.c|  14 ++---
 hw/core/qdev-properties-system.c |  32 +--
 hw/core/qdev-properties.c| 116 +++
 hw/core/qdev.c   |   5 +-
 hw/i386/pc.c |  29 +-
 hw/ide/qdev.c|   8 +--
 hw/intc/xics.c   |  12 ++--
 hw/isa/lpc_ich9.c|   5 +-
 hw/mem/pc-dimm.c |   4 +-
 hw/misc/edu.c|   4 +-
 hw/misc/tmp105.c |   8 +--
 hw/net/ne2000-isa.c  |  10 ++--
 hw/pci-host/piix.c   |  10 ++--
 hw/pci-host/q35.c|  13 ++---
 hw/ppc/spapr_drc.c   |  16 +++---
 hw/usb/dev-storage.c |   8 +--
 hw/virtio/virtio-balloon.c   |   8 +--
 memory.c |  18 +++---
 net/dump.c   |   8 +--
 net/filter-buffer.c  |  10 ++--
 qom/object.c |  75 +
 target-i386/cpu.c|  67 +++---
 target-ppc/translate_init.c  |   8 +--
 tests/test-qdev-global-props.c   |  14 ++---
 29 files changed, 282 insertions(+), 291 deletions(-)

diff --git a/include/qom/object.h b/include/qom/object.h
index 3e7e99d..698827d 100644
--- a/include/qom/object.h
+++ b/include/qom/object.h
@@ -290,16 +290,16 @@ typedef struct InterfaceInfo InterfaceInfo;
  * ObjectPropertyAccessor:
  * @obj: the object that owns the property
  * @v: the visitor that contains the property data
+ * @name: the name of the property
  * @opaque: the object property opaque
- * @name: the name of the property
  * @errp: a pointer to an Error that is filled if getting/setting fails.
  *
  * Called when trying to get/set a property.
  */
 typedef void (ObjectPropertyAccessor)(Object *obj,
   Visitor *v,
-  void *opaque,
   const char *name,
+  void *opaque,
   Error **errp);

 /**
diff --git a/backends/hostmem.c b/backends/hostmem.c
index a9d30d8..2657907 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -26,8 +26,8 @@ QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_INTERLEAVE != 
MPOL_INTERLEAVE);
 #endif

 static void
-host_memory_backend_get_size(Object *obj, Visitor *v, void *opaque,
- const char *name, Error **errp)
+host_memory_backend_get_size(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
 {
 HostMemoryBackend *backend = MEMORY_BACKEND(obj);
 uint64_t value = backend->size;
@@ -36,8 +36,8 @@ host_memory_backend_get_size(Object *obj, Visitor *v, void 
*opaque,
 }

 static void
-host_memory_backend_set_size(Object *obj, Visitor *v, void *opaque,
- const char *name, Error **errp)
+host_memory_backend_set_size(Object *obj, Visitor *v, const char *name,
+ void *opaque, Error **errp)
 {
 HostMemoryBackend *backend = MEMORY_BACKEND(obj);
 Error *local_err = NULL;
@@ -63,8 +63,8 @@ out:
 }

 static void
-host_memory_backend_get_host_nodes(Object *obj, Visitor *v, void *opaque,
-   const char *name, Error **errp)
+host_memory_backend_get_host_nodes(Object *obj, Visitor *v, const char *name,
+   void *opaque, Error **errp)
 {
 HostMemoryBackend *backend = MEMORY_BACKEND(obj);
 uint16List *host_nodes = NULL;
@@ -95,8 +95,8 @@

[Qemu-devel] [PATCH v10 23/25] qapi: Drop unused error argument for list and implicit struct

2016-01-29 Thread Eric Blake

No backend was setting an error when ending an implicit struct,
or when iterating a list.  Make the callers a bit easier to follow
by making this a part of the contract, and removing the errp
argument - callers can then unconditionally end an object as
part of cleanup without having to think about whether a second
error is dominated by a first, because there is no second error.

A later patch will then tackle the larger task of splitting
visit_end_struct(), which can indeed set an error (and that
cleanup will also have the side-effect of removing the use of
error_abort added here).

Signed-off-by: Eric Blake 

---
v10: split out qmp input changes, also fix visit_next_list(), drop R-b
v9: enhance commit message
v8: no change
v7: place earlier in series, rebase to earlier changes
v6: new patch, split from RFC on v5 7/46
---
 include/qapi/visitor.h   |  8 +---
 include/qapi/visitor-impl.h  |  9 ++---
 scripts/qapi-visit.py| 12 
 qapi/qapi-visit-core.c   | 12 ++--
 hw/ppc/spapr_drc.c   |  6 +-
 qapi/opts-visitor.c  |  6 +++---
 qapi/qapi-dealloc-visitor.c  |  8 
 qapi/qmp-input-visitor.c | 11 +++
 qapi/qmp-output-visitor.c|  6 +++---
 qapi/string-input-visitor.c  |  8 +++-
 qapi/string-output-visitor.c |  8 +++-
 11 files changed, 41 insertions(+), 53 deletions(-)

diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
index 997555d..5e581dc 100644
--- a/include/qapi/visitor.h
+++ b/include/qapi/visitor.h
@@ -1,6 +1,7 @@
 /*
  * Core Definitions for QAPI Visitor Classes
  *
+ * Copyright (C) 2012-2016 Red Hat, Inc.
  * Copyright IBM, Corp. 2011
  *
  * Authors:
@@ -32,10 +33,11 @@ void visit_start_struct(Visitor *v, const char *name, void 
**obj,
 void visit_end_struct(Visitor *v, Error **errp);
 void visit_start_implicit_struct(Visitor *v, void **obj, size_t size,
  Error **errp);
-void visit_end_implicit_struct(Visitor *v, Error **errp);
+void visit_end_implicit_struct(Visitor *v);
+
 void visit_start_list(Visitor *v, const char *name, Error **errp);
-GenericList *visit_next_list(Visitor *v, GenericList **list, Error **errp);
-void visit_end_list(Visitor *v, Error **errp);
+GenericList *visit_next_list(Visitor *v, GenericList **list);
+void visit_end_list(Visitor *v);

 /**
  * Check if an optional member @name of an object needs visiting.
diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
index 337f999..ea252f8 100644
--- a/include/qapi/visitor-impl.h
+++ b/include/qapi/visitor-impl.h
@@ -24,11 +24,14 @@ struct Visitor

 void (*start_implicit_struct)(Visitor *v, void **obj, size_t size,
   Error **errp);
-void (*end_implicit_struct)(Visitor *v, Error **errp);
+/* May be NULL */
+void (*end_implicit_struct)(Visitor *v);

 void (*start_list)(Visitor *v, const char *name, Error **errp);
-GenericList *(*next_list)(Visitor *v, GenericList **list, Error **errp);
-void (*end_list)(Visitor *v, Error **errp);
+/* Must be set */
+GenericList *(*next_list)(Visitor *v, GenericList **list);
+/* Must be set */
+void (*end_list)(Visitor *v);

 void (*type_enum)(Visitor *v, const char *name, int *obj,
   const char *const strings[], Error **errp);
diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index 308000f..0fdcebc 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -62,7 +62,7 @@ static void visit_type_implicit_%(c_type)s(Visitor *v, 
%(c_type)s **obj, Error *
 visit_start_implicit_struct(v, (void **)obj, sizeof(%(c_type)s), );
 if (!err) {
 visit_type_%(c_type)s_fields(v, obj, errp);
-visit_end_implicit_struct(v, );
+visit_end_implicit_struct(v);
 }
 error_propagate(errp, err);
 }
@@ -161,15 +161,13 @@ void visit_type_%(c_name)s(Visitor *v, const char *name, 
%(c_name)s **obj, Error
 }

 for (prev = (GenericList **)obj;
- !err && (i = visit_next_list(v, prev, )) != NULL;
+ !err && (i = visit_next_list(v, prev)) != NULL;
  prev = ) {
 %(c_name)s *native_i = (%(c_name)s *)i;
 visit_type_%(c_elt_type)s(v, NULL, _i->value, );
 }

-error_propagate(errp, err);
-err = NULL;
-visit_end_list(v, );
+visit_end_list(v);
 out:
 error_propagate(errp, err);
 }
@@ -230,9 +228,7 @@ void visit_type_%(c_name)s(Visitor *v, const char *name, 
%(c_name)s **obj, Error
"%(name)s");
 }
 out_obj:
-error_propagate(errp, err);
-err = NULL;
-visit_end_implicit_struct(v, );
+visit_end_implicit_struct(v);
 out:
 error_propagate(errp, err);
 }
diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index 295..89599e8 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -38,10 +38,10 @@ void visit_start_implicit_struct(Visitor *v, void **obj, 
size_t size,
 }
 }

-void

Re: [Qemu-devel] [PATCH v7 01/13] machine: Don't allow CPU toplogies with partially filled cores

2016-01-29 Thread Eduardo Habkost

On Fri, Jan 29, 2016 at 02:52:30PM +1100, David Gibson wrote:
> On Thu, Jan 28, 2016 at 11:19:43AM +0530, Bharata B Rao wrote:
> > Prevent guests from booting with CPU topologies that have partially
> > filled CPU cores or can result in partially filled CPU cores after
> > CPU hotplug like
> > 
> > -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> > -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> > 
> > This is enforced by introducing MachineClass::validate_smp_config()
> > that gets called from generic SMP parsing code. Machine type versions
> > that want to enforce this can define this to the generic version
> > provided.
> > 
> > Only sPAPR and PC machine types starting from version 2.6 enforce this in
> > this patch.
> > 
> > Signed-off-by: Bharata B Rao 
> 
> I've been kind of lost in the back and forth about
> threads/cores/sockets.
> 
> What, in the end, is the rationale for allowing partially filled
> sockets, but not partially filled cores?

I don't think there's a good reason for that (at least for PC).

It's easier to relax the requirements later if necessary, than
dealing with compatibility issues again when making the code more
strict. So I suggest we make validate_smp_config_generic() also
check if smp_cpus % (smp_threads * smp_cores) == 0.

-- 
Eduardo

Re: [Qemu-devel] [PATCH 1/1] arm: virt: change GPIO trigger interrupt to pulse

2016-01-29 Thread Shannon Zhao

On 2016/1/29 22:50, Wei Huang wrote:

On 01/29/2016 08:46 AM, Shannon Zhao wrote:

>
>
>On 2016/1/29 22:35, Wei Huang wrote:

>>
>>
>>On 01/29/2016 04:10 AM, Shannon Zhao wrote:

>>>Hi，
>>>
>>>This makes ACPI work well but makes DT not work. The reason is
>>>systemd or
>>>acpid open /dev/input/event0 failed. So the interrupt could be
>>>injected and
>>>could see under /proc/interrupts but guest doesn't have any action. I'll
>>>investigate why it opens failed later.

>>
>>That is interesting. Could you try it with the following? This reverses
>>the order to down-up and worked on ACPI case.
>>

>Yeah, that's very weird.
>

>>qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 0);
>>qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 1);
>>

>I'll try this tomorrow. But even if this works, it's still weird.

To reproduce this case, do the following steps using current upstream
qemu: create vm => reboot vm (succeed) => reboot or shutdown vm (fail).
Apparently the last interrupt wasn't received correctly.

Yes, I reproduce this today. Let's clarify current state.
Firstly, for ACPI it should use qemu_irq_pulse since we make the GPIO 
pin edge-triggered. And for DT, it uses gpio-key which is also 
edge-triggered that we could get from output of guest /proc/interrupts.

Secondly, current upstream qemu with your patch makes second reboot 
works when using ACPI. But first shutdown/reboot doesn't works when 
using DT since the systemd or acpid open /dev/input/event0 failed. This 
is what I'm surprised.

Wei, what userspace program your guest uses? systemd or acpid? Could you 
please try to use DT to test your patch? And see if there is a same 
result with me.(I know Redhat kernel uses ACPI by default, so you could 
append acpi=off to switch to DT)

Thanks,
--
Shannon

Re: [Qemu-devel] [PATCH v8 06/16] nbd: Switch from close to eject notifier

2016-01-29 Thread Max Reitz

On 28.01.2016 04:26, Fam Zheng wrote:
> On Wed, 01/27 18:59, Max Reitz wrote:
>> The NBD code uses the BDS close notifier to determine when a medium is
>> ejected. However, now it should use the BB's BDS removal notifier for
>> that instead of the BDS's close notifier.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>  blockdev-nbd.c | 40 +---
>>  nbd/server.c   | 13 +
>>  2 files changed, 18 insertions(+), 35 deletions(-)
>>
>> diff --git a/blockdev-nbd.c b/blockdev-nbd.c
>> index 4a758ac..9d6a21c 100644
>> --- a/blockdev-nbd.c
>> +++ b/blockdev-nbd.c
>> @@ -45,37 +45,11 @@ void qmp_nbd_server_start(SocketAddress *addr, Error 
>> **errp)
>>  }
>>  }
>>  
>> -/*
>> - * Hook into the BlockBackend notifiers to close the export when the
>> - * backend is closed.
>> - */
>> -typedef struct NBDCloseNotifier {
>> -Notifier n;
>> -NBDExport *exp;
>> -QTAILQ_ENTRY(NBDCloseNotifier) next;
>> -} NBDCloseNotifier;
>> -
>> -static QTAILQ_HEAD(, NBDCloseNotifier) close_notifiers =
>> -QTAILQ_HEAD_INITIALIZER(close_notifiers);
>> -
>> -static void nbd_close_notifier(Notifier *n, void *data)
>> -{
>> -NBDCloseNotifier *cn = DO_UPCAST(NBDCloseNotifier, n, n);
>> -
>> -notifier_remove(>n);
>> -QTAILQ_REMOVE(_notifiers, cn, next);
>> -
>> -nbd_export_close(cn->exp);
>> -nbd_export_put(cn->exp);
>> -g_free(cn);
>> -}
>> -
>>  void qmp_nbd_server_add(const char *device, bool has_writable, bool 
>> writable,
>>  Error **errp)
>>  {
>>  BlockBackend *blk;
>>  NBDExport *exp;
>> -NBDCloseNotifier *n;
>>  
>>  if (server_fd == -1) {
>>  error_setg(errp, "NBD server not running");
>> @@ -113,19 +87,15 @@ void qmp_nbd_server_add(const char *device, bool 
>> has_writable, bool writable,
>>  
>>  nbd_export_set_name(exp, device);
>>  
>> -n = g_new0(NBDCloseNotifier, 1);
>> -n->n.notify = nbd_close_notifier;
>> -n->exp = exp;
>> -blk_add_close_notifier(blk, >n);
>> -QTAILQ_INSERT_TAIL(_notifiers, n, next);
>> +/* The list of named exports has a strong reference to this export now 
>> and
>> + * our only way of accessing it is through nbd_export_find(), so we can 
>> drop
>> + * the strong reference that is @exp. */
> 
> Not quite sure about the meaning of "the strong reference that is @exp", I
> guess you mean the one reference born in nbd_export_new(), which would match
> the code.  Other than this,

Yes, the reference returned by nbd_export_new(), which is then stored in
@exp. This is a strong reference, so once @exp goes out of scope, the
reference counter has to be decremented.

Max

> Reviewed-by: Fam Zheng 



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v8 12/16] blockdev: Keep track of monitor-owned BDS

2016-01-29 Thread Max Reitz

On 28.01.2016 04:33, Fam Zheng wrote:
> On Wed, 01/27 18:59, Max Reitz wrote:
>> Signed-off-by: Max Reitz 
>> ---
>>  blockdev.c | 26 ++
>>  include/block/block_int.h  |  4 
>>  stubs/Makefile.objs|  1 +
>>  stubs/blockdev-close-all-bdrv-states.c |  5 +
>>  4 files changed, 36 insertions(+)
>>  create mode 100644 stubs/blockdev-close-all-bdrv-states.c
>>
>> diff --git a/blockdev.c b/blockdev.c
>> index 09d4621..ac93f43 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -50,6 +50,9 @@
>>  #include "trace.h"
>>  #include "sysemu/arch_init.h"
>>  
>> +static QTAILQ_HEAD(, BlockDriverState) monitor_bdrv_states =
>> +QTAILQ_HEAD_INITIALIZER(monitor_bdrv_states);
>> +
>>  static const char *const if_name[IF_COUNT] = {
>>  [IF_NONE] = "none",
>>  [IF_IDE] = "ide",
>> @@ -702,6 +705,19 @@ fail:
>>  return NULL;
>>  }
>>  
>> +void blockdev_close_all_bdrv_states(void)
>> +{
>> +BlockDriverState *bs, *next_bs;
>> +
>> +QTAILQ_FOREACH_SAFE(bs, _bdrv_states, monitor_list, next_bs) {
>> +AioContext *ctx = bdrv_get_aio_context(bs);
>> +
>> +aio_context_acquire(ctx);
>> +bdrv_unref(bs);
>> +aio_context_release(ctx);
>> +}
>> +}
>> +
>>  static void qemu_opt_rename(QemuOpts *opts, const char *from, const char 
>> *to,
>>  Error **errp)
>>  {
>> @@ -3875,12 +3891,15 @@ void qmp_blockdev_add(BlockdevOptions *options, 
>> Error **errp)
>>  if (!bs) {
>>  goto fail;
>>  }
>> +
>> +QTAILQ_INSERT_TAIL(_bdrv_states, bs, monitor_list);
>>  }
>>  
>>  if (bs && bdrv_key_required(bs)) {
>>  if (blk) {
>>  blk_unref(blk);
>>  } else {
>> +QTAILQ_REMOVE(_bdrv_states, bs, monitor_list);
>>  bdrv_unref(bs);
>>  }
>>  error_setg(errp, "blockdev-add doesn't support encrypted devices");
>> @@ -3945,11 +3964,18 @@ void qmp_x_blockdev_del(bool has_id, const char *id,
>> bdrv_get_device_or_node_name(bs));
>>  goto out;
>>  }
>> +
>> +if (!blk && !bs->monitor_list.tqe_prev) {
>> +error_setg(errp, "Node %s is not owned by the monitor",
>> +   bs->node_name);
>> +goto out;
>> +}
> 
> Is this an extra restriction added by this patch?

I hope not. This is just an additional check that should not change
behavior; if it does, we did something wrong.

>   Deserve some words in the
> commit message?

I'll see if I can come up with something.

Max

>>  }
>>  
>>  if (blk) {
>>  blk_unref(blk);
>>  } else {
>> +QTAILQ_REMOVE(_bdrv_states, bs, monitor_list);
>>  bdrv_unref(bs);
>>  }
>>  
>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>> index 1e4c518..dd00d12 100644
>> --- a/include/block/block_int.h
>> +++ b/include/block/block_int.h
>> @@ -445,6 +445,8 @@ struct BlockDriverState {
>>  QTAILQ_ENTRY(BlockDriverState) device_list;
>>  /* element of the list of all BlockDriverStates (all_bdrv_states) */
>>  QTAILQ_ENTRY(BlockDriverState) bs_list;
>> +/* element of the list of monitor-owned BDS */
>> +QTAILQ_ENTRY(BlockDriverState) monitor_list;
>>  QLIST_HEAD(, BdrvDirtyBitmap) dirty_bitmaps;
>>  int refcnt;
>>  
>> @@ -707,4 +709,6 @@ bool bdrv_requests_pending(BlockDriverState *bs);
>>  void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out);
>>  void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in);
>>  
>> +void blockdev_close_all_bdrv_states(void);
>> +
>>  #endif /* BLOCK_INT_H */
>> diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
>> index d7898a0..e922de9 100644
>> --- a/stubs/Makefile.objs
>> +++ b/stubs/Makefile.objs
>> @@ -1,5 +1,6 @@
>>  stub-obj-y += arch-query-cpu-def.o
>>  stub-obj-y += bdrv-commit-all.o
>> +stub-obj-y += blockdev-close-all-bdrv-states.o
>>  stub-obj-y += clock-warp.o
>>  stub-obj-y += cpu-get-clock.o
>>  stub-obj-y += cpu-get-icount.o
>> diff --git a/stubs/blockdev-close-all-bdrv-states.c 
>> b/stubs/blockdev-close-all-bdrv-states.c
>> new file mode 100644
>> index 000..12d2442
>> --- /dev/null
>> +++ b/stubs/blockdev-close-all-bdrv-states.c
>> @@ -0,0 +1,5 @@
>> +#include "block/block_int.h"
>> +
>> +void blockdev_close_all_bdrv_states(void)
>> +{
>> +}
>> -- 
>> 2.7.0
>>




signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v10 10/25] qapi: Improve generated event use of qapi visitor

2016-01-29 Thread Eric Blake

All other successful clients of visit_start_struct() were paired
with an unconditional visit_end_struct(); but the generated
code for events was relying on qmp_output_visitor_cleanup() to
work on an incomplete visit.  Alter the code to guarantee that
the struct is completed, which will make a future patch to
split visit_end_struct() easier to reason about.  While at it,
drop some assertions and comments that are not present in other
uses of the qmp output visitor, and pass NULL rather than "" as
the 'kind' parameter (matching most other uses where obj is NULL).

The changes to the generated code look like:

| qov = qmp_output_visitor_new();
|-g_assert(qov);
|-
| v = qmp_output_get_visitor(qov);
|-g_assert(v);
|
|-/* Fake visit, as if all members are under a structure */
|-visit_start_struct(v, NULL, "", "ACPI_DEVICE_OST", 0, );
|+visit_start_struct(v, NULL, NULL, "ACPI_DEVICE_OST", 0, );
| if (err) {
| goto out;
| }
| visit_type_ACPIOSTInfo(v, , "info", );
| if (err) {
|-goto out;
|+goto out_obj;
| }
|-visit_end_struct(v, );
|+out_obj:
|+visit_end_struct(v, err ? NULL : );
| if (err) {
| goto out;
| }
|
| obj = qmp_output_get_qobject(qov);
|-g_assert(obj != NULL);
|+g_assert(obj);
|
| qdict_put_obj(qmp, "data", obj);
| emit(QAPI_EVENT_ACPI_DEVICE_OST, qmp, );
|
|out:
| qmp_output_visitor_cleanup(qov);
| error_propagate(errp, err);

Note that the 'goto out_obj' with no intervening code before the
label, as well as the construct of 'err ? NULL : ', are both
a bit unusual but also temporary; they get fixed in a later patch
that splits visit_end_struct() to drop its errp parameter by moving
some checking before the label.  But until that time, this was the
simplest way to avoid the appearance of passing a possibly-set
error to visit_end_struct(), even though actual code inspection
shows that visit_end_struct() for a QMP output visitor will never
set an error.

Signed-off-by: Eric Blake 

---
v10: avoid appearance of  misuse; enhance commit message
v9: save churn in declaration order for later series on boxed params,
drop Marc-Andre's R-b
v8: no change
v7: place earlier in series, adjust handling of 'kind'
v6: new patch

If desired, I can defer the hunk re-ordering the declaration of
obj to later in the series where it actually comes in handy.
---
 scripts/qapi-event.py | 16 +++-
 scripts/qapi.py   |  5 +++--
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/scripts/qapi-event.py b/scripts/qapi-event.py
index 720486f..0f5534f 100644
--- a/scripts/qapi-event.py
+++ b/scripts/qapi-event.py
@@ -2,7 +2,7 @@
 # QAPI event generator
 #
 # Copyright (c) 2014 Wenchao Xia
-# Copyright (c) 2015 Red Hat Inc.
+# Copyright (c) 2015-2016 Red Hat Inc.
 #
 # Authors:
 #  Wenchao Xia 
@@ -61,25 +61,23 @@ def gen_event_send(name, arg_type):
 if arg_type and arg_type.members:
 ret += mcgen('''
 qov = qmp_output_visitor_new();
-g_assert(qov);
-
 v = qmp_output_get_visitor(qov);
-g_assert(v);

-/* Fake visit, as if all members are under a structure */
-visit_start_struct(v, NULL, "", "%(name)s", 0, );
+visit_start_struct(v, NULL, NULL, "%(name)s", 0, );
 ''',
  name=name)
 ret += gen_err_check()
-ret += gen_visit_fields(arg_type.members, need_cast=True)
+ret += gen_visit_fields(arg_type.members, need_cast=True,
+label='out_obj')
 ret += mcgen('''
-visit_end_struct(v, );
+out_obj:
+visit_end_struct(v, err ? NULL : );
 if (err) {
 goto out;
 }

 obj = qmp_output_get_qobject(qov);
-g_assert(obj != NULL);
+g_assert(obj);

 qdict_put_obj(qmp, "data", obj);
 ''')
diff --git a/scripts/qapi.py b/scripts/qapi.py
index 37aa6fe..43b3251 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -1636,7 +1636,8 @@ def gen_err_check(label='out', skiperr=False):
  label=label)


-def gen_visit_fields(members, prefix='', need_cast=False, skiperr=False):
+def gen_visit_fields(members, prefix='', need_cast=False, skiperr=False,
+ label='out'):
 ret = ''
 if skiperr:
 errparg = 'NULL'
@@ -1664,7 +1665,7 @@ def gen_visit_fields(members, prefix='', need_cast=False, 
skiperr=False):
  c_type=memb.type.c_name(), prefix=prefix, cast=cast,
  c_name=c_name(memb.name), name=memb.name,
  errp=errparg)
-ret += gen_err_check(skiperr=skiperr)
+ret += gen_err_check(skiperr=skiperr, label=label)

 if memb.optional:
 pop_indent()
-- 
2.5.0

[Qemu-devel] [PATCH v10 18/25] qapi: Swap visit_* arguments for consistent 'name' placement

2016-01-29 Thread Eric Blake

JSON uses "name":value, but many of our visitor interfaces were
called with visit_type_FOO(v, , name, errp).  This can be
a bit confusing to have to mentally swap the parameter order to
match JSON order.  It's particularly bad for visit_start_struct(),
where the 'name' parameter is smack in the middle of the
otherwise-related group of 'obj, kind, size' parameters! It's
time to do a global swap of the parameter ordering, so that the
'name' parameter is always immediately after the Visitor argument.

Additional reason in favor of the swap: the existing include/qjson.h
prefers listing 'name' first in json_prop_*(), and I have plans to
unify that file with the qapi visitors; listing 'name' first in
qapi will minimize churn to the (admittedly few) qjson.h clients.

Later patches will then fix docs, object.h, visitor-impl.h, and
those clients to match.

Done by first patching scripts/qapi*.py by hand to make generated
files do what I want, then by running the following Coccinelle
script to affect the rest of the code base:
 $ spatch --sp-file script `git grep -l '\bvisit_' -- '**/*.[ch]'`
I then had to apply some touchups (Coccinelle insisted on TAB
indentation in visitor.h, and botched the signature of
visit_type_enum() by rewriting 'const char *const strings[]' to
the syntactically invalid 'const char*const[] strings').  The
movement of parameters is sufficient to provoke compiler errors
if any callers were missed.

// Part 1: Swap declaration order
@@
type TV, TErr, TObj, T1, T2;
identifier OBJ, ARG1, ARG2;
@@
 void visit_start_struct
-(TV v, TObj OBJ, T1 ARG1, const char *name, T2 ARG2, TErr errp)
+(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
 { ... }

@@
type bool, TV, T1;
identifier ARG1;
@@
 bool visit_optional
-(TV v, T1 ARG1, const char *name)
+(TV v, const char *name, T1 ARG1)
 { ... }

@@
type TV, TErr, TObj, T1;
identifier OBJ, ARG1;
@@
 void visit_get_next_type
-(TV v, TObj OBJ, T1 ARG1, const char *name, TErr errp)
+(TV v, const char *name, TObj OBJ, T1 ARG1, TErr errp)
 { ... }

@@
type TV, TErr, TObj, T1, T2;
identifier OBJ, ARG1, ARG2;
@@
 void visit_type_enum
-(TV v, TObj OBJ, T1 ARG1, T2 ARG2, const char *name, TErr errp)
+(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
 { ... }

@@
type TV, TErr, TObj;
identifier OBJ;
identifier VISIT_TYPE =~ "^visit_type_";
@@
 void VISIT_TYPE
-(TV v, TObj OBJ, const char *name, TErr errp)
+(TV v, const char *name, TObj OBJ, TErr errp)
 { ... }

// Part 2: swap caller order
@@
expression V, NAME, OBJ, ARG1, ARG2, ERR;
identifier VISIT_TYPE =~ "^visit_type_";
@@
(
-visit_start_struct(V, OBJ, ARG1, NAME, ARG2, ERR)
+visit_start_struct(V, NAME, OBJ, ARG1, ARG2, ERR)
|
-visit_optional(V, ARG1, NAME)
+visit_optional(V, NAME, ARG1)
|
-visit_get_next_type(V, OBJ, ARG1, NAME, ERR)
+visit_get_next_type(V, NAME, OBJ, ARG1, ERR)
|
-visit_type_enum(V, OBJ, ARG1, ARG2, NAME, ERR)
+visit_type_enum(V, NAME, OBJ, ARG1, ARG2, ERR)
|
-VISIT_TYPE(V, OBJ, NAME, ERR)
+VISIT_TYPE(V, NAME, OBJ, ERR)
)

Signed-off-by: Eric Blake 
Reviewed-by: Marc-André Lureau 

---
v10: commit message improvement, rebase to earlier changes
v9: mention later docs cleanup in commit message, rebase to master
v8: new patch
---
 include/qapi/visitor.h | 52 +++--
 scripts/qapi-commands.py   |  4 +-
 scripts/qapi-event.py  |  2 +-
 scripts/qapi-types.py  |  2 +-
 scripts/qapi-visit.py  | 28 ++---
 scripts/qapi.py|  4 +-
 qapi/qapi-visit-core.c | 54 +
 backends/hostmem.c |  8 ++--
 block/qapi.c   |  2 +-
 blockdev.c |  4 +-
 bootdevice.c   |  4 +-
 hmp.c  |  8 ++--
 hw/acpi/core.c |  4 +-
 hw/acpi/ich9.c | 14 +++
 hw/block/nvme.c|  4 +-
 hw/core/machine.c  | 10 ++---
 hw/core/qdev-properties-system.c   | 12 +++---
 hw/core/qdev-properties.c  | 66 +++
 hw/core/qdev.c |  2 +-
 hw/i386/pc.c   | 14 +++
 hw/ide/qdev.c  |  4 +-
 hw/intc/xics.c |  8 ++--
 hw/isa/lpc_ich9.c  |  2 +-
 hw/mem/pc-dimm.c   |  2 +-
 hw/misc/edu.c  |  2 +-
 hw/misc/tmp105.c   |  4 +-
 hw/net/ne2000-isa.c|  4 +-
 hw/pci-host/piix.c |  8 ++--
 hw/pci-host/q35.c  | 10 ++---
 hw/ppc/spapr_drc.c | 12 +++---
 hw/usb/dev-storage.c

Re: [Qemu-devel] [PATCH] build: Add include check on syscall.h

2016-01-29 Thread Lluís Vilanova

Peter Maydell writes:

> On 29 January 2016 at 13:30, Lluís Vilanova  wrote:
>> Peter Maydell writes:
>>> Adding include guards is fine, but it sounds to me like what we
>>> should actually do to fix this confusion is rename all the linux-user
>>> local headers to target_syscall.h.
>> 
>> Hmmm, I didn't know if using the same name was on purpose or not. If the
>> intention was *not* to override the system's syscall.h, then a rename is the
>> proper solution.


> Yes, the intention is absolutely not to override any system header
> (the constants defined are only relevant to the guest, and if the
> header got included and overrode the host's syscall.h then nothing
> would work and it probably wouldn't even compile). It just ended up
> with the same name by accident.

Aha, then I'll resend with the filles renamed.

Cheers,
  Lluis

[Qemu-devel] [PATCH] arm: virt-acpi: each MADT.GICC entry as enabled unconditionally

2016-01-29 Thread Igor Mammedov

in current impl. condition

build_madt() {
  ...
  if (test_bit(i, cpuinfo->found_cpus))

is always true since loop handles only present CPUs
in range [0..smp_cpus).
But to fill usless cpuinfo->found_cpus we do unnecessary
scan over QOM tree to find the same CPUs.
So mark GICC as present always and drop not needed
code that fills cpuinfo->found_cpus.

Signed-off-by: Igor Mammedov 
---
It's just simple cleanup but I'm trying to generalize
a bit CPU related ACPI tables and as part of it get rid
of found_cpus bitmap and if possible cpu_index usage
in ACPI parts of code.
---
 hw/arm/virt-acpi-build.c | 26 +++---
 1 file changed, 3 insertions(+), 23 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 87fbe7c..3ed39fc 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -46,20 +46,6 @@
 #define ARM_SPI_BASE 32
 #define ACPI_POWER_BUTTON_DEVICE "PWRB"
 
-typedef struct VirtAcpiCpuInfo {
-DECLARE_BITMAP(found_cpus, VIRT_ACPI_CPU_ID_LIMIT);
-} VirtAcpiCpuInfo;
-
-static void virt_acpi_get_cpu_info(VirtAcpiCpuInfo *cpuinfo)
-{
-CPUState *cpu;
-
-memset(cpuinfo->found_cpus, 0, sizeof cpuinfo->found_cpus);
-CPU_FOREACH(cpu) {
-set_bit(cpu->cpu_index, cpuinfo->found_cpus);
-}
-}
-
 static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
 {
 uint16_t i;
@@ -458,8 +444,7 @@ build_gtdt(GArray *table_data, GArray *linker)
 
 /* MADT */
 static void
-build_madt(GArray *table_data, GArray *linker, VirtGuestInfo *guest_info,
-   VirtAcpiCpuInfo *cpuinfo)
+build_madt(GArray *table_data, GArray *linker, VirtGuestInfo *guest_info)
 {
 int madt_start = table_data->len;
 const MemMapEntry *memmap = guest_info->memmap;
@@ -489,9 +474,7 @@ build_madt(GArray *table_data, GArray *linker, 
VirtGuestInfo *guest_info,
 gicc->cpu_interface_number = i;
 gicc->arm_mpidr = armcpu->mp_affinity;
 gicc->uid = i;
-if (test_bit(i, cpuinfo->found_cpus)) {
-gicc->flags = cpu_to_le32(ACPI_GICC_ENABLED);
-}
+gicc->flags = cpu_to_le32(ACPI_GICC_ENABLED);
 }
 
 if (guest_info->gic_version == 3) {
@@ -599,11 +582,8 @@ void virt_acpi_build(VirtGuestInfo *guest_info, 
AcpiBuildTables *tables)
 {
 GArray *table_offsets;
 unsigned dsdt, rsdt;
-VirtAcpiCpuInfo cpuinfo;
 GArray *tables_blob = tables->table_data;
 
-virt_acpi_get_cpu_info();
-
 table_offsets = g_array_new(false, true /* clear */,
 sizeof(uint32_t));
 
@@ -630,7 +610,7 @@ void virt_acpi_build(VirtGuestInfo *guest_info, 
AcpiBuildTables *tables)
 build_fadt(tables_blob, tables->linker, dsdt);
 
 acpi_add_table(table_offsets, tables_blob);
-build_madt(tables_blob, tables->linker, guest_info, );
+build_madt(tables_blob, tables->linker, guest_info);
 
 acpi_add_table(table_offsets, tables_blob);
 build_gtdt(tables_blob, tables->linker);
-- 
1.8.3.1

Re: [Qemu-devel] [PULL 00/39] ppc-for-2.6 queue 20160129

2016-01-29 Thread Peter Maydell

On 29 January 2016 at 05:06, David Gibson <da...@gibson.dropbear.id.au> wrote:
> The following changes since commit 357e81c7e880f868833edf9f53cce1f3b09ea8ec:
>
>   Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20160128' into 
> staging (2016-01-28 11:46:34 +)
>
> are available in the git repository at:
>
>   git://github.com/dgibson/qemu.git tags/ppc-for-2.6-20160129
>
> for you to fetch changes up to 1699679e699276c0538008f6ca74cd04e6c68b42:
>
>   target-ppc: Make every FPSCR_ macro have a corresponding FP_ macro 
> (2016-01-29 14:01:52 +1100)
>
> 
> ppc patch queue for 2016-01-29
>
> Currently accumulated patches for target-ppc, pseries machine type and
> related devices.
>   * Cleanup of error handling code in spapr
>   * A number of fixes for Macintosh devices for the benefit of MacOS 9 and X
>   * Remove some abuses of the RTAS memory access functions in spapr
>   * Fixes for the gdbstub (and monitor debug) for VMX and VSX extensions.
>   * Fix pseries machine hotplug memory under TCG
>   * Clean up and extend handling of multiple page sizes with 64-bit hash MMUs
>

Hi. Unfortunately this generates errors when built with clang:

/home/petmay01/linaro/qemu-for-merges/target-ppc/mmu_helper.c:660:20:
error: unused function 'ppc4xx_tlb_invalidate_virt'
[-Werror,-Wunused-function]
static inline void ppc4xx_tlb_invalidate_virt(CPUPPCState *env,
   ^
1 error generated.

The function does appear from a quick grep to be entirely unused...

(GCC doesn't complain about this because it doesn't warn about unused
static inline functions in a .c file, but clang does.)

thanks
-- PMM

Re: [Qemu-devel] [PATCH 1/1] arm: virt: change GPIO trigger interrupt to pulse

2016-01-29 Thread Shannon Zhao




On 2016/1/29 22:35, Wei Huang wrote:



On 01/29/2016 04:10 AM, Shannon Zhao wrote:

Hi，

This makes ACPI work well but makes DT not work. The reason is systemd or
acpid open /dev/input/event0 failed. So the interrupt could be injected and
could see under /proc/interrupts but guest doesn't have any action. I'll
investigate why it opens failed later.


That is interesting. Could you try it with the following? This reverses
the order to down-up and worked on ACPI case.


Yeah, that's very weird.


qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 0);
qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 1);


I'll try this tomorrow. But even if this works, it's still weird.


Thanks,
-Wei



2016年1月29日星期五，Wei Huang  写道：


When QEMU is hook'ed up with libvirt/virsh, the first ACPI reboot
request will succeed; but the following shutdown/reboot requests
fail to trigger VMs to react. Notice that in mach-virt machine
model GPIO is defined as edge-triggered and active-high in ACPI.
This patch changes the behavior of powerdown notifier from PULLUP
to PULSE. It solves the problem described above (i.e. reboot
continues to work).

Signed-off-by: Wei Huang >
---
  hw/arm/virt.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 05f9087..b5468a9 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -546,7 +546,7 @@ static DeviceState *pl061_dev;
  static void virt_powerdown_req(Notifier *n, void *opaque)
  {
  /* use gpio Pin 3 for power button event */
-qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 1);
+qemu_irq_pulse(qdev_get_gpio_in(pl061_dev, 3));
  }

  static Notifier virt_system_powerdown_notifier = {
--
1.8.3.1






--
Shannon

Re: [Qemu-devel] help with understanding qcow2 file format

2016-01-29 Thread lspnet

hi，I have read 2015-qcow2-expanded.pdf and qcow2.txt，
so I understand how to convert  the offset in the virtual disk to the 
offset into the image file（qcow2).

but I wish to know how to convert  the block using ext4 to the offset in 
the virtual disk. Please help me.

the file block information is below:

cloud@cloud-pc:$ sudo debugfs /dev/vda1
debugfs: blocks /home/cloud/test
347008

cloud@cloud-pc:$ stat /home/cloud/test
File: /home/cloud/test
size:8   Blocks:8   IO Block:4096  regular file
Device: fd01h/64769d  Inode:131601  Links:1

OS' file system information is below: (ubuntu 14.04, ext4)

cloud@cloud-pc:$ fdisk -l 
Disk /dev/vda: 8589 MB, 8589934592 bytes
16 heads, 63 sectors/track, 16644 cylinders, total 16777216 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x05f4

  Device Boot  Start End  Blocks   Id  System
/dev/vda1   *204812582911 6290432   83  Linux
/dev/vda21258495816775167 20951055  Extended
/dev/vda51258496016775167 2095104   82  Linux swap 
/ Solaris

cloud@cloud-pc:$ dumpe2fs -h /dev/vad1
Filesystem volume name:   
Last mounted on:  /
Filesystem UUID:  bbcbbb0e-a335-46fe-b829-b4bf7bef513b
Filesystem magic number:  0xEF53
Filesystem revision #:1 (dynamic)
Filesystem features:  has_journal ext_attr resize_inode dir_index 
filetype needs_recovery extent flex_bg sparse_super large_file 
huge_file uninit_bg dir_nlink 
extra_isize
Filesystem flags: signed_directory_hash 
Default mount options:user_xattr acl
Filesystem state: clean
Errors behavior:  Continue
Filesystem OS type:   Linux
Inode count:  393216
Block count:  1572608
Reserved block count: 78630
Free blocks:  716304
Free inodes:  215296
First block:  0
Block size:   4096
Fragment size:4096
Reserved GDT blocks:  383
Blocks per group: 32768
Fragments per group:  32768
Inodes per group: 8192
Inode blocks per group:   512
Flex block group size:16
Filesystem created:   Thu Jan 21 14:42:55 2016
Last mount time:  Fri Jan 29 21:25:03 2016
Last write time:  Fri Jan 29 21:25:03 2016
Mount count:  7
Maximum mount count:  -1
Last checked: Thu Jan 21 14:42:55 2016
Check interval:   0 ()
Lifetime writes:  4221 MB
Reserved blocks uid:  0 (user root)
Reserved blocks gid:  0 (group root)
First inode:  11
Inode size:   256
Required extra isize: 28
Desired extra isize:  28
Journal inode:8
First orphan inode:   131699
Default directory hash:   half_md4
Directory Hash Seed:  475ee556-9a7f-4c7b-93c3-21249906efea
Journal backup:   inode blocks
Journal features: journal_incompat_revoke
Journal size: 128M
Journal length:   32768
Journal sequence: 0x0dc8
Journal start:18

cloud@cloud-pc:$ cat /etc/fstab
# /etc/fstab: static file system information.
#
# / was on /dev/vda1 during installation
UUID=bbcbbb0e-a335-46fe-b829-b4bf7bef513b /
   ext4errors=remount-ro 0   1
# swap was on /dev/vda5 during installation
UUID=e2e3ec5c-dc4c-4f5d-a176-0f166b419785 none  swap  sw   0  0

Re: [Qemu-devel] [PATCH] build: Add include check on syscall.h

2016-01-29 Thread Lluís Vilanova

Peter Maydell writes:

> On 28 January 2016 at 20:36, Lluís Vilanova  wrote:
>> The LTTng tracing backend includes the system's "syscall.h", but QEMU
>> replaces it with its own for linux-user builds. This results in a double
>> include on some targets (when LTTng is enabled).
>> 
>> Signed-off-by: Lluís Vilanova 

> Adding include guards is fine, but it sounds to me like what we
> should actually do to fix this confusion is rename all the linux-user
> local headers to target_syscall.h.

Hmmm, I didn't know if using the same name was on purpose or not. If the
intention was *not* to override the system's syscall.h, then a rename is the
proper solution.

Can you or someone else confirm this?

Thanks,
  Lluis

[Qemu-devel] [PATCH v10 07/25] hmp: Cache use of qapi visitor

2016-01-29 Thread Eric Blake

Cache the visitor in a local variable instead of repeatedly
calling the accessor.

Signed-off-by: Eric Blake 
Reviewed-by: Marc-André Lureau 

---
v10: resplit 4/37 and 5/37 by action rather than file, retain R-b.
v9: no change
v8: no change
v7: place earlier in series, drop attempts to provide a 'kind' string,
drop bogus avoidance of qmp_object_del() on error
v6: new patch, split from RFC on v5 7/46
---
 hmp.c | 12 +++-
 vl.c  | 12 +++-
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/hmp.c b/hmp.c
index 9537f7b..6d67f9b 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1658,6 +1658,7 @@ void hmp_object_add(Monitor *mon, const QDict *qdict)
 char *id = NULL;
 OptsVisitor *ov;
 QDict *pdict;
+Visitor *v;

 opts = qemu_opts_from_qdict(qemu_find_opts("object"), qdict, );
 if (err) {
@@ -1666,28 +1667,29 @@ void hmp_object_add(Monitor *mon, const QDict *qdict)

 ov = opts_visitor_new(opts);
 pdict = qdict_clone_shallow(qdict);
+v = opts_get_visitor(ov);

-visit_start_struct(opts_get_visitor(ov), NULL, NULL, NULL, 0, );
+visit_start_struct(v, NULL, NULL, NULL, 0, );
 if (err) {
 goto out_clean;
 }

 qdict_del(pdict, "qom-type");
-visit_type_str(opts_get_visitor(ov), , "qom-type", );
+visit_type_str(v, , "qom-type", );
 if (err) {
 goto out_end;
 }

 qdict_del(pdict, "id");
-visit_type_str(opts_get_visitor(ov), , "id", );
+visit_type_str(v, , "id", );
 if (err) {
 goto out_end;
 }

-object_add(type, id, pdict, opts_get_visitor(ov), );
+object_add(type, id, pdict, v, );

 out_end:
-visit_end_struct(opts_get_visitor(ov), _end);
+visit_end_struct(v, _end);
 if (!err && err_end) {
 qmp_object_del(id, NULL);
 }
diff --git a/vl.c b/vl.c
index b96590a..24d30f2 100644
--- a/vl.c
+++ b/vl.c
@@ -2827,17 +2827,19 @@ static int object_create(void *opaque, QemuOpts *opts, 
Error **errp)
 OptsVisitor *ov;
 QDict *pdict;
 bool (*type_predicate)(const char *) = opaque;
+Visitor *v;

 ov = opts_visitor_new(opts);
 pdict = qemu_opts_to_qdict(opts, NULL);
+v = opts_get_visitor(ov);

-visit_start_struct(opts_get_visitor(ov), NULL, NULL, NULL, 0, );
+visit_start_struct(v, NULL, NULL, NULL, 0, );
 if (err) {
 goto out;
 }

 qdict_del(pdict, "qom-type");
-visit_type_str(opts_get_visitor(ov), , "qom-type", );
+visit_type_str(v, , "qom-type", );
 if (err) {
 goto out;
 }
@@ -2846,16 +2848,16 @@ static int object_create(void *opaque, QemuOpts *opts, 
Error **errp)
 }

 qdict_del(pdict, "id");
-visit_type_str(opts_get_visitor(ov), , "id", );
+visit_type_str(v, , "id", );
 if (err) {
 goto out;
 }

-object_add(type, id, pdict, opts_get_visitor(ov), );
+object_add(type, id, pdict, v, );
 if (err) {
 goto out;
 }
-visit_end_struct(opts_get_visitor(ov), );
+visit_end_struct(v, );
 if (err) {
 qmp_object_del(id, NULL);
 }
-- 
2.5.0

[Qemu-devel] [PATCH v10 11/25] qapi: Track all failures between visit_start/stop

2016-01-29 Thread Eric Blake

Inside the generated code between visit_start_struct() and
visit_end_struct(), we were blindly setting the error into
the caller's errp parameter.  But a future patch to split
visit_end_struct() will require that we take action based
on whether an error has occurred, which requires us to track
all actions through a local err.  Rewrite the visits to be
more in line with the other generated calls.

Generated code changes look like:

| visit_start_struct(v, (void **)obj, "Abort", name, sizeof(Abort), );
|-if (!err) {
|-if (*obj) {
|-visit_type_Abort_fields(v, obj, errp);
|-}
|-visit_end_struct(v, );
|+if (err) {
|+goto out;
| }
|+if (!*obj) {
|+goto out_obj;
|+}
|+visit_type_Abort_fields(v, obj, );
|+error_propagate(errp, err);
|+err = NULL;
|+out_obj:
|+visit_end_struct(v, );
|+out:
| error_propagate(errp, err);
| }

Signed-off-by: Eric Blake 

---
v10: enhance commit message, move out_obj label and drop R-b
v9: enhance commit message
v8: no change
v7: place earlier in series
v6: based loosely on v5 7/46, but mostly a rewrite to get the last
generated code in the same form as all the others, so that the
later conversion to split visit_check_struct() will be easier
---
 scripts/qapi-visit.py | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index b93690b..ec16e36 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -2,7 +2,7 @@
 # QAPI visitor generator
 #
 # Copyright IBM, Corp. 2011
-# Copyright (C) 2014-2015 Red Hat, Inc.
+# Copyright (C) 2014-2016 Red Hat, Inc.
 #
 # Authors:
 #  Anthony Liguori 
@@ -123,12 +123,18 @@ void visit_type_%(c_name)s(Visitor *v, %(c_name)s **obj, 
const char *name, Error
 Error *err = NULL;

 visit_start_struct(v, (void **)obj, "%(name)s", name, sizeof(%(c_name)s), 
);
-if (!err) {
-if (*obj) {
-visit_type_%(c_name)s_fields(v, obj, errp);
-}
-visit_end_struct(v, );
+if (err) {
+goto out;
 }
+if (!*obj) {
+goto out_obj;
+}
+visit_type_%(c_name)s_fields(v, obj, );
+error_propagate(errp, err);
+err = NULL;
+out_obj:
+visit_end_struct(v, );
+out:
 error_propagate(errp, err);
 }
 ''',
-- 
2.5.0

[Qemu-devel] [PATCH v10 03/25] qapi: Drop dead dealloc visitor variable

2016-01-29 Thread Eric Blake

Commit 0b9d8542 added StackEntry.is_list_head, but forgot to
delete the now-unused QapiDeallocVisitor.is_list_head.

Signed-off-by: Eric Blake 
Reviewed-by: Marc-André Lureau 

---
v10: no change
v9: no change
v8: no change
v7: new patch
---
 qapi/qapi-dealloc-visitor.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/qapi/qapi-dealloc-visitor.c b/qapi/qapi-dealloc-visitor.c
index 737deab..204de8f 100644
--- a/qapi/qapi-dealloc-visitor.c
+++ b/qapi/qapi-dealloc-visitor.c
@@ -28,7 +28,6 @@ struct QapiDeallocVisitor
 {
 Visitor visitor;
 QTAILQ_HEAD(, StackEntry) stack;
-bool is_list_head;
 };

 static QapiDeallocVisitor *to_qov(Visitor *v)
-- 
2.5.0

[Qemu-devel] [PATCH v10 13/25] qapi: Prefer type_int64 over type_int in visitors

2016-01-29 Thread Eric Blake

The qapi builtin type 'int' is basically shorthand for the type
'int64'.  In fact, since no visitor was providing the optional
type_int64() callback, visit_type_int64() was just always falling
back to type_int(), cementing the equivalence between the types.

However, some visitors are providing a type_uint64() callback.
For purposes of code consistency, it is nicer if all visitors
use the paired type_int64/type_uint64 names rather than the
mismatched type_int/type_uint64.  So this patch just renames
the signed int callbacks in place, dropping the type_int()
callback as redundant, and a later patch will focus on the
unsigned int callbacks.

Add some FIXMEs to questionable reuse of errp in code touched
by the rename, while at it (the reuse works as long as the
callbacks don't modify value when setting an error, but it's not
a good example to set) - a later patch will then fix those.

No change in functionality here, although further cleanups are
in the pipeline.

Signed-off-by: Eric Blake 

---
v10: reword comments for less churn in next patch
v9: improve commit message, hoist in part of 11/35, drop
Marc-Andre's R-b
v8: no change
v7: split off of 1/23 and 2/23 for easier-to-read diffs
---
 include/qapi/visitor-impl.h  |  6 --
 qapi/qapi-visit-core.c   | 36 ++--
 qapi/opts-visitor.c  |  4 ++--
 qapi/qapi-dealloc-visitor.c  |  6 +++---
 qapi/qmp-input-visitor.c |  6 +++---
 qapi/qmp-output-visitor.c|  6 +++---
 qapi/string-input-visitor.c  |  6 +++---
 qapi/string-output-visitor.c |  6 +++---
 8 files changed, 43 insertions(+), 33 deletions(-)

diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
index f314894..319efe8 100644
--- a/include/qapi/visitor-impl.h
+++ b/include/qapi/visitor-impl.h
@@ -36,7 +36,10 @@ struct Visitor
 void (*get_next_type)(Visitor *v, QType *type, bool promote_int,
   const char *name, Error **errp);

-void (*type_int)(Visitor *v, int64_t *obj, const char *name, Error **errp);
+/* Must be set. */
+void (*type_int64)(Visitor *v, int64_t *obj, const char *name,
+   Error **errp);
+/* Must be set. */
 void (*type_bool)(Visitor *v, bool *obj, const char *name, Error **errp);
 void (*type_str)(Visitor *v, char **obj, const char *name, Error **errp);
 void (*type_number)(Visitor *v, double *obj, const char *name,
@@ -54,7 +57,6 @@ struct Visitor
 void (*type_int8)(Visitor *v, int8_t *obj, const char *name, Error **errp);
 void (*type_int16)(Visitor *v, int16_t *obj, const char *name, Error 
**errp);
 void (*type_int32)(Visitor *v, int32_t *obj, const char *name, Error 
**errp);
-void (*type_int64)(Visitor *v, int64_t *obj, const char *name, Error 
**errp);
 /* visit_type_size() falls back to (*type_uint64)() if type_size is unset 
*/
 void (*type_size)(Visitor *v, uint64_t *obj, const char *name, Error 
**errp);
 bool (*start_union)(Visitor *v, bool data_present, Error **errp);
diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index f96d82f..3a888ab 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -91,7 +91,7 @@ void visit_type_enum(Visitor *v, int *obj, const char * const 
strings[],

 void visit_type_int(Visitor *v, int64_t *obj, const char *name, Error **errp)
 {
-v->type_int(v, obj, name, errp);
+v->type_int64(v, obj, name, errp);
 }

 void visit_type_uint8(Visitor *v, uint8_t *obj, const char *name, Error **errp)
@@ -102,8 +102,10 @@ void visit_type_uint8(Visitor *v, uint8_t *obj, const char 
*name, Error **errp)
 v->type_uint8(v, obj, name, errp);
 } else {
 value = *obj;
-v->type_int(v, , name, errp);
+v->type_int64(v, , name, errp);
 if (value < 0 || value > UINT8_MAX) {
+/* FIXME questionable reuse of errp if callback changed
+   value on error */
 error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
name ? name : "null", "uint8_t");
 return;
@@ -120,8 +122,10 @@ void visit_type_uint16(Visitor *v, uint16_t *obj, const 
char *name, Error **errp
 v->type_uint16(v, obj, name, errp);
 } else {
 value = *obj;
-v->type_int(v, , name, errp);
+v->type_int64(v, , name, errp);
 if (value < 0 || value > UINT16_MAX) {
+/* FIXME questionable reuse of errp if callback changed
+   value on error */
 error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
name ? name : "null", "uint16_t");
 return;
@@ -138,8 +142,10 @@ void visit_type_uint32(Visitor *v, uint32_t *obj, const 
char *name, Error **errp
 v->type_uint32(v, obj, name, errp);
 } else {
 value = *obj;
-v->type_int(v, , name, errp);
+v->type_int64(v, , name, errp);
 if (value < 0 || value > UINT32_MAX) {
+/* FIXME questionable reuse of errp

[Qemu-devel] [PATCH v10 24/25] qmp: Fix reference-counting of qnull on empty output visit

2016-01-29 Thread Eric Blake

Commit 6c2f9a15 ensured that we would not return NULL when the
caller used an output visitor but had nothing to visit. But
in doing so, it added a FIXME about a reference count leak
that could abort qemu in the (unlikely) case of SIZE_MAX such
visits (more plausible on 32-bit).  (Although that commit
suggested we might fix it in time for 2.5, we ran out of time;
fortunately, it is unlikely enough to bite that it was not
worth worrying about during the 2.5 release.)

This fixes things by documenting the internal contracts, and
explaining why the internal function can return NULL and only
the public facing interface needs to worry about qnull(),
thus avoiding over-referencing the qnull_ global object.

It does not, however, fix the stupidity of the stack mixing
up two separate pieces of information; add a FIXME to explain
that issue, which will be fixed shortly in a future patch.

Signed-off-by: Eric Blake 
Cc: qemu-sta...@nongnu.org

---
v10: tweak comments and add assert, drop R-b
v9: enhance commit message
v8: rebase to earlier changes
v7: cc qemu-stable, tweak some asserts, drop stale comment, add more
comments
v6: no change
---
 qapi/qmp-output-visitor.c   | 41 ++---
 tests/test-qmp-output-visitor.c |  2 ++
 2 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/qapi/qmp-output-visitor.c b/qapi/qmp-output-visitor.c
index 6ab2e04..b0e3980 100644
--- a/qapi/qmp-output-visitor.c
+++ b/qapi/qmp-output-visitor.c
@@ -30,6 +30,15 @@ typedef QTAILQ_HEAD(QStack, QStackEntry) QStack;
 struct QmpOutputVisitor
 {
 Visitor visitor;
+/* FIXME: we are abusing stack to hold two separate pieces of
+ * information: the current root object in slot 0, and the stack
+ * of N objects still being built in slots 1 through N (for N+1
+ * slots in use).  Worse, our behavior is inconsistent:
+ * qmp_output_add_obj() visiting two top-level scalars in a row
+ * discards the first in favor of the second, but visiting two
+ * top-level objects in a row tries to append the second object
+ * into the first (since the first object was placed in the stack
+ * in both slot 0 and 1, but only popped from slot 1).  */
 QStack stack;
 };

@@ -42,10 +51,12 @@ static QmpOutputVisitor *to_qov(Visitor *v)
 return container_of(v, QmpOutputVisitor, visitor);
 }

+/* Push @value onto the stack of current QObjects being built */
 static void qmp_output_push_obj(QmpOutputVisitor *qov, QObject *value)
 {
 QStackEntry *e = g_malloc0(sizeof(*e));

+assert(value);
 e->value = value;
 if (qobject_type(e->value) == QTYPE_QLIST) {
 e->is_list_head = true;
@@ -53,44 +64,53 @@ static void qmp_output_push_obj(QmpOutputVisitor *qov, 
QObject *value)
 QTAILQ_INSERT_HEAD(>stack, e, node);
 }

+/* Pop a value off the stack of QObjects being built, and return it. */
 static QObject *qmp_output_pop(QmpOutputVisitor *qov)
 {
 QStackEntry *e = QTAILQ_FIRST(>stack);
 QObject *value;
+
+assert(e);
 QTAILQ_REMOVE(>stack, e, node);
 value = e->value;
+assert(value);
 g_free(e);
 return value;
 }

+/* Grab the root QObject, if any */
 static QObject *qmp_output_first(QmpOutputVisitor *qov)
 {
 QStackEntry *e = QTAILQ_LAST(>stack, QStack);

-/*
- * FIXME Wrong, because qmp_output_get_qobject() will increment
- * the refcnt *again*.  We need to think through how visitors
- * handle null.
- */
 if (!e) {
-return qnull();
+/* No root */
+return NULL;
 }
-
+assert(e->value);
 return e->value;
 }

+/* Peek at the top of the stack of QObjects being built.
+ * The stack must not be empty. */
 static QObject *qmp_output_last(QmpOutputVisitor *qov)
 {
 QStackEntry *e = QTAILQ_FIRST(>stack);
+
+assert(e && e->value);
 return e->value;
 }

+/* Add @value to the current QObject being built.
+ * If the stack is visiting a dictionary or list, @value is now owned
+ * by that container. Otherwise, @value is now the root.  */
 static void qmp_output_add_obj(QmpOutputVisitor *qov, const char *name,
QObject *value)
 {
 QObject *cur;

 if (QTAILQ_EMPTY(>stack)) {
+/* Stack was empty, track this object as root */
 qmp_output_push_obj(qov, value);
 return;
 }
@@ -99,13 +119,17 @@ static void qmp_output_add_obj(QmpOutputVisitor *qov, 
const char *name,

 switch (qobject_type(cur)) {
 case QTYPE_QDICT:
+assert(name);
 qdict_put_obj(qobject_to_qdict(cur), name, value);
 break;
 case QTYPE_QLIST:
 qlist_append_obj(qobject_to_qlist(cur), value);
 break;
 default:
+/* The previous root was a scalar, replace it with a new root */
+/* FIXME this is abusing the stack; see comment above */
 qobject_decref(qmp_output_pop(qov));
+assert(QTAILQ_EMPTY(>stack));
 qmp_output_push_obj(qov, value);

[Qemu-devel] [PATCH v10 02/25] qapi: Avoid use of misnamed DO_UPCAST()

2016-01-29 Thread Eric Blake

The macro DO_UPCAST() is incorrectly named: it converts from a
parent class to a derived class (which is a downcast).  Better,
and more consistent with some of the other qapi visitors, is
to use the container_of() macro through a to_FOO() helper.  Names
like 'to_ov()' may be a bit short, but for a static helper it
doesn't hurt too much, and matches existing practice in files
like qmp-input-visitor.c.

Our current definition of container_of() is weaker than
DO_UPCAST(), in that it does not require the derived class to
have Visitor as its first member, but this does not hurt our
usage patterns in qapi visitors.

Signed-off-by: Eric Blake 
Reviewed-by: Marc-André Lureau 

---
v10: improve commit message
v9: no change
v8: no change
v7: new patch
---
 qapi/opts-visitor.c  | 28 +---
 qapi/string-input-visitor.c  | 23 ++-
 qapi/string-output-visitor.c | 21 +
 3 files changed, 44 insertions(+), 28 deletions(-)

diff --git a/qapi/opts-visitor.c b/qapi/opts-visitor.c
index ef5fb8b..dd4094c 100644
--- a/qapi/opts-visitor.c
+++ b/qapi/opts-visitor.c
@@ -89,6 +89,12 @@ struct OptsVisitor
 };


+static OptsVisitor *to_ov(Visitor *v)
+{
+return container_of(v, OptsVisitor, visitor);
+}
+
+
 static void
 destroy_list(gpointer list)
 {
@@ -121,7 +127,7 @@ static void
 opts_start_struct(Visitor *v, void **obj, const char *kind,
   const char *name, size_t size, Error **errp)
 {
-OptsVisitor *ov = DO_UPCAST(OptsVisitor, visitor, v);
+OptsVisitor *ov = to_ov(v);
 const QemuOpt *opt;

 if (obj) {
@@ -160,7 +166,7 @@ ghr_true(gpointer ign_key, gpointer ign_value, gpointer 
ign_user_data)
 static void
 opts_end_struct(Visitor *v, Error **errp)
 {
-OptsVisitor *ov = DO_UPCAST(OptsVisitor, visitor, v);
+OptsVisitor *ov = to_ov(v);
 GQueue *any;

 if (--ov->depth > 0) {
@@ -202,7 +208,7 @@ lookup_distinct(const OptsVisitor *ov, const char *name, 
Error **errp)
 static void
 opts_start_list(Visitor *v, const char *name, Error **errp)
 {
-OptsVisitor *ov = DO_UPCAST(OptsVisitor, visitor, v);
+OptsVisitor *ov = to_ov(v);

 /* we can't traverse a list in a list */
 assert(ov->list_mode == LM_NONE);
@@ -216,7 +222,7 @@ opts_start_list(Visitor *v, const char *name, Error **errp)
 static GenericList *
 opts_next_list(Visitor *v, GenericList **list, Error **errp)
 {
-OptsVisitor *ov = DO_UPCAST(OptsVisitor, visitor, v);
+OptsVisitor *ov = to_ov(v);
 GenericList **link;

 switch (ov->list_mode) {
@@ -265,7 +271,7 @@ opts_next_list(Visitor *v, GenericList **list, Error **errp)
 static void
 opts_end_list(Visitor *v, Error **errp)
 {
-OptsVisitor *ov = DO_UPCAST(OptsVisitor, visitor, v);
+OptsVisitor *ov = to_ov(v);

 assert(ov->list_mode == LM_STARTED ||
ov->list_mode == LM_IN_PROGRESS ||
@@ -307,7 +313,7 @@ processed(OptsVisitor *ov, const char *name)
 static void
 opts_type_str(Visitor *v, char **obj, const char *name, Error **errp)
 {
-OptsVisitor *ov = DO_UPCAST(OptsVisitor, visitor, v);
+OptsVisitor *ov = to_ov(v);
 const QemuOpt *opt;

 opt = lookup_scalar(ov, name, errp);
@@ -323,7 +329,7 @@ opts_type_str(Visitor *v, char **obj, const char *name, 
Error **errp)
 static void
 opts_type_bool(Visitor *v, bool *obj, const char *name, Error **errp)
 {
-OptsVisitor *ov = DO_UPCAST(OptsVisitor, visitor, v);
+OptsVisitor *ov = to_ov(v);
 const QemuOpt *opt;

 opt = lookup_scalar(ov, name, errp);
@@ -356,7 +362,7 @@ opts_type_bool(Visitor *v, bool *obj, const char *name, 
Error **errp)
 static void
 opts_type_int(Visitor *v, int64_t *obj, const char *name, Error **errp)
 {
-OptsVisitor *ov = DO_UPCAST(OptsVisitor, visitor, v);
+OptsVisitor *ov = to_ov(v);
 const QemuOpt *opt;
 const char *str;
 long long val;
@@ -412,7 +418,7 @@ opts_type_int(Visitor *v, int64_t *obj, const char *name, 
Error **errp)
 static void
 opts_type_uint64(Visitor *v, uint64_t *obj, const char *name, Error **errp)
 {
-OptsVisitor *ov = DO_UPCAST(OptsVisitor, visitor, v);
+OptsVisitor *ov = to_ov(v);
 const QemuOpt *opt;
 const char *str;
 unsigned long long val;
@@ -464,7 +470,7 @@ opts_type_uint64(Visitor *v, uint64_t *obj, const char 
*name, Error **errp)
 static void
 opts_type_size(Visitor *v, uint64_t *obj, const char *name, Error **errp)
 {
-OptsVisitor *ov = DO_UPCAST(OptsVisitor, visitor, v);
+OptsVisitor *ov = to_ov(v);
 const QemuOpt *opt;
 int64_t val;
 char *endptr;
@@ -490,7 +496,7 @@ opts_type_size(Visitor *v, uint64_t *obj, const char *name, 
Error **errp)
 static void
 opts_optional(Visitor *v, bool *present, const char *name)
 {
-OptsVisitor *ov = DO_UPCAST(OptsVisitor, visitor, v);
+OptsVisitor *ov = to_ov(v);

 /* we only support a single mandatory scalar field in a list node */
 assert(ov->list_mode == LM_NONE);
diff

Re: [Qemu-devel] [PATCH v8 05/16] virtio-scsi: Catch BDS-BB removal/insertion

2016-01-29 Thread Max Reitz

On 29.01.2016 13:41, Kevin Wolf wrote:
> Am 27.01.2016 um 18:59 hat Max Reitz geschrieben:
>> Make use of the BDS-BB removal and insertion notifiers to remove or set
>> up, respectively, virtio-scsi's op blockers.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>  hw/scsi/virtio-scsi.c   | 55 
>> +
>>  include/hw/virtio/virtio-scsi.h | 10 
>>  2 files changed, 65 insertions(+)
>>
>> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
>> index 607593c..b508b81 100644
>> --- a/hw/scsi/virtio-scsi.c
>> +++ b/hw/scsi/virtio-scsi.c
>> @@ -757,6 +757,22 @@ static void virtio_scsi_change(SCSIBus *bus, SCSIDevice 
>> *dev, SCSISense sense)
>>  }
>>  }
>>  
>> +static void virtio_scsi_blk_insert_notifier(Notifier *n, void *data)
>> +{
>> +VirtIOSCSIBlkChangeNotifier *cn = DO_UPCAST(VirtIOSCSIBlkChangeNotifier,
>> +n, n);
>> +assert(cn->sd->conf.blk == data);
>> +blk_op_block_all(cn->sd->conf.blk, cn->s->blocker);
>> +}
>> +
>> +static void virtio_scsi_blk_remove_notifier(Notifier *n, void *data)
>> +{
>> +VirtIOSCSIBlkChangeNotifier *cn = DO_UPCAST(VirtIOSCSIBlkChangeNotifier,
>> +n, n);
>> +assert(cn->sd->conf.blk == data);
>> +blk_op_unblock_all(cn->sd->conf.blk, cn->s->blocker);
>> +}
>> +
>>  static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState 
>> *dev,
>>  Error **errp)
>>  {
>> @@ -765,6 +781,22 @@ static void virtio_scsi_hotplug(HotplugHandler 
>> *hotplug_dev, DeviceState *dev,
>>  SCSIDevice *sd = SCSI_DEVICE(dev);
>>  
>>  if (s->ctx && !s->dataplane_disabled) {
>> +VirtIOSCSIBlkChangeNotifier *insert_notifier, *remove_notifier;
>> +
>> +insert_notifier = g_new0(VirtIOSCSIBlkChangeNotifier, 1);
>> +insert_notifier->n.notify = virtio_scsi_blk_insert_notifier;
>> +insert_notifier->s = s;
>> +insert_notifier->sd = sd;
>> +blk_add_insert_bs_notifier(sd->conf.blk, _notifier->n);
>> +QTAILQ_INSERT_TAIL(>insert_notifiers, insert_notifier, next);
>> +
>> +remove_notifier = g_new0(VirtIOSCSIBlkChangeNotifier, 1);
>> +remove_notifier->n.notify = virtio_scsi_blk_remove_notifier;
>> +remove_notifier->s = s;
>> +remove_notifier->sd = sd;
>> +blk_add_remove_bs_notifier(sd->conf.blk, _notifier->n);
>> +QTAILQ_INSERT_TAIL(>remove_notifiers, remove_notifier, next);
>> +
>>  if (blk_op_is_blocked(sd->conf.blk, BLOCK_OP_TYPE_DATAPLANE, errp)) 
>> {
>>  return;
>>  }
> 
> If we take the error path here, won't we have dangling pointers in the
> notifier list?

Yes, I'll move it below that error path.

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] arm: virt-acpi: each MADT.GICC entry as enabled unconditionally

2016-01-29 Thread Shannon Zhao




On 2016/1/29 22:24, Igor Mammedov wrote:

in current impl. condition

build_madt() {
   ...
   if (test_bit(i, cpuinfo->found_cpus))

is always true since loop handles only present CPUs
in range [0..smp_cpus).
But to fill usless cpuinfo->found_cpus we do unnecessary
scan over QOM tree to find the same CPUs.
So mark GICC as present always and drop not needed
code that fills cpuinfo->found_cpus.

Signed-off-by: Igor Mammedov
---
It's just simple cleanup but I'm trying to generalize
a bit CPU related ACPI tables and as part of it get rid
of found_cpus bitmap and if possible cpu_index usage
in ACPI parts of code.
---
  hw/arm/virt-acpi-build.c | 26 +++---
  1 file changed, 3 insertions(+), 23 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 87fbe7c..3ed39fc 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -46,20 +46,6 @@
  #define ARM_SPI_BASE 32
  #define ACPI_POWER_BUTTON_DEVICE "PWRB"

-typedef struct VirtAcpiCpuInfo {
-DECLARE_BITMAP(found_cpus, VIRT_ACPI_CPU_ID_LIMIT);
-} VirtAcpiCpuInfo;
-
-static void virt_acpi_get_cpu_info(VirtAcpiCpuInfo *cpuinfo)
-{
-CPUState *cpu;
-
-memset(cpuinfo->found_cpus, 0, sizeof cpuinfo->found_cpus);
-CPU_FOREACH(cpu) {
-set_bit(cpu->cpu_index, cpuinfo->found_cpus);
-}
-}
-
  static void acpi_dsdt_add_cpus(Aml *scope, int smp_cpus)
  {
  uint16_t i;
@@ -458,8 +444,7 @@ build_gtdt(GArray *table_data, GArray *linker)

  /* MADT */
  static void
-build_madt(GArray *table_data, GArray *linker, VirtGuestInfo *guest_info,
-   VirtAcpiCpuInfo *cpuinfo)
+build_madt(GArray *table_data, GArray *linker, VirtGuestInfo *guest_info)
  {
  int madt_start = table_data->len;
  const MemMapEntry *memmap = guest_info->memmap;
@@ -489,9 +474,7 @@ build_madt(GArray *table_data, GArray *linker, 
VirtGuestInfo *guest_info,
  gicc->cpu_interface_number = i;
  gicc->arm_mpidr = armcpu->mp_affinity;
  gicc->uid = i;
-if (test_bit(i, cpuinfo->found_cpus)) {
-gicc->flags = cpu_to_le32(ACPI_GICC_ENABLED);
-}
+gicc->flags = cpu_to_le32(ACPI_GICC_ENABLED);
  }
Ah, yes, it uses smp_cpus not max_cpus. But we still needs to support 
max_cpus usage even though it doesn't support vcpu hotplug currently. So 
we may need to introduce guest_info->max_cpus and use it here.
And below check in virt.c is not right while it should compare the 
global max_cpus with the max_cpus GIC supports.


if (smp_cpus > max_cpus) {
error_report("Number of SMP CPUs requested (%d) exceeds max CPUs "
 "supported by machine 'mach-virt' (%d)",
 smp_cpus, max_cpus);
exit(1);
}

Thanks,
--
Shannon

Re: [Qemu-devel] [PATCH v7 01/13] machine: Don't allow CPU toplogies with partially filled cores

2016-01-29 Thread Igor Mammedov

On Fri, 29 Jan 2016 12:24:18 -0200
Eduardo Habkost  wrote:

> On Fri, Jan 29, 2016 at 02:52:30PM +1100, David Gibson wrote:
> > On Thu, Jan 28, 2016 at 11:19:43AM +0530, Bharata B Rao wrote:  
> > > Prevent guests from booting with CPU topologies that have partially
> > > filled CPU cores or can result in partially filled CPU cores after
> > > CPU hotplug like
> > > 
> > > -smp 15,sockets=1,cores=4,threads=4,maxcpus=16 or
> > > -smp 15,sockets=1,cores=4,threads=4,maxcpus=17.
> > > 
> > > This is enforced by introducing MachineClass::validate_smp_config()
> > > that gets called from generic SMP parsing code. Machine type versions
> > > that want to enforce this can define this to the generic version
> > > provided.
> > > 
> > > Only sPAPR and PC machine types starting from version 2.6 enforce this in
> > > this patch.
> > > 
> > > Signed-off-by: Bharata B Rao   
> > 
> > I've been kind of lost in the back and forth about
> > threads/cores/sockets.
> > 
> > What, in the end, is the rationale for allowing partially filled
> > sockets, but not partially filled cores?  
> 
> I don't think there's a good reason for that (at least for PC).
> 
> It's easier to relax the requirements later if necessary, than
> dealing with compatibility issues again when making the code more
> strict. So I suggest we make validate_smp_config_generic() also
> check if smp_cpus % (smp_threads * smp_cores) == 0.

that would break exiting setups.

Also in case of cpu hotplug this patch will break migration
as target QEMU might refuse starting with hotplugged CPU thread.

Perhaps this check should be enforced per target/machine if
arch requires it.

Re: [Qemu-devel] [PATCH 1/1] arm: virt: change GPIO trigger interrupt to pulse

2016-01-29 Thread Wei Huang



On 01/29/2016 08:50 AM, Peter Maydell wrote:
> On 29 January 2016 at 14:46, Shannon Zhao  wrote:
>> On 2016/1/29 22:35, Wei Huang wrote:
>>> On 01/29/2016 04:10 AM, Shannon Zhao wrote:
 This makes ACPI work well but makes DT not work. The reason is systemd or
 acpid open /dev/input/event0 failed. So the interrupt could be injected
 and
 could see under /proc/interrupts but guest doesn't have any action. I'll
 investigate why it opens failed later.
>>>
>>>
>>> That is interesting. Could you try it with the following? This reverses
>>> the order to down-up and worked on ACPI case.
>>>
>> Yeah, that's very weird.
>>
>>> qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 0);
>>> qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 1);
>>>
>> I'll try this tomorrow. But even if this works, it's still weird.
> 
> I wonder if we should be asserting the GPIO pin in the powerdown-request
> hook and then deasserting it on system reset somewhere...

This is another possibility. We can try to reset the pl061 state by
hooking up with dc->reset and see what happens.

> 
> thanks
> -- PMM
>

Re: [Qemu-devel] [PATCH] build: Add include check on syscall.h

2016-01-29 Thread Peter Maydell

On 29 January 2016 at 13:30, Lluís Vilanova  wrote:
> Peter Maydell writes:
>> Adding include guards is fine, but it sounds to me like what we
>> should actually do to fix this confusion is rename all the linux-user
>> local headers to target_syscall.h.
>
> Hmmm, I didn't know if using the same name was on purpose or not. If the
> intention was *not* to override the system's syscall.h, then a rename is the
> proper solution.

Yes, the intention is absolutely not to override any system header
(the constants defined are only relevant to the guest, and if the
header got included and overrode the host's syscall.h then nothing
would work and it probably wouldn't even compile). It just ended up
with the same name by accident.

thanks
-- PMM

[Qemu-devel] [PATCH v10 01/25] qobject: Document more shortcomings in our number handling

2016-01-29 Thread Eric Blake

We've already documented that our JSON parsing is locale dependent;
but we should also document that our JSON output has the same
problem.  Additionally, JSON requires finite values (you have to
upgrade to JSON5 to get support for Inf or NaN), and our output
truncates floating point numbers to the point of losing significant
precision that could cause the receiver to read a different value.

Sadly, this series is not going to be the one that addresses these
problems.

Fix some trailing whitespace I noticed in the vicinity.

Signed-off-by: Eric Blake 

---
v10: comment improvements, drop Marc-Andre's R-b.
v9: no change
v8: no change
v7: new patch
---
 qobject/json-parser.c |  6 --
 qobject/qjson.c   | 11 ++-
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/qobject/json-parser.c b/qobject/json-parser.c
index 3c5d35d..95bb054 100644
--- a/qobject/json-parser.c
+++ b/qobject/json-parser.c
@@ -1,5 +1,5 @@
 /*
- * JSON Parser 
+ * JSON Parser
  *
  * Copyright IBM, Corp. 2009
  *
@@ -518,7 +518,9 @@ static QObject *parse_literal(JSONParserContext *ctxt)
 /* fall through to JSON_FLOAT */
 }
 case JSON_FLOAT:
-/* FIXME dependent on locale */
+/* FIXME dependent on locale; a pervasive issue in QEMU */
+/* FIXME our lexer matches RFC 7159 in forbidding Inf or NaN,
+ * but those might be useful extensions beyond JSON */
 return QOBJECT(qfloat_from_double(strtod(token->str, NULL)));
 default:
 abort();
diff --git a/qobject/qjson.c b/qobject/qjson.c
index a3e6a7c..8bc7f20 100644
--- a/qobject/qjson.c
+++ b/qobject/qjson.c
@@ -237,6 +237,15 @@ static void to_json(const QObject *obj, QString *str, int 
pretty, int indent)
 char buffer[1024];
 int len;

+/* FIXME: snprintf() is locale dependent; but JSON requires
+ * numbers to be formatted as if in the C locale. Dependence
+ * on C locale is a pervasive issue in QEMU. */
+/* FIXME: This risks printing Inf or NaN, which are not valid
+ * JSON values. */
+/* FIXME: the default precision of 6 for %f often causes
+ * rounding errors; we should be using DBL_DECIMAL_DIG (17),
+ * and only rounding to a shorter number if the result would
+ * still produce the same floating point value.  */
 len = snprintf(buffer, sizeof(buffer), "%f", qfloat_get_double(val));
 while (len > 0 && buffer[len - 1] == '0') {
 len--;
@@ -247,7 +256,7 @@ static void to_json(const QObject *obj, QString *str, int 
pretty, int indent)
 } else {
 buffer[len] = 0;
 }
-
+
 qstring_append(str, buffer);
 break;
 }
-- 
2.5.0

[Qemu-devel] [PATCH v10 00/25] qapi visitor cleanups part 1 (post-introspection cleanups subset E)

2016-01-29 Thread Eric Blake

Based on qemu.git master. No pending prerequisites

Also available as a tag at this location:
git fetch git://repo.or.cz/qemu/ericb.git qapi-cleanupv10e

and will soon be part of my branch with the rest of the v5 series, at:
http://repo.or.cz/qemu/ericb.git/shortlog/refs/heads/qapi

v10 notes:
This is patches 1-20 of v9; 21-37 of that series will come later,
but this half was relatively clean and should be ready to merge.
Plus, this half includes the argument reordering, which touches
a lot of the tree, so getting it in sooner rather than later will
minimize rebase churn.  A couple patches were split differently
or retitled.  Reviewed-by was kept on patches that didn't change
in content (even if the content was split across different patch
boundaries).

Most of the work here was addressing Markus' review comments,
or rebasing later patches on top of earlier changes.

001/25:[0013] [FC] 'qobject: Document more shortcomings in our number handling'
002/25:[] [--] 'qapi: Avoid use of misnamed DO_UPCAST()'
003/25:[] [--] 'qapi: Drop dead dealloc visitor variable'
004/25:[down] 'qapi: Dealloc visitor does not need a type_size()'
005/25:[down] 'qapi: Drop dead parameter in gen_params()'
006/25:[down] 'hmp: Drop pointless allocation during qapi visit'
007/25:[down] 'hmp: Cache use of qapi visitor'
008/25:[down] 'vl: Ensure qapi visitor properly ends struct visit'
009/25:[0003] [FC] 'balloon: Improve use of qapi visitor'
010/25:[0004] [FC] 'qapi: Improve generated event use of qapi visitor'
011/25:[0004] [FC] 'qapi: Track all failures between visit_start/stop'
012/25:[down] 'qapi-visit: Kill unused visit_end_union()'
013/25:[0012] [FC] 'qapi: Prefer type_int64 over type_int in visitors'
014/25:[0012] [FC] 'qapi: Make all visitors supply uint64 callbacks'
015/25:[0014] [FC] 'qapi: Consolidate visitor small integer callbacks'
016/25:[0006] [FC] 'qapi: Don't cast Enum* to int*'
017/25:[] [--] 'qom: Use typedef for Visitor'
018/25:[0004] [FC] 'qapi: Swap visit_* arguments for consistent 'name' 
placement'
019/25:[0005] [FC] 'qom: Swap 'name' next to visitor in ObjectPropertyAccessor'
020/25:[0010] [FC] 'qapi: Swap 'name' in visit_* callbacks to match public API'
021/25:[0018] [FC] 'qapi: Drop unused 'kind' for struct/enum visit'
022/25:[down] 'qapi: Tighten qmp_input_end_list()'
023/25:[0046] [FC] 'qapi: Drop unused error argument for list and implicit 
struct'
024/25:[0008] [FC] 'qmp: Fix reference-counting of qnull on empty output visit'
025/25:[0018] [FC] 'qmp: Don't abuse stack to track qmp-output root'

v9 notes:
https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg03504.html
Rebase to master, incorporate findings from Marc-André.

v8 notes:
https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg03863.html
For notes here and earlier, look in the archives

 include/qapi/visitor.h |  60 +
 include/qapi/visitor-impl.h|  63 +-
 scripts/qapi-commands.py   |   4 +-
 scripts/qapi-event.py  |  16 ++-
 scripts/qapi-types.py  |   2 +-
 scripts/qapi-visit.py  |  72 ++-
 scripts/qapi.py|  13 +-
 include/qom/object.h   |  13 +-
 qapi/qapi-visit-core.c | 252 +++--
 backends/hostmem.c |  24 ++--
 block/qapi.c   |   2 +-
 blockdev.c |   4 +-
 bootdevice.c   |  12 +-
 hmp.c  |  18 +--
 hw/acpi/core.c |   4 +-
 hw/acpi/ich9.c |  49 
 hw/block/nvme.c|  12 +-
 hw/core/machine.c  |  24 ++--
 hw/core/qdev-properties-system.c   |  44 +++
 hw/core/qdev-properties.c  | 180 +-
 hw/core/qdev.c |   7 +-
 hw/i386/pc.c   |  43 +++
 hw/ide/qdev.c  |  12 +-
 hw/intc/xics.c |  20 +--
 hw/isa/lpc_ich9.c  |   7 +-
 hw/mem/pc-dimm.c   |   6 +-
 hw/misc/edu.c  |   6 +-
 hw/misc/tmp105.c   |  12 +-
 hw/net/ne2000-isa.c|  14 ++-
 hw/pci-host/piix.c |  18 +--
 hw/pci-host/q35.c  |  23 ++--
 hw/ppc/spapr_drc.c |  34 +++--
 hw/usb/dev-storage.c   |  12 +-
 hw/virtio/virtio-balloon.c |  30 ++---
 memory.c   |  26 ++--
 net/dump.c |  12 +-
 net/filter-buffer.c|  14 ++-
 net/net.c  |   4 +-
 numa.c |   6 +-
 qapi/opts-visitor.c|  52 
 qapi/qapi-dealloc-visitor.c|  48 ---
 qapi/qmp-input-visitor.c   |  54 
 qapi/qmp-output-visitor.c  | 120 +-
 qapi/string-input-visitor.c|  62 +
 qapi/string-output-visitor.c   |  54

[Qemu-devel] [PATCH v10 05/25] qapi: Drop dead parameter in gen_params()

2016-01-29 Thread Eric Blake

Commit 5cdc8831 reworked gen_params() to be simpler, but forgot
to clean up a now-unused errp named argument.

No change to generated code.

Reported-by: Markus Armbruster 
Signed-off-by: Eric Blake 

---
v10: new patch
---
 scripts/qapi.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/qapi.py b/scripts/qapi.py
index 7dec611..37aa6fe 100644
--- a/scripts/qapi.py
+++ b/scripts/qapi.py
@@ -2,7 +2,7 @@
 # QAPI helper library
 #
 # Copyright IBM, Corp. 2011
-# Copyright (c) 2013-2015 Red Hat Inc.
+# Copyright (c) 2013-2016 Red Hat Inc.
 #
 # Authors:
 #  Anthony Liguori 
@@ -1649,7 +1649,7 @@ def gen_visit_fields(members, prefix='', need_cast=False, 
skiperr=False):
 if (visit_optional(v, &%(prefix)shas_%(c_name)s, "%(name)s")) {
 ''',
  prefix=prefix, c_name=c_name(memb.name),
- name=memb.name, errp=errparg)
+ name=memb.name)
 push_indent()

 # Ugly: sometimes we need to cast away const
-- 
2.5.0

[Qemu-devel] [PATCH v10 04/25] qapi: Dealloc visitor does not need a type_size()

2016-01-29 Thread Eric Blake

The intent of having the visitor type_size() callback differ
from type_uint64() is to allow special handling for sizes; the
visitor core gracefully falls back to type_uint64() if there is
no need for the distinction.  Since the dealloc visitor does
nothing for any of the int visits, drop the pointless size
handler.

Signed-off-by: Eric Blake 

---
v10: new patch, split out from 10/37
---
 qapi/qapi-dealloc-visitor.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/qapi/qapi-dealloc-visitor.c b/qapi/qapi-dealloc-visitor.c
index 204de8f..4d1ef93 100644
--- a/qapi/qapi-dealloc-visitor.c
+++ b/qapi/qapi-dealloc-visitor.c
@@ -158,11 +158,6 @@ static void qapi_dealloc_type_anything(Visitor *v, QObject 
**obj,
 }
 }

-static void qapi_dealloc_type_size(Visitor *v, uint64_t *obj, const char *name,
-   Error **errp)
-{
-}
-
 static void qapi_dealloc_type_enum(Visitor *v, int *obj,
const char * const strings[],
const char *kind, const char *name,
@@ -224,7 +219,6 @@ QapiDeallocVisitor *qapi_dealloc_visitor_new(void)
 v->visitor.type_str = qapi_dealloc_type_str;
 v->visitor.type_number = qapi_dealloc_type_number;
 v->visitor.type_any = qapi_dealloc_type_anything;
-v->visitor.type_size = qapi_dealloc_type_size;
 v->visitor.start_union = qapi_dealloc_start_union;

 QTAILQ_INIT(>stack);
-- 
2.5.0

Re: [Qemu-devel] [PATCH v8 12/16] blockdev: Keep track of monitor-owned BDS

2016-01-29 Thread Kevin Wolf

Am 29.01.2016 um 14:44 hat Max Reitz geschrieben:
> On 28.01.2016 04:33, Fam Zheng wrote:
> > On Wed, 01/27 18:59, Max Reitz wrote:
> >> Signed-off-by: Max Reitz 
> >> ---
> >>  blockdev.c | 26 ++
> >>  include/block/block_int.h  |  4 
> >>  stubs/Makefile.objs|  1 +
> >>  stubs/blockdev-close-all-bdrv-states.c |  5 +
> >>  4 files changed, 36 insertions(+)
> >>  create mode 100644 stubs/blockdev-close-all-bdrv-states.c
> >>
> >> diff --git a/blockdev.c b/blockdev.c
> >> index 09d4621..ac93f43 100644
> >> --- a/blockdev.c
> >> +++ b/blockdev.c
> >> @@ -50,6 +50,9 @@
> >>  #include "trace.h"
> >>  #include "sysemu/arch_init.h"
> >>  
> >> +static QTAILQ_HEAD(, BlockDriverState) monitor_bdrv_states =
> >> +QTAILQ_HEAD_INITIALIZER(monitor_bdrv_states);
> >> +
> >>  static const char *const if_name[IF_COUNT] = {
> >>  [IF_NONE] = "none",
> >>  [IF_IDE] = "ide",
> >> @@ -702,6 +705,19 @@ fail:
> >>  return NULL;
> >>  }
> >>  
> >> +void blockdev_close_all_bdrv_states(void)
> >> +{
> >> +BlockDriverState *bs, *next_bs;
> >> +
> >> +QTAILQ_FOREACH_SAFE(bs, _bdrv_states, monitor_list, next_bs) {
> >> +AioContext *ctx = bdrv_get_aio_context(bs);
> >> +
> >> +aio_context_acquire(ctx);
> >> +bdrv_unref(bs);
> >> +aio_context_release(ctx);
> >> +}
> >> +}
> >> +
> >>  static void qemu_opt_rename(QemuOpts *opts, const char *from, const char 
> >> *to,
> >>  Error **errp)
> >>  {
> >> @@ -3875,12 +3891,15 @@ void qmp_blockdev_add(BlockdevOptions *options, 
> >> Error **errp)
> >>  if (!bs) {
> >>  goto fail;
> >>  }
> >> +
> >> +QTAILQ_INSERT_TAIL(_bdrv_states, bs, monitor_list);
> >>  }
> >>  
> >>  if (bs && bdrv_key_required(bs)) {
> >>  if (blk) {
> >>  blk_unref(blk);
> >>  } else {
> >> +QTAILQ_REMOVE(_bdrv_states, bs, monitor_list);
> >>  bdrv_unref(bs);
> >>  }
> >>  error_setg(errp, "blockdev-add doesn't support encrypted 
> >> devices");
> >> @@ -3945,11 +3964,18 @@ void qmp_x_blockdev_del(bool has_id, const char 
> >> *id,
> >> bdrv_get_device_or_node_name(bs));
> >>  goto out;
> >>  }
> >> +
> >> +if (!blk && !bs->monitor_list.tqe_prev) {
> >> +error_setg(errp, "Node %s is not owned by the monitor",
> >> +   bs->node_name);
> >> +goto out;
> >> +}
> > 
> > Is this an extra restriction added by this patch?
> 
> I hope not. This is just an additional check that should not change
> behavior; if it does, we did something wrong.

Actually, if you were to respin the series, you could remove the check
that it replaces:

if (bs->refcnt > 1 || !QLIST_EMPTY(>parents)) {

The QLIST_EMPTY() check is trying to achieve the same as what you
introduce in a clean way now. It doesn't hurt at the moment, but I had
to get rid of the QLIST_EMPTY() check when I tried to convert BB to
BdrvChild.

Kevin


pgpXpebeXQaKZ.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH] Fix virtio migration

2016-01-29 Thread Cornelia Huck

On Fri, 29 Jan 2016 13:18:56 +
"Dr. David Alan Gilbert (git)"  wrote:

> From: "Dr. David Alan Gilbert" 
> 
> I misunderstood the vmstate macro definition when I reworked the
> virtio .get/.put.
> The VMSTATE_STRUCT_VARRAY_KNOWN, was described as being for "a
> variable length array (i.e. _type *_field) but we know the
> length".  However it actually specified operation for arrays embedded in
> the struct (i.e. _type _field[]) since it lacked the VMS_POINTER
> flag. This caused offset calculation to be completely off, examining and
> potentially sending random data instead of the VirtQueue content.
> 
> Replace the otherwise unused VMSTATE_STRUCT_VARRAY_KNOWN with a
> VMSTATE_STRUCT_VARRAY_POINTER_KNOWN that includes the VMS_POINTER flag
> (so now actually doing what it advertises) and use it in the virtio
> migration code.
> 
> Fixes and description as per Sascha's suggestions/debug.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> Reported-by: Sascha Silbe 
> Tested-By: Sascha Silbe 
> Reviewed-By: Sascha Silbe 
> 
> Fixes: 50e5ae4dc3e4f21e874512f9e87b93b5472d26e0
> Fixes: 2cf0148674430b6693c60d42b7eef721bfa9509f
> ---
>  hw/virtio/virtio.c  |  8 
>  include/migration/vmstate.h | 18 +-
>  2 files changed, 13 insertions(+), 13 deletions(-)

Tested-by: Cornelia Huck

Re: [Qemu-devel] [PATCH] Fix virtio migration

2016-01-29 Thread Peter Maydell

On 29 January 2016 at 13:18, Dr. David Alan Gilbert (git)
 wrote:
> From: "Dr. David Alan Gilbert" 
>
> I misunderstood the vmstate macro definition when I reworked the
> virtio .get/.put.
> The VMSTATE_STRUCT_VARRAY_KNOWN, was described as being for "a
> variable length array (i.e. _type *_field) but we know the
> length".  However it actually specified operation for arrays embedded in
> the struct (i.e. _type _field[]) since it lacked the VMS_POINTER
> flag. This caused offset calculation to be completely off, examining and
> potentially sending random data instead of the VirtQueue content.
>
> Replace the otherwise unused VMSTATE_STRUCT_VARRAY_KNOWN with a
> VMSTATE_STRUCT_VARRAY_POINTER_KNOWN that includes the VMS_POINTER flag
> (so now actually doing what it advertises) and use it in the virtio
> migration code.

Yeah, these macro names are a bit of a mess. I had an idea ages
back about autogenerating them all as an orthogonal cross product
of the different kinds of thing (and filling in some of the random
gaps in coverage as a side effect), but I never really got anywhere
with it.

thanks
-- PMM

Re: [Qemu-devel] [PATCH 1/1] arm: virt: change GPIO trigger interrupt to pulse

2016-01-29 Thread Wei Huang



On 01/29/2016 04:10 AM, Shannon Zhao wrote:
> Hi，
> 
> This makes ACPI work well but makes DT not work. The reason is systemd or
> acpid open /dev/input/event0 failed. So the interrupt could be injected and
> could see under /proc/interrupts but guest doesn't have any action. I'll
> investigate why it opens failed later.

That is interesting. Could you try it with the following? This reverses
the order to down-up and worked on ACPI case.

qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 0);
qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 1);

Thanks,
-Wei

> 
> 2016年1月29日星期五，Wei Huang  写道：
> 
>> When QEMU is hook'ed up with libvirt/virsh, the first ACPI reboot
>> request will succeed; but the following shutdown/reboot requests
>> fail to trigger VMs to react. Notice that in mach-virt machine
>> model GPIO is defined as edge-triggered and active-high in ACPI.
>> This patch changes the behavior of powerdown notifier from PULLUP
>> to PULSE. It solves the problem described above (i.e. reboot
>> continues to work).
>>
>> Signed-off-by: Wei Huang >
>> ---
>>  hw/arm/virt.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index 05f9087..b5468a9 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -546,7 +546,7 @@ static DeviceState *pl061_dev;
>>  static void virt_powerdown_req(Notifier *n, void *opaque)
>>  {
>>  /* use gpio Pin 3 for power button event */
>> -qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 1);
>> +qemu_irq_pulse(qdev_get_gpio_in(pl061_dev, 3));
>>  }
>>
>>  static Notifier virt_system_powerdown_notifier = {
>> --
>> 1.8.3.1
>>
>>
>

Re: [Qemu-devel] [PATCH 1/1] arm: virt: change GPIO trigger interrupt to pulse

2016-01-29 Thread Wei Huang



On 01/29/2016 08:46 AM, Shannon Zhao wrote:
> 
> 
> On 2016/1/29 22:35, Wei Huang wrote:
>>
>>
>> On 01/29/2016 04:10 AM, Shannon Zhao wrote:
>>> Hi，
>>>
>>> This makes ACPI work well but makes DT not work. The reason is
>>> systemd or
>>> acpid open /dev/input/event0 failed. So the interrupt could be
>>> injected and
>>> could see under /proc/interrupts but guest doesn't have any action. I'll
>>> investigate why it opens failed later.
>>
>> That is interesting. Could you try it with the following? This reverses
>> the order to down-up and worked on ACPI case.
>>
> Yeah, that's very weird.
> 
>> qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 0);
>> qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 1);
>>
> I'll try this tomorrow. But even if this works, it's still weird.

To reproduce this case, do the following steps using current upstream
qemu: create vm => reboot vm (succeed) => reboot or shutdown vm (fail).
Apparently the last interrupt wasn't received correctly.

-Wei


> 
>> Thanks,
>> -Wei
>>
>>>
>>> 2016年1月29日星期五，Wei Huang  写道：
>>>
 When QEMU is hook'ed up with libvirt/virsh, the first ACPI reboot
 request will succeed; but the following shutdown/reboot requests
 fail to trigger VMs to react. Notice that in mach-virt machine
 model GPIO is defined as edge-triggered and active-high in ACPI.
 This patch changes the behavior of powerdown notifier from PULLUP
 to PULSE. It solves the problem described above (i.e. reboot
 continues to work).

 Signed-off-by: Wei Huang >
 ---
   hw/arm/virt.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/hw/arm/virt.c b/hw/arm/virt.c
 index 05f9087..b5468a9 100644
 --- a/hw/arm/virt.c
 +++ b/hw/arm/virt.c
 @@ -546,7 +546,7 @@ static DeviceState *pl061_dev;
   static void virt_powerdown_req(Notifier *n, void *opaque)
   {
   /* use gpio Pin 3 for power button event */
 -qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 1);
 +qemu_irq_pulse(qdev_get_gpio_in(pl061_dev, 3));
   }

   static Notifier virt_system_powerdown_notifier = {
 -- 
 1.8.3.1


>>>
>

Re: [Qemu-devel] [PATCH v5 5/5] doc: Introduce coding style for errors

2016-01-29 Thread Lluís Vilanova

Eric Blake writes:

> On 01/28/2016 02:53 PM, Lluís Vilanova wrote:
>> Gives some general guidelines for reporting errors in QEMU.
>> 
>> Signed-off-by: Lluís Vilanova 
>> ---
>> HACKING |   33 +
>> 1 file changed, 33 insertions(+)

> I'm not sure if my v4 review crossed paths with this, but I still see typos:

>> +
>> +WARNING: The special 'error_fatal' and 'error_abort' objects follow the same
>> +constrains as the 'error_report_fatal' and 'error_report_abort' functions.

> s/constrains/constraints/

We did cross paths, sorry. It's fixed on upcoming v6.

Lluis

Re: [Qemu-devel] [PATCH v5 2/5] util: Use new error_report_fatal/abort instead of error_setg(_fatal/abort)

2016-01-29 Thread Lluís Vilanova

David Gibson writes:

> On Thu, Jan 28, 2016 at 10:53:43PM +0100, Lluís Vilanova wrote:
>> Replaces all direct uses of 'error_setg(_fatal/abort)' with
>> 'error_report_fatal/abort'. Also reimplements the former on top of the
>> latter.
>> 
>> Signed-off-by: Lluís Vilanova 

> I think the spapr parts of this will be obsoleted by the cleanups to
> error handling included in the pull request I sent today.

Ok, then I'll rebase once merged. Is there an ETA for the merge?

Thanks,
  Lluis

Re: [Qemu-devel] [PATCH v8 01/16] block: Release dirty bitmaps in bdrv_close()

2016-01-29 Thread Max Reitz

On 28.01.2016 04:01, Fam Zheng wrote:
> On Wed, 01/27 18:59, Max Reitz wrote:
>> bdrv_delete() is not very happy about deleting BlockDriverStates with
>> dirty bitmaps still attached to them. In the past, we got around that
>> very easily by relying on bdrv_close_all() bypassing bdrv_delete(), and
>> bdrv_close() simply ignoring that condition. We should fix that by
>> releasing all dirty bitmaps in bdrv_close() and drop the assertion in
>> bdrv_delete().
>>
>> Signed-off-by: Max Reitz 
>> Reviewed-by: John Snow 
>> ---
>>  block.c | 37 +
>>  1 file changed, 29 insertions(+), 8 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 5709d3d..9a31e20 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -88,6 +88,8 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const 
>> char *filename,
>>   const BdrvChildRole *child_role, Error **errp);
>>  
>>  static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
>> +static void bdrv_release_all_dirty_bitmaps(BlockDriverState *bs);
>> +
>>  /* If non-zero, use only whitelisted block drivers */
>>  static int use_bdrv_whitelist;
>>  
>> @@ -2157,6 +2159,8 @@ void bdrv_close(BlockDriverState *bs)
>>  
>>  notifier_list_notify(>close_notifiers, bs);
>>  
>> +bdrv_release_all_dirty_bitmaps(bs);
>> +
>>  if (bs->blk) {
>>  blk_dev_change_media_cb(bs->blk, false);
>>  }
>> @@ -2366,7 +2370,6 @@ static void bdrv_delete(BlockDriverState *bs)
>>  assert(!bs->job);
>>  assert(bdrv_op_blocker_is_empty(bs));
>>  assert(!bs->refcnt);
>> -assert(QLIST_EMPTY(>dirty_bitmaps));
>>  
>>  bdrv_close(bs);
>>  
>> @@ -3582,21 +3585,39 @@ static void 
>> bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
>>  }
>>  }
>>  
>> -void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap 
>> *bitmap)
>> +static void bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
>> +  BdrvDirtyBitmap *bitmap)
>>  {
>>  BdrvDirtyBitmap *bm, *next;
>>  QLIST_FOREACH_SAFE(bm, >dirty_bitmaps, list, next) {
>> -if (bm == bitmap) {
>> +if (!bitmap || bm == bitmap) {
>>  assert(!bdrv_dirty_bitmap_frozen(bm));
>> -QLIST_REMOVE(bitmap, list);
>> -hbitmap_free(bitmap->bitmap);
>> -g_free(bitmap->name);
>> -g_free(bitmap);
>> -return;
>> +QLIST_REMOVE(bm, list);
>> +hbitmap_free(bm->bitmap);
>> +g_free(bm->name);
>> +g_free(bm);
>> +
>> +if (bitmap) {
>> +return;
>> +}
>>  }
>>  }
>>  }
>>  
>> +void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap 
>> *bitmap)
>> +{
>> +bdrv_do_release_matching_dirty_bitmap(bs, bitmap);
>> +}
>> +
>> +/**
>> + * Release all dirty bitmaps attached to a BDS (for use in bdrv_close()). 
>> There
>> + * must not be any frozen bitmaps attached.
> 
> Should we assert that?

Well, it is asserted in bdrv_do_release_matching_dirty_bitmap().

>And IIUC the intention of this function is to release
> all monitor owned (i.e. user created) dirty bitmaps, which must be named. If
> so, can we assert that too?

Probably we can. I don't really know if it would be worth it, though
(even if it isn't much code). There's not much harm in releasing unnamed
bitmaps here, the main issue would be the question "where did they come
from?".

If I do it, I'll rename the function bdrv_release_named_dirty_bitmaps()
and then assert in bdrv_close() that no bitmaps are attached any more.

Max

> 
> Fam
> 
>> + */
>> +static void bdrv_release_all_dirty_bitmaps(BlockDriverState *bs)
>> +{
>> +bdrv_do_release_matching_dirty_bitmap(bs, NULL);
>> +}
>> +
>>  void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
>>  {
>>  assert(!bdrv_dirty_bitmap_frozen(bitmap));
>> -- 
>> 2.7.0
>>




signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v10 17/25] qom: Use typedef for Visitor

2016-01-29 Thread Eric Blake

No need to repeat 'struct Visitor' when we already have it in
typedefs.h.  Omitting the redundant 'struct' also makes a later
patch easier to search for all object property callbacks that
are associated with a Visitor.

Signed-off-by: Eric Blake 
Reviewed-by: Marc-André Lureau 

---
v10: no change
v9: no change
v8: new patch
---
 include/qom/object.h   | 9 -
 hw/misc/edu.c  | 4 ++--
 hw/virtio/virtio-balloon.c | 6 +++---
 qom/object.c   | 4 ++--
 target-i386/cpu.c  | 4 ++--
 5 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/include/qom/object.h b/include/qom/object.h
index d0dafe9..3e7e99d 100644
--- a/include/qom/object.h
+++ b/include/qom/object.h
@@ -18,10 +18,9 @@
 #include 
 #include 
 #include "qemu/queue.h"
+#include "qemu/typedefs.h"
 #include "qapi/error.h"

-struct Visitor;
-
 struct TypeImpl;
 typedef struct TypeImpl *Type;

@@ -298,7 +297,7 @@ typedef struct InterfaceInfo InterfaceInfo;
  * Called when trying to get/set a property.
  */
 typedef void (ObjectPropertyAccessor)(Object *obj,
-  struct Visitor *v,
+  Visitor *v,
   void *opaque,
   const char *name,
   Error **errp);
@@ -1025,7 +1024,7 @@ void object_unparent(Object *obj);
  *
  * Reads a property from a object.
  */
-void object_property_get(Object *obj, struct Visitor *v, const char *name,
+void object_property_get(Object *obj, Visitor *v, const char *name,
  Error **errp);

 /**
@@ -1161,7 +1160,7 @@ void object_property_get_uint16List(Object *obj, const 
char *name,
  *
  * Writes a property to a object.
  */
-void object_property_set(Object *obj, struct Visitor *v, const char *name,
+void object_property_set(Object *obj, Visitor *v, const char *name,
  Error **errp);

 /**
diff --git a/hw/misc/edu.c b/hw/misc/edu.c
index 43d5b18..a7171eb 100644
--- a/hw/misc/edu.c
+++ b/hw/misc/edu.c
@@ -362,8 +362,8 @@ static void pci_edu_uninit(PCIDevice *pdev)
 timer_del(>dma_timer);
 }

-static void edu_obj_uint64(Object *obj, struct Visitor *v, void *opaque,
-const char *name, Error **errp)
+static void edu_obj_uint64(Object *obj, Visitor *v, void *opaque,
+   const char *name, Error **errp)
 {
 uint64_t *val = opaque;

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index ba1d393..02e2bb9 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -110,7 +110,7 @@ static void balloon_stats_poll_cb(void *opaque)
 virtio_notify(vdev, s->svq);
 }

-static void balloon_stats_get_all(Object *obj, struct Visitor *v,
+static void balloon_stats_get_all(Object *obj, Visitor *v,
   void *opaque, const char *name, Error **errp)
 {
 Error *err = NULL;
@@ -148,7 +148,7 @@ out:
 error_propagate(errp, err);
 }

-static void balloon_stats_get_poll_interval(Object *obj, struct Visitor *v,
+static void balloon_stats_get_poll_interval(Object *obj, Visitor *v,
 void *opaque, const char *name,
 Error **errp)
 {
@@ -156,7 +156,7 @@ static void balloon_stats_get_poll_interval(Object *obj, 
struct Visitor *v,
 visit_type_int(v, >stats_poll_interval, name, errp);
 }

-static void balloon_stats_set_poll_interval(Object *obj, struct Visitor *v,
+static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
 void *opaque, const char *name,
 Error **errp)
 {
diff --git a/qom/object.c b/qom/object.c
index 5ff97ab..4d7d8c8 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -2184,7 +2184,7 @@ typedef struct {
 char *target_name;
 } AliasProperty;

-static void property_get_alias(Object *obj, struct Visitor *v, void *opaque,
+static void property_get_alias(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
 {
 AliasProperty *prop = opaque;
@@ -2192,7 +2192,7 @@ static void property_get_alias(Object *obj, struct 
Visitor *v, void *opaque,
 object_property_get(prop->target_obj, v, prop->target_name, errp);
 }

-static void property_set_alias(Object *obj, struct Visitor *v, void *opaque,
+static void property_set_alias(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
 {
 AliasProperty *prop = opaque;
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index b255644..60bfa80 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -2945,7 +2945,7 @@ typedef struct BitProperty {
 } BitProperty;

 static void x86_cpu_get_bit_prop(Object *obj,
- struct Visitor *v,
+

[Qemu-devel] [PATCH v10 06/25] hmp: Drop pointless allocation during qapi visit

2016-01-29 Thread Eric Blake

The qapi visitor contract allows us to visit a virtual structure,
where we don't have any corresponding qapi struct.  Most such uses
pass NULL for @obj; but these two callers were passing a dummy
pointer, which then gets allocated to heap memory but then
immediately freed without use.  Clean this up to suppress unwanted
allocation, like we do elsewhere.

Signed-off-by: Eric Blake 
Reviewed-by: Marc-André Lureau 

---
v10: resplit 4/37 and 5/37 by action rather than file, retain R-b.
v9: no change
v8: no change
v7: place earlier in series, drop attempts to provide a 'kind' string,
drop bogus avoidance of qmp_object_del() on error
v6: new patch, split from RFC on v5 7/46
---
 hmp.c | 4 +---
 vl.c  | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/hmp.c b/hmp.c
index 54f2620..9537f7b 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1656,7 +1656,6 @@ void hmp_object_add(Monitor *mon, const QDict *qdict)
 QemuOpts *opts;
 char *type = NULL;
 char *id = NULL;
-void *dummy = NULL;
 OptsVisitor *ov;
 QDict *pdict;

@@ -1668,7 +1667,7 @@ void hmp_object_add(Monitor *mon, const QDict *qdict)
 ov = opts_visitor_new(opts);
 pdict = qdict_clone_shallow(qdict);

-visit_start_struct(opts_get_visitor(ov), , NULL, NULL, 0, );
+visit_start_struct(opts_get_visitor(ov), NULL, NULL, NULL, 0, );
 if (err) {
 goto out_clean;
 }
@@ -1700,7 +1699,6 @@ out_clean:
 qemu_opts_del(opts);
 g_free(id);
 g_free(type);
-g_free(dummy);

 out:
 hmp_handle_error(mon, );
diff --git a/vl.c b/vl.c
index f043009..b96590a 100644
--- a/vl.c
+++ b/vl.c
@@ -2824,7 +2824,6 @@ static int object_create(void *opaque, QemuOpts *opts, 
Error **errp)
 Error *err = NULL;
 char *type = NULL;
 char *id = NULL;
-void *dummy = NULL;
 OptsVisitor *ov;
 QDict *pdict;
 bool (*type_predicate)(const char *) = opaque;
@@ -2832,7 +2831,7 @@ static int object_create(void *opaque, QemuOpts *opts, 
Error **errp)
 ov = opts_visitor_new(opts);
 pdict = qemu_opts_to_qdict(opts, NULL);

-visit_start_struct(opts_get_visitor(ov), , NULL, NULL, 0, );
+visit_start_struct(opts_get_visitor(ov), NULL, NULL, NULL, 0, );
 if (err) {
 goto out;
 }
@@ -2867,7 +2866,6 @@ out:
 QDECREF(pdict);
 g_free(id);
 g_free(type);
-g_free(dummy);
 if (err) {
 error_report_err(err);
 return -1;
-- 
2.5.0

[Qemu-devel] [PATCH v10 09/25] balloon: Improve use of qapi visitor

2016-01-29 Thread Eric Blake

Rework the control flow of balloon_stats_get_all() to make it
easier for a later patch to split visit_end_struct().  Also
switch to the uint64 visitor to match the data type.

Signed-off-by: Eric Blake 

---
v10: defer out_nested label to later patch, drop Marc-Andre's R-b
v9: no change
v8: no change
v7: place earlier in series
v6: new patch, split from RFC on v5 7/46
---
 hw/virtio/virtio-balloon.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 9671635..ba1d393 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -130,9 +130,11 @@ static void balloon_stats_get_all(Object *obj, struct 
Visitor *v,
 if (err) {
 goto out_end;
 }
-for (i = 0; !err && i < VIRTIO_BALLOON_S_NR; i++) {
-visit_type_int64(v, (int64_t *) >stats[i], balloon_stat_names[i],
- );
+for (i = 0; i < VIRTIO_BALLOON_S_NR; i++) {
+visit_type_uint64(v, >stats[i], balloon_stat_names[i], );
+if (err) {
+break;
+}
 }
 error_propagate(errp, err);
 err = NULL;
-- 
2.5.0

[Qemu-devel] [PATCH v10 16/25] qapi: Don't cast Enum* to int*

2016-01-29 Thread Eric Blake

C compilers are allowed to represent enums as a smaller type
than int, if all enum values fit in the smaller type.  There
are even compiler flags that force the use of this smaller
representation, although using them changes the ABI of a
binary. Therefore, our generated code for visit_type_ENUM()
(for all qapi enums) was wrong for casting Enum* to int* when
calling visit_type_enum().

It appears that no one has been using compiler ABI switches
for qemu, because if they had, we are potentially dereferencing
beyond bounds or even risking a SIGBUS on platforms where
unaligned pointer dereferencing is fatal.  But it is still
better to avoid the practice entirely, and just use the correct
types.

This matches the fix for alternate qapi types, done earlier in
commit 0426d53 "qapi: Simplify visiting of alternate types",
with generated code changing as:

| void visit_type_QType(Visitor *v, QType *obj, const char *name, Error **errp)
| {
|-visit_type_enum(v, (int *)obj, QType_lookup, "QType", name, errp);
|+int value = *obj;
|+visit_type_enum(v, , QType_lookup, "QType", name, errp);
|+*obj = value;
| }

Signed-off-by: Eric Blake 
Reviewed-by: Marc-André Lureau 

---
v10: s/tmp/value/, shorter commit message
v9: mention earlier commit id, enhance commit message
v8: no change
v7: rebase on typo fix
v6: new patch
---
 scripts/qapi-visit.py | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index f98bb5f..ba75667 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -178,12 +178,13 @@ out:


 def gen_visit_enum(name):
-# FIXME cast from enum *obj to int * invalidly assumes enum is int
 return mcgen('''

 void visit_type_%(c_name)s(Visitor *v, %(c_name)s *obj, const char *name, 
Error **errp)
 {
-visit_type_enum(v, (int *)obj, %(c_name)s_lookup, "%(name)s", name, errp);
+int value = *obj;
+visit_type_enum(v, , %(c_name)s_lookup, "%(name)s", name, errp);
+*obj = value;
 }
 ''',
  c_name=c_name(name), name=name)
-- 
2.5.0

[Qemu-devel] [PATCH v10 21/25] qapi: Drop unused 'kind' for struct/enum visit

2016-01-29 Thread Eric Blake

visit_start_struct() and visit_type_enum() had a 'kind' argument
that was usually set to either the stringized version of the
corresponding qapi type name, or to NULL (although some clients
didn't even get that right).  But nothing ever used the argument.
It's even hard to argue that it would be useful in a debugger,
as a stack backtrace also tells which type is being visited.

Therefore, drop the 'kind' argument as dead.

Signed-off-by: Eric Blake 
Reviewed-by: Marc-André Lureau 

---
v10: rebase to earlier changes, tweak copyright
v9: no change
v8: rebase to 'name' motion
v7: new patch
---
 include/qapi/visitor.h  |  5 ++---
 include/qapi/visitor-impl.h | 11 ---
 scripts/qapi-event.py   |  2 +-
 scripts/qapi-visit.py   | 12 ++--
 qapi/qapi-visit-core.c  | 16 +++-
 hmp.c   |  2 +-
 hw/core/qdev-properties.c   |  6 ++
 hw/ppc/spapr_drc.c  |  4 ++--
 hw/virtio/virtio-balloon.c  |  4 ++--
 qapi/opts-visitor.c |  2 +-
 qapi/qapi-dealloc-visitor.c |  6 ++
 qapi/qmp-input-visitor.c|  2 +-
 qapi/qmp-output-visitor.c   |  3 +--
 qom/object.c|  8 
 vl.c|  2 +-
 15 files changed, 37 insertions(+), 48 deletions(-)

diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
index 0b5cd41..997555d 100644
--- a/include/qapi/visitor.h
+++ b/include/qapi/visitor.h
@@ -28,7 +28,7 @@ typedef struct GenericList
 } GenericList;

 void visit_start_struct(Visitor *v, const char *name, void **obj,
-const char *kind, size_t size, Error **errp);
+size_t size, Error **errp);
 void visit_end_struct(Visitor *v, Error **errp);
 void visit_start_implicit_struct(Visitor *v, void **obj, size_t size,
  Error **errp);
@@ -54,8 +54,7 @@ bool visit_optional(Visitor *v, const char *name, bool 
*present);
 void visit_get_next_type(Visitor *v, const char *name, QType *type,
  bool promote_int, Error **errp);
 void visit_type_enum(Visitor *v, const char *name, int *obj,
- const char *const strings[], const char *kind,
- Error **errp);
+ const char *const strings[], Error **errp);
 void visit_type_int(Visitor *v, const char *name, int64_t *obj, Error **errp);
 void visit_type_uint8(Visitor *v, const char *name, uint8_t *obj,
   Error **errp);
diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
index 734cc13..337f999 100644
--- a/include/qapi/visitor-impl.h
+++ b/include/qapi/visitor-impl.h
@@ -19,7 +19,7 @@ struct Visitor
 {
 /* Must be set */
 void (*start_struct)(Visitor *v, const char *name, void **obj,
- const char *kind, size_t size, Error **errp);
+ size_t size, Error **errp);
 void (*end_struct)(Visitor *v, Error **errp);

 void (*start_implicit_struct)(Visitor *v, void **obj, size_t size,
@@ -31,8 +31,7 @@ struct Visitor
 void (*end_list)(Visitor *v, Error **errp);

 void (*type_enum)(Visitor *v, const char *name, int *obj,
-  const char *const strings[], const char *kind,
-  Error **errp);
+  const char *const strings[], Error **errp);
 /* May be NULL; only needed for input visitors. */
 void (*get_next_type)(Visitor *v, const char *name, QType *type,
   bool promote_int, Error **errp);
@@ -61,10 +60,8 @@ struct Visitor
 };

 void input_type_enum(Visitor *v, const char *name, int *obj,
- const char *const strings[], const char *kind,
- Error **errp);
+ const char *const strings[], Error **errp);
 void output_type_enum(Visitor *v, const char *name, int *obj,
-  const char *const strings[], const char *kind,
-  Error **errp);
+  const char *const strings[], Error **errp);

 #endif
diff --git a/scripts/qapi-event.py b/scripts/qapi-event.py
index edd446b..07bcb73 100644
--- a/scripts/qapi-event.py
+++ b/scripts/qapi-event.py
@@ -63,7 +63,7 @@ def gen_event_send(name, arg_type):
 qov = qmp_output_visitor_new();
 v = qmp_output_get_visitor(qov);

-visit_start_struct(v, "%(name)s", NULL, NULL, 0, );
+visit_start_struct(v, "%(name)s", NULL, 0, );
 ''',
  name=name)
 ret += gen_err_check()
diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index 35505ac..308000f 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -122,7 +122,7 @@ void visit_type_%(c_name)s(Visitor *v, const char *name, 
%(c_name)s **obj, Error
 {
 Error *err = NULL;

-visit_start_struct(v, name, (void **)obj, "%(name)s", sizeof(%(c_name)s), 
);
+visit_start_struct(v, name, (void **)obj, sizeof(%(c_name)s), );
 if (err) {

[Qemu-devel] [PATCH v10 25/25] qmp: Don't abuse stack to track qmp-output root

2016-01-29 Thread Eric Blake

The previous commit documented an inconsistency in how we are
using the stack of qmp-output-visitor.  Normally, pushing a
single top-level object puts the object on the stack twice:
once as the root, and once as the current container being
appended to; but popping that struct only pops once.  However,
qmp_ouput_add() was trying to either set up the added object
as the new root (works if you parse two top-level scalars in a
row: the second replaces the first as the root) or as a member
of the current container (works as long as you have an open
container on the stack; but if you have popped the first
top-level container, it then resolves to the root and still
tries to add into that existing container).

Fix the stupidity by not tracking two separate things in the
stack.  Drop the now-useless qmp_output_first() and
qmp_output_last() while at it.

Saved for a later patch: we still are rather sloppy in that
qmp_output_get_object() can be called in the middle of a parse,
rather than requiring that a visit is complete.

Signed-off-by: Eric Blake 

---
v10: rebase to earlier changes, inline qmp_output_last, drop R-b
v9: no change
v8: rebase to earlier changes
v7: retitle; rebase to earlier changes, drop qmp_output_first()
v6: no change
---
 qapi/qmp-output-visitor.c | 89 ++-
 1 file changed, 26 insertions(+), 63 deletions(-)

diff --git a/qapi/qmp-output-visitor.c b/qapi/qmp-output-visitor.c
index b0e3980..05b872a 100644
--- a/qapi/qmp-output-visitor.c
+++ b/qapi/qmp-output-visitor.c
@@ -30,16 +30,8 @@ typedef QTAILQ_HEAD(QStack, QStackEntry) QStack;
 struct QmpOutputVisitor
 {
 Visitor visitor;
-/* FIXME: we are abusing stack to hold two separate pieces of
- * information: the current root object in slot 0, and the stack
- * of N objects still being built in slots 1 through N (for N+1
- * slots in use).  Worse, our behavior is inconsistent:
- * qmp_output_add_obj() visiting two top-level scalars in a row
- * discards the first in favor of the second, but visiting two
- * top-level objects in a row tries to append the second object
- * into the first (since the first object was placed in the stack
- * in both slot 0 and 1, but only popped from slot 1).  */
-QStack stack;
+QStack stack; /* Stack of containers that haven't yet been finished */
+QObject *root; /* Root of the output visit */
 };

 #define qmp_output_add(qov, name, value) \
@@ -56,6 +48,7 @@ static void qmp_output_push_obj(QmpOutputVisitor *qov, 
QObject *value)
 {
 QStackEntry *e = g_malloc0(sizeof(*e));

+assert(qov->root);
 assert(value);
 e->value = value;
 if (qobject_type(e->value) == QTYPE_QLIST) {
@@ -78,60 +71,32 @@ static QObject *qmp_output_pop(QmpOutputVisitor *qov)
 return value;
 }

-/* Grab the root QObject, if any */
-static QObject *qmp_output_first(QmpOutputVisitor *qov)
-{
-QStackEntry *e = QTAILQ_LAST(>stack, QStack);
-
-if (!e) {
-/* No root */
-return NULL;
-}
-assert(e->value);
-return e->value;
-}
-
-/* Peek at the top of the stack of QObjects being built.
- * The stack must not be empty. */
-static QObject *qmp_output_last(QmpOutputVisitor *qov)
-{
-QStackEntry *e = QTAILQ_FIRST(>stack);
-
-assert(e && e->value);
-return e->value;
-}
-
 /* Add @value to the current QObject being built.
  * If the stack is visiting a dictionary or list, @value is now owned
  * by that container. Otherwise, @value is now the root.  */
 static void qmp_output_add_obj(QmpOutputVisitor *qov, const char *name,
QObject *value)
 {
-QObject *cur;
+QStackEntry *e = QTAILQ_FIRST(>stack);
+QObject *cur = e ? e->value : NULL;

-if (QTAILQ_EMPTY(>stack)) {
-/* Stack was empty, track this object as root */
-qmp_output_push_obj(qov, value);
-return;
-}
-
-cur = qmp_output_last(qov);
-
-switch (qobject_type(cur)) {
-case QTYPE_QDICT:
-assert(name);
-qdict_put_obj(qobject_to_qdict(cur), name, value);
-break;
-case QTYPE_QLIST:
-qlist_append_obj(qobject_to_qlist(cur), value);
-break;
-default:
-/* The previous root was a scalar, replace it with a new root */
-/* FIXME this is abusing the stack; see comment above */
-qobject_decref(qmp_output_pop(qov));
-assert(QTAILQ_EMPTY(>stack));
-qmp_output_push_obj(qov, value);
-break;
+if (!cur) {
+/* FIXME we should require the user to reset the visitor, rather
+ * than throwing away the previous root */
+qobject_decref(qov->root);
+qov->root = value;
+} else {
+switch (qobject_type(cur)) {
+case QTYPE_QDICT:
+assert(name);
+qdict_put_obj(qobject_to_qdict(cur), name, value);
+break;
+case QTYPE_QLIST:
+qlist_append_obj(qobject_to_qlist(cur),

[Qemu-devel] [PATCH v10 22/25] qapi: Tighten qmp_input_end_list()

2016-01-29 Thread Eric Blake

The only way that qmp_input_pop() will set errp is if a dictionary
was the most recent thing pushed.  Since we don't have any
push(struct)/pop(list) or push(list)/pop(struct) mismatches (such
a mismatch is a programming bug), we therefore cannot set errp
inside qmp_input_end_list().  Make this obvious by
using _abort.  A later patch will then remove the errp
parameter of qmp_input_pop(), but that will first require the
larger task of splitting visit_end_struct().

Signed-off-by: Eric Blake 

---
v10: new patch, split out from 18/37
---
 qapi/qmp-input-visitor.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qapi/qmp-input-visitor.c b/qapi/qmp-input-visitor.c
index bf25249..ce3c0d6 100644
--- a/qapi/qmp-input-visitor.c
+++ b/qapi/qmp-input-visitor.c
@@ -205,7 +205,7 @@ static void qmp_input_end_list(Visitor *v, Error **errp)
 {
 QmpInputVisitor *qiv = to_qiv(v);

-qmp_input_pop(qiv, errp);
+qmp_input_pop(qiv, _abort);
 }

 static void qmp_input_get_next_type(Visitor *v, const char *name, QType *type,
-- 
2.5.0

[Qemu-devel] [PATCH v10 12/25] qapi-visit: Kill unused visit_end_union()

2016-01-29 Thread Eric Blake

The generated code can call visit_end_union() without having called
visit_start_union().  Example:

if (!*obj) {
goto out_obj;
}
visit_type_CpuInfoBase_fields(v, (CpuInfoBase **)obj, );
if (err) {
goto out_obj; // if we go from here...
}
if (!visit_start_union(v, !!(*obj)->u.data, ) || err) {
goto out_obj;
}
switch ((*obj)->arch) {
[...]
}
out_obj:
// ... then *obj is true, and ...
error_propagate(errp, err);
err = NULL;
if (*obj) {
// we end up here
visit_end_union(v, !!(*obj)->u.data, );
}
error_propagate(errp, err);

Harmless only because no visitor implements end_union().  Clean it up
anyway, by deleting the function as useless.

Messed up since we have visit_end_union (commit cee2ded).

Signed-off-by: Markus Armbruster 
Message-Id: <1453902888-20457-3-git-send-email-arm...@redhat.com>
[expand scope of patch to delete rather than repair]
Signed-off-by: Eric Blake 

---
v10: new patch; take Markus' commit message as a starting point (hence
his S-o-b), but rewrite the patch and claim ownership
---
 include/qapi/visitor.h  | 1 -
 include/qapi/visitor-impl.h | 1 -
 scripts/qapi-visit.py   | 5 -
 qapi/qapi-visit-core.c  | 8 +---
 4 files changed, 1 insertion(+), 14 deletions(-)

diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h
index a14a16d..83ac3ad 100644
--- a/include/qapi/visitor.h
+++ b/include/qapi/visitor.h
@@ -70,6 +70,5 @@ void visit_type_str(Visitor *v, char **obj, const char *name, 
Error **errp);
 void visit_type_number(Visitor *v, double *obj, const char *name, Error 
**errp);
 void visit_type_any(Visitor *v, QObject **obj, const char *name, Error **errp);
 bool visit_start_union(Visitor *v, bool data_present, Error **errp);
-void visit_end_union(Visitor *v, bool data_present, Error **errp);

 #endif
diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h
index 44a21b7..f314894 100644
--- a/include/qapi/visitor-impl.h
+++ b/include/qapi/visitor-impl.h
@@ -58,7 +58,6 @@ struct Visitor
 /* visit_type_size() falls back to (*type_uint64)() if type_size is unset 
*/
 void (*type_size)(Visitor *v, uint64_t *obj, const char *name, Error 
**errp);
 bool (*start_union)(Visitor *v, bool data_present, Error **errp);
-void (*end_union)(Visitor *v, bool data_present, Error **errp);
 };

 void input_type_enum(Visitor *v, int *obj, const char * const strings[],
diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py
index ec16e36..f98bb5f 100644
--- a/scripts/qapi-visit.py
+++ b/scripts/qapi-visit.py
@@ -320,11 +320,6 @@ void visit_type_%(c_name)s(Visitor *v, %(c_name)s **obj, 
const char *name, Error
 out_obj:
 error_propagate(errp, err);
 err = NULL;
-if (*obj) {
-visit_end_union(v, !!(*obj)->u.data, );
-}
-error_propagate(errp, err);
-err = NULL;
 visit_end_struct(v, );
 out:
 error_propagate(errp, err);
diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c
index 6d63e40..f96d82f 100644
--- a/qapi/qapi-visit-core.c
+++ b/qapi/qapi-visit-core.c
@@ -1,6 +1,7 @@
 /*
  * Core Definitions for QAPI Visitor Classes
  *
+ * Copyright (C) 2012-2016 Red Hat, Inc.
  * Copyright IBM, Corp. 2011
  *
  * Authors:
@@ -66,13 +67,6 @@ bool visit_start_union(Visitor *v, bool data_present, Error 
**errp)
 return true;
 }

-void visit_end_union(Visitor *v, bool data_present, Error **errp)
-{
-if (v->end_union) {
-v->end_union(v, data_present, errp);
-}
-}
-
 bool visit_optional(Visitor *v, bool *present, const char *name)
 {
 if (v->optional) {
-- 
2.5.0

Re: [Qemu-devel] [PATCH] qdev: Use GList for global properties

2016-01-29 Thread Eduardo Habkost

On Thu, Jan 28, 2016 at 07:01:12PM +0200, Michael S. Tsirkin wrote:
> On Thu, Jan 28, 2016 at 01:02:26PM -0200, Eduardo Habkost wrote:
> > If the same GlobalProperty struct is registered twice, the list
> > entry gets corrupted, making tqe_next points to itself, and
> > qdev_prop_set_globals() gets stuck in a loop. The bug can be
> > easily reproduced by running:
> > 
> >   $ qemu-system-x86_64 -rtc-td-hack -rtc-td-hack
> > 
> > Change global_props to use GList instead of queue.h, making the
> > code simpler and able to deal with properties being registered
> > twice.
> > 
> > Signed-off-by: Eduardo Habkost 
> 
> Reviewed-by: Michael S. Tsirkin 

Thanks! It seems we don't have a maintainer for hw/core/qdev*.
Who should merge this?

-- 
Eduardo

[Qemu-devel] [RFC v7 01/16] exec.c: Add new exclusive bitmap to ram_list

2016-01-29 Thread Alvise Rigo

The purpose of this new bitmap is to flag the memory pages that are in
the middle of LL/SC operations (after a LL, before a SC). For all these
pages, the corresponding TLB entries will be generated in such a way to
force the slow-path for all the VCPUs (see the following patches).

When the system starts, the whole memory is set to dirty.

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 exec.c  |  7 +--
 include/exec/memory.h   |  3 ++-
 include/exec/ram_addr.h | 31 +++
 3 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/exec.c b/exec.c
index 7115403..51f366d 100644
--- a/exec.c
+++ b/exec.c
@@ -1575,11 +1575,14 @@ static ram_addr_t ram_block_add(RAMBlock *new_block, 
Error **errp)
 int i;
 
 /* ram_list.dirty_memory[] is protected by the iothread lock.  */
-for (i = 0; i < DIRTY_MEMORY_NUM; i++) {
+for (i = 0; i < DIRTY_MEMORY_EXCLUSIVE; i++) {
 ram_list.dirty_memory[i] =
 bitmap_zero_extend(ram_list.dirty_memory[i],
old_ram_size, new_ram_size);
-   }
+}
+ram_list.dirty_memory[DIRTY_MEMORY_EXCLUSIVE] =
+bitmap_zero_extend(ram_list.dirty_memory[DIRTY_MEMORY_EXCLUSIVE],
+   old_ram_size, new_ram_size);
 }
 cpu_physical_memory_set_dirty_range(new_block->offset,
 new_block->used_length,
diff --git a/include/exec/memory.h b/include/exec/memory.h
index c92734a..71e0480 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -19,7 +19,8 @@
 #define DIRTY_MEMORY_VGA   0
 #define DIRTY_MEMORY_CODE  1
 #define DIRTY_MEMORY_MIGRATION 2
-#define DIRTY_MEMORY_NUM   3/* num of dirty bits */
+#define DIRTY_MEMORY_EXCLUSIVE 3
+#define DIRTY_MEMORY_NUM   4/* num of dirty bits */
 
 #include 
 #include 
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index ef1489d..19789fc 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -21,6 +21,7 @@
 
 #ifndef CONFIG_USER_ONLY
 #include "hw/xen/xen.h"
+#include "sysemu/sysemu.h"
 
 struct RAMBlock {
 struct rcu_head rcu;
@@ -172,6 +173,9 @@ static inline void 
cpu_physical_memory_set_dirty_range(ram_addr_t start,
 if (unlikely(mask & (1 << DIRTY_MEMORY_CODE))) {
 bitmap_set_atomic(d[DIRTY_MEMORY_CODE], page, end - page);
 }
+if (unlikely(mask & (1 << DIRTY_MEMORY_EXCLUSIVE))) {
+bitmap_set_atomic(d[DIRTY_MEMORY_EXCLUSIVE], page, end - page);
+}
 xen_modified_memory(start, length);
 }
 
@@ -287,5 +291,32 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(unsigned 
long *dest,
 }
 
 void migration_bitmap_extend(ram_addr_t old, ram_addr_t new);
+
+/* Exclusive bitmap support. */
+#define EXCL_BITMAP_GET_OFFSET(addr) (addr >> TARGET_PAGE_BITS)
+
+/* Make the page of @addr not exclusive. */
+static inline void cpu_physical_memory_unset_excl(ram_addr_t addr)
+{
+set_bit_atomic(EXCL_BITMAP_GET_OFFSET(addr),
+   ram_list.dirty_memory[DIRTY_MEMORY_EXCLUSIVE]);
+}
+
+/* Return true if the page of @addr is exclusive, i.e. the EXCL bit is set. */
+static inline int cpu_physical_memory_is_excl(ram_addr_t addr)
+{
+return !test_bit(EXCL_BITMAP_GET_OFFSET(addr),
+ ram_list.dirty_memory[DIRTY_MEMORY_EXCLUSIVE]);
+}
+
+/* Set the page of @addr as exclusive clearing its EXCL bit and return the
+ * previous bit's state. */
+static inline int cpu_physical_memory_set_excl(ram_addr_t addr)
+{
+return bitmap_test_and_clear_atomic(
+ram_list.dirty_memory[DIRTY_MEMORY_EXCLUSIVE],
+EXCL_BITMAP_GET_OFFSET(addr), 1);
+}
+
 #endif
 #endif
-- 
2.7.0

[Qemu-devel] [RFC v7 14/16] target-arm: translate: Use ld/st excl for atomic insns

2016-01-29 Thread Alvise Rigo

Use the new LL/SC runtime helpers to handle the ARM atomic instructions
in softmmu_llsc_template.h.

In general, the helper generator
gen_{ldrex,strex}_{8,16a,32a,64a}() calls the function
helper_{le,be}_{ldlink,stcond}{ub,uw,ul,q}_mmu() implemented in
softmmu_llsc_template.h, doing an alignment check.

In addition, add a simple helper function to emulate the CLREX instruction.

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 target-arm/cpu.h   |   2 +
 target-arm/helper.h|   4 ++
 target-arm/machine.c   |   2 +
 target-arm/op_helper.c |  10 +++
 target-arm/translate.c | 188 +++--
 5 files changed, 202 insertions(+), 4 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index b8b3364..bb5361f 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -462,9 +462,11 @@ typedef struct CPUARMState {
 float_status fp_status;
 float_status standard_fp_status;
 } vfp;
+#if !defined(CONFIG_ARM_USE_LDST_EXCL)
 uint64_t exclusive_addr;
 uint64_t exclusive_val;
 uint64_t exclusive_high;
+#endif
 #if defined(CONFIG_USER_ONLY)
 uint64_t exclusive_test;
 uint32_t exclusive_info;
diff --git a/target-arm/helper.h b/target-arm/helper.h
index c2a85c7..6bc3c0a 100644
--- a/target-arm/helper.h
+++ b/target-arm/helper.h
@@ -532,6 +532,10 @@ DEF_HELPER_2(dc_zva, void, env, i64)
 DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64)
 
+#ifdef CONFIG_ARM_USE_LDST_EXCL
+DEF_HELPER_1(atomic_clear, void, env)
+#endif
+
 #ifdef TARGET_AARCH64
 #include "helper-a64.h"
 #endif
diff --git a/target-arm/machine.c b/target-arm/machine.c
index ed1925a..7adfb4d 100644
--- a/target-arm/machine.c
+++ b/target-arm/machine.c
@@ -309,9 +309,11 @@ const VMStateDescription vmstate_arm_cpu = {
 VMSTATE_VARRAY_INT32(cpreg_vmstate_values, ARMCPU,
  cpreg_vmstate_array_len,
  0, vmstate_info_uint64, uint64_t),
+#if !defined(CONFIG_ARM_USE_LDST_EXCL)
 VMSTATE_UINT64(env.exclusive_addr, ARMCPU),
 VMSTATE_UINT64(env.exclusive_val, ARMCPU),
 VMSTATE_UINT64(env.exclusive_high, ARMCPU),
+#endif
 VMSTATE_UINT64(env.features, ARMCPU),
 VMSTATE_UINT32(env.exception.syndrome, ARMCPU),
 VMSTATE_UINT32(env.exception.fsr, ARMCPU),
diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
index a5ee65f..404c13b 100644
--- a/target-arm/op_helper.c
+++ b/target-arm/op_helper.c
@@ -51,6 +51,14 @@ static int exception_target_el(CPUARMState *env)
 return target_el;
 }
 
+#ifdef CONFIG_ARM_USE_LDST_EXCL
+void HELPER(atomic_clear)(CPUARMState *env)
+{
+ENV_GET_CPU(env)->excl_protected_range.begin = EXCLUSIVE_RESET_ADDR;
+ENV_GET_CPU(env)->ll_sc_context = false;
+}
+#endif
+
 uint32_t HELPER(neon_tbl)(CPUARMState *env, uint32_t ireg, uint32_t def,
   uint32_t rn, uint32_t maxindex)
 {
@@ -689,7 +697,9 @@ void HELPER(exception_return)(CPUARMState *env)
 
 aarch64_save_sp(env, cur_el);
 
+#if !defined(CONFIG_ARM_USE_LDST_EXCL)
 env->exclusive_addr = -1;
+#endif
 
 /* We must squash the PSTATE.SS bit to zero unless both of the
  * following hold:
diff --git a/target-arm/translate.c b/target-arm/translate.c
index cff511b..5150841 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -60,8 +60,10 @@ TCGv_ptr cpu_env;
 static TCGv_i64 cpu_V0, cpu_V1, cpu_M0;
 static TCGv_i32 cpu_R[16];
 TCGv_i32 cpu_CF, cpu_NF, cpu_VF, cpu_ZF;
+#if !defined(CONFIG_ARM_USE_LDST_EXCL)
 TCGv_i64 cpu_exclusive_addr;
 TCGv_i64 cpu_exclusive_val;
+#endif
 #ifdef CONFIG_USER_ONLY
 TCGv_i64 cpu_exclusive_test;
 TCGv_i32 cpu_exclusive_info;
@@ -94,10 +96,12 @@ void arm_translate_init(void)
 cpu_VF = tcg_global_mem_new_i32(TCG_AREG0, offsetof(CPUARMState, VF), 
"VF");
 cpu_ZF = tcg_global_mem_new_i32(TCG_AREG0, offsetof(CPUARMState, ZF), 
"ZF");
 
+#if !defined(CONFIG_ARM_USE_LDST_EXCL)
 cpu_exclusive_addr = tcg_global_mem_new_i64(TCG_AREG0,
 offsetof(CPUARMState, exclusive_addr), "exclusive_addr");
 cpu_exclusive_val = tcg_global_mem_new_i64(TCG_AREG0,
 offsetof(CPUARMState, exclusive_val), "exclusive_val");
+#endif
 #ifdef CONFIG_USER_ONLY
 cpu_exclusive_test = tcg_global_mem_new_i64(TCG_AREG0,
 offsetof(CPUARMState, exclusive_test), "exclusive_test");
@@ -7413,15 +7417,145 @@ static void gen_logicq_cc(TCGv_i32 lo, TCGv_i32 hi)
 tcg_gen_or_i32(cpu_ZF, lo, hi);
 }
 
-/* Load/Store exclusive instructions are implemented by remembering
+/* If the softmmu is enabled, the translation of Load/Store exclusive
+   instructions will rely on the gen_helper_{ldlink,stcond} helpers,
+   offloading most of the work to the softmmu_llsc_template.h functions.
+   All the

[Qemu-devel] [RFC v7 05/16] softmmu: Add new TLB_EXCL flag

2016-01-29 Thread Alvise Rigo

Add a new TLB flag to force all the accesses made to a page to follow
the slow-path.

The TLB entries referring guest pages with the DIRTY_MEMORY_EXCLUSIVE
bit clean will have this flag set.

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 include/exec/cpu-all.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 83b1781..f8d8feb 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -277,6 +277,14 @@ CPUArchState *cpu_copy(CPUArchState *env);
 #define TLB_NOTDIRTY(1 << 4)
 /* Set if TLB entry is an IO callback.  */
 #define TLB_MMIO(1 << 5)
+/* Set if TLB entry references a page that requires exclusive access.  */
+#define TLB_EXCL(1 << 6)
+
+/* Do not allow a TARGET_PAGE_MASK which covers one or more bits defined
+ * above. */
+#if TLB_EXCL >= TARGET_PAGE_SIZE
+#error TARGET_PAGE_MASK covering the low bits of the TLB virtual address
+#endif
 
 void dump_exec_info(FILE *f, fprintf_function cpu_fprintf);
 void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf);
-- 
2.7.0

[Qemu-devel] [RFC v7 03/16] softmmu: Simplify helper_*_st_name, wrap MMIO code

2016-01-29 Thread Alvise Rigo

Attempting to simplify the helper_*_st_name, wrap the MMIO code into an
inline function.

Based on this work, Alex proposed the following patch series
https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg01136.html
that reduces code duplication of the softmmu_helpers.

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 softmmu_template.h | 66 --
 1 file changed, 44 insertions(+), 22 deletions(-)

diff --git a/softmmu_template.h b/softmmu_template.h
index 7029a03..3d388ec 100644
--- a/softmmu_template.h
+++ b/softmmu_template.h
@@ -396,6 +396,26 @@ static inline void glue(helper_le_st_name, 
_do_unl_access)(CPUArchState *env,
 }
 }
 
+static inline void glue(helper_le_st_name, _do_mmio_access)(CPUArchState *env,
+DATA_TYPE val,
+target_ulong addr,
+TCGMemOpIdx oi,
+unsigned mmu_idx,
+int index,
+uintptr_t retaddr)
+{
+CPUIOTLBEntry *iotlbentry = >iotlb[mmu_idx][index];
+
+if ((addr & (DATA_SIZE - 1)) != 0) {
+glue(helper_le_st_name, _do_unl_access)(env, val, addr, mmu_idx,
+oi, retaddr);
+}
+/* ??? Note that the io helpers always read data in the target
+   byte ordering.  We should push the LE/BE request down into io.  */
+val = TGT_LE(val);
+glue(io_write, SUFFIX)(env, iotlbentry, val, addr, retaddr);
+}
+
 void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val,
TCGMemOpIdx oi, uintptr_t retaddr)
 {
@@ -423,17 +443,8 @@ void helper_le_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 
 /* Handle an IO access.  */
 if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
-CPUIOTLBEntry *iotlbentry;
-if ((addr & (DATA_SIZE - 1)) != 0) {
-glue(helper_le_st_name, _do_unl_access)(env, val, addr, mmu_idx,
-oi, retaddr);
-}
-iotlbentry = >iotlb[mmu_idx][index];
-
-/* ??? Note that the io helpers always read data in the target
-   byte ordering.  We should push the LE/BE request down into io.  */
-val = TGT_LE(val);
-glue(io_write, SUFFIX)(env, iotlbentry, val, addr, retaddr);
+glue(helper_le_st_name, _do_mmio_access)(env, val, addr, oi,
+ mmu_idx, index, retaddr);
 return;
 }
 
@@ -488,6 +499,26 @@ static inline void glue(helper_be_st_name, 
_do_unl_access)(CPUArchState *env,
 }
 }
 
+static inline void glue(helper_be_st_name, _do_mmio_access)(CPUArchState *env,
+DATA_TYPE val,
+target_ulong addr,
+TCGMemOpIdx oi,
+unsigned mmu_idx,
+int index,
+uintptr_t retaddr)
+{
+CPUIOTLBEntry *iotlbentry = >iotlb[mmu_idx][index];
+
+if ((addr & (DATA_SIZE - 1)) != 0) {
+glue(helper_be_st_name, _do_unl_access)(env, val, addr, mmu_idx,
+oi, retaddr);
+}
+/* ??? Note that the io helpers always read data in the target
+   byte ordering.  We should push the LE/BE request down into io.  */
+val = TGT_BE(val);
+glue(io_write, SUFFIX)(env, iotlbentry, val, addr, retaddr);
+}
+
 void helper_be_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val,
TCGMemOpIdx oi, uintptr_t retaddr)
 {
@@ -515,17 +546,8 @@ void helper_be_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 
 /* Handle an IO access.  */
 if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
-CPUIOTLBEntry *iotlbentry;
-if ((addr & (DATA_SIZE - 1)) != 0) {
-glue(helper_be_st_name, _do_unl_access)(env, val, addr, mmu_idx,
-oi, retaddr);
-}
-iotlbentry = >iotlb[mmu_idx][index];
-
-/* ??? Note that the io helpers always read data in the target
-   byte ordering.  We should push the LE/BE request down into io.  */
-val = TGT_BE(val);
-glue(io_write, SUFFIX)(env, iotlbentry, val, addr, retaddr);
+glue(helper_be_st_name, _do_mmio_access)(env, val, addr, oi,
+

Re: [Qemu-devel] [PATCH v8 05/16] virtio-scsi: Catch BDS-BB removal/insertion

2016-01-29 Thread Kevin Wolf

Am 27.01.2016 um 18:59 hat Max Reitz geschrieben:
> Make use of the BDS-BB removal and insertion notifiers to remove or set
> up, respectively, virtio-scsi's op blockers.
> 
> Signed-off-by: Max Reitz 
> ---
>  hw/scsi/virtio-scsi.c   | 55 
> +
>  include/hw/virtio/virtio-scsi.h | 10 
>  2 files changed, 65 insertions(+)
> 
> diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
> index 607593c..b508b81 100644
> --- a/hw/scsi/virtio-scsi.c
> +++ b/hw/scsi/virtio-scsi.c
> @@ -757,6 +757,22 @@ static void virtio_scsi_change(SCSIBus *bus, SCSIDevice 
> *dev, SCSISense sense)
>  }
>  }
>  
> +static void virtio_scsi_blk_insert_notifier(Notifier *n, void *data)
> +{
> +VirtIOSCSIBlkChangeNotifier *cn = DO_UPCAST(VirtIOSCSIBlkChangeNotifier,
> +n, n);
> +assert(cn->sd->conf.blk == data);
> +blk_op_block_all(cn->sd->conf.blk, cn->s->blocker);
> +}
> +
> +static void virtio_scsi_blk_remove_notifier(Notifier *n, void *data)
> +{
> +VirtIOSCSIBlkChangeNotifier *cn = DO_UPCAST(VirtIOSCSIBlkChangeNotifier,
> +n, n);
> +assert(cn->sd->conf.blk == data);
> +blk_op_unblock_all(cn->sd->conf.blk, cn->s->blocker);
> +}
> +
>  static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState 
> *dev,
>  Error **errp)
>  {
> @@ -765,6 +781,22 @@ static void virtio_scsi_hotplug(HotplugHandler 
> *hotplug_dev, DeviceState *dev,
>  SCSIDevice *sd = SCSI_DEVICE(dev);
>  
>  if (s->ctx && !s->dataplane_disabled) {
> +VirtIOSCSIBlkChangeNotifier *insert_notifier, *remove_notifier;
> +
> +insert_notifier = g_new0(VirtIOSCSIBlkChangeNotifier, 1);
> +insert_notifier->n.notify = virtio_scsi_blk_insert_notifier;
> +insert_notifier->s = s;
> +insert_notifier->sd = sd;
> +blk_add_insert_bs_notifier(sd->conf.blk, _notifier->n);
> +QTAILQ_INSERT_TAIL(>insert_notifiers, insert_notifier, next);
> +
> +remove_notifier = g_new0(VirtIOSCSIBlkChangeNotifier, 1);
> +remove_notifier->n.notify = virtio_scsi_blk_remove_notifier;
> +remove_notifier->s = s;
> +remove_notifier->sd = sd;
> +blk_add_remove_bs_notifier(sd->conf.blk, _notifier->n);
> +QTAILQ_INSERT_TAIL(>remove_notifiers, remove_notifier, next);
> +
>  if (blk_op_is_blocked(sd->conf.blk, BLOCK_OP_TYPE_DATAPLANE, errp)) {
>  return;
>  }

If we take the error path here, won't we have dangling pointers in the
notifier list?

Kevin

Re: [Qemu-devel] [PATCH v8 02/16] iotests: Add test for eject under NBD server

2016-01-29 Thread Max Reitz

On 27.01.2016 21:56, Eric Blake wrote:
> On 01/27/2016 10:59 AM, Max Reitz wrote:
>> This patch adds a test for ejecting the BlockBackend an NBD server is
>> connected to (the NBD server is supposed to stop).
>>
>> Signed-off-by: Max Reitz 
>> ---
>>  tests/qemu-iotests/140 | 92 
>> ++
>>  tests/qemu-iotests/140.out | 16 
>>  tests/qemu-iotests/group   |  1 +
>>  3 files changed, 109 insertions(+)
>>  create mode 100755 tests/qemu-iotests/140
>>  create mode 100644 tests/qemu-iotests/140.out
>>
>> diff --git a/tests/qemu-iotests/140 b/tests/qemu-iotests/140
>> new file mode 100755
>> index 000..3434997
>> --- /dev/null
>> +++ b/tests/qemu-iotests/140
>> @@ -0,0 +1,92 @@
>> +#!/bin/bash
>> +#
>> +# Test case for ejecting a BB with an NBD server attached to it
>> +#
>> +# Copyright (C) 2015 Red Hat, Inc.
> 
> Do you want to add 2016 now?

Or just replace it; yes, I probably do. :-)

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 2/2] migration/virtio: Remove simple .get/.put use

2016-01-29 Thread Dr. David Alan Gilbert

* Cornelia Huck (cornelia.h...@de.ibm.com) wrote:
> On Thu, 21 Jan 2016 21:56:22 +0100
> Sascha Silbe  wrote:
> 
> > "Dr. David Alan Gilbert"  writes:
> 
> > > Thanks I've reused a chunk of that;  I'll post the fix soon.
> > > Thanks for your help on this.
> > 
> > Thanks, looking forward to the next version of your patch. Will use the
> > the previous one in the meantime as a local work-around.
> 
> Any updates around this? I always need to remember to put this patch on
> top when I test migration...

Hmm - I thought I posted this properly but I can't see it on patchwork; I'll
repost it.

Dave

> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v13 00/10] Block replication for continuous checkpoints

2016-01-29 Thread Dr. David Alan Gilbert

* Wen Congyang (we...@cn.fujitsu.com) wrote:
> On 01/29/2016 06:07 PM, Dr. David Alan Gilbert wrote:
> > * Wen Congyang (we...@cn.fujitsu.com) wrote:
> >> On 01/27/2016 07:03 PM, Dr. David Alan Gilbert wrote:
> >>> Hi,
> >>>   I've got a block error if I kill the secondary.
> >>>
> >>> Start both primary & secondary
> >>> kill -9 secondary qemu
> >>> x_colo_lost_heartbeat on primary
> >>>
> >>> The guest sees a block error and the ext4 root switches to read-only.
> >>>
> >>> I gdb'd the primary with a breakpoint on quorum_report_bad; see
> >>> backtrace below.
> >>> (This is based on colo-v2.4-periodic-mode of the framework
> >>> code with the block and network proxy merged in; so it could be my
> >>> merging but I don't think so ?)
> >>>
> >>>
> >>> (gdb) where
> >>> #0  quorum_report_bad (node_name=0x7f2946a0892c "node0", ret=-5, 
> >>> acb=0x7f2946cb3910, acb=0x7f2946cb3910)
> >>> at /root/colo/jan-2016/qemu/block/quorum.c:222
> >>> #1  0x7f2943b23058 in quorum_aio_cb (opaque=, 
> >>> ret=)
> >>> at /root/colo/jan-2016/qemu/block/quorum.c:315
> >>> #2  0x7f2943b311be in bdrv_co_complete (acb=0x7f2946cb3f60) at 
> >>> /root/colo/jan-2016/qemu/block/io.c:2122
> >>> #3  0x7f2943ae777d in aio_bh_call (bh=) at 
> >>> /root/colo/jan-2016/qemu/async.c:64
> >>> #4  aio_bh_poll (ctx=ctx@entry=0x7f2945b771d0) at 
> >>> /root/colo/jan-2016/qemu/async.c:92
> >>> #5  0x7f2943af5090 in aio_dispatch (ctx=0x7f2945b771d0) at 
> >>> /root/colo/jan-2016/qemu/aio-posix.c:305
> >>> #6  0x7f2943ae756e in aio_ctx_dispatch (source=, 
> >>> callback=, 
> >>> user_data=) at /root/colo/jan-2016/qemu/async.c:231
> >>> #7  0x7f293b84a79a in g_main_context_dispatch () from 
> >>> /lib64/libglib-2.0.so.0
> >>> #8  0x7f2943af3a00 in glib_pollfds_poll () at 
> >>> /root/colo/jan-2016/qemu/main-loop.c:211
> >>> #9  os_host_main_loop_wait (timeout=) at 
> >>> /root/colo/jan-2016/qemu/main-loop.c:256
> >>> #10 main_loop_wait (nonblocking=) at 
> >>> /root/colo/jan-2016/qemu/main-loop.c:504
> >>> #11 0x7f29438529ee in main_loop () at 
> >>> /root/colo/jan-2016/qemu/vl.c:1945
> >>> #12 main (argc=, argv=, envp= >>> out>) at /root/colo/jan-2016/qemu/vl.c:4707
> >>>
> >>> (gdb) p s->num_children
> >>> $1 = 2
> >>> (gdb) p acb->success_count
> >>> $2 = 0
> >>> (gdb) p acb->is_read
> >>> $5 = false
> >>
> >> Sorry for the late reply.
> > 
> > No problem.
> > 
> >> What it the value of acb->count?
> > 
> > (gdb) p acb->count
> > $1 = 1
> 
> Note, the count is 1, not 2. Writing to children.0 is in flight. If writing 
> to children.0 successes,
> the guest doesn't know this error.
> >> If secondary host is down, you should remove quorum's children.1. 
> >> Otherwise, you will get
> >> I/O error event.
> > 
> > Is that safe?  If the secondary fails, do you always have time to issue the 
> > command to
> > remove the children.1  before the guest sees the error?
> 
> We will write to two children, and expect that writing to children.0 will 
> success. If so,
> the guest doesn't know this error. You just get the I/O error event.

I think children.0 is the disk, and that should be OK - so only the 
children.1/replication should
be failing - so in that case why do I see the error?
The 'node0' in the backtrace above is the name of the replication, so it does 
look like the error
is coming from the replication.

> > Anyway, I tried removing children.1 but it segfaults now, I guess the 
> > replication is unhappy:
> > 
> > (qemu) x_block_change colo-disk0 -d children.1
> > (qemu) x_colo_lost_heartbeat 
> 
> Hmm, you should not remove the child before failover. I will check it how to 
> avoid it in the codes.

 But you said 'If secondary host is down, you should remove quorum's 
children.1' - is that not
what you meant?

> > 12973 Segmentation fault  (core dumped) 
> > ./try/x86_64-softmmu/qemu-system-x86_64 -enable-kvm $console_param -S -boot 
> > c -m 4080 -smp 4 -machine pc-i440fx-2.5,accel=kvm -name debug-threads=on 
> > -trace events=trace-file -device virtio-rng-pci $block_param $net_param
> > 
> > #0  0x7f0a398a864c in bdrv_stop_replication (bs=0x7f0a3b0a8430, 
> > failover=true, errp=0x7fff6a5c3420)
> > at /root/colo/jan-2016/qemu/block.c:4426
> > 
> > (gdb) p drv
> > $1 = (BlockDriver *) 0x5d2a
> > 
> >   it looks like the whole of bs is bogus.
> > 
> > #1  0x7f0a398d87f6 in quorum_stop_replication (bs=, 
> > failover=, 
> > errp=) at /root/colo/jan-2016/qemu/block/quorum.c:1213
> > 
> > (gdb) p s->replication_index
> > $3 = 1
> > 
> > I guess quorum_del_child needs to stop replication before it removes the 
> > child?
> 
> Yes, but in the newest version, quorum doesn't know the block replication, 
> and I think
> we shoud add an reference to the bs when starting block replication.

Do you have a new version ready to test?  I'm interested to try it (and also 
interested
to try the latest version of the colo-proxy)

Dave

> Thanks
> Wen Congyang
> 
> > (although it would have to be

[Qemu-devel] [PATCH] Fix virtio migration

2016-01-29 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

I misunderstood the vmstate macro definition when I reworked the
virtio .get/.put.
The VMSTATE_STRUCT_VARRAY_KNOWN, was described as being for "a
variable length array (i.e. _type *_field) but we know the
length".  However it actually specified operation for arrays embedded in
the struct (i.e. _type _field[]) since it lacked the VMS_POINTER
flag. This caused offset calculation to be completely off, examining and
potentially sending random data instead of the VirtQueue content.

Replace the otherwise unused VMSTATE_STRUCT_VARRAY_KNOWN with a
VMSTATE_STRUCT_VARRAY_POINTER_KNOWN that includes the VMS_POINTER flag
(so now actually doing what it advertises) and use it in the virtio
migration code.

Fixes and description as per Sascha's suggestions/debug.

Signed-off-by: Dr. David Alan Gilbert 
Reported-by: Sascha Silbe 
Tested-By: Sascha Silbe 
Reviewed-By: Sascha Silbe 

Fixes: 50e5ae4dc3e4f21e874512f9e87b93b5472d26e0
Fixes: 2cf0148674430b6693c60d42b7eef721bfa9509f
---
 hw/virtio/virtio.c  |  8 
 include/migration/vmstate.h | 18 +-
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index bd6b4df..41a8a8a 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1143,8 +1143,8 @@ static const VMStateDescription vmstate_virtio_virtqueues 
= {
 .minimum_version_id = 1,
 .needed = _virtqueue_needed,
 .fields = (VMStateField[]) {
-VMSTATE_STRUCT_VARRAY_KNOWN(vq, struct VirtIODevice, VIRTIO_QUEUE_MAX,
-  0, vmstate_virtqueue, VirtQueue),
+VMSTATE_STRUCT_VARRAY_POINTER_KNOWN(vq, struct VirtIODevice,
+  VIRTIO_QUEUE_MAX, 0, vmstate_virtqueue, VirtQueue),
 VMSTATE_END_OF_LIST()
 }
 };
@@ -1165,8 +1165,8 @@ static const VMStateDescription vmstate_virtio_ringsize = 
{
 .minimum_version_id = 1,
 .needed = _ringsize_needed,
 .fields = (VMStateField[]) {
-VMSTATE_STRUCT_VARRAY_KNOWN(vq, struct VirtIODevice, VIRTIO_QUEUE_MAX,
-  0, vmstate_ringsize, VirtQueue),
+VMSTATE_STRUCT_VARRAY_POINTER_KNOWN(vq, struct VirtIODevice,
+  VIRTIO_QUEUE_MAX, 0, vmstate_ringsize, VirtQueue),
 VMSTATE_END_OF_LIST()
 }
 };
diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h
index a4b81bb..7246f29 100644
--- a/include/migration/vmstate.h
+++ b/include/migration/vmstate.h
@@ -386,26 +386,26 @@ extern const VMStateInfo vmstate_info_bitmap;
 .offset   = vmstate_offset_array(_state, _field, _type, _num),\
 }
 
-/* a variable length array (i.e. _type *_field) but we know the
- * length
- */
-#define VMSTATE_STRUCT_VARRAY_KNOWN(_field, _state, _num, _version, _vmsd, 
_type) { \
+#define VMSTATE_STRUCT_VARRAY_UINT8(_field, _state, _field_num, _version, 
_vmsd, _type) { \
 .name   = (stringify(_field)),   \
-.num  = (_num),  \
+.num_offset = vmstate_offset_value(_state, _field_num, uint8_t), \
 .version_id = (_version),\
 .vmsd   = &(_vmsd),  \
 .size   = sizeof(_type), \
-.flags  = VMS_STRUCT|VMS_ARRAY,  \
+.flags  = VMS_STRUCT|VMS_VARRAY_UINT8,   \
 .offset = offsetof(_state, _field),  \
 }
 
-#define VMSTATE_STRUCT_VARRAY_UINT8(_field, _state, _field_num, _version, 
_vmsd, _type) { \
+/* a variable length array (i.e. _type *_field) but we know the
+ * length
+ */
+#define VMSTATE_STRUCT_VARRAY_POINTER_KNOWN(_field, _state, _num, _version, 
_vmsd, _type) { \
 .name   = (stringify(_field)),   \
-.num_offset = vmstate_offset_value(_state, _field_num, uint8_t), \
+.num  = (_num),  \
 .version_id = (_version),\
 .vmsd   = &(_vmsd),  \
 .size   = sizeof(_type), \
-.flags  = VMS_STRUCT|VMS_VARRAY_UINT8,   \
+.flags  = VMS_STRUCT|VMS_ARRAY|VMS_POINTER,  \
 .offset = offsetof(_state, _field),  \
 }
 
-- 
2.5.0

[Qemu-devel] [PATCH v4] Add optionrom compatible with fw_cfg DMA version

2016-01-29 Thread Marc Marí

This optionrom is based on linuxboot.S.

Added changes proposed by Gerd Hoffman, Stefan Hajnoczi and Kevin O'Connor.

All optionroms are now compiled in 32 bits. This also forces to not use any
standard C header because this would need cross-compiling support check and a
big modification on the configuration script.

Signed-off-by: Marc Marí 
---
 .gitignore|   4 +
 hw/i386/pc.c  |   9 +-
 hw/nvram/fw_cfg.c |   2 +-
 include/hw/nvram/fw_cfg.h |   1 +
 pc-bios/optionrom/Makefile|   7 +-
 pc-bios/optionrom/linuxboot_dma.c | 288 ++
 6 files changed, 306 insertions(+), 5 deletions(-)
 create mode 100644 pc-bios/optionrom/linuxboot_dma.c

diff --git a/.gitignore b/.gitignore
index 88a80ff..101d1e0 100644
--- a/.gitignore
+++ b/.gitignore
@@ -94,6 +94,10 @@
 /pc-bios/optionrom/linuxboot.bin
 /pc-bios/optionrom/linuxboot.raw
 /pc-bios/optionrom/linuxboot.img
+/pc-bios/optionrom/linuxboot_dma.asm
+/pc-bios/optionrom/linuxboot_dma.bin
+/pc-bios/optionrom/linuxboot_dma.raw
+/pc-bios/optionrom/linuxboot_dma.img
 /pc-bios/optionrom/multiboot.asm
 /pc-bios/optionrom/multiboot.bin
 /pc-bios/optionrom/multiboot.raw
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 459260b..00339fa 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1007,8 +1007,13 @@ static void load_linux(PCMachineState *pcms,
 fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
 fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
 
-option_rom[nb_option_roms].name = "linuxboot.bin";
-option_rom[nb_option_roms].bootindex = 0;
+if (fw_cfg_dma_enabled(fw_cfg)) {
+option_rom[nb_option_roms].name = "linuxboot_dma.bin";
+option_rom[nb_option_roms].bootindex = 0;
+} else {
+option_rom[nb_option_roms].name = "linuxboot.bin";
+option_rom[nb_option_roms].bootindex = 0;
+}
 nb_option_roms++;
 }
 
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index a1d650d..d0a5753 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -546,7 +546,7 @@ static bool is_version_1(void *opaque, int version_id)
 return version_id == 1;
 }
 
-static bool fw_cfg_dma_enabled(void *opaque)
+bool fw_cfg_dma_enabled(void *opaque)
 {
 FWCfgState *s = opaque;
 
diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
index 664eaf6..953e58d 100644
--- a/include/hw/nvram/fw_cfg.h
+++ b/include/hw/nvram/fw_cfg.h
@@ -219,6 +219,7 @@ FWCfgState *fw_cfg_init_mem_wide(hwaddr ctl_addr,
  hwaddr dma_addr, AddressSpace *dma_as);
 
 FWCfgState *fw_cfg_find(void);
+bool fw_cfg_dma_enabled(void *opaque);
 
 #endif /* NO_QEMU_PROTOS */
 
diff --git a/pc-bios/optionrom/Makefile b/pc-bios/optionrom/Makefile
index ce4852a..bdd0cc1 100644
--- a/pc-bios/optionrom/Makefile
+++ b/pc-bios/optionrom/Makefile
@@ -13,15 +13,18 @@ CFLAGS := -Wall -Wstrict-prototypes -Werror 
-fomit-frame-pointer -fno-builtin
 CFLAGS += -I$(SRC_PATH)
 CFLAGS += $(call cc-option, $(CFLAGS), -fno-stack-protector)
 CFLAGS += $(CFLAGS_NOPIE)
+CFLAGS += -m32
 QEMU_CFLAGS = $(CFLAGS)
 
-build-all: multiboot.bin linuxboot.bin kvmvapic.bin
+ASFLAGS += -32
+
+build-all: multiboot.bin linuxboot.bin linuxboot_dma.bin kvmvapic.bin
 
 # suppress auto-removal of intermediate files
 .SECONDARY:
 
 %.img: %.o
-   $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -Ttext 0 -e _start -s -o $@ 
$<,"  Building $(TARGET_DIR)$@")
+   $(call quiet-command,$(LD) $(LDFLAGS_NOPIE) -m elf_i386 -Ttext 0 -e 
_start -s -o $@ $<,"  Building $(TARGET_DIR)$@")
 
 %.raw: %.img
$(call quiet-command,$(OBJCOPY) -O binary -j .text $< $@,"  Building 
$(TARGET_DIR)$@")
diff --git a/pc-bios/optionrom/linuxboot_dma.c 
b/pc-bios/optionrom/linuxboot_dma.c
new file mode 100644
index 000..c1181cd
--- /dev/null
+++ b/pc-bios/optionrom/linuxboot_dma.c
@@ -0,0 +1,288 @@
+/*
+ * Linux Boot Option ROM for fw_cfg DMA
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see .
+ *
+ * Copyright (c) 2015 Red Hat Inc.
+ *   Authors: Marc Marí 
+ */
+
+asm(
+".text\n"
+".global _start\n"
+"_start:\n"
+"   .short 0xaa55\n"
+"   .byte (_end - _start) / 512\n"
+"   lret\n"
+"   .org 0x18\n"
+"   .short 0\n"
+"   .short _pnph\n"
+"_pnph:\n"
+"   .ascii \"$PnP\"\n"
+"   .byte 0x01\n"
+"   .byte ( _pnph_len / 16

Re: [Qemu-devel] [PATCH v9 35/37] qapi: Change visit_type_FOO() to no longer return partial objects

2016-01-29 Thread Markus Armbruster

Eric Blake  writes:

> On 01/28/2016 08:24 AM, Markus Armbruster wrote:
>> Eric Blake  writes:
>> 
>>> Returning a partial object on error is an invitation for a careless
>>> caller to leak memory.  As no one outside the testsuite was actually
>>> relying on these semantics, it is cleaner to just document and
>>> guarantee that ALL pointer-based visit_type_FOO() functions always
>>> leave a safe value in *obj during an input visitor (either the new
>>> object on success, or NULL if an error is encountered).
>>>
>>> Since input visitors have blind assignment semantics, we have to
>>> track the result of whether an assignment is made all the way down
>>> to each visitor callback implementation, to avoid making decisions
>>> based on potentially uninitialized storage.
>> 
>> I'm not sure I get this paragraph.  What's "blind assignment semantics"?
>
> The caller does:
>
> {
> Foo *bar; /* uninit */
> visit_type_Foo();
> if (no error) {
> /* use bar */
> }
> }
>
> Which means the visitor core can't do 'if (*obj)', because obj may be
> uninitialized on entry; if it dereferences obj at all, it must be via
> '*obj = ...' which I'm terming a blind assignment.
>
> But I can try and come up with better wording.

I'd suggest one, but I think we should first make up our minds on error
behavior.

>>> Note that we still leave *obj unchanged after a scalar-based
>>> visit_type_FOO(); I did not feel like auditing all uses of
>>> visit_type_Enum() to see if the callers would tolerate a specific
>>> sentinel value (not to mention having to decide whether it would
>>> be better to use 0 or ENUM__MAX as that sentinel).
>> 
>> The assigning input visitor functions (core and generated) all assign
>> either a pointer to a newly allocated object, or a non-pointer scalar
>> value.
>> 
>> Possible behaviors on error:
>> 
>> (0) What we have now: assign something that must be cleaned up with the
>> dealloc visitor if it's a pointer, but is otherwise useless
>> 
>> CON: callers have to clean up
>> CON: exposes careless callers to useless values
>> 
>> (1) Don't assign anything

Need to be very careful to store only when success is assured.

Consider visiting a QAPI type that is actually two levels of containers,
say a list of a struct of scalars.  The visit is a walk of this tree
then:

list
___/ ... \___
   / \
struct1  structN
   /  ...  \ /  ...  \   
scal11   scal1M   scalN1   scalNM

Now let's consider the state when the visit reaches scalN1:

* The visits of scal11..scal1M and struct 1 have all succeeded already,
  and stored their value into their container.  Same for the visits of
  structI (I < N) omitted in the diagram, and their scalars.

* The visit of list and struct N are in progress: the object has been
  partially constructed, but not yet stored.

* The remaining visits haven't begun, and their members in objects
  already allocated are still zero.

Now the visit of scalN1 fails.  The visit of structN fails in turn,
freeing its (not yet stored, partially constructed) object.  Finally,
the visit of list fails, freeing its object.

That the failed visits of scalN1 and structN didn't store is actually
unimportant.

Of course, this isn't how things work now.  Visits of containers store
the newly allocated object before visiting members, in visit_start_*(),
and the member visits use that to find the object.

>> PRO: consistent
>> CON: exposes careless callers to uninitialized values
>
> Half-PRO: Caller can pre-initialize a default, and rely on that value on
> error.  In fact, I think we have callers doing that when visiting an
> enum, and I didn't feel up to auditing them all when first writing the
> patch.
>
> But a small audit right now shows:
>
> qom/object.c:object_property_get_enum() starts with uninitialized 'int
> ret;', hardcodes 'return 0;' on some failures, but otherwise passes it
> to visit_type_enum() then blindly returns that value even if errp is
> set.  Yuck.  Callers HAVE to check errp rather than relying on the
> return value to flag errors; although it looks like the lone caller is
> in numa.c and passes _abort.
>
> Maybe I should just bite the bullet, and audit ALL uses of visitor for
> their behavior of what to expect in *obj on error.
>> 
>> (2) Assign zero bits
>> 
>> PRO: consistent
>> CON: exposes careless callers to bogus zero values
>
> Half-CON: Caller cannot pre-initialize a default

With (1) don't assign, the caller can pick an error value by assigning
it before the visit, and it must not access the value on error unless it
does.

With (2) assign zero, the caller can't pick an error value, but may
safely access the value even on error.

Tradeoff.  I figure either can work for us.

>> (3) Assign null pointer, else don't assign anything
>> 
>> CON: inconsistent
>> CON: mix of (1)'s and (2)'s CON
>
> Which I think is what

Re: [Qemu-devel] [PATCH v19 7/9] machine: add properties to compat_props incrementaly

2016-01-29 Thread Cornelia Huck

On Thu, 28 Jan 2016 11:58:08 +0100
Igor Mammedov  wrote:

> Switch to adding compat properties incrementaly instead of
> completly overwriting compat_props per machine type.
> That removes data duplication which we have due to nested
> [PC|SPAPR]_COMPAT_* macros.

We'll try to switch to something similar to spapr for ccw so we can get
rid of the nesting as well (once one of us has time to look into that).

> 
> It also allows to set default device properties from
> default foo_machine_options() hook, which will be used
> in following patch for putting VMGENID device as
> a function if ISA bridge on pc/q35 machines.
> 
> Suggested-by: Eduardo Habkost 
> Signed-off-by: Igor Mammedov 

master + this patch (+ <20160115120143.GB2432@work-vm>) survives some
playing around with virsh managedsave and the 2.4/2.5/2.6 ccw machines,
so

Acked-by: Cornelia Huck

[Qemu-devel] [PATCH] usb: ehci: add capability mmio write function

2016-01-29 Thread P J P

From: Prasad J Pandit 

USB Ehci emulation supports host controller capability registers.
But its mmio '.write' function was missing, which lead to a null
pointer dereference issue. Add a do nothing 'ehci_caps_write'
definition to avoid it; Do nothing because capability registers
are Read Only(RO).

Reported-by: Zuozhi Fzz 
Signed-off-by: Prasad J Pandit 
---
 hw/usb/hcd-ehci.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
index c40013e..b08ff62 100644
--- a/hw/usb/hcd-ehci.c
+++ b/hw/usb/hcd-ehci.c
@@ -893,6 +893,11 @@ static uint64_t ehci_caps_read(void *ptr, hwaddr addr,
 return s->caps[addr];
 }
 
+static void ehci_caps_write(void *ptr, hwaddr addr,
+ uint64_t val, unsigned size)
+{
+}
+
 static uint64_t ehci_opreg_read(void *ptr, hwaddr addr,
 unsigned size)
 {
@@ -2313,6 +2318,7 @@ static void ehci_frame_timer(void *opaque)
 
 static const MemoryRegionOps ehci_mmio_caps_ops = {
 .read = ehci_caps_read,
+.write = ehci_caps_write,
 .valid.min_access_size = 1,
 .valid.max_access_size = 4,
 .impl.min_access_size = 1,
-- 
2.5.0

[Qemu-devel] [Bug 1538541] Re: qcow2 rejects request to use preallocation with backing file

2016-01-29 Thread Max Reitz

It is not exactly trivial, and it being in qemu does not make it
simpler.
http://git.qemu.org/?p=qemu.git;a=blob;f=block/qcow2.c;h=fd8436c5f8b13ab0e8c605147ce76e6b6a8e5f95;hb=HEAD#l2105
is the code that calculates the size; as you can see, it is pretty self-
contained.

Max

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1538541

Title:
  qcow2 rejects request to use preallocation with backing file

Status in QEMU:
  New

Bug description:
  The 'preallocation=full' option to qemu-img / qcow2 block driver
  instructs QEMU to fully allocate the host file to the maximum size
  needed by the logical disk size.

  $ qemu-img create -f qcow2 -o preallocation=full base.qcow2 200M
  Formatting 'base.qcow2', fmt=qcow2 size=209715200 encryption=off 
cluster_size=65536 preallocation='full' lazy_refcounts=off refcount_bits=16

  $ ls -alhs base.qcow2 
  201M -rw-r--r--. 1 berrange berrange 201M Jan 27 12:49 base.qcow2

  
  When specifying a backing file for the qcow2 file, however, it rejects the 
preallocation request

  $ qemu-img create -f qcow2 -o preallocation=full,backing_file=base.qcow2 
front.qcow2 200M
  Formatting 'front.qcow2', fmt=qcow2 size=209715200 backing_file='base.qcow2' 
encryption=off cluster_size=65536 preallocation='full' lazy_refcounts=off 
refcount_bits=16
  qemu-img: front.qcow2: Backing file and preallocation cannot be used at the 
same time

  
  It might seem like requesting full preallocation is redundant because most 
data associated with the image will be present in the backing file, as so the 
top layer is unlikely to ever need the full preallocation.  Rejecting this, 
however, means it is not (officially) possible to reserve disk space for the 
top layer to guarantee that future copy-on-writes will never get ENOSPC.

  OpenStack in particular uses backing files with all images, in order
  to avoid the I/O overhead of copying the backing file contents into
  the per-VM disk image. It, however, still wants to have a guarantee
  that the per-VM image will never hit an ENOSPC scenario.

  Currently it has to hack around QEMU's refusal to allow backing_file +
  preallocation, by calling 'fallocate' on the qcow2 file after it has
  been created. This is an inexact fix though, because it doesn't take
  account of fact that qcow2 metadata can takes some MBs of space.

  Thus, it would like to see preallocation=full supported in combination
  with backing files.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1538541/+subscriptions

Re: [Qemu-devel] [PATCH] build: Add include check on syscall.h

2016-01-29 Thread Peter Maydell

On 28 January 2016 at 20:36, Lluís Vilanova  wrote:
> The LTTng tracing backend includes the system's "syscall.h", but QEMU
> replaces it with its own for linux-user builds. This results in a double
> include on some targets (when LTTng is enabled).
>
> Signed-off-by: Lluís Vilanova 

Adding include guards is fine, but it sounds to me like what we
should actually do to fix this confusion is rename all the linux-user
local headers to target_syscall.h.

thanks
-- PMM

Re: [Qemu-devel] [PATCH 2/2] migration/virtio: Remove simple .get/.put use

2016-01-29 Thread Cornelia Huck

On Thu, 21 Jan 2016 21:56:22 +0100
Sascha Silbe  wrote:

> "Dr. David Alan Gilbert"  writes:

> > Thanks I've reused a chunk of that;  I'll post the fix soon.
> > Thanks for your help on this.
> 
> Thanks, looking forward to the next version of your patch. Will use the
> the previous one in the meantime as a local work-around.

Any updates around this? I always need to remember to put this patch on
top when I test migration...

Re: [Qemu-devel] [PATCH COLO-Frame v13 10/39] COLO: Implement colo checkpoint protocol

2016-01-29 Thread Dr. David Alan Gilbert

* zhanghailiang (zhang.zhanghaili...@huawei.com) wrote:
> We need communications protocol of user-defined to control the checkpoint
> process.
> 
> The new checkpoint request is started by Primary VM, and the interactive 
> process
> like below:
> Checkpoint synchronizing points:
> 
>Primary   Secondary
> initial work
> 'checkpoint-ready'< @
> 
> 'checkpoint-request'  @ >
> Suspend (Only in hybrid mode)
> 'checkpoint-reply'< @
>   Suspend state
> 'vmstate-send'@ >
>   Send stateReceive state
> 'vmstate-received'< @
>   Release packets   Load state
> 'vmstate-load'< @
>   ResumeResume (Only in hybrid mode)
> 
>   Start Comparing (Only in hybrid mode)
> NOTE:
>  1) '@' who sends the message
>  2) Every sync-point is synchronized by two sides with only
> one handshake(single direction) for low-latency.
> If more strict synchronization is required, a opposite direction
> sync-point should be added.
>  3) Since sync-points are single direction, the remote side may
> go forward a lot when this side just receives the sync-point.
>  4) For now, we only support 'periodic' checkpoint, for which
>the Secondary VM is not running, later we will support 'hybrid' mode.
> 
> Signed-off-by: zhanghailiang 
> Signed-off-by: Li Zhijian 
> Signed-off-by: Gonglei 
> Cc: Eric Blake 
> Cc: Markus Armbruster 
> Cc: Dr. David Alan Gilbert 
> ---
> v13:
> - Refactor colo command related helper functions, use 'Error **errp' parameter
>   instead of return value to indicate success or failure.
> - Fix some other comments from Markus.
> 
> v12:
> - Rename colo_ctl_put() to colo_put_cmd()
> - Rename colo_ctl_get() to colo_get_check_cmd() and drop
>   the third parameter
> - Rename colo_ctl_get_cmd() to colo_get_cmd()
> - Remove useless 'invalid' member for COLOcommand enum.
> v11:
> - Add missing 'checkpoint-ready' communication in comment.
> - Use parameter to return 'value' for colo_ctl_get() (Dave's suggestion)
> - Fix trace for colo_ctl_get() to trace command and value both
> v10:
> - Rename enum COLOCmd to COLOCommand (Eric's suggestion).
> - Remove unused 'ram-steal'
> ---
>  migration/colo.c | 201 
> ++-
>  qapi-schema.json |  25 +++
>  trace-events |   2 +
>  3 files changed, 226 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index 65ac0c9..c76e1fa 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -10,6 +10,7 @@
>   * later.  See the COPYING file in the top-level directory.
>   */
>  
> +#include 
>  #include "sysemu/sysemu.h"
>  #include "migration/colo.h"
>  #include "trace.h"
> @@ -34,22 +35,147 @@ bool migration_incoming_in_colo_state(void)
>  return mis && (mis->state == MIGRATION_STATUS_COLO);
>  }
>  
> +static void colo_put_cmd(QEMUFile *f, COLOCommand cmd,
> + Error **errp)
> +{
> +int ret;
> +
> +if (cmd >= COLO_COMMAND__MAX) {
> +error_setg(errp, "%s: Invalid cmd", __func__);
> +return;
> +}
> +qemu_put_be32(f, cmd);
> +qemu_fflush(f);
> +
> +ret = qemu_file_get_error(f);
> +if (ret < 0) {
> +error_setg_errno(errp, -ret, "Can't put COLO command");
> +}
> +trace_colo_put_cmd(COLOCommand_lookup[cmd]);
> +}
> +
> +static COLOCommand colo_get_cmd(QEMUFile *f, Error **errp)
> +{
> +COLOCommand cmd;
> +int ret;
> +
> +cmd = qemu_get_be32(f);
> +ret = qemu_file_get_error(f);
> +if (ret < 0) {
> +error_setg_errno(errp, -ret, "Can't get COLO command");
> +return cmd;
> +}
> +if (cmd >= COLO_COMMAND__MAX) {

I'm not sure this is guaranteed to do the right thing if the value
read was negative and converted to the enum;  it seems to depend
a bit on the compiler as to whether an enum is signed or not;
but it seems the normal behaviour is for the enum to be unsigned
as long as there are no negative values when it's declared; so it looks
safe in practice.

Reviewed-by: Dr. David Alan Gilbert 

> +error_setg(errp, "%s: Invalid cmd", __func__);
> +return cmd;
> +}
> +trace_colo_get_cmd(COLOCommand_lookup[cmd]);
> +return cmd;
> +}
> +
> +static void colo_get_check_cmd(QEMUFile *f, COLOCommand expect_cmd,
> +   Error **errp)
> +{
> +COLOCommand cmd;
> +Error *local_err = NULL;
> +
> +cmd = colo_get_cmd(f, _err);
> +if (local_err) {
> +

Re: [Qemu-devel] [PATCH v4 0/5] ARM: Add NUMA support for machine virt

2016-01-29 Thread Andrew Jones

On Fri, Jan 29, 2016 at 02:52:35PM +0800, Shannon Zhao wrote:
> 
> 
> On 2016/1/29 14:32, Ashok Kumar wrote:
> > Hi, 
> > 
> > On Sat, Jan 23, 2016 at 07:36:41PM +0800, Shannon Zhao wrote:
> >> > From: Shannon Zhao 
> >> > 
> >> > Add NUMA support for machine virt. Tested successfully running a guest
> >> > Linux kernel with the following patch applied:
> >> > 
> >> > - [PATCH v9 0/6] arm64, numa: Add numa support for arm64 platforms
> >> > https://lwn.net/Articles/672329/
> >> > - [PATCH v2 0/4] ACPI based NUMA support for ARM64
> >> > http://www.spinics.net/lists/linux-acpi/msg61795.html
> >> > 
> >> > Changes since v3:
> >> > * based on new kernel driver and device bindings
> >> > * add ACPI part
> >> > 
> >> > Changes since v2:
> >> > * update to use NUMA node property arm,associativity.
> >> > 
> >> > Changes since v1:
> >> > Take into account Peter's comments:
> >> > * rename virt_memory_init to arm_generate_memory_dtb
> >> > * move arm_generate_memory_dtb to boot.c and make it a common func
> >> > * use a struct numa_map to generate numa dtb
> >> > 
> >> > Example qemu command line:
> >> > qemu-system-aarch64 \
> >> > -enable-kvm -smp 4\
> >> > -kernel Image \
> >> > -m 512 -machine virt,kernel_irqchip=on \
> >> > -initrd guestfs.cpio.gz \
> >> > -cpu host -nographic \
> >> > -numa node,mem=256M,cpus=0-1,nodeid=0 \
> >> > -numa node,mem=256M,cpus=2-3,nodeid=1 \
> >> > -append "console=ttyAMA0 root=/dev/ram"
> >> > 
> >> > Shannon Zhao (5):
> >> >   ARM: Virt: Add /distance-map node for NUMA
> >> >   ARM: Virt: Set numa-node-id for CPUs
> >> >   ARM: Add numa-node-id for /memory node
> >> >   include/hw/acpi/acpi-defs: Add GICC Affinity Structure
> >> >   hw/arm/virt-acpi-build: Generate SRAT table
> >> > 
> >> >  hw/arm/boot.c   | 29 ++-
> >> >  hw/arm/virt-acpi-build.c| 58 
> >> > +
> >> >  hw/arm/virt.c   | 37 +
> >> >  hw/i386/acpi-build.c|  2 +-
> >> >  include/hw/acpi/acpi-defs.h | 15 +++-
> >> >  5 files changed, 138 insertions(+), 3 deletions(-)
> >> > 
> >> > -- 
> >> > 2.0.4
> >> > 
> > Don't we need to populate the NUMA node in the Affinity byte of MPIDR?
> > Linux uses the Affinity information in MPIDR to build topology which
> > might go wrong for the guest in this case. 
> > Maybe a non Linux OS might be impacted more?
> > 
> Ah, yes. It needs to update the MPIDR. But currently QEMU uses the value
> from KVM when using KVM. It needs to call kvm_set_one_reg to set the
> MPIDR and I'm not sure if this will affect KVM by looking at following
> comments:
> /*
>  * When KVM is in use, PSCI is emulated in-kernel and not by qemu.
>  * Currently KVM has its own idea about MPIDR assignment, so we
>  * override our defaults with what we get from KVM.
>  */
> 
> Peter, do you have any suggestion?
> 
> > distance-map compatible string has been changed from
> > "numa,distance-map-v1" to "numa-distance-map-v1"
> Will update this.

Sigh... I've had the MPIDR rework (to let QEMU dictate it0 on my TODO list
for a loong time. I'll pick it up today and hopefully have something to
discuss next week. I'll keep this series in mind too.

Thanks,
drew

Re: [Qemu-devel] [PATCH v3] blockjob: Fix hang in block_job_finish_sync

2016-01-29 Thread Stefan Hajnoczi

On Fri, Jan 29, 2016 at 10:19:49AM +0800, Fam Zheng wrote:
> @@ -402,6 +407,10 @@ typedef void BlockJobDeferToMainLoopFn(BlockJob *job, 
> void *opaque);
>   * AioContext acquired.  Block jobs must call bdrv_unref(), bdrv_close(), and
>   * anything that uses bdrv_drain_all() in the main loop.
>   *
> + * The job->deferred_to_main_loop flag will be set. Caller must clear it once
> + * the deferred work is done and the block job coroutine continues, unless 
> it's
> + * completing immediately.
> + *

It's not necessary to expose job->deferred_to_main_loop to the user.
Just clear it:

static void block_job_defer_to_main_loop_bh(void *opaque)
{
BlockJobDeferToMainLoopData *data = opaque;
AioContext *aio_context;

qemu_bh_delete(data->bh);

/* Prevent race with block_job_defer_to_main_loop() */
aio_context_acquire(data->aio_context);

/* Fetch BDS AioContext again, in case it has changed */
aio_context = bdrv_get_aio_context(data->job->bs);
aio_context_acquire(aio_context);

data->fn(data->job, data->opaque);
job->deferred_to_main_loop = false;  /* <- HERE */

aio_context_release(aio_context);

aio_context_release(data->aio_context);

g_free(data);
}


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v19 3/9] pc: add a Virtual Machine Generation ID device

2016-01-29 Thread Igor Mammedov

On Thu, 28 Jan 2016 14:59:25 +0200
"Michael S. Tsirkin"  wrote:

> On Thu, Jan 28, 2016 at 01:03:16PM +0100, Igor Mammedov wrote:
> > On Thu, 28 Jan 2016 13:13:04 +0200
> > "Michael S. Tsirkin"  wrote:
> >   
> > > On Thu, Jan 28, 2016 at 11:54:25AM +0100, Igor Mammedov wrote:  
> > > > Based on Microsoft's specifications (paper can be
> > > > downloaded from http://go.microsoft.com/fwlink/?LinkId=260709,
> > > > easily found by "Virtual Machine Generation ID" keywords),
> > > > add a PCI device with corresponding description in
> > > > SSDT ACPI table.
> > > > 
> > > > The GUID is set using "vmgenid.guid" property or
> > > > a corresponding HMP/QMP command.
> > > > 
> > > > Example of using vmgenid device:
> > > >  -device vmgenid,id=FOO,guid="324e6eaf-d1d1-4bf6-bf41-b9bb6c91fb87"
> > > > 
> > > > 'vmgenid' device initialization flow is as following:
> > > >  1. vmgenid has RAM BAR registered with size of GUID buffer
> > > >  2. BIOS initializes PCI devices and it maps BAR in PCI hole
> > > >  3. BIOS reads ACPI tables from QEMU, at that moment tables
> > > > are generated with \_SB.VMGI.ADDR constant pointing to
> > > > GPA where BIOS's mapped vmgenid's BAR earlier
> > > > 
> > > > Note:
> > > > This implementation uses PCI class 0x0500 code for vmgenid device,
> > > > that is marked as NO_DRV in Windows's machine.inf.
> > > > Testing various Windows versions showed that, OS
> > > > doesn't touch nor checks for resource conflicts
> > > > for such PCI devices.
> > > > There was concern that during PCI rebalancing, OS
> > > > could reprogram the BAR at other place, which would
> > > > leave VGEN.ADDR pointing to the old (no more valid)
> > > > address.
> > > > However testing showed that Windows does rebalancing
> > > > only for PCI device that have a driver attached
> > > > and completely ignores NO_DRV class of devices.
> > > > Which in turn creates a problem where OS could remap
> > > > one of PCI devices(with driver) over BAR used by
> > > > a driver-less PCI device.
> > > > Statically declaring used memory range as VGEN._CRS
> > > > makes OS to honor resource reservation and an ignored
> > > > BAR range is not longer touched during PCI rebalancing.
> > > > 
> > > > Signed-off-by: Gal Hammer 
> > > > Signed-off-by: Igor Mammedov 
> > > 
> > > It's an interesting hack, but this needs some thought. BIOS has no idea
> > > this BAR is special and can not be rebalanced, so it might put the BAR
> > > in the middle of the range, in effect fragmenting it.  
> > yep that's the only drawback in PCI approach.
> >   
> > > Really I think something like V12 just rewritten using the new APIs
> > > (probably with something like build_append_named_dword that I suggested)
> > > would be much a simpler way to implement this device, given
> > > the weird API limitations.  
> > We went over stating drawbacks of both approaches several times 
> > and that's where I strongly disagree with using v12 AML patching
> > approach for reasons stated in those discussions.  
> 
> Yes, IIRC you dislike the need to allocate an IO range to pass address
> to host, and to have costom code to migrate the address.
allocating IO ports is fine by me but I'm against using bios_linker (ACPI)
approach for task at hand,
let me enumerate one more time the issues that make me dislike it so much
(in order where most disliked ones go the first):

1. over-engineered for the task at hand, 
   for device to become initialized guest OS has to execute AML,
   so init chain looks like:
 QEMU -> BIOS (patch AML) -> OS (AML write buf address to IO port) ->
 QEMU (update buf address)
   it's hell to debug when something doesn't work right in this chain
   even if there isn't any memory corruption that incorrect AML patching
   could introduce.
   As result of complexity patches are hard to review since one has
   to remember/relearn all details how bios_linker in QEMU and BIOS works,
   hence chance of regression is very high.
   Dynamically patched AML also introduces its own share of AML
   code that has to deal with dynamic buff address value.
   For an example:
 "nvdimm acpi: add _CRS" https://patchwork.ozlabs.org/patch/566697/
   27 liner patch could be just 5-6 lines if static (known in advance)
   buffer address were used to declare static _CRS variable.

2. ACPI approach consumes guest usable RAM to allocate buffer
   and then makes device to DMA data in that RAM.
   That's a design point I don't agree with.
   Just compare with a graphics card design, where on device memory
   is mapped directly at some GPA not wasting RAM that guest could
   use for other tasks.
   VMGENID and NVDIMM use-cases look to me exactly the same, i.e.
   instead of consuming guest's RAM they should be mapped at
   some GPA and their memory accessed directly.
   In that case NVDIMM could even map whole label area and
   significantly simplify QEMU<->OSPM protocol that

Re: [Qemu-devel] [PATCH 1/2] block: fix assert in qcow2_get_specific_info

2016-01-29 Thread Denis V. Lunev


On 01/20/2016 10:12 AM, Denis V. Lunev wrote:

There is a possibility to hit assert qcow2_get_specific_info that
s->qcow_version is undefined. This happens when VM in starting from
suspended state, i.e. it processes incoming migration, and in the same
time 'info block' is called.

The problem is that in the qcow2_invalidate_cache closes and the image
and memsets BDRVQcowState in the middle.

The patch moves processing of qcow2_get_specific_info into coroutine
context and ensures that qcow2_invalidate_cache and qcow2_get_specific_info
can not run simultaneosly.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Paolo Bonzini 
---
  block/qcow2.c | 64 ++-
  block/qcow2.h |  2 ++
  2 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 1789af4..12eda24 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1740,6 +1740,10 @@ static void qcow2_invalidate_cache(BlockDriverState *bs, 
Error **errp)
  Error *local_err = NULL;
  int ret;
  
+qemu_co_mutex_lock(>lock);

+s->in_transient_state = true;
+qemu_co_mutex_unlock(>lock);
+
  /*
   * Backing files are read-only which makes all of their metadata 
immutable,
   * that means we don't have to worry about reopening them here.
@@ -1753,10 +1757,10 @@ static void qcow2_invalidate_cache(BlockDriverState 
*bs, Error **errp)
  bdrv_invalidate_cache(bs->file->bs, _err);
  if (local_err) {
  error_propagate(errp, local_err);
-return;
+goto done;
  }
  
-memset(s, 0, sizeof(BDRVQcow2State));

+memset(s, 0, offsetof(BDRVQcow2State, in_transient_state));
  options = qdict_clone_shallow(bs->options);
  
  ret = qcow2_open(bs, options, flags, _err);

@@ -1765,13 +1769,18 @@ static void qcow2_invalidate_cache(BlockDriverState 
*bs, Error **errp)
  error_setg(errp, "Could not reopen qcow2 layer: %s",
 error_get_pretty(local_err));
  error_free(local_err);
-return;
+goto done;
  } else if (ret < 0) {
  error_setg_errno(errp, -ret, "Could not reopen qcow2 layer");
-return;
+goto done;
  }
  
  s->cipher = cipher;

+
+done:
+qemu_co_mutex_lock(>lock);
+s->in_transient_state = false;
+qemu_co_mutex_unlock(>lock);
  }
  
  static size_t header_ext_add(char *buf, uint32_t magic, const void *s,

@@ -2778,11 +2787,21 @@ static int qcow2_get_info(BlockDriverState *bs, 
BlockDriverInfo *bdi)
  return 0;
  }
  
-static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs)

+
+static ImageInfoSpecific *qcow2_co_get_specific_info(BlockDriverState *bs)
  {
  BDRVQcow2State *s = bs->opaque;
+AioContext *ctx = bdrv_get_aio_context(bs);
+
  ImageInfoSpecific *spec_info = g_new(ImageInfoSpecific, 1);
  
+qemu_co_mutex_lock(>lock);

+while (s->in_transient_state) {
+qemu_co_mutex_unlock(>lock);
+aio_poll(ctx, true);
+qemu_co_mutex_lock(>lock);
+}
+
  *spec_info = (ImageInfoSpecific){
  .type  = IMAGE_INFO_SPECIFIC_KIND_QCOW2,
  .u.qcow2 = g_new(ImageInfoSpecificQCow2, 1),
@@ -2808,10 +2827,45 @@ static ImageInfoSpecific 
*qcow2_get_specific_info(BlockDriverState *bs)
   * added without having it covered here */
  assert(false);
  }
+qemu_co_mutex_unlock(>lock);
  
  return spec_info;

  }
  
+struct InfoCo {

+BlockDriverState *bs;
+ImageInfoSpecific *info;
+};
+
+static void qcow2_co_get_specific_info_entry(void *opaque)
+{
+struct InfoCo *ret = opaque;
+ret->info = qcow2_co_get_specific_info(ret->bs);
+}
+
+static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs)
+{
+Coroutine *co;
+struct InfoCo info_co = {
+.bs = bs,
+};
+
+if (qemu_in_coroutine()) {
+/* Fast-path if already in coroutine context */
+qcow2_co_get_specific_info_entry(_co);
+} else {
+AioContext *aio_context = bdrv_get_aio_context(bs);
+
+co = qemu_coroutine_create(qcow2_co_get_specific_info_entry);
+qemu_coroutine_enter(co, _co);
+while (info_co.info == NULL) {
+aio_poll(aio_context, true);
+}
+}
+
+return info_co.info;
+}
+
  #if 0
  static void dump_refcounts(BlockDriverState *bs)
  {
diff --git a/block/qcow2.h b/block/qcow2.h
index a063a3c..1114528 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -293,6 +293,8 @@ typedef struct BDRVQcow2State {
   * override) */
  char *image_backing_file;
  char *image_backing_format;
+
+bool in_transient_state;
  } BDRVQcow2State;
  
  typedef struct Qcow2COWRegion {

ping v2

Re: [Qemu-devel] [PATCH v9 36/37] RFC: qapi: Adjust layout of FooList types

2016-01-29 Thread Markus Armbruster

Eric Blake  writes:

> On 01/28/2016 08:34 AM, Markus Armbruster wrote:
>> Eric Blake  writes:
>> 
>>> By sticking the next pointer first, we don't need a union with
>>> 64-bit padding for smaller types.  On 32-bit platforms, this
>>> can reduce the size of uint8List from 16 bytes (or 12, depending
>>> on whether 64-bit ints can tolerate 4-byte alignment) down to 8.
>>> It has no effect on 64-bit platforms (where alignment still
>>> dictates a 16-byte struct); but fewer anonymous unions is still
>>> a win in my book.
>>>
>>> However, this requires visit_start_list() and visit_next_list()
>>> to gain a size parameter, to know what size element to allocate.
>>>
>>> I debated about going one step further, to allow for fewer casts,
>>> by doing:
>>> typedef GenericList GenericList;
>>> struct GenericList {
>>> GenericList *next;
>>> };
>>> struct FooList {
>>> GenericList base;
>>> Foo value;
>>> };
>>> so that you convert to 'GenericList *' by '>base', and
>>> back by 'container_of(generic, GenericList, base)' (as opposed to
>>> the existing '(GenericList *)foolist' and '(FooList *)generic').
>>> But doing that would require hoisting the declaration of
>>> GenericList prior to inclusion of qapi-types.h, rather than its
>>> current spot in visitor.h; it also makes iteration a bit more
>>> verbose through 'foolist->base.next' instead of 'foolist->next'.
>
> Should I attempt this?

A quick grep for '(GenericList' finds two in qapi-code-gen.txt, and two
in qapi-visit-py.  I doubt avoiding them is worth much of your time or
mine :)

>>>  typedef struct GenericList
>>>  {
>>> -union {
>>> -void *value;
>>> -uint64_t padding;
>>> -};
>>>  struct GenericList *next;
>>> +char padding[];
>>>  } GenericList;
>> 
>> Less trickery, I like it.
>> 
>> Member padding appears to be unused.
>
> or just leave it at this?

I'd say good enough.

>>>  bool visit_start_list(Visitor *v, const char *name, GenericList **list,
>>> -  Error **errp)
>>> +  size_t size, Error **errp)
>>>  {
>>> -bool result = v->start_list(v, name, list, errp);
>>> +bool result;
>>> +
>>> +assert(list ? size : !size);
>> 
>> Tighter than size != 0 would be size >= GenericList.  Same elsewhere.
>
> Makes sense.
>
>> 
>> Rest looks good.  Didn't look as closely as for the previous patches
>> (getting tired), but so far I like the idea.
>
> Okay, I'll keep it and drop the RFC.

Thanks!

[Qemu-devel] [RFC v7 09/16] softmmu: Include MMIO/invalid exclusive accesses

2016-01-29 Thread Alvise Rigo

Enable exclusive accesses when the MMIO/invalid flag is set in the TLB
entry.

In case a LL access is done to MMIO memory, we treat it differently from
a RAM access in that we do not rely on the EXCL bitmap to flag the page
as exclusive. In fact, we don't even need the TLB_EXCL flag to force the
slow path, since it is always forced anyway.

This commit does not take care of invalidating an MMIO exclusive range from
other non-exclusive accesses i.e. CPU1 LoadLink to MMIO address X and
CPU2 writes to X. This will be addressed in the following commit.

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 cputlb.c   |  7 +++
 softmmu_template.h | 26 --
 2 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index aa9cc17..87d09c8 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -424,7 +424,7 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
vaddr,
 if ((memory_region_is_ram(section->mr) && section->readonly)
 || memory_region_is_romd(section->mr)) {
 /* Write access calls the I/O callback.  */
-te->addr_write = address | TLB_MMIO;
+address |= TLB_MMIO;
 } else if (memory_region_is_ram(section->mr)
&& cpu_physical_memory_is_clean(section->mr->ram_addr
+ xlat)) {
@@ -437,11 +437,10 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
vaddr,
 if (cpu_physical_memory_is_excl(section->mr->ram_addr + xlat)) {
 /* There is at least one vCPU that has flagged the address as
  * exclusive. */
-te->addr_write = address | TLB_EXCL;
-} else {
-te->addr_write = address;
+address |= TLB_EXCL;
 }
 }
+te->addr_write = address;
 } else {
 te->addr_write = -1;
 }
diff --git a/softmmu_template.h b/softmmu_template.h
index 267c52a..c54bdc9 100644
--- a/softmmu_template.h
+++ b/softmmu_template.h
@@ -476,7 +476,7 @@ void helper_le_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 
 /* Handle an IO access or exclusive access.  */
 if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
-if ((tlb_addr & ~TARGET_PAGE_MASK) == TLB_EXCL) {
+if (tlb_addr & TLB_EXCL) {
 CPUIOTLBEntry *iotlbentry = >iotlb[mmu_idx][index];
 CPUState *cpu = ENV_GET_CPU(env);
 CPUClass *cc = CPU_GET_CLASS(cpu);
@@ -500,8 +500,15 @@ void helper_le_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 }
 }
 
-glue(helper_le_st_name, _do_ram_access)(env, val, addr, oi,
-mmu_idx, index, retaddr);
+if (tlb_addr & ~(TARGET_PAGE_MASK | TLB_EXCL)) { /* MMIO access */
+glue(helper_le_st_name, _do_mmio_access)(env, val, addr, oi,
+ mmu_idx, index,
+ retaddr);
+} else {
+glue(helper_le_st_name, _do_ram_access)(env, val, addr, oi,
+mmu_idx, index,
+retaddr);
+}
 
 lookup_and_reset_cpus_ll_addr(hw_addr, DATA_SIZE);
 
@@ -620,7 +627,7 @@ void helper_be_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 
 /* Handle an IO access or exclusive access.  */
 if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
-if ((tlb_addr & ~TARGET_PAGE_MASK) == TLB_EXCL) {
+if (tlb_addr & TLB_EXCL) {
 CPUIOTLBEntry *iotlbentry = >iotlb[mmu_idx][index];
 CPUState *cpu = ENV_GET_CPU(env);
 CPUClass *cc = CPU_GET_CLASS(cpu);
@@ -644,8 +651,15 @@ void helper_be_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 }
 }
 
-glue(helper_be_st_name, _do_ram_access)(env, val, addr, oi,
-mmu_idx, index, retaddr);
+if (tlb_addr & ~(TARGET_PAGE_MASK | TLB_EXCL)) { /* MMIO access */
+glue(helper_be_st_name, _do_mmio_access)(env, val, addr, oi,
+ mmu_idx, index,
+ retaddr);
+} else {
+glue(helper_be_st_name, _do_ram_access)(env, val, addr, oi,
+mmu_idx, index,
+retaddr);
+}
 
 lookup_and_reset_cpus_ll_addr(hw_addr, DATA_SIZE);
 
-- 
2.7.0

[Qemu-devel] [RFC v7 02/16] softmmu: Simplify helper_*_st_name, wrap unaligned code

2016-01-29 Thread Alvise Rigo

Attempting to simplify the helper_*_st_name, wrap the
do_unaligned_access code into an inline function.
Remove also the goto statement.

Based on this work, Alex proposed the following patch series
https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg01136.html
that reduces code duplication of the softmmu_helpers.

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 softmmu_template.h | 96 ++
 1 file changed, 60 insertions(+), 36 deletions(-)

diff --git a/softmmu_template.h b/softmmu_template.h
index 208f808..7029a03 100644
--- a/softmmu_template.h
+++ b/softmmu_template.h
@@ -370,6 +370,32 @@ static inline void glue(io_write, SUFFIX)(CPUArchState 
*env,
  iotlbentry->attrs);
 }
 
+static inline void glue(helper_le_st_name, _do_unl_access)(CPUArchState *env,
+   DATA_TYPE val,
+   target_ulong addr,
+   TCGMemOpIdx oi,
+   unsigned mmu_idx,
+   uintptr_t retaddr)
+{
+int i;
+
+if ((get_memop(oi) & MO_AMASK) == MO_ALIGN) {
+cpu_unaligned_access(ENV_GET_CPU(env), addr, MMU_DATA_STORE,
+ mmu_idx, retaddr);
+}
+/* XXX: not efficient, but simple */
+/* Note: relies on the fact that tlb_fill() does not remove the
+ * previous page from the TLB cache.  */
+for (i = DATA_SIZE - 1; i >= 0; i--) {
+/* Little-endian extract.  */
+uint8_t val8 = val >> (i * 8);
+/* Note the adjustment at the beginning of the function.
+   Undo that for the recursion.  */
+glue(helper_ret_stb, MMUSUFFIX)(env, addr + i, val8,
+oi, retaddr + GETPC_ADJ);
+}
+}
+
 void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val,
TCGMemOpIdx oi, uintptr_t retaddr)
 {
@@ -399,7 +425,8 @@ void helper_le_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
 CPUIOTLBEntry *iotlbentry;
 if ((addr & (DATA_SIZE - 1)) != 0) {
-goto do_unaligned_access;
+glue(helper_le_st_name, _do_unl_access)(env, val, addr, mmu_idx,
+oi, retaddr);
 }
 iotlbentry = >iotlb[mmu_idx][index];
 
@@ -414,23 +441,8 @@ void helper_le_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 if (DATA_SIZE > 1
 && unlikely((addr & ~TARGET_PAGE_MASK) + DATA_SIZE - 1
  >= TARGET_PAGE_SIZE)) {
-int i;
-do_unaligned_access:
-if ((get_memop(oi) & MO_AMASK) == MO_ALIGN) {
-cpu_unaligned_access(ENV_GET_CPU(env), addr, MMU_DATA_STORE,
- mmu_idx, retaddr);
-}
-/* XXX: not efficient, but simple */
-/* Note: relies on the fact that tlb_fill() does not remove the
- * previous page from the TLB cache.  */
-for (i = DATA_SIZE - 1; i >= 0; i--) {
-/* Little-endian extract.  */
-uint8_t val8 = val >> (i * 8);
-/* Note the adjustment at the beginning of the function.
-   Undo that for the recursion.  */
-glue(helper_ret_stb, MMUSUFFIX)(env, addr + i, val8,
-oi, retaddr + GETPC_ADJ);
-}
+glue(helper_le_st_name, _do_unl_access)(env, val, addr, mmu_idx,
+oi, retaddr);
 return;
 }
 
@@ -450,6 +462,32 @@ void helper_le_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 }
 
 #if DATA_SIZE > 1
+static inline void glue(helper_be_st_name, _do_unl_access)(CPUArchState *env,
+   DATA_TYPE val,
+   target_ulong addr,
+   TCGMemOpIdx oi,
+   unsigned mmu_idx,
+   uintptr_t retaddr)
+{
+int i;
+
+if ((get_memop(oi) & MO_AMASK) == MO_ALIGN) {
+cpu_unaligned_access(ENV_GET_CPU(env), addr, MMU_DATA_STORE,
+ mmu_idx, retaddr);
+}
+/* XXX: not efficient, but simple */
+/* Note: relies on the fact that tlb_fill() does not remove the
+ * previous page from the TLB cache.  */
+for (i = DATA_SIZE - 1; i >= 0; i--) {
+/* Big-endian extract.  */
+uint8_t val8 = val >> (((DATA_SIZE - 1) * 8) - (i * 8));
+

[Qemu-devel] [RFC v7 07/16] softmmu: Add helpers for a new slowpath

2016-01-29 Thread Alvise Rigo

The new helpers rely on the legacy ones to perform the actual read/write.

The LoadLink helper (helper_ldlink_name) prepares the way for the
following StoreCond operation. It sets the linked address and the size
of the access. The LoadLink helper also updates the TLB entry of the
page involved in the LL/SC to all vCPUs by forcing a TLB flush, so that
the following accesses made by all the vCPUs will follow the slow path.

The StoreConditional helper (helper_stcond_name) returns 1 if the
store has to fail due to a concurrent access to the same page by
another vCPU. A 'concurrent access' can be a store made by *any* vCPU
(although, some implementations allow stores made by the CPU that issued
the LoadLink).

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 cputlb.c|   3 ++
 include/qom/cpu.h   |   5 ++
 softmmu_llsc_template.h | 133 
 softmmu_template.h  |  12 +
 tcg/tcg.h   |  31 +++
 5 files changed, 184 insertions(+)
 create mode 100644 softmmu_llsc_template.h

diff --git a/cputlb.c b/cputlb.c
index f6fb161..ce6d720 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -476,6 +476,8 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env1, 
target_ulong addr)
 
 #define MMUSUFFIX _mmu
 
+/* Generates LoadLink/StoreConditional helpers in softmmu_template.h */
+#define GEN_EXCLUSIVE_HELPERS
 #define SHIFT 0
 #include "softmmu_template.h"
 
@@ -488,6 +490,7 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env1, 
target_ulong addr)
 #define SHIFT 3
 #include "softmmu_template.h"
 #undef MMUSUFFIX
+#undef GEN_EXCLUSIVE_HELPERS
 
 #define MMUSUFFIX _cmmu
 #undef GETPC_ADJ
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 682c81d..6f6c1c0 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -351,10 +351,15 @@ struct CPUState {
  */
 bool throttle_thread_scheduled;
 
+/* Used by the atomic insn translation backend. */
+bool ll_sc_context;
 /* vCPU's exclusive addresses range.
  * The address is set to EXCLUSIVE_RESET_ADDR if the vCPU is not
  * in the middle of a LL/SC. */
 struct Range excl_protected_range;
+/* Used to carry the SC result but also to flag a normal store access made
+ * by a stcond (see softmmu_template.h). */
+bool excl_succeeded;
 
 /* Note that this is accessed at the start of every TB via a negative
offset from AREG0.  Leave this field at the end so as to make the
diff --git a/softmmu_llsc_template.h b/softmmu_llsc_template.h
new file mode 100644
index 000..101f5e8
--- /dev/null
+++ b/softmmu_llsc_template.h
@@ -0,0 +1,133 @@
+/*
+ *  Software MMU support (esclusive load/store operations)
+ *
+ * Generate helpers used by TCG for qemu_ldlink/stcond ops.
+ *
+ * Included from softmmu_template.h only.
+ *
+ * Copyright (c) 2015 Virtual Open Systems
+ *
+ * Authors:
+ *  Alvise Rigo 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+/* This template does not generate together the le and be version, but only one
+ * of the two depending on whether BIGENDIAN_EXCLUSIVE_HELPERS has been set.
+ * The same nomenclature as softmmu_template.h is used for the exclusive
+ * helpers.  */
+
+#ifdef BIGENDIAN_EXCLUSIVE_HELPERS
+
+#define helper_ldlink_name  glue(glue(helper_be_ldlink, USUFFIX), MMUSUFFIX)
+#define helper_stcond_name  glue(glue(helper_be_stcond, SUFFIX), MMUSUFFIX)
+#define helper_ld glue(glue(helper_be_ld, USUFFIX), MMUSUFFIX)
+#define helper_st glue(glue(helper_be_st, SUFFIX), MMUSUFFIX)
+
+#else /* LE helpers + 8bit helpers (generated only once for both LE end BE) */
+
+#if DATA_SIZE > 1
+#define helper_ldlink_name  glue(glue(helper_le_ldlink, USUFFIX), MMUSUFFIX)
+#define helper_stcond_name  glue(glue(helper_le_stcond, SUFFIX), MMUSUFFIX)
+#define helper_ld glue(glue(helper_le_ld, USUFFIX), MMUSUFFIX)
+#define helper_st glue(glue(helper_le_st, SUFFIX), MMUSUFFIX)
+#else /* DATA_SIZE <= 1 */
+#define helper_ldlink_name  glue(glue(helper_ret_ldlink, USUFFIX), MMUSUFFIX)
+#define helper_stcond_name  glue(glue(helper_ret_stcond, SUFFIX), MMUSUFFIX)
+#define helper_ld glue(glue(helper_ret_ld, USUFFIX), MMUSUFFIX)
+#define helper_st glue(glue(helper_ret_st,

[Qemu-devel] [RFC v7 06/16] qom: cpu: Add CPUClass hooks for exclusive range

2016-01-29 Thread Alvise Rigo

The excl_protected_range is a hwaddr range set by the VCPU at the
execution of a LoadLink instruction. If a normal access writes to this
range, the corresponding StoreCond will fail.

Each architecture can set the exclusive range when issuing the LoadLink
operation through a CPUClass hook. This comes in handy to emulate, for
instance, the exclusive monitor implemented in some ARM architectures
(more precisely, the Exclusive Reservation Granule).

In addition, add another CPUClass hook called to decide whether a
StoreCond has to fail or not.

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 include/qom/cpu.h | 15 +++
 qom/cpu.c | 20 
 2 files changed, 35 insertions(+)

diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 2e5229d..682c81d 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -29,6 +29,7 @@
 #include "qemu/queue.h"
 #include "qemu/thread.h"
 #include "qemu/typedefs.h"
+#include "qemu/range.h"
 
 typedef int (*WriteCoreDumpFunction)(const void *buf, size_t size,
  void *opaque);
@@ -183,6 +184,12 @@ typedef struct CPUClass {
 void (*cpu_exec_exit)(CPUState *cpu);
 bool (*cpu_exec_interrupt)(CPUState *cpu, int interrupt_request);
 
+/* Atomic instruction handling */
+void (*cpu_set_excl_protected_range)(CPUState *cpu, hwaddr addr,
+ hwaddr size);
+int (*cpu_valid_excl_access)(CPUState *cpu, hwaddr addr,
+ hwaddr size);
+
 void (*disas_set_info)(CPUState *cpu, disassemble_info *info);
 } CPUClass;
 
@@ -219,6 +226,9 @@ struct kvm_run;
 #define TB_JMP_CACHE_BITS 12
 #define TB_JMP_CACHE_SIZE (1 << TB_JMP_CACHE_BITS)
 
+/* Atomic insn translation TLB support. */
+#define EXCLUSIVE_RESET_ADDR ULLONG_MAX
+
 /**
  * CPUState:
  * @cpu_index: CPU index (informative).
@@ -341,6 +351,11 @@ struct CPUState {
  */
 bool throttle_thread_scheduled;
 
+/* vCPU's exclusive addresses range.
+ * The address is set to EXCLUSIVE_RESET_ADDR if the vCPU is not
+ * in the middle of a LL/SC. */
+struct Range excl_protected_range;
+
 /* Note that this is accessed at the start of every TB via a negative
offset from AREG0.  Leave this field at the end so as to make the
(absolute value) offset as small as possible.  This reduces code
diff --git a/qom/cpu.c b/qom/cpu.c
index 8f537a4..a5d360c 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -203,6 +203,24 @@ static bool cpu_common_exec_interrupt(CPUState *cpu, int 
int_req)
 return false;
 }
 
+static void cpu_common_set_excl_range(CPUState *cpu, hwaddr addr, hwaddr size)
+{
+cpu->excl_protected_range.begin = addr;
+cpu->excl_protected_range.end = addr + size;
+}
+
+static int cpu_common_valid_excl_access(CPUState *cpu, hwaddr addr, hwaddr 
size)
+{
+/* Check if the excl range completely covers the access */
+if (cpu->excl_protected_range.begin <= addr &&
+cpu->excl_protected_range.end >= addr + size) {
+
+return 1;
+}
+
+return 0;
+}
+
 void cpu_dump_state(CPUState *cpu, FILE *f, fprintf_function cpu_fprintf,
 int flags)
 {
@@ -355,6 +373,8 @@ static void cpu_class_init(ObjectClass *klass, void *data)
 k->cpu_exec_enter = cpu_common_noop;
 k->cpu_exec_exit = cpu_common_noop;
 k->cpu_exec_interrupt = cpu_common_exec_interrupt;
+k->cpu_set_excl_protected_range = cpu_common_set_excl_range;
+k->cpu_valid_excl_access = cpu_common_valid_excl_access;
 dc->realize = cpu_common_realizefn;
 /*
  * Reason: CPUs still need special care by board code: wiring up
-- 
2.7.0

Re: [Qemu-devel] [PATCH v13 00/10] Block replication for continuous checkpoints

2016-01-29 Thread Dr. David Alan Gilbert

* Wen Congyang (we...@cn.fujitsu.com) wrote:
> On 01/27/2016 07:03 PM, Dr. David Alan Gilbert wrote:
> > Hi,
> >   I've got a block error if I kill the secondary.
> > 
> > Start both primary & secondary
> > kill -9 secondary qemu
> > x_colo_lost_heartbeat on primary
> > 
> > The guest sees a block error and the ext4 root switches to read-only.
> > 
> > I gdb'd the primary with a breakpoint on quorum_report_bad; see
> > backtrace below.
> > (This is based on colo-v2.4-periodic-mode of the framework
> > code with the block and network proxy merged in; so it could be my
> > merging but I don't think so ?)
> > 
> > 
> > (gdb) where
> > #0  quorum_report_bad (node_name=0x7f2946a0892c "node0", ret=-5, 
> > acb=0x7f2946cb3910, acb=0x7f2946cb3910)
> > at /root/colo/jan-2016/qemu/block/quorum.c:222
> > #1  0x7f2943b23058 in quorum_aio_cb (opaque=, 
> > ret=)
> > at /root/colo/jan-2016/qemu/block/quorum.c:315
> > #2  0x7f2943b311be in bdrv_co_complete (acb=0x7f2946cb3f60) at 
> > /root/colo/jan-2016/qemu/block/io.c:2122
> > #3  0x7f2943ae777d in aio_bh_call (bh=) at 
> > /root/colo/jan-2016/qemu/async.c:64
> > #4  aio_bh_poll (ctx=ctx@entry=0x7f2945b771d0) at 
> > /root/colo/jan-2016/qemu/async.c:92
> > #5  0x7f2943af5090 in aio_dispatch (ctx=0x7f2945b771d0) at 
> > /root/colo/jan-2016/qemu/aio-posix.c:305
> > #6  0x7f2943ae756e in aio_ctx_dispatch (source=, 
> > callback=, 
> > user_data=) at /root/colo/jan-2016/qemu/async.c:231
> > #7  0x7f293b84a79a in g_main_context_dispatch () from 
> > /lib64/libglib-2.0.so.0
> > #8  0x7f2943af3a00 in glib_pollfds_poll () at 
> > /root/colo/jan-2016/qemu/main-loop.c:211
> > #9  os_host_main_loop_wait (timeout=) at 
> > /root/colo/jan-2016/qemu/main-loop.c:256
> > #10 main_loop_wait (nonblocking=) at 
> > /root/colo/jan-2016/qemu/main-loop.c:504
> > #11 0x7f29438529ee in main_loop () at /root/colo/jan-2016/qemu/vl.c:1945
> > #12 main (argc=, argv=, envp=) 
> > at /root/colo/jan-2016/qemu/vl.c:4707
> > 
> > (gdb) p s->num_children
> > $1 = 2
> > (gdb) p acb->success_count
> > $2 = 0
> > (gdb) p acb->is_read
> > $5 = false
> 
> Sorry for the late reply.

No problem.

> What it the value of acb->count?

(gdb) p acb->count
$1 = 1

> If secondary host is down, you should remove quorum's children.1. Otherwise, 
> you will get
> I/O error event.

Is that safe?  If the secondary fails, do you always have time to issue the 
command to
remove the children.1  before the guest sees the error?

Anyway, I tried removing children.1 but it segfaults now, I guess the 
replication is unhappy:

(qemu) x_block_change colo-disk0 -d children.1
(qemu) x_colo_lost_heartbeat 

12973 Segmentation fault  (core dumped) 
./try/x86_64-softmmu/qemu-system-x86_64 -enable-kvm $console_param -S -boot c 
-m 4080 -smp 4 -machine pc-i440fx-2.5,accel=kvm -name debug-threads=on -trace 
events=trace-file -device virtio-rng-pci $block_param $net_param

#0  0x7f0a398a864c in bdrv_stop_replication (bs=0x7f0a3b0a8430, 
failover=true, errp=0x7fff6a5c3420)
at /root/colo/jan-2016/qemu/block.c:4426

(gdb) p drv
$1 = (BlockDriver *) 0x5d2a

  it looks like the whole of bs is bogus.

#1  0x7f0a398d87f6 in quorum_stop_replication (bs=, 
failover=, 
errp=) at /root/colo/jan-2016/qemu/block/quorum.c:1213

(gdb) p s->replication_index
$3 = 1

I guess quorum_del_child needs to stop replication before it removes the child?
(although it would have to be careful not to block on the dead nbd).

#2  0x7f0a398a8901 in bdrv_stop_replication_all 
(failover=failover@entry=true, errp=errp@entry=0x7fff6a5c3478)
at /root/colo/jan-2016/qemu/block.c:4504
#3  0x7f0a3984b0af in primary_vm_do_failover () at 
/root/colo/jan-2016/qemu/migration/colo.c:144
#4  colo_do_failover (s=) at 
/root/colo/jan-2016/qemu/migration/colo.c:162
#5  0x7f0a3989d7fd in aio_bh_call (bh=) at 
/root/colo/jan-2016/qemu/async.c:64
#6  aio_bh_poll (ctx=ctx@entry=0x7f0a3a6c21d0) at 
/root/colo/jan-2016/qemu/async.c:92
#7  0x7f0a398ab110 in aio_dispatch (ctx=0x7f0a3a6c21d0) at 
/root/colo/jan-2016/qemu/aio-posix.c:305
#8  0x7f0a3989d5ee in aio_ctx_dispatch (source=, 
callback=, 
user_data=) at /root/colo/jan-2016/qemu/async.c:231
#9  0x7f0a3160079a in g_main_context_dispatch () from 
/lib64/libglib-2.0.so.0
#10 0x7f0a398a9a80 in glib_pollfds_poll () at 
/root/colo/jan-2016/qemu/main-loop.c:211
#11 os_host_main_loop_wait (timeout=) at 
/root/colo/jan-2016/qemu/main-loop.c:256
#12 main_loop_wait (nonblocking=) at 
/root/colo/jan-2016/qemu/main-loop.c:504
#13 0x7f0a396089ee in main_loop () at /root/colo/jan-2016/qemu/vl.c:1945
#14 main (argc=, argv=, envp=) at 
/root/colo/jan-2016/qemu/vl.c:4707

Dave

> Thanks
> Wen Congyang
> 
> > 
> > (qemu) info block
> > colo-disk0 (#block080): json:{"children": [{"driver": "raw", "file": 
> > {"driver": "file", "filename": "/root/colo/bugzilla.raw"}}, {"driver": 
> > "replication", "mode": "primary", "file": {"port": "8889", "host": 
>

Re: [Qemu-devel] [PATCH 0/8] ivshmem: test msi=off, remove CharDriver

2016-01-29 Thread Marc-André Lureau

ping

On Thu, Jan 7, 2016 at 4:52 PM, Marc-André Lureau
 wrote:
> Hi
>
> On Mon, Dec 21, 2015 at 12:30 PM,   wrote:
>> From: Marc-André Lureau 
>>
>> This is a ivshmem series with various bits:
>> - add a test for a recently introduced regression
>> - the fix is included in the series but was sent separatly to cc -stable
>> - fix some test leaks
>> - get rid of CharDriver usage for eventfd
>> - simplify event callback
>>
>
> Adding a few people in CC who might help with reviewing.
>
> thanks
>
>> Marc-André Lureau (8):
>>   ivshmem: no need for opaque argument
>>   ivshmem: remove redundant assignment, fix crash with msi=off
>>   ivshmem-test: leak fixes
>>   libqos: remove some leaks
>>   ivshmem-test: test both msi & irq cases
>>   ivshmem: generalize ivshmem_setup_interrupts
>>   ivshmem: use a single eventfd callback, get rid of CharDriver
>>   char: remove qemu_chr_open_eventfd
>>
>>  hw/misc/ivshmem.c | 85 
>> +--
>>  include/sysemu/char.h |  3 --
>>  qemu-char.c   | 13 
>>  tests/ivshmem-test.c  | 81 
>>  tests/libqos/pci.c|  2 ++
>>  5 files changed, 91 insertions(+), 93 deletions(-)
>>
>> --
>> 2.5.0
>>
>>
>
>
>
> --
> Marc-André Lureau



-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH] target-mips: Stop using uint_fast*_t types in r4k_tlb_t struct

2016-01-29 Thread Aurelien Jarno

On 2016-01-25 17:40, Peter Maydell wrote:
> The r4k_tlb_t structure uses the uint_fast*_t types. Most of these
> uses are in bitfields and are thus pointless, because the bitfield
> itself specifies the width of the type; just use 'unsigned int'
> instead. (On glibc uint_fast16_t is defined as either 32 or 64 bits,
> so we know the code is not reliant on it being exactly 16 bits.)
> There is also one use of uint_fast8_t, which we replace with uint8_t,
> because both are exactly 8 bits on glibc and this is the only
> place outside the softfloat code which uses an int_fast*_t type.
> 
> Signed-off-by: Peter Maydell 
> ---
> I'm going to have a go at getting rid of the int_fast16_t usage
> in the softfloat code too, but in the meantime this is an
> independent cleanup.
> 
>  target-mips/cpu.h | 26 +-
>  1 file changed, 13 insertions(+), 13 deletions(-)

Thanks for the cleanup.

Reviewed-by: Aurelien Jarno 

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

Re: [Qemu-devel] [iGVT-g] VFIO based vGPU(was Re: [Announcement] 2015-Q3 release of XenGT - a Mediated ...)

2016-01-29 Thread Jike Song

On 01/29/2016 03:20 PM, Jike Song wrote:
> This discussion becomes a little difficult for a newbie like me :(
> 
> On 01/28/2016 11:23 PM, Alex Williamson wrote:
>> On Thu, 2016-01-28 at 14:00 +0800, Jike Song wrote:
>>> On 01/28/2016 12:19 AM, Alex Williamson wrote:
 On Wed, 2016-01-27 at 13:43 +0800, Jike Song wrote:
>>> {snip}
>>>  
> Had a look at eventfd, I would say yes, technically we are able to
> achieve the goal: introduce a fd, with fop->{read|write} defined in KVM,
> call into vgpu device-model, also an iodev registered for a MMIO GPA
> range to invoke the fop->{read|write}.  I just didn't understand why
> userspace can't register an iodev via API directly.
  
 Please elaborate on how it would work via iodev.
  
>>>  
>>> QEMU forwards BAR0 write to the bus driver, in the bus driver, if
>>> found that MEM bit is enabled, register an iodev to KVM: with an
>>> ops:
>>>  
>>> const struct kvm_io_device_ops trap_mmio_ops = {
>>> .read   = kvmgt_guest_mmio_read,
>>> .write  = kvmgt_guest_mmio_write,
>>> };
>>>  
>>> I may not be able to illustrated it clearly with descriptions but this
>>> should not be a problem, thanks to your explanation, I can understand
>>> and adopt it for KVMGT.
>>
>> You're still crossing modules with direct callbacks, right?  What's the
>> advantage versus using the file descriptor + offset approach which could
>> offer the same performance and improve KVM overall by creating a new
>> option for generically handling MMIO?
>>
> 
> Yes, the method I gave above is the current way: calling kvm_io_device_ops
> from KVM hypervisor, and then going to vgpu device-model directly.
> 
> From KVMGT's side this is almost the same as what you suggested, I don't
> think now we have a problem here. I will adopt your suggestion.
> 
> Besides, this doesn't necessarily require another thread, right?
> I guess it can be within the VCPU thread? 
  
 I would think so too, the vcpu is blocked on the MMIO access, we should
 be able to service it in that context.  I hope.
  
>>>  
>>> Thanks for confirmation.
>>>  
> And this brought another question: except the vfio bus drvier and
> iommu backend (and the page_track ulitiy used for guest memory 
> write-protection), 
> is it KVMGT allowed to call into kvm.ko (or modify)? Though we are
> becoming less and less willing to do that with VFIO, it's still better
> to know that before going wrong.
  
 kvm and vfio are separate modules, for the most part, they know nothing
 about each other and have no hard dependencies between them.  We do have
 various accelerations we can use to avoid paths through userspace, but
 these are all via APIs that are agnostic of the party on the other end.
 For example, vfio signals interrups through eventfds and has no concept
 of whether that eventfd terminates in userspace or into an irqfd in KVM.
 vfio supports direct access to device MMIO regions via mmaps, but vfio
 has no idea if that mmap gets directly mapped into a VM address space.
 Even with posted interrupts, we've introduced an irq bypass manager
 allowing interrupt producers and consumers to register independently to
 form a connection without directly knowing anything about the other
 module.  That sort or proper software layering needs to continue.  It
 would be wrong for a vfio bus driver to assume KVM is the user and
 directly call into KVM interfaces.  Thanks,
  
>>>  
>>> I understand and agree with your point, it's bad if the bus driver
>>> assume KVM is the user and/or call into KVM interfaces.
>>>  
>>> However, the vgpu device-model, in intel case also a part of i915 driver,
>>> will always need to call some hypervisor-specific interfaces.
>>
>> No, think differently.
>>
>>> For example, when a guest gfx driver submit GPU commands, the device-model
>>> may want to scan it for security or whatever-else purpose:
>>>  
>>> - get a GPA (from GPU page tables)
>>> - want to read 16 bytes from that GPA
>>> - call hypervisor-specific read_gpa() method
>>> - for Xen, the GPA belongs to a foreign domain, it must find
>>>   a way to map & read it - beyond our scope here;
>>> - for KVM, the GPA can converted to HVA, copy_from_user (if
>>>   called from vcpu thread) or access_remote_vm (if called from
>>>   other threads);
>>>  
>>> Please note that this is not from the vfio bus driver, but from the vgpu
>>> device-model; also this is not DMA addr from GPU talbes, but real GPA.
>>
>> This is exactly why we're proposing that the vfio IOMMU interface be
>> used as a database of guest translations. 
>> The type1 IOMMU model in QEMU
>> maps all of guest memory through the IOMMU, in the vGPU model type1 is
>> simply collecting these and they map GPA to process virtual memory.
> 
> GPA to HVA mappings are maintained in

Re: [Qemu-devel] [PATCH 0/4] fpu: Remove use of int_fast*_t types

2016-01-29 Thread Aurelien Jarno

On 2016-01-26 11:30, Peter Maydell wrote:
> This patchset removes the uses of int_fast*_t types from the
> softfloat code:
>  * the return types for the "convert to 16 bit integer" functions
>are changed to int16_t
>  * uses of int_fast*_t for a shift count or an exponent value
>are changed to int
> 
> Basically, where the type was being used to mean what should
> logically be exactly 16 bits we use int16_t; where it was just
> being used for a value which isn't inherently 16 bits wide
> we switch to plain int.
> 
> Compatibility note: both these changes match the logical definition
> of int_fast*_t so if the code was not previously buggily relying
> on the width it happened to be they will not introduce any new bugs.
> In practice on glibc int_fast16_t is 32-bits on 32-bit platforms
> and 64-bits on 64-bit platforms so we are changing the underlying
> type size. I have tested by running a bunch of ARM regression
> tests with 'risu', so I'm pretty happy this doesn't cause problems.
> 
> The final patch removes some back-compat defines from osdep.h;
> it depends on both the earlier patches in this series and also
> on the targe-mips patch I sent out yesterday:
>   http://patchwork.ozlabs.org/patch/572843/
> (there are no other uses of the int_fast* types in QEMU.)
> 
> thanks
> -- PMM
> 
> Peter Maydell (4):
>   fpu: Remove use of int_fast16_t in conversions to int16
>   fpu: Use plain 'int' rather than 'int_fast16_t' for shift counts
>   fpu: Use plain 'int' rather than 'int_fast16_t' for exponents
>   osdep.h: Remove int_fast*_t Solaris compatibility code
> 
>  fpu/softfloat-macros.h  |  18 +++---
>  fpu/softfloat.c | 162 
> ++--
>  include/fpu/softfloat.h |  16 ++---
>  include/qemu/osdep.h|   7 ---
>  4 files changed, 104 insertions(+), 99 deletions(-)

Great work.

Reviewed-by: Aurelien Jarno 

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

[Qemu-devel] [RFC v7 15/16] target-arm: cpu64: use custom set_excl hook

2016-01-29 Thread Alvise Rigo

In aarch64 the LDXP/STXP instructions allow to perform up to 128 bits
exclusive accesses. However, due to a softmmu limitation, such wide
accesses are not allowed.

To workaround this limitation, we need to support LoadLink instructions
that cover at least 128 consecutive bits (see the next patch for more
details).

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 target-arm/cpu64.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/target-arm/cpu64.c b/target-arm/cpu64.c
index cc177bb..1d45e66 100644
--- a/target-arm/cpu64.c
+++ b/target-arm/cpu64.c
@@ -287,6 +287,13 @@ static void aarch64_cpu_set_pc(CPUState *cs, vaddr value)
 }
 }
 
+static void aarch64_set_excl_range(CPUState *cpu, hwaddr addr, hwaddr size)
+{
+cpu->excl_protected_range.begin = addr;
+/* At least cover 128 bits for a STXP access (two paired doublewords 
case)*/
+cpu->excl_protected_range.end = addr + 16;
+}
+
 static void aarch64_cpu_class_init(ObjectClass *oc, void *data)
 {
 CPUClass *cc = CPU_CLASS(oc);
@@ -297,6 +304,7 @@ static void aarch64_cpu_class_init(ObjectClass *oc, void 
*data)
 cc->gdb_write_register = aarch64_cpu_gdb_write_register;
 cc->gdb_num_core_regs = 34;
 cc->gdb_core_xml_file = "aarch64-core.xml";
+cc->cpu_set_excl_protected_range = aarch64_set_excl_range;
 }
 
 static void aarch64_cpu_register(const ARMCPUInfo *info)
-- 
2.7.0

[Qemu-devel] [RFC v7 16/16] target-arm: aarch64: add atomic instructions

2016-01-29 Thread Alvise Rigo

Use the new LL/SC runtime helpers to handle the aarch64 atomic instructions
in softmmu_llsc_template.h.

The STXP emulation required a dedicated helper to handle the paired
doubleword case.

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 configure  |   6 +-
 target-arm/helper-a64.c|  55 +++
 target-arm/helper-a64.h|   4 ++
 target-arm/op_helper.c |   8 +++
 target-arm/translate-a64.c | 134 -
 5 files changed, 204 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index 915efcc..38121ff 100755
--- a/configure
+++ b/configure
@@ -5873,9 +5873,11 @@ echo "QEMU_CFLAGS+=$cflags" >> $config_target_mak
 # Use tcg LL/SC tcg backend for exclusive instruction is arm/aarch64
 # softmmus targets
 if test "$arm_tcg_use_llsc" = "yes" ; then
-  if test "$target" = "arm-softmmu" ; then
+  case "$target" in
+arm-softmmu | aarch64-softmmu)
 echo "CONFIG_ARM_USE_LDST_EXCL=y" >> $config_target_mak
-  fi
+;;
+  esac
 fi
 done # for target in $targets
 
diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c
index c7bfb4d..dcee66f 100644
--- a/target-arm/helper-a64.c
+++ b/target-arm/helper-a64.c
@@ -26,6 +26,7 @@
 #include "qemu/bitops.h"
 #include "internals.h"
 #include "qemu/crc32c.h"
+#include "tcg/tcg.h"
 #include  /* For crc32 */
 
 /* C2.4.7 Multiply and divide */
@@ -443,3 +444,57 @@ uint64_t HELPER(crc32c_64)(uint64_t acc, uint64_t val, 
uint32_t bytes)
 /* Linux crc32c converts the output to one's complement.  */
 return crc32c(acc, buf, bytes) ^ 0x;
 }
+
+#ifdef CONFIG_ARM_USE_LDST_EXCL
+/* STXP emulation for two 64 bit doublewords. We can't use directly two
+ * stcond_i64 accesses, otherwise the first will conclude the LL/SC pair.
+ * Instead, two normal 64-bit accesses are used and the CPUState is
+ * updated accordingly. */
+target_ulong HELPER(stxp_i128)(CPUArchState *env, target_ulong addr,
+   uint64_t vall, uint64_t valh,
+   uint32_t mmu_idx)
+{
+CPUState *cpu = ENV_GET_CPU(env);
+TCGMemOpIdx op;
+target_ulong ret = 0;
+
+if (!cpu->ll_sc_context) {
+cpu->excl_succeeded = false;
+ret = 1;
+goto out;
+}
+
+op = make_memop_idx(MO_BEQ, mmu_idx);
+
+/* According to section C6.6.191 of ARM ARM DDI 0487A.h, the access has to
+ * be quadword aligned.  For the time being, we do not support paired STXPs
+ * to MMIO memory, this will become trivial when the softmmu will support
+ * 128bit memory accesses. */
+if (addr & 0xf) {
+/* TODO: Do unaligned access */
+}
+
+/* Setting excl_succeeded to true will make the store exclusive. */
+cpu->excl_succeeded = true;
+helper_ret_stq_mmu(env, addr, vall, op, GETRA());
+
+if (!cpu->excl_succeeded) {
+ret = 1;
+goto out;
+}
+
+helper_ret_stq_mmu(env, addr + 8, valh, op, GETRA());
+if (!cpu->excl_succeeded) {
+ret = 1;
+} else {
+cpu->excl_succeeded = false;
+}
+
+out:
+/* Unset LL/SC context */
+cpu->ll_sc_context = false;
+cpu->excl_protected_range.begin = EXCLUSIVE_RESET_ADDR;
+
+return ret;
+}
+#endif
diff --git a/target-arm/helper-a64.h b/target-arm/helper-a64.h
index 1d3d10f..c416a83 100644
--- a/target-arm/helper-a64.h
+++ b/target-arm/helper-a64.h
@@ -46,3 +46,7 @@ DEF_HELPER_FLAGS_2(frecpx_f32, TCG_CALL_NO_RWG, f32, f32, ptr)
 DEF_HELPER_FLAGS_2(fcvtx_f64_to_f32, TCG_CALL_NO_RWG, f32, f64, env)
 DEF_HELPER_FLAGS_3(crc32_64, TCG_CALL_NO_RWG_SE, i64, i64, i64, i32)
 DEF_HELPER_FLAGS_3(crc32c_64, TCG_CALL_NO_RWG_SE, i64, i64, i64, i32)
+#ifdef CONFIG_ARM_USE_LDST_EXCL
+/* STXP helper */
+DEF_HELPER_5(stxp_i128, i64, env, i64, i64, i64, i32)
+#endif
diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c
index 404c13b..146fc9a 100644
--- a/target-arm/op_helper.c
+++ b/target-arm/op_helper.c
@@ -34,6 +34,14 @@ static void raise_exception(CPUARMState *env, uint32_t excp,
 cs->exception_index = excp;
 env->exception.syndrome = syndrome;
 env->exception.target_el = target_el;
+#ifdef CONFIG_ARM_USE_LDST_EXCL
+HELPER(atomic_clear)(env);
+/* If the exception happens in the middle of a LL/SC, we need to clear
+ * excl_succeeded to avoid that the normal store following the exception is
+ * wrongly interpreted as exclusive.
+ * */
+cs->excl_succeeded = 0;
+#endif
 cpu_loop_exit(cs);
 }
 
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 80f6c20..f34e957 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -37,8 +37,10 @@
 static TCGv_i64 cpu_X[32];
 static TCGv_i64 cpu_pc;
 
+#if !defined(CONFIG_ARM_USE_LDST_EXCL)
 /* Load/store exclusive handling */
 static TCGv_i64 cpu_exclusive_high;
+#endif
 
 static const

Re: [Qemu-devel] [PATCH v5] qom, qmp, hmp, qapi: create qom-type-prop-list for class properties

2016-01-29 Thread Valentin Rakush

Hi Eduardo, hi Daniel,

I checked most of the classes that are used for x86_64 qemu simulation with
this command line:
x86_64-softmmu/qemu-system-x86_64 -qmp tcp:localhost:,server,nowait
-machine pc -cpu core2duo

Here are some of the classes that cannot provide properties with
device_list_properties call:
/object/machine/generic-pc-machine/pc-0.13-machine
/object/bus/i2c-bus
/interface/user-creatable
/object/tls-creds/tls-creds-anon
/object/memory-backend/memory-backend-file
/object/qemu:memory-region
/object/rng-backend/rng-random
/object/tpm-backend/tpm-passthrough
/object/tls-creds/tls-creds-x509
/object/secret

They cannot provide properties because these classes cannot be casted to
TYPE_DEVICE. This is done intentionally because TYPE_DEVICE has its own
properties. Also TYPE_MACHINE has own properties of type GlobalProperty.
Here are two ways (AFAICS):
- we refactor TYPE_DEVICE and TYPE_MACHINE so they store their properties
in the ObjectClass properties.
- we change device_list_properties so it process different classes
differently.

The disadvantage of the second approach, is that it is complicating code in
favor of simplifying qapi interface. I like first approach with
refactoring, although it is more complex. The first approach should put all
properties in the base classes and then use this properties everywhere
(command line help, qmp etc.) The simplest way the refactoring can be done,
is by moving TYPE_DEVICE properties to ObjectClass and merging them somehow
with TYPE_MACHINE GlobalProperty. Then we will use these properties for all
other types of classes.

Of course, we can leave device_list_properties as it is and use
qom-type-prop-list instead.

What do you think? Does these design options make sense for you?


Thank you,
Valentin






On Wed, Jan 27, 2016 at 6:23 PM, Daniel P. Berrange 
wrote:

> On Wed, Jan 27, 2016 at 01:09:37PM -0200, Eduardo Habkost wrote:
> > On Tue, Jan 26, 2016 at 10:19:13PM +, Daniel P. Berrange wrote:
> > > On Tue, Jan 26, 2016 at 03:26:35PM -0200, Eduardo Habkost wrote:
> > > > On Tue, Jan 26, 2016 at 03:51:21PM +, Daniel P. Berrange wrote:
> > > > > On Tue, Jan 26, 2016 at 01:35:38PM -0200, Eduardo Habkost wrote:
> > > > > > On Mon, Jan 25, 2016 at 11:24:47AM +0300, Valentin Rakush wrote:
> > > > > > > This patch adds support for qom-type-prop-list command to list
> object
> > > > > > > class properties. A later patch will use this functionality to
> > > > > > > implement x86_64-cpu properties.
> > > > > > >
> > > > > > > Signed-off-by: Valentin Rakush 
> > > > > > > Cc: Luiz Capitulino 
> > > > > > > Cc: Eric Blake 
> > > > > > > Cc: Markus Armbruster 
> > > > > > > Cc: Andreas Färber 
> > > > > > > Cc: Daniel P. Berrange 
> > > > > > > Cc: Eduardo Habkost 
> > > > > > > ---
> > > > > > [...]
> > > > > > > diff --git a/qmp.c b/qmp.c
> > > > > > > index 53affe2..baf25c0 100644
> > > > > > > --- a/qmp.c
> > > > > > > +++ b/qmp.c
> > > > > > > @@ -460,6 +460,37 @@ ObjectTypeInfoList
> *qmp_qom_list_types(bool has_implements,
> > > > > > >  return ret;
> > > > > > >  }
> > > > > > >
> > > > > > > +ObjectPropertyInfoList *qmp_qom_type_prop_list(const char
> *typename, Error **errp)
> > > > > > > +{
> > > > > > > +ObjectClass *klass;
> > > > > > > +ObjectPropertyInfoList *props = NULL;
> > > > > > > +ObjectProperty *prop;
> > > > > > > +ObjectPropertyIterator iter;
> > > > > > > +
> > > > > > > +klass = object_class_by_name(typename);
> > > > > > > +if (!klass) {
> > > > > > > +error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
> > > > > > > +  "Object class '%s' not found", typename);
> > > > > > > +return NULL;
> > > > > > > +}
> > > > > > > +
> > > > > > > +object_class_property_iter_init(, klass);
> > > > > > > +while ((prop = object_property_iter_next())) {
> > > > > > > +ObjectPropertyInfoList *entry =
> g_new0(ObjectPropertyInfoList, 1);
> > > > > > > +
> > > > > > > +if (entry) {
> > > > > > > +entry->value = g_new0(ObjectPropertyInfo, 1);
> > > > > > > +entry->next = props;
> > > > > > > +props = entry;
> > > > > > > +
> > > > > > > +entry->value->name = g_strdup(prop->name);
> > > > > > > +entry->value->type = g_strdup(prop->type);
> > > > > > > +}
> > > > > > > +}
> > > > > > > +
> > > > > > > +return props;
> > > > > > > +}
> > > > > > > +
> > > > > >
> > > > > > We already have "-device ,help", and it uses a completely
> > > > > > different mechanism for listing properties. There's no reason for
> > > > > > having two arbitrarily different APIs for listing properties
> > > > > > returning different results.
> > > > > >
> > > > > > If qmp_device_list_properties() is not enough for you, please
> > > > > > clarify why, so

Re: [Qemu-devel] [PATCH v13 00/10] Block replication for continuous checkpoints

2016-01-29 Thread Wen Congyang

On 01/29/2016 06:07 PM, Dr. David Alan Gilbert wrote:
> * Wen Congyang (we...@cn.fujitsu.com) wrote:
>> On 01/27/2016 07:03 PM, Dr. David Alan Gilbert wrote:
>>> Hi,
>>>   I've got a block error if I kill the secondary.
>>>
>>> Start both primary & secondary
>>> kill -9 secondary qemu
>>> x_colo_lost_heartbeat on primary
>>>
>>> The guest sees a block error and the ext4 root switches to read-only.
>>>
>>> I gdb'd the primary with a breakpoint on quorum_report_bad; see
>>> backtrace below.
>>> (This is based on colo-v2.4-periodic-mode of the framework
>>> code with the block and network proxy merged in; so it could be my
>>> merging but I don't think so ?)
>>>
>>>
>>> (gdb) where
>>> #0  quorum_report_bad (node_name=0x7f2946a0892c "node0", ret=-5, 
>>> acb=0x7f2946cb3910, acb=0x7f2946cb3910)
>>> at /root/colo/jan-2016/qemu/block/quorum.c:222
>>> #1  0x7f2943b23058 in quorum_aio_cb (opaque=, 
>>> ret=)
>>> at /root/colo/jan-2016/qemu/block/quorum.c:315
>>> #2  0x7f2943b311be in bdrv_co_complete (acb=0x7f2946cb3f60) at 
>>> /root/colo/jan-2016/qemu/block/io.c:2122
>>> #3  0x7f2943ae777d in aio_bh_call (bh=) at 
>>> /root/colo/jan-2016/qemu/async.c:64
>>> #4  aio_bh_poll (ctx=ctx@entry=0x7f2945b771d0) at 
>>> /root/colo/jan-2016/qemu/async.c:92
>>> #5  0x7f2943af5090 in aio_dispatch (ctx=0x7f2945b771d0) at 
>>> /root/colo/jan-2016/qemu/aio-posix.c:305
>>> #6  0x7f2943ae756e in aio_ctx_dispatch (source=, 
>>> callback=, 
>>> user_data=) at /root/colo/jan-2016/qemu/async.c:231
>>> #7  0x7f293b84a79a in g_main_context_dispatch () from 
>>> /lib64/libglib-2.0.so.0
>>> #8  0x7f2943af3a00 in glib_pollfds_poll () at 
>>> /root/colo/jan-2016/qemu/main-loop.c:211
>>> #9  os_host_main_loop_wait (timeout=) at 
>>> /root/colo/jan-2016/qemu/main-loop.c:256
>>> #10 main_loop_wait (nonblocking=) at 
>>> /root/colo/jan-2016/qemu/main-loop.c:504
>>> #11 0x7f29438529ee in main_loop () at /root/colo/jan-2016/qemu/vl.c:1945
>>> #12 main (argc=, argv=, envp=) 
>>> at /root/colo/jan-2016/qemu/vl.c:4707
>>>
>>> (gdb) p s->num_children
>>> $1 = 2
>>> (gdb) p acb->success_count
>>> $2 = 0
>>> (gdb) p acb->is_read
>>> $5 = false
>>
>> Sorry for the late reply.
> 
> No problem.
> 
>> What it the value of acb->count?
> 
> (gdb) p acb->count
> $1 = 1

Note, the count is 1, not 2. Writing to children.0 is in flight. If writing to 
children.0 successes,
the guest doesn't know this error.

> 
>> If secondary host is down, you should remove quorum's children.1. Otherwise, 
>> you will get
>> I/O error event.
> 
> Is that safe?  If the secondary fails, do you always have time to issue the 
> command to
> remove the children.1  before the guest sees the error?

We will write to two children, and expect that writing to children.0 will 
success. If so,
the guest doesn't know this error. You just get the I/O error event.

> 
> Anyway, I tried removing children.1 but it segfaults now, I guess the 
> replication is unhappy:
> 
> (qemu) x_block_change colo-disk0 -d children.1
> (qemu) x_colo_lost_heartbeat 

Hmm, you should not remove the child before failover. I will check it how to 
avoid it in the codes.

> 
> 12973 Segmentation fault  (core dumped) 
> ./try/x86_64-softmmu/qemu-system-x86_64 -enable-kvm $console_param -S -boot c 
> -m 4080 -smp 4 -machine pc-i440fx-2.5,accel=kvm -name debug-threads=on -trace 
> events=trace-file -device virtio-rng-pci $block_param $net_param
> 
> #0  0x7f0a398a864c in bdrv_stop_replication (bs=0x7f0a3b0a8430, 
> failover=true, errp=0x7fff6a5c3420)
> at /root/colo/jan-2016/qemu/block.c:4426
> 
> (gdb) p drv
> $1 = (BlockDriver *) 0x5d2a
> 
>   it looks like the whole of bs is bogus.
> 
> #1  0x7f0a398d87f6 in quorum_stop_replication (bs=, 
> failover=, 
> errp=) at /root/colo/jan-2016/qemu/block/quorum.c:1213
> 
> (gdb) p s->replication_index
> $3 = 1
> 
> I guess quorum_del_child needs to stop replication before it removes the 
> child?

Yes, but in the newest version, quorum doesn't know the block replication, and 
I think
we shoud add an reference to the bs when starting block replication.

Thanks
Wen Congyang

> (although it would have to be careful not to block on the dead nbd).
> 
> #2  0x7f0a398a8901 in bdrv_stop_replication_all 
> (failover=failover@entry=true, errp=errp@entry=0x7fff6a5c3478)
> at /root/colo/jan-2016/qemu/block.c:4504
> #3  0x7f0a3984b0af in primary_vm_do_failover () at 
> /root/colo/jan-2016/qemu/migration/colo.c:144
> #4  colo_do_failover (s=) at 
> /root/colo/jan-2016/qemu/migration/colo.c:162
> #5  0x7f0a3989d7fd in aio_bh_call (bh=) at 
> /root/colo/jan-2016/qemu/async.c:64
> #6  aio_bh_poll (ctx=ctx@entry=0x7f0a3a6c21d0) at 
> /root/colo/jan-2016/qemu/async.c:92
> #7  0x7f0a398ab110 in aio_dispatch (ctx=0x7f0a3a6c21d0) at 
> /root/colo/jan-2016/qemu/aio-posix.c:305
> #8  0x7f0a3989d5ee in aio_ctx_dispatch (source=, 
> callback=, 
> user_data=) at /root/colo/jan-2016/qemu/async.c:231
> #9

[Qemu-devel] [RFC v7 08/16] softmmu: Honor the new exclusive bitmap

2016-01-29 Thread Alvise Rigo

The pages set as exclusive (clean) in the DIRTY_MEMORY_EXCLUSIVE bitmap
have to have their TLB entries flagged with TLB_EXCL. The accesses to
pages with TLB_EXCL flag set have to be properly handled in that they
can potentially invalidate an open LL/SC transaction.

Modify the TLB entries generation to honor the new bitmap and extend
the softmmu_template to handle the accesses made to guest pages marked
as exclusive.

In the case we remove a TLB entry marked as EXCL, we unset the
corresponding exclusive bit in the bitmap.

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 cputlb.c   | 44 --
 softmmu_template.h | 80 --
 2 files changed, 113 insertions(+), 11 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index ce6d720..aa9cc17 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -395,6 +395,16 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
vaddr,
 env->tlb_v_table[mmu_idx][vidx] = *te;
 env->iotlb_v[mmu_idx][vidx] = env->iotlb[mmu_idx][index];
 
+if (unlikely(!(te->addr_write & TLB_MMIO) && (te->addr_write & TLB_EXCL))) 
{
+/* We are removing an exclusive entry, set the page to dirty. This
+ * is not be necessary if the vCPU has performed both SC and LL. */
+hwaddr hw_addr = (env->iotlb[mmu_idx][index].addr & TARGET_PAGE_MASK) +
+  (te->addr_write & TARGET_PAGE_MASK);
+if (!cpu->ll_sc_context) {
+cpu_physical_memory_unset_excl(hw_addr);
+}
+}
+
 /* refill the tlb */
 env->iotlb[mmu_idx][index].addr = iotlb - vaddr;
 env->iotlb[mmu_idx][index].attrs = attrs;
@@ -418,9 +428,19 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
vaddr,
 } else if (memory_region_is_ram(section->mr)
&& cpu_physical_memory_is_clean(section->mr->ram_addr
+ xlat)) {
-te->addr_write = address | TLB_NOTDIRTY;
-} else {
-te->addr_write = address;
+address |= TLB_NOTDIRTY;
+}
+
+/* Since the MMIO accesses follow always the slow path, we do not need
+ * to set any flag to trap the access */
+if (!(address & TLB_MMIO)) {
+if (cpu_physical_memory_is_excl(section->mr->ram_addr + xlat)) {
+/* There is at least one vCPU that has flagged the address as
+ * exclusive. */
+te->addr_write = address | TLB_EXCL;
+} else {
+te->addr_write = address;
+}
 }
 } else {
 te->addr_write = -1;
@@ -474,6 +494,24 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env1, 
target_ulong addr)
 return qemu_ram_addr_from_host_nofail(p);
 }
 
+/* For every vCPU compare the exclusive address and reset it in case of a
+ * match. Since only one vCPU is running at once, no lock has to be held to
+ * guard this operation. */
+static inline void lookup_and_reset_cpus_ll_addr(hwaddr addr, hwaddr size)
+{
+CPUState *cpu;
+
+CPU_FOREACH(cpu) {
+if (cpu->excl_protected_range.begin != EXCLUSIVE_RESET_ADDR &&
+ranges_overlap(cpu->excl_protected_range.begin,
+   cpu->excl_protected_range.end -
+   cpu->excl_protected_range.begin,
+   addr, size)) {
+cpu->excl_protected_range.begin = EXCLUSIVE_RESET_ADDR;
+}
+}
+}
+
 #define MMUSUFFIX _mmu
 
 /* Generates LoadLink/StoreConditional helpers in softmmu_template.h */
diff --git a/softmmu_template.h b/softmmu_template.h
index 4332db2..267c52a 100644
--- a/softmmu_template.h
+++ b/softmmu_template.h
@@ -474,11 +474,43 @@ void helper_le_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
 }
 
-/* Handle an IO access.  */
+/* Handle an IO access or exclusive access.  */
 if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) {
-glue(helper_le_st_name, _do_mmio_access)(env, val, addr, oi,
- mmu_idx, index, retaddr);
-return;
+if ((tlb_addr & ~TARGET_PAGE_MASK) == TLB_EXCL) {
+CPUIOTLBEntry *iotlbentry = >iotlb[mmu_idx][index];
+CPUState *cpu = ENV_GET_CPU(env);
+CPUClass *cc = CPU_GET_CLASS(cpu);
+/* The slow-path has been forced since we are writing to
+ * exclusive-protected memory. */
+hwaddr hw_addr = (iotlbentry->addr & TARGET_PAGE_MASK) + addr;
+
+/* The function lookup_and_reset_cpus_ll_addr could have reset the
+ * exclusive address. Fail the SC in this case.
+ * N.B.: here excl_succeed == true means that the caller is
+ *

[Qemu-devel] [RFC v7 11/16] tcg: Create new runtime helpers for excl accesses

2016-01-29 Thread Alvise Rigo

Introduce a set of new runtime helpers to handle exclusive instructions.
These helpers are used as hooks to call the respective LL/SC helpers in
softmmu_llsc_template.h from TCG code.

The helpers ending with an "a" make an alignment check.

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 Makefile.target |   2 +-
 include/exec/helper-gen.h   |   3 ++
 include/exec/helper-proto.h |   1 +
 include/exec/helper-tcg.h   |   3 ++
 tcg-llsc-helper.c   | 104 
 tcg-llsc-helper.h   |  61 ++
 tcg/tcg-llsc-gen-helper.h   |  67 
 7 files changed, 240 insertions(+), 1 deletion(-)
 create mode 100644 tcg-llsc-helper.c
 create mode 100644 tcg-llsc-helper.h
 create mode 100644 tcg/tcg-llsc-gen-helper.h

diff --git a/Makefile.target b/Makefile.target
index 34ddb7e..faf32a2 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -135,7 +135,7 @@ obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o 
ioport.o numa.o
 obj-y += qtest.o bootdevice.o
 obj-y += hw/
 obj-$(CONFIG_KVM) += kvm-all.o
-obj-y += memory.o cputlb.o
+obj-y += memory.o cputlb.o tcg-llsc-helper.o
 obj-y += memory_mapping.o
 obj-y += dump.o
 obj-y += migration/ram.o migration/savevm.o
diff --git a/include/exec/helper-gen.h b/include/exec/helper-gen.h
index 0d0da3a..f8483a9 100644
--- a/include/exec/helper-gen.h
+++ b/include/exec/helper-gen.h
@@ -60,6 +60,9 @@ static inline void glue(gen_helper_, 
name)(dh_retvar_decl(ret)  \
 #include "trace/generated-helpers.h"
 #include "trace/generated-helpers-wrappers.h"
 #include "tcg-runtime.h"
+#if defined(CONFIG_SOFTMMU)
+#include "tcg-llsc-gen-helper.h"
+#endif
 
 #undef DEF_HELPER_FLAGS_0
 #undef DEF_HELPER_FLAGS_1
diff --git a/include/exec/helper-proto.h b/include/exec/helper-proto.h
index effdd43..90be2fd 100644
--- a/include/exec/helper-proto.h
+++ b/include/exec/helper-proto.h
@@ -29,6 +29,7 @@ dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), 
dh_ctype(t3), \
 #include "helper.h"
 #include "trace/generated-helpers.h"
 #include "tcg-runtime.h"
+#include "tcg/tcg-llsc-gen-helper.h"
 
 #undef DEF_HELPER_FLAGS_0
 #undef DEF_HELPER_FLAGS_1
diff --git a/include/exec/helper-tcg.h b/include/exec/helper-tcg.h
index 79fa3c8..6228a7f 100644
--- a/include/exec/helper-tcg.h
+++ b/include/exec/helper-tcg.h
@@ -38,6 +38,9 @@
 #include "helper.h"
 #include "trace/generated-helpers.h"
 #include "tcg-runtime.h"
+#ifdef CONFIG_SOFTMMU
+#include "tcg-llsc-gen-helper.h"
+#endif
 
 #undef DEF_HELPER_FLAGS_0
 #undef DEF_HELPER_FLAGS_1
diff --git a/tcg-llsc-helper.c b/tcg-llsc-helper.c
new file mode 100644
index 000..646b4ba
--- /dev/null
+++ b/tcg-llsc-helper.c
@@ -0,0 +1,104 @@
+/*
+ * Runtime helpers for atomic istruction emulation
+ *
+ * Copyright (c) 2015 Virtual Open Systems
+ *
+ * Authors:
+ *  Alvise Rigo 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "exec/cpu_ldst.h"
+#include "exec/helper-head.h"
+#include "tcg-llsc-helper.h"
+
+#define LDEX_HELPER(SUFF, OPC, FUNC)   \
+uint32_t HELPER(ldlink_i##SUFF)(CPUArchState *env, target_ulong addr,  \
+uint32_t index)\
+{  \
+CPUArchState *state = env; \
+TCGMemOpIdx op;\
+   \
+op = make_memop_idx((OPC), index); \
+   \
+return (uint32_t)FUNC(state, addr, op, GETRA());   \
+}
+
+#define STEX_HELPER(SUFF, DATA_TYPE, OPC, FUNC)\
+target_ulong HELPER(stcond_i##SUFF)(CPUArchState *env, target_ulong addr,  \
+uint32_t val, uint32_t index)  \
+{  \
+CPUArchState *state = env; \

Re: [Qemu-devel] [PATCH] MAINTAINERS: Add section for FPU emulation

2016-01-29 Thread Aurelien Jarno

On 2016-01-26 13:27, Peter Maydell wrote:
> Add an entry to the MAINTAINERS file for our softfloat FPU
> emulation code. This code is only 'odd fixes' but it's useful to
> record who to cc on patches to it.
> 
> Signed-off-by: Peter Maydell 
> ---
> Would anybody else like to be listed here (ie to be cc'd on softfloat
> patches) ? Richard? Aurelien?

As long as it is in "Odd Fixes" mode, it would like to get it listed
please. I don't have time to follow the whole mailing list anymore, so
being Cc'd n on softfloat patches would be nice.

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net

[Qemu-devel] [RFC v7 04/16] softmmu: Simplify helper_*_st_name, wrap RAM code

2016-01-29 Thread Alvise Rigo

Attempting to simplify the helper_*_st_name, wrap the code relative to a
RAM access into an inline function.

Based on this work, Alex proposed the following patch series
https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg01136.html
that reduces code duplication of the softmmu_helpers.

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 softmmu_template.h | 110 +
 1 file changed, 68 insertions(+), 42 deletions(-)

diff --git a/softmmu_template.h b/softmmu_template.h
index 3d388ec..6279437 100644
--- a/softmmu_template.h
+++ b/softmmu_template.h
@@ -416,13 +416,46 @@ static inline void glue(helper_le_st_name, 
_do_mmio_access)(CPUArchState *env,
 glue(io_write, SUFFIX)(env, iotlbentry, val, addr, retaddr);
 }
 
+static inline void glue(helper_le_st_name, _do_ram_access)(CPUArchState *env,
+   DATA_TYPE val,
+   target_ulong addr,
+   TCGMemOpIdx oi,
+   unsigned mmu_idx,
+   int index,
+   uintptr_t retaddr)
+{
+uintptr_t haddr;
+
+/* Handle slow unaligned access (it spans two pages or IO).  */
+if (DATA_SIZE > 1
+&& unlikely((addr & ~TARGET_PAGE_MASK) + DATA_SIZE - 1
+ >= TARGET_PAGE_SIZE)) {
+glue(helper_le_st_name, _do_unl_access)(env, val, addr, oi, mmu_idx,
+retaddr);
+return;
+}
+
+/* Handle aligned access or unaligned access in the same page.  */
+if ((addr & (DATA_SIZE - 1)) != 0
+&& (get_memop(oi) & MO_AMASK) == MO_ALIGN) {
+cpu_unaligned_access(ENV_GET_CPU(env), addr, MMU_DATA_STORE,
+ mmu_idx, retaddr);
+}
+
+haddr = addr + env->tlb_table[mmu_idx][index].addend;
+#if DATA_SIZE == 1
+glue(glue(st, SUFFIX), _p)((uint8_t *)haddr, val);
+#else
+glue(glue(st, SUFFIX), _le_p)((uint8_t *)haddr, val);
+#endif
+}
+
 void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val,
TCGMemOpIdx oi, uintptr_t retaddr)
 {
 unsigned mmu_idx = get_mmuidx(oi);
 int index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
 target_ulong tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
-uintptr_t haddr;
 
 /* Adjust the given return address.  */
 retaddr -= GETPC_ADJ;
@@ -448,28 +481,8 @@ void helper_le_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 return;
 }
 
-/* Handle slow unaligned access (it spans two pages or IO).  */
-if (DATA_SIZE > 1
-&& unlikely((addr & ~TARGET_PAGE_MASK) + DATA_SIZE - 1
- >= TARGET_PAGE_SIZE)) {
-glue(helper_le_st_name, _do_unl_access)(env, val, addr, mmu_idx,
-oi, retaddr);
-return;
-}
-
-/* Handle aligned access or unaligned access in the same page.  */
-if ((addr & (DATA_SIZE - 1)) != 0
-&& (get_memop(oi) & MO_AMASK) == MO_ALIGN) {
-cpu_unaligned_access(ENV_GET_CPU(env), addr, MMU_DATA_STORE,
- mmu_idx, retaddr);
-}
-
-haddr = addr + env->tlb_table[mmu_idx][index].addend;
-#if DATA_SIZE == 1
-glue(glue(st, SUFFIX), _p)((uint8_t *)haddr, val);
-#else
-glue(glue(st, SUFFIX), _le_p)((uint8_t *)haddr, val);
-#endif
+glue(helper_le_st_name, _do_ram_access)(env, val, addr, oi, mmu_idx, index,
+retaddr);
 }
 
 #if DATA_SIZE > 1
@@ -519,13 +532,42 @@ static inline void glue(helper_be_st_name, 
_do_mmio_access)(CPUArchState *env,
 glue(io_write, SUFFIX)(env, iotlbentry, val, addr, retaddr);
 }
 
+static inline void glue(helper_be_st_name, _do_ram_access)(CPUArchState *env,
+   DATA_TYPE val,
+   target_ulong addr,
+   TCGMemOpIdx oi,
+   unsigned mmu_idx,
+   int index,
+   uintptr_t retaddr)
+{
+uintptr_t haddr;
+
+/* Handle slow unaligned access (it spans two pages or IO).  */
+if (DATA_SIZE > 1
+&& unlikely((addr & ~TARGET_PAGE_MASK) + DATA_SIZE - 1
+ >= TARGET_PAGE_SIZE)) {
+glue(helper_be_st_name, _do_unl_access)(env, val, addr, oi, mmu_idx,
+retaddr);
+

[Qemu-devel] [RFC v7 10/16] softmmu: Protect MMIO exclusive range

2016-01-29 Thread Alvise Rigo

As for the RAM case, also the MMIO exclusive ranges have to be protected
by other CPU's accesses. In order to do that, we flag the accessed
MemoryRegion to mark that an exclusive access has been performed and is
not concluded yet.

This flag will force the other CPUs to invalidate the exclusive range in
case of collision.

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 cputlb.c| 20 +---
 include/exec/memory.h   |  1 +
 softmmu_llsc_template.h | 11 +++
 softmmu_template.h  | 22 ++
 4 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 87d09c8..06ce2da 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -496,19 +496,25 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env1, 
target_ulong addr)
 /* For every vCPU compare the exclusive address and reset it in case of a
  * match. Since only one vCPU is running at once, no lock has to be held to
  * guard this operation. */
-static inline void lookup_and_reset_cpus_ll_addr(hwaddr addr, hwaddr size)
+static inline bool lookup_and_reset_cpus_ll_addr(hwaddr addr, hwaddr size)
 {
 CPUState *cpu;
+bool ret = false;
 
 CPU_FOREACH(cpu) {
-if (cpu->excl_protected_range.begin != EXCLUSIVE_RESET_ADDR &&
-ranges_overlap(cpu->excl_protected_range.begin,
-   cpu->excl_protected_range.end -
-   cpu->excl_protected_range.begin,
-   addr, size)) {
-cpu->excl_protected_range.begin = EXCLUSIVE_RESET_ADDR;
+if (current_cpu != cpu) {
+if (cpu->excl_protected_range.begin != EXCLUSIVE_RESET_ADDR &&
+ranges_overlap(cpu->excl_protected_range.begin,
+   cpu->excl_protected_range.end -
+   cpu->excl_protected_range.begin,
+   addr, size)) {
+cpu->excl_protected_range.begin = EXCLUSIVE_RESET_ADDR;
+ret = true;
+}
 }
 }
+
+return ret;
 }
 
 #define MMUSUFFIX _mmu
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 71e0480..bacb3ad 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -171,6 +171,7 @@ struct MemoryRegion {
 bool rom_device;
 bool flush_coalesced_mmio;
 bool global_locking;
+bool pending_excl_access; /* A vCPU issued an exclusive access */
 uint8_t dirty_log_mask;
 ram_addr_t ram_addr;
 Object *owner;
diff --git a/softmmu_llsc_template.h b/softmmu_llsc_template.h
index 101f5e8..b4712ba 100644
--- a/softmmu_llsc_template.h
+++ b/softmmu_llsc_template.h
@@ -81,15 +81,18 @@ WORD_TYPE helper_ldlink_name(CPUArchState *env, 
target_ulong addr,
 }
 }
 }
+/* For this vCPU, just update the TLB entry, no need to flush. */
+env->tlb_table[mmu_idx][index].addr_write |= TLB_EXCL;
 } else {
-hw_error("EXCL accesses to MMIO regions not supported yet.");
+/* Set a pending exclusive access in the MemoryRegion */
+MemoryRegion *mr = iotlb_to_region(this,
+   env->iotlb[mmu_idx][index].addr,
+   env->iotlb[mmu_idx][index].attrs);
+mr->pending_excl_access = true;
 }
 
 cc->cpu_set_excl_protected_range(this, hw_addr, DATA_SIZE);
 
-/* For this vCPU, just update the TLB entry, no need to flush. */
-env->tlb_table[mmu_idx][index].addr_write |= TLB_EXCL;
-
 /* From now on we are in LL/SC context */
 this->ll_sc_context = true;
 
diff --git a/softmmu_template.h b/softmmu_template.h
index c54bdc9..71c5152 100644
--- a/softmmu_template.h
+++ b/softmmu_template.h
@@ -360,6 +360,14 @@ static inline void glue(io_write, SUFFIX)(CPUArchState 
*env,
 MemoryRegion *mr = iotlb_to_region(cpu, physaddr, iotlbentry->attrs);
 
 physaddr = (physaddr & TARGET_PAGE_MASK) + addr;
+
+/* Invalidate the exclusive range that overlaps this access */
+if (mr->pending_excl_access) {
+if (lookup_and_reset_cpus_ll_addr(physaddr, 1 << SHIFT)) {
+mr->pending_excl_access = false;
+}
+}
+
 if (mr != _mem_rom && mr != _mem_notdirty && !cpu->can_do_io) {
 cpu_io_recompile(cpu, retaddr);
 }
@@ -504,6 +512,13 @@ void helper_le_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 glue(helper_le_st_name, _do_mmio_access)(env, val, addr, oi,
  mmu_idx, index,
  retaddr);
+/* N.B.: Here excl_succeeded == true means that this access
+ * comes from an exclusive instruction. */
+if (cpu->excl_succeeded) {
+MemoryRegion *mr =

[Qemu-devel] [RFC v7 00/16] Slow-path for atomic instruction translation

2016-01-29 Thread Alvise Rigo

This is the seventh iteration of the patch series which applies to the
upstream branch of QEMU (v2.5.0-rc4).

Changes versus previous versions are at the bottom of this cover letter.

The code is also available at following repository:
https://git.virtualopensystems.com/dev/qemu-mt.git
branch:
slowpath-for-atomic-v7-no-mttcg

This patch series provides an infrastructure for atomic instruction
implementation in QEMU, thus offering a 'legacy' solution for
translating guest atomic instructions. Moreover, it can be considered as
a first step toward a multi-thread TCG.

The underlying idea is to provide new TCG helpers (sort of softmmu
helpers) that guarantee atomicity to some memory accesses or in general
a way to define memory transactions.

More specifically, the new softmmu helpers behave as LoadLink and
StoreConditional instructions, and are called from TCG code by means of
target specific helpers. This work includes the implementation for all
the ARM atomic instructions, see target-arm/op_helper.c.

The implementation heavily uses the software TLB together with a new
bitmap that has been added to the ram_list structure which flags, on a
per-CPU basis, all the memory pages that are in the middle of a LoadLink
(LL), StoreConditional (SC) operation.  Since all these pages can be
accessed directly through the fast-path and alter a vCPU's linked value,
the new bitmap has been coupled with a new TLB flag for the TLB virtual
address which forces the slow-path execution for all the accesses to a
page containing a linked address.

The new slow-path is implemented such that:
- the LL behaves as a normal load slow-path, except for clearing the
  dirty flag in the bitmap.  The cputlb.c code while generating a TLB
  entry, checks if there is at least one vCPU that has the bit cleared
  in the exclusive bitmap, it that case the TLB entry will have the EXCL
  flag set, thus forcing the slow-path.  In order to ensure that all the
  vCPUs will follow the slow-path for that page, we flush the TLB cache
  of all the other vCPUs.

  The LL will also set the linked address and size of the access in a
  vCPU's private variable. After the corresponding SC, this address will
  be set to a reset value.

- the SC can fail returning 1, or succeed, returning 0.  It has to come
  always after a LL and has to access the same address 'linked' by the
  previous LL, otherwise it will fail. If in the time window delimited
  by a legit pair of LL/SC operations another write access happens to
  the linked address, the SC will fail.

In theory, the provided implementation of TCG LoadLink/StoreConditional
can be used to properly handle atomic instructions on any architecture.

The code has been tested with bare-metal test cases and by booting Linux.

* Performance considerations
The new slow-path adds some overhead to the translation of the ARM
atomic instructions, since their emulation doesn't happen anymore only
in the guest (by means of pure TCG generated code), but requires the
execution of two helpers functions. Despite this, the additional time
required to boot an ARM Linux kernel on an i7 clocked at 2.5GHz is
negligible.
Instead, on a LL/SC bound test scenario - like:
https://git.virtualopensystems.com/dev/tcg_baremetal_tests.git - this
solution requires 30% (1 million iterations) and 70% (10 millions
iterations) of additional time for the test to complete.

Changes from v6:
- Included aligned variants of the exclusive helpers
- Reverted to single bit per page design in DIRTY_MEMORY_EXCLUSIVE
  bitmap. The new way we restore the pages as non-exclusive (PATCH 13)
  made the per-VCPU design unnecessary.
- arm32 now uses aligned exclusive accesses
- aarch64 exclusive instructions implemented [PATCH 15-16]
- Addressed comments from Alex

Changes from v5:
- The exclusive memory region is now set through a CPUClass hook,
  allowing any architecture to decide the memory area that will be
  protected during a LL/SC operation [PATCH 3]
- The runtime helpers dropped any target dependency and are now in a
  common file [PATCH 5]
- Improved the way we restore a guest page as non-exclusive [PATCH 9]
- Included MMIO memory as possible target of LL/SC
  instructions. This also required to somehow simplify the
  helper_*_st_name helpers in softmmu_template.h [PATCH 8-14]

Changes from v4:
- Reworked the exclusive bitmap to be of fixed size (8 bits per address)
- The slow-path is now TCG backend independent, no need to touch
  tcg/* anymore as suggested by Aurelien Jarno.

Changes from v3:
- based on upstream QEMU
- addressed comments from Alex Bennée
- the slow path can be enabled by the user with:
  ./configure --enable-tcg-ldst-excl only if the backend supports it
- all the ARM ldex/stex instructions make now use of the slow path
- added aarch64 TCG backend support
- part of the code has been rewritten

Changes from v2:
- the bitmap accessors are now atomic
- a rendezvous between vCPUs and a simple callback support before executing
  a TB have been

[Qemu-devel] [RFC v7 12/16] configure: Use slow-path for atomic only when the softmmu is enabled

2016-01-29 Thread Alvise Rigo

Use the new slow path for atomic instruction translation when the
softmmu is enabled.

At the moment only arm and aarch64 use the new LL/SC backend. It is
possible to disable such backed with --disable-arm-llsc-backend.

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 configure | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/configure b/configure
index 44ac9ab..915efcc 100755
--- a/configure
+++ b/configure
@@ -294,6 +294,7 @@ solaris="no"
 profiler="no"
 cocoa="no"
 softmmu="yes"
+arm_tcg_use_llsc="yes"
 linux_user="no"
 bsd_user="no"
 aix="no"
@@ -880,6 +881,10 @@ for opt do
   ;;
   --disable-debug-tcg) debug_tcg="no"
   ;;
+  --enable-arm-llsc-backend) arm_tcg_use_llsc="yes"
+  ;;
+  --disable-arm-llsc-backend) arm_tcg_use_llsc="no"
+  ;;
   --enable-debug)
   # Enable debugging options that aren't excessively noisy
   debug_tcg="yes"
@@ -4751,6 +4756,7 @@ echo "host CPU  $cpu"
 echo "host big endian   $bigendian"
 echo "target list   $target_list"
 echo "tcg debug enabled $debug_tcg"
+echo "arm use llsc backend" $arm_tcg_use_llsc
 echo "gprof enabled $gprof"
 echo "sparse enabled$sparse"
 echo "strip binaries$strip_opt"
@@ -4806,6 +4812,7 @@ echo "Install blobs $blobs"
 echo "KVM support   $kvm"
 echo "RDMA support  $rdma"
 echo "TCG interpreter   $tcg_interpreter"
+echo "use ld/st excl$softmmu"
 echo "fdt support   $fdt"
 echo "preadv support$preadv"
 echo "fdatasync $fdatasync"
@@ -5863,6 +5870,13 @@ fi
 echo "LDFLAGS+=$ldflags" >> $config_target_mak
 echo "QEMU_CFLAGS+=$cflags" >> $config_target_mak
 
+# Use tcg LL/SC tcg backend for exclusive instruction is arm/aarch64
+# softmmus targets
+if test "$arm_tcg_use_llsc" = "yes" ; then
+  if test "$target" = "arm-softmmu" ; then
+echo "CONFIG_ARM_USE_LDST_EXCL=y" >> $config_target_mak
+  fi
+fi
 done # for target in $targets
 
 if [ "$pixman" = "internal" ]; then
-- 
2.7.0

[Qemu-devel] [RFC v7 13/16] softmmu: Add history of excl accesses

2016-01-29 Thread Alvise Rigo

Add a circular buffer to store the hw addresses used in the last
EXCLUSIVE_HISTORY_LEN exclusive accesses.

When an address is pop'ed from the buffer, its page will be set as not
exclusive. In this way, we avoid:
- frequent set/unset of a page (causing frequent flushes as well)
- the possibility to forget the EXCL bit set.

Suggested-by: Jani Kokkonen 
Suggested-by: Claudio Fontana 
Signed-off-by: Alvise Rigo 
---
 cputlb.c| 29 +++--
 exec.c  | 19 +++
 include/qom/cpu.h   |  8 
 softmmu_llsc_template.h |  1 +
 vl.c|  3 +++
 5 files changed, 50 insertions(+), 10 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 06ce2da..f3c4d97 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -395,16 +395,6 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
vaddr,
 env->tlb_v_table[mmu_idx][vidx] = *te;
 env->iotlb_v[mmu_idx][vidx] = env->iotlb[mmu_idx][index];
 
-if (unlikely(!(te->addr_write & TLB_MMIO) && (te->addr_write & TLB_EXCL))) 
{
-/* We are removing an exclusive entry, set the page to dirty. This
- * is not be necessary if the vCPU has performed both SC and LL. */
-hwaddr hw_addr = (env->iotlb[mmu_idx][index].addr & TARGET_PAGE_MASK) +
-  (te->addr_write & TARGET_PAGE_MASK);
-if (!cpu->ll_sc_context) {
-cpu_physical_memory_unset_excl(hw_addr);
-}
-}
-
 /* refill the tlb */
 env->iotlb[mmu_idx][index].addr = iotlb - vaddr;
 env->iotlb[mmu_idx][index].attrs = attrs;
@@ -517,6 +507,25 @@ static inline bool lookup_and_reset_cpus_ll_addr(hwaddr 
addr, hwaddr size)
 return ret;
 }
 
+extern CPUExclusiveHistory excl_history;
+static inline void excl_history_put_addr(hwaddr addr)
+{
+hwaddr last;
+
+/* Calculate the index of the next exclusive address */
+excl_history.last_idx = (excl_history.last_idx + 1) % excl_history.length;
+
+last = excl_history.c_array[excl_history.last_idx];
+
+/* Unset EXCL bit of the oldest entry */
+if (last != EXCLUSIVE_RESET_ADDR) {
+cpu_physical_memory_unset_excl(last);
+}
+
+/* Add a new address, overwriting the oldest one */
+excl_history.c_array[excl_history.last_idx] = addr & TARGET_PAGE_MASK;
+}
+
 #define MMUSUFFIX _mmu
 
 /* Generates LoadLink/StoreConditional helpers in softmmu_template.h */
diff --git a/exec.c b/exec.c
index 51f366d..2e123f1 100644
--- a/exec.c
+++ b/exec.c
@@ -177,6 +177,25 @@ struct CPUAddressSpace {
 MemoryListener tcg_as_listener;
 };
 
+/* Exclusive memory support */
+CPUExclusiveHistory excl_history;
+void cpu_exclusive_history_init(void)
+{
+/* Initialize exclusive history for atomic instruction handling. */
+if (tcg_enabled()) {
+g_assert(EXCLUSIVE_HISTORY_CPU_LEN * max_cpus <= UINT16_MAX);
+excl_history.length = EXCLUSIVE_HISTORY_CPU_LEN * max_cpus;
+excl_history.c_array = g_malloc(excl_history.length * sizeof(hwaddr));
+memset(excl_history.c_array, -1, excl_history.length * sizeof(hwaddr));
+}
+}
+
+void cpu_exclusive_history_free(void)
+{
+if (tcg_enabled()) {
+g_free(excl_history.c_array);
+}
+}
 #endif
 
 #if !defined(CONFIG_USER_ONLY)
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 6f6c1c0..0452fd0 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -227,7 +227,15 @@ struct kvm_run;
 #define TB_JMP_CACHE_SIZE (1 << TB_JMP_CACHE_BITS)
 
 /* Atomic insn translation TLB support. */
+typedef struct CPUExclusiveHistory {
+uint16_t last_idx;   /* index of last insertion */
+uint16_t length; /* history's length, it depends on smp_cpus */
+hwaddr *c_array; /* history's circular array */
+} CPUExclusiveHistory;
 #define EXCLUSIVE_RESET_ADDR ULLONG_MAX
+#define EXCLUSIVE_HISTORY_CPU_LEN 256
+void cpu_exclusive_history_init(void);
+void cpu_exclusive_history_free(void);
 
 /**
  * CPUState:
diff --git a/softmmu_llsc_template.h b/softmmu_llsc_template.h
index b4712ba..b4e7f9d 100644
--- a/softmmu_llsc_template.h
+++ b/softmmu_llsc_template.h
@@ -75,6 +75,7 @@ WORD_TYPE helper_ldlink_name(CPUArchState *env, target_ulong 
addr,
  * to request any flush. */
 if (!cpu_physical_memory_is_excl(hw_addr)) {
 cpu_physical_memory_set_excl(hw_addr);
+excl_history_put_addr(hw_addr);
 CPU_FOREACH(cpu) {
 if (current_cpu != cpu) {
 tlb_flush(cpu, 1);
diff --git a/vl.c b/vl.c
index f043009..b22d99b 100644
--- a/vl.c
+++ b/vl.c
@@ -547,6 +547,7 @@ static void res_free(void)
 {
 g_free(boot_splash_filedata);
 boot_splash_filedata = NULL;
+cpu_exclusive_history_free();
 }
 
 static int default_driver_check(void *opaque, QemuOpts *opts, Error **errp)
@@ -4322,6 +4323,8 @@ int main(int argc, char **argv, char

Re: [Qemu-devel] [PATCH 1/1] arm: virt: change GPIO trigger interrupt to pulse

2016-01-29 Thread Shannon Zhao

Hi，

This makes ACPI work well but makes DT not work. The reason is systemd or
acpid open /dev/input/event0 failed. So the interrupt could be injected and
could see under /proc/interrupts but guest doesn't have any action. I'll
investigate why it opens failed later.

2016年1月29日星期五，Wei Huang  写道：

> When QEMU is hook'ed up with libvirt/virsh, the first ACPI reboot
> request will succeed; but the following shutdown/reboot requests
> fail to trigger VMs to react. Notice that in mach-virt machine
> model GPIO is defined as edge-triggered and active-high in ACPI.
> This patch changes the behavior of powerdown notifier from PULLUP
> to PULSE. It solves the problem described above (i.e. reboot
> continues to work).
>
> Signed-off-by: Wei Huang >
> ---
>  hw/arm/virt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 05f9087..b5468a9 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -546,7 +546,7 @@ static DeviceState *pl061_dev;
>  static void virt_powerdown_req(Notifier *n, void *opaque)
>  {
>  /* use gpio Pin 3 for power button event */
> -qemu_set_irq(qdev_get_gpio_in(pl061_dev, 3), 1);
> +qemu_irq_pulse(qdev_get_gpio_in(pl061_dev, 3));
>  }
>
>  static Notifier virt_system_powerdown_notifier = {
> --
> 1.8.3.1
>
>

1 2 3 4 >

1 - 100 of 305 matches

Mail list logo