[Qemu-devel] [Bug 1717708] Re: QEMU aarch64 can't run Windows ARM64 iso's

2017-11-17 Thread Andrew Baumann
I just noticed this bug via qemu-devel. For the record, recent QEMU
(2.11-rc1 and later) can run recent arm64 Windows images using -M virt
and the virtio storage/input/networking drivers in the guest. I was
using older UEFI build that still includes the VGA framebuffer support.
(I only just learned via this bug that it had been removed, but as
PeterM pointed out this isn't the right place to discuss it.)

Here's a sample invocation:
qemu-system-aarch64 -M virt -cpu cortex-a57 -m 4G -smp 4 -bios 
QEMU_EFI-arm64.fd \
  -serial stdio -device VGA -device virtio-keyboard-pci -device 
virtio-mouse-pci \
  -netdev user,id=net0 -device 
virtio-net-pci,disable-legacy=on,netdev=net0,mac=00:11:22:33:44:55,addr=08 \
  -drive if=none,file=myimage.vhdx,id=hd0 -device virtio-blk-pci,drive=hd0

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1717708

Title:
  QEMU aarch64 can't run Windows ARM64 iso's

Status in QEMU:
  New

Bug description:
  Hi,
  recently Windows ARM64 ISOs have been posted on the internet..
  just checked with latest QEMU 2.10 release from 
  https://qemu.weilnetz.de/w64/qemu-w64-setup-20170830.exe 
  "h:\qemu\qemu-system-aarch64.exe" -boot d -cdrom 
h:\iso\16353.1000.170825-1423.RS_PRERELEASE_CLIENTPRO_OEMRET_ARM64FRE_ES-ES.ISO 
-m 2048 -cpu cortex-a57 -smp 1 -machine virt
  seems no video output..
  checked various machine options for example versatilepb (says guest has not 
initialized the guest)..

  so don't know if it's a QEMU bug or lacking feature but can support
  running Windows ARM64 builds (would be nice if you can add a
  Snapdragon835 machine type which is which first machines will be
  running..)

  
  note running a Windows x64 ISO with similar parameters works (removing -cpu 
and -machine as not needed)

  thanks..

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1717708/+subscriptions



Re: [Qemu-devel] [virtio-dev] Re: [PATCH v17 6/6] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ

2017-11-17 Thread Wang, Wei W
On Friday, November 17, 2017 8:45 PM, Michael S. Tsirkin wrote:
> On Fri, Nov 17, 2017 at 07:35:03PM +0800, Wei Wang wrote:
> > On 11/16/2017 09:27 PM, Wei Wang wrote:
> > > On 11/16/2017 04:32 AM, Michael S. Tsirkin wrote:
> > > > On Fri, Nov 03, 2017 at 04:13:06PM +0800, Wei Wang wrote:
> > > > > Negotiation of the VIRTIO_BALLOON_F_FREE_PAGE_VQ feature
> > > > > indicates the support of reporting hints of guest free pages to
> > > > > the host via virtio-balloon. The host requests the guest to
> > > > > report the free pages by sending commands via the virtio-balloon
> configuration registers.
> > > > >
> > > > > When the guest starts to report, the first element added to the
> > > > > free page vq is a sequence id of the start reporting command.
> > > > > The id is given by the host, and it indicates whether the
> > > > > following free pages correspond to the command. For example, the
> > > > > host may stop the report and start again with a new command id.
> > > > > The obsolete pages for the previous start command can be
> > > > > detected by the id dismatching on the host. The id is added to
> > > > > the vq using an output buffer, and the free pages are added to
> > > > > the vq using input buffer.
> > > > >
> > > > > Here are some explainations about the added configuration registers:
> > > > > - host2guest_cmd: a register used by the host to send commands
> > > > > to the guest.
> > > > > - guest2host_cmd: written by the guest to ACK to the host about
> > > > > the commands that have been received. The host will clear the
> > > > > corresponding bits on the host2guest_cmd register. The guest
> > > > > also uses this register to send commands to the host (e.g. when finish
> free page reporting).
> > > > > - free_page_cmd_id: the sequence id of the free page report
> > > > > command given by the host.
> > > > >
> > > > > Signed-off-by: Wei Wang 
> > > > > Signed-off-by: Liang Li 
> > > > > Cc: Michael S. Tsirkin 
> > > > > Cc: Michal Hocko 
> > > > > ---
> > > > >
> > > > > +
> > > > > +static void report_free_page(struct work_struct *work) {
> > > > > +struct virtio_balloon *vb;
> > > > > +
> > > > > +vb = container_of(work, struct virtio_balloon,
> > > > > report_free_page_work);
> > > > > +report_free_page_cmd_id(vb);
> > > > > +walk_free_mem_block(vb, 0, _balloon_send_free_pages);
> > > > > +/*
> > > > > + * The last few free page blocks that were added may not reach 
> > > > > the
> > > > > + * batch size, but need a kick to notify the device to
> > > > > handle them.
> > > > > + */
> > > > > +virtqueue_kick(vb->free_page_vq);
> > > > > +report_free_page_end(vb);
> > > > > +}
> > > > > +
> > > > I think there's an issue here: if pages are poisoned and
> > > > hypervisor subsequently drops them, testing them after allocation
> > > > will trigger a false positive.
> > > >
> > > > The specific configuration:
> > > >
> > > > PAGE_POISONING on
> > > > PAGE_POISONING_NO_SANITY off
> > > > PAGE_POISONING_ZERO off
> > > >
> > > >
> > > > Solutions:
> > > > 1. disable the feature in that configuration
> > > > suggested as an initial step
> > >
> > > Thanks for the finding.
> > > Similar to this option: I'm thinking could we make
> > > walk_free_mem_block() simply return if that option is on?
> > > That is, at the beginning of the function:
> > > if (!page_poisoning_enabled())
> > > return;
> > >
> >
> >
> > Thought about it more, I think it would be better to put this logic to
> > virtio_balloon:
> >
> > send_free_page_cmd_id(vb, >start_cmd_id);
> > if (page_poisoning_enabled() &&
> > !IS_ENABLED(CONFIG_PAGE_POISONING_NO_SANITY))
> > walk_free_mem_block(vb, 0, _balloon_send_free_pages);
> > send_free_page_cmd_id(vb, >stop_cmd_id);
> >
> >
> > walk_free_mem_block() should be a more generic API, and this potential
> > page poisoning issue is specific to live migration which is only one
> > use case of this function, so I think it is better to handle it in the
> > special use case itself.
> >
> > Best,
> > Wei
> >
> 
> It's a quick work-around but it doesn't make me very happy.
> 
> AFAIK e.g. RHEL has a debug kernel with poisoning enabled.
> If this never uses free page hinting at all, it will be much less useful for
> debugging guests.
> 

I understand your concern. I think people who use debugging guests don't regard 
performance as the first priority, and most vendors usually wouldn't use 
debugging guests for their products.

How about taking it as the initial solution? We can exploit more solutions 
after this series is done.

Best,
Wei





Re: [Qemu-devel] [PATCH v8 04/14] block/dirty-bitmap: add bdrv_dirty_bitmap_set_frozen

2017-11-17 Thread John Snow


On 11/17/2017 12:30 PM, Vladimir Sementsov-Ogievskiy wrote:
> 17.11.2017 20:20, John Snow wrote:
>>
>> On 11/17/2017 09:46 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> 14.11.2017 02:32, John Snow wrote:
 On 10/30/2017 12:32 PM, Vladimir Sementsov-Ogievskiy wrote:
> Make it possible to set bitmap 'frozen' without a successor.
> This is needed to protect the bitmap during outgoing bitmap postcopy
> migration.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>    include/block/dirty-bitmap.h |  1 +
>    block/dirty-bitmap.c | 22 --
>    2 files changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/include/block/dirty-bitmap.h
> b/include/block/dirty-bitmap.h
> index a9e2a92e4f..ae6d697850 100644
> --- a/include/block/dirty-bitmap.h
> +++ b/include/block/dirty-bitmap.h
> @@ -39,6 +39,7 @@ uint32_t
> bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
>    uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap
> *bitmap);
>    bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
>    bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
> +void bdrv_dirty_bitmap_set_frozen(BdrvDirtyBitmap *bitmap, bool
> frozen);
>    const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap);
>    int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap);
>    DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap
> *bitmap);
> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
> index 7578863aa1..67fc6bd6e0 100644
> --- a/block/dirty-bitmap.c
> +++ b/block/dirty-bitmap.c
> @@ -40,6 +40,8 @@ struct BdrvDirtyBitmap {
>    QemuMutex *mutex;
>    HBitmap *bitmap;    /* Dirty bitmap implementation */
>    HBitmap *meta;  /* Meta dirty bitmap */
> +    bool frozen;    /* Bitmap is frozen, it can't be
> modified
> +   through QMP */
 I hesitate, because this now means that we have two independent bits of
 state we need to update and maintain consistency with.
>>> it was your proposal)))
>>>
>>> https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg01172.html
>>>
>>> "
>>> Your new use case here sounds like Frozen to me, but it simply does not
>>> have an anonymous successor to force it to be recognized as "frozen." We
>>> can add a `bool protected` or `bool frozen` field to force recognition
>>> of this status and adjust the json documentation accordingly.
>>>
>>> I think then we'd have four recognized states:
>>>
>>> FROZEN: Cannot be reset/deleted. Bitmap is in-use by a block job or
>>> other internal process. Bitmap is otherwise ACTIVE.
>>> DISABLED: Not recording any writes (by choice.)
>>> READONLY: Not able to record any writes (by necessity.)
>>> ACTIVE: Normal bitmap status.
>>>
>>> Sound right?
>>> "
>>>
>>>
>> I was afraid you'd say that :(
>>
>> It's okay, anyway. I shouldn't let myself go so long between reviews
>> like this, because you catch me changing my mind. Anyway, please go
>> ahead with it. I don't want to delay you on something that works because
>> I can't make up *my* mind.
> 
> Hm, if you remember, reusing "frozen" state was strange for me too. And
> it's not
> late to move to
> 1. make a new state: MIGRATION, which disallows qmp operations on bitmap
> 2. or just make them unnamed, so they can't be touched by qmp

"Migrating" is fine as a state name. You could probably announce this by
having it be "frozen" in the usual way (it has a successor) and a new
bool that lets you do whatever special handling you need to do for it.

> 
> anything is ok for me as well as leaving it as is. It's all little
> things, the core is patch 10.
> 
> "frozen" sounds like unchanged, but user will see dirty-count changing
> in query-block.

I guess it's a strange misnomer now... or maybe just always was a bad
name, since it's not really "frozen" but rather "locked" in a way that
the QMP user cannot interfere with it -- but it's still a live,
functioning object.

> 

I'm remembering what I was talking about, but I think my preference is
illustrably worse. I was trying to avoid boolean bloat by advocating
re-use, but the re-use is kind of confusing...

I think I was hoping that a bitmap on the receiving end could simply be
"frozen" in the usual way, in that it has a successor recording writes.

I think the way you want to handle it though is with different semantics
for cleanup and recovery which makes it not quite the same state, which
needs either a new bool or some such.

Go with what you think is cleanest at this point, including if you want
to leave it alone. I don't want to cause you to respin it over bikeshedding.

--js



Re: [Qemu-devel] [Qemu-block] [PATCH for-2.11] iotests: Fix 176 on 32-bit host

2017-11-17 Thread John Snow


On 11/17/2017 02:10 PM, Eric Blake wrote:
> On 11/17/2017 01:04 PM, Eric Blake wrote:
>> The contents of a qcow2 bitmap are rounded up to a size that
>> matches the number of bits available for the granularity, but
>> that granularity differs for 32-bit hosts (our default 64k
>> cluster allows for 2M bitmap coverage per 'long') and 64-bit
>> hosts (4M bitmap per 'long').  If the image is a multiple of
>> 2M but not 4M, then the number of bytes occupied by the array
>> of longs in memory differs between architecture, thus
>> resulting in different SHA256 hashes.
>>
>> Furthermore (but untested by me), if our computation of the
>> SHA256 hash is at all endian-dependent because of how we store
>> data in memory, that's another variable we'd have to account
>> for (ideally, we specified the bitmap stored in qcow2 as
>> fixed-endian on disk, because the same qcow2 file must be
>> usable across any architecture; but that says nothing about
>> how we represent things in memory).  But we already have test
>> 165 to validate that bitmaps are stored correctly on disk,
>> while this test is merely testing that the bitmap exists.
>>
>> So for this test, the easiest solution is to filter out the
>> actual hash value.  Broken in commit 4096974e.
> 
> Of course, if Kevin sends a v2 pull, it's probably better to just squash
> this in to my original testsuite change (since a v2 would probably
> invalidate that commit id).
> 

Whichever way you go,

Reviewed-by: John Snow 

I don't know how important it is to nail this down, but the purpose of
this command is primarily for sanity testing and not necessarily between
architectures, so this might just be a limitation to document.

Also, don't use this for anything except testing.

:(



Re: [Qemu-devel] [PATCH for-2.12 0/4] qmp dirty bitmap API

2017-11-17 Thread John Snow


On 11/17/2017 03:22 AM, Vladimir Sementsov-Ogievskiy wrote:
> 17.11.2017 06:10, John Snow wrote:
>>
>> On 11/16/2017 03:17 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> 16.11.2017 00:20, John Snow wrote:
 On 11/13/2017 11:20 AM, Vladimir Sementsov-Ogievskiy wrote:
> Hi all.
>
> There are three qmp commands, needed to implement external backup API.
>
> Using these three commands, client may do all needed bitmap
> management by
> hand:
>
> on backup start we need to do a transaction:
>   {disable old bitmap, create new bitmap}
>
> on backup success:
>   drop old bitmap
>
> on backup fail:
>   enable old bitmap
>   merge new bitmap to old bitmap
>   drop new bitmap
>
 Can you give me an example of how you expect these commands to be used,
 and why they're required?

 I'm a little weary about how error-prone these commands might be and the
 potential for incorrect usage seems... high. Why do we require them?
>>> It is needed for incremental backup. It looks like bad idea to export
>>> abdicate/reclaim functionality, it is simpler
>>> and clearer to allow user to merge/enable/disable bitmaps by hand.
>>>
>>> usage is like this:
>>>
>>> 1. we have dirty bitmap bitmap0 for incremental backup.
>>>
>>> 2. prepare image fleecing (create temporary image with backing=our_disk)
>>> 3. in qmp transaction:
>>>     - disable bitmap0
>>>     - create bitmap1
>>>     - start image fleecing (backup sync=none our_disk -> temp_disk)
>> This could probably just be its own command, though:
>>
>> block-job-fleece node=foobar bitmap=bitmap0 etc=etera etc=etera
>>
>> Could handle forking the bitmap. I'm not sure what the arguments would
>> look like, but we could name the NBD export here, too. (Assuming the
>> server is already started and we just need to create the share.)
>>
>> Then, we can basically do what mirror does:
>>
>> (1) Cancel
>> (2) Complete
>>
>> Cancel would instruct QEMU to keep the bitmap changes (i.e. roll back),
>> and Complete would instruct QEMU to discard the changes.
>>
>> This way we don't need to expose commands like split or merge that will
>> almost always be dangerous to use over QMP.
>>
>> In fact, a fleecing job would be really convenient even without a
>> bitmap, because it'd still be nice to have a convenience command for it.
>> Using an existing infrastructure and understood paradigm is just a bonus.
> 
> 1. If I understand correctly, Kevin and Max said in their report in
> Prague about new block-job approach,
>   using filter nodes, so I'm not sure that this is a good Idea to
> introduce now new old-style block-job, where we can
>   do without it.
> 

We could do without it, but it might be a lot better to have everything
wrapped up in a command that's easy to digest instead of releasing 10
smaller commands that have to be executed in a very specific way in
order to work correctly.

I'm thinking about the complexity of error checking here with all the
smaller commands, versus error checking on a larger workflow we
understand and can quality test better.

I'm not sure that filter nodes becoming the new normal for block jobs
precludes our ability to use the job-management API as a handle for
managing the lifetime of a long-running task like fleecing, but I'll
check with Max and Kevin about this.

> 2. there is the following scenario: customers needs a possibility to
> create a backup of data changed since some
> point in time. So, maintaining several static and one (or several) activ
> bitmaps with a possiblity of merge some of them
> and create a backup using this merged bitmap may be convenient.
> 

I think the ability to copy bitmaps and issue differential backups would
be sufficient in all cases I could think of...

> 3. And I don't see any danger here: we already can create, delete and
> clear dirty bitmaps through qmp. Some additional
> operations should not be bad.
> 

Not bad, just risk of misuse for little benefit, IMO. I'm wary of
raising the external complexity too much. I'd like to keep the API as
simple as I can.

> 4. Interesting: "disabled" field looks like never used..

Ah! That's true. It was in an early draft and Stefan noticed that there
was no reason to let users disable or enable bitmaps in the first
version of the feature set, so those QMP commands (along with copy) got
removed to keep the feature set smaller.

The mechanisms that paid attention to a bitmap being enabled or not
remained behind, though.

> 
>>
>>> 4. export temp_disk (it looks like a snapshot) and bitmap0 through NBD
>>>
>>> === external tool can get bitmap0 and then copy some clusters through
>>> NBD ===
>>>
>>> on successful backup external tool can drop bitmap0. or can save it and
>>> reuse somehow
>>>
>>> on failed backup external tool can do the following:
>>>     - enable bitmap0
>>>     - merge bitmap1 to bitmap0
>>>     - drop bitmap1
>>>
> 
> 
> -- 
> Best regards,
> Vladimir
> 



Re: [Qemu-devel] [PATCH 3/4] multiboot: load elf sections and section headers

2017-11-17 Thread Anatol Pomozov
Hello

On Tue, Oct 31, 2017 at 11:38 AM, Anatol Pomozov
 wrote:
> Hi
>
> On Thu, Oct 19, 2017 at 2:36 AM, Kevin Wolf  wrote:
>> Am 18.10.2017 um 19:22 hat Anatol Pomozov geschrieben:
>>> Hello qemu-devs,
>>>
>>> This patchset has been posted a while ago. I would like to move
>>> forward with it and look at the next task (e.g. multiboot2 support in
>>> QEMU). Who is the multiboot code maintainer who can help to review
>>> this patchset and give go/no-go answer?
>>
>> I think that's exactly your problem, there is nobody who feels
>> responsible for Multiboot support. Officially, it is part of the PC/x86
>> subsystems:
>
> In this case I would like to become the official Qemu/Multiboot
> maintainer. I do development related to x86 osdev and actively use
> qemu for my projects. I am interested in having a feature-rich and
> robust Multiboot implementation.

It is very unfortunate to see lack of interest in improving
qemu/multiboot functionality. I guess my best option for now is to
avoid using qemu multiboot implementation and use grub instead.



Re: [Qemu-devel] [PATCH 02/24] sdl: remove -alt-grab and -ctrl-grab support

2017-11-17 Thread Gerd Hoffmann
On Fri, Nov 17, 2017 at 11:06:12AM -0500, Programmingkid wrote:
> 
> > With absolute pointer devices such as usb-tablet being widely used
> > mouse grabs (for relative pointing devices) should be rarely needed
> > these days.  So the benefit of the options to configure the hotkey
> > modifiers for grab (and other actions) seems questionable.  Which
> > is expecially true for the -ctrl-grab which isn't handled in the
> > handle_keyup() code.
> 
> So does this mean you are against a patch that would allow the user to
> chose a key to act as a mouse ungrab key?

I don't consider it a top priority, and given there I have lots of other
things to do I most likely wouldn't implement that.  But I wouldn't
object if someone comes up with a sane implementation.

What we have right now is just a big mess though, both in code and the
user interface.  First, these are toplevel command line options.  It
belongs to -display though.  Second, it's two hard-coded names for two
hard-coded variations.  So it's not like you can configure much.  On top
of that the names are not exactly intuitive.  And it works with SDL only.

cheers,
  Gerd




Re: [Qemu-devel] [Qemu-ppc] [PATCH for-2.11] spapr: reset DRCs after devices

2017-11-17 Thread Daniel Henrique Barboza



On 11/17/2017 04:19 PM, Michael Roth wrote:

Quoting Daniel Henrique Barboza (2017-11-17 10:21:27)


On 11/17/2017 10:56 AM, Greg Kurz wrote:

A DRC with a pending unplug request releases its associated device at
machine reset time.

In the case of LMB, when all DRCs for a DIMM device have been reset,
the DIMM gets unplugged, causing guest memory to disappear. This may
be very confusing for anything still using this memory.

This is exactly what happens with vhost backends, and QEMU aborts
with:

qemu-system-ppc64: used ring relocated for ring 2
qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion
   `r >= 0' failed.

The issue is that each DRC registers a QEMU reset handler, and we
don't control the order in which these handlers are called (ie,
a LMB DRC will unplug a DIMM before the virtio device using the
memory on this DIMM could stop its vhost backend).

To avoid such situations, let's reset DRCs after all devices
have been reset.

Reported-by: Mallesh N. Koti 
Signed-off-by: Greg Kurz 
---

I believe this has to do with the problem discussed at the memory
unplug reboot with vhost=on, right?

I don't see any problem with this patch and it's probably the best solution
short-term for the problem (and potentially other related LMB release
problems), but that said, how hard it is to forbid LMB unplug at all if the
memory is being used in vhost?

The way I see it, the root problem is that a device unplug failed at the
guest
side and we don't know about it, leaving drc->unplug_requested flags set
around the code when it shouldn't. Reading that thread I see a comment from

I'm not sure we want to go down the road of preemptively aborting
specific cases where we think unplug will fail in the guest. We already
need to deal with the cases where we don't know whether they'll fail
ahead of time, so it seems like more of a headache to add special cases
on top of that that management would need to deal with.


Just to be clear, I only advocate this kind of constraint if the LMB unplug,
at this specific scenario, has a 100% chance of fail. Otherwise we're
better of rolling the dice.




And in this particular case we'd never be able to unplug the DIMM if
the guest happens to use it for the vring each boot (unless maybe the
unplug request occured during early boot), even though there's no
technical reason we shouldn't be able to at least handle it on the
next reset.


I am ok going this route if we have our bases covered. Does this
behavior of unplugging the LMBs pending unplug at reset documented
in the PAPR specs? If not, we would need to document it somewhere else
(or ask for a PAPR revision to add it).

Otherwise, I am not sure if management will be OK with LMBs being
"inadvertently" hot unplugged in a reset due to a failed LMB hot unplug that
might have occurred days ago and the user doesn't even remember it
anymore :)


Thanks,


Daniel



Michael S. Tsirkin saying that this bug is somewhat expected, that vhost
backend will
not behave well if memory is going away. Unlike other LMB unplug
failures from
the guest side that might happen for any other reason, this one sees
predicable
and we can avoid it.

I think this is something worth considering. But for now,


Reviewed-by: Daniel Henrique Barboza 


   hw/ppc/spapr.c |   21 +
   hw/ppc/spapr_drc.c |7 ---
   2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6841bd294b3c..6285f7211f9a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1411,6 +1411,19 @@ static void find_unknown_sysbus_device(SysBusDevice 
*sbdev, void *opaque)
   }
   }

+static int spapr_reset_drcs(Object *child, void *opaque)
+{
+sPAPRDRConnector *drc =
+(sPAPRDRConnector *) object_dynamic_cast(child,
+ TYPE_SPAPR_DR_CONNECTOR);
+
+if (drc) {
+spapr_drc_reset(drc);
+}
+
+return 0;
+}
+
   static void ppc_spapr_reset(void)
   {
   MachineState *machine = MACHINE(qdev_get_machine());
@@ -1434,6 +1447,14 @@ static void ppc_spapr_reset(void)
   }

   qemu_devices_reset();
+
+/* DRC reset may cause a device to be unplugged. This will cause troubles
+ * if this device is used by another device (eg, a running vhost backend
+ * will crash QEMU if the DIMM holding the vring goes away). To avoid such
+ * situations, we reset DRCs after all devices have been reset.
+ */
+object_child_foreach_recursive(object_get_root(), spapr_reset_drcs, NULL);
+
   spapr_clear_pending_events(spapr);

   /*
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 915e9b51c40c..e3b122968e89 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -455,11 +455,6 @@ void spapr_drc_reset(sPAPRDRConnector *drc)
   }
   }

-static void drc_reset(void *opaque)
-{
-spapr_drc_reset(SPAPR_DR_CONNECTOR(opaque));
-}
-

Re: [Qemu-devel] [PATCH] block/snapshot: dirty all dirty bitmaps on snapshot-switch

2017-11-17 Thread John Snow


On 11/17/2017 01:25 PM, Kevin Wolf wrote:
> Am 17.11.2017 um 19:15 hat John Snow geschrieben:
>>
>>
>> On 11/17/2017 10:01 AM, Max Reitz wrote:
>>> On 2017-11-17 13:30, Kevin Wolf wrote:
 Am 23.10.2017 um 11:29 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Snapshot-switch actually changes active state of disk so it should
> reflect on dirty bitmaps. Otherwise next incremental backup using
> these bitmaps will be invalid.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

 We discussed this quite a while ago, and I'm still not convinced that
 this approach makes sense.
>>>
>>> I think it at least makes more sense than not handling this case at all.
>>>
 Can you give just one example of a use case where dirtying the whole
 bitmap while loading a snapshot is the desired behaviour?

 I think the most useful behaviour would be something where the bitmaps
 themselves are snapshotted, too.
>>>
>>> Agreed.
>>>
  But for the time being, the easiest and
 safest solution might just be to error out in any snapshot operations
 if any bitmaps are in use.
>>>
>>> Sounds OK, too.  I personally don't have an opinion either way.
>>>
>>> But in any case, what we did before this patch was definitely wrong so I
>>> consider it an improvement.
>>>
>>
>> This is how I feel about it too. Erroring out entirely is an option, but
>> code-wise just dirtying everything is at least verifiably not-wrong and
>> pretty simple to implement.
> 
> But that's exactly the problem: If something is just plain wrong, you
> can always replace it with something that makes sense. If it errors out,
> you can still remove that error later. But if you have something that
> doesn't make a whole lot of sense, but kinda sorta works (like after
> this patch), it's ABI and you can't implement something more useful
> later any more.
> 
>> It's an improvement... Don't do it, but at least you won't get
>> something wrong after, just something heinously unoptimal.
> 
> It's a short-term improvement that may become a long-term burden.
> 
> Kevin
> 

I see your point.

If we enable it now, we always have to.
If we disable it now, we *can* later if we wish.

...however, I think it's been the case that we haven't prohibited it
before, but also it's a pretty good case that nobody has been using this
feature in production because they're not yet persistent and migratable.

An error message asking the user to delete bitmaps (which will very
obviously invalidate them) would be fine, too. I was erring on the side
of "just let things work," but you have a point that making sure the
user knows that what they're trying to accomplish is not a good idea is
probably better than silently doing the very stupid thing.



Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2017-11-17 Thread Peter Maydell
On 17 November 2017 at 20:26, Paolo Bonzini  wrote:
> On 17/11/2017 21:07, Peter Maydell wrote:
>> Hi. This commit breaks booting of Debian on aarch64 virt board.
>> (repro instructions for creating the image available at:
>> https://translatedcode.wordpress.com/2017/07/24/installing-debian-on-qemus-64-bit-arm-virt-board/)
>> The guest kernel never prints anything to the serial port.
>>
>> Reverting this commit fixes master for me, so I plan to do
>> that on Monday.
>
> Maybe you can also test moving the atomic_set inside the "if".  It does
> seem to be a genuine bugfix.

No, that doesn't help: guest still sits there like a lemon.

thanks
-- PMM



Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2017-11-17 Thread Paolo Bonzini
On 17/11/2017 21:07, Peter Maydell wrote:
> On 16 November 2017 at 11:59, Paolo Bonzini  wrote:
>> From: Pavel Dovgalyuk 
>>
>> This patch adds a condition before overwriting exception_index fiels.
>> It is needed when exception_index is already set to some meaningful value.
>>
>> Signed-off-by: Pavel Dovgalyuk 
>>
>> Message-Id: <20171114081812.27640.26372.stgit@pasha-VirtualBox>
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  accel/tcg/cpu-exec.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>> index 61297f8f4a..0473055a08 100644
>> --- a/accel/tcg/cpu-exec.c
>> +++ b/accel/tcg/cpu-exec.c
>> @@ -594,7 +594,9 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>>  if (unlikely(atomic_read(>exit_request)
>>  || (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == 
>> 0))) {
>>  atomic_set(>exit_request, 0);
>> -cpu->exception_index = EXCP_INTERRUPT;
>> +if (cpu->exception_index == -1) {
>> +cpu->exception_index = EXCP_INTERRUPT;
>> +}
>>  return true;
>>  }
> 
> Hi. This commit breaks booting of Debian on aarch64 virt board.
> (repro instructions for creating the image available at:
> https://translatedcode.wordpress.com/2017/07/24/installing-debian-on-qemus-64-bit-arm-virt-board/)
> The guest kernel never prints anything to the serial port.
> 
> Reverting this commit fixes master for me, so I plan to do
> that on Monday.

Maybe you can also test moving the atomic_set inside the "if".  It does
seem to be a genuine bugfix.

Paolo



[Qemu-devel] [Bug 1732981] [NEW] usb-net on aarch64 has wrong class IDs, isn't recognized by Windows

2017-11-17 Thread Googulator
Public bug reported:

If I run qemu-system-aarch64 with "-device usb-net,...", the resulting
USB device will be seen in the guest as class 0x2, subclass 0x2,
protocol 0xFF, instead of the expected class 0xe0, subclass 0x1,
protocol 0x3. This prevents Windows' in-box class driver from binding to
the device. On x86 and x64 Windows, this is not an issue, as another
driver will bind to the device, but in Windows for ARM64, that driver is
unavailable, so the USB device is incorrectly recognized as a serial
port.

Installing a modified version of the inbox driver in "Disable Driver
Signature Enforcement" mode solves the issue, but it's not a very clean
solution.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1732981

Title:
  usb-net on aarch64 has wrong class IDs, isn't recognized by Windows

Status in QEMU:
  New

Bug description:
  If I run qemu-system-aarch64 with "-device usb-net,...", the resulting
  USB device will be seen in the guest as class 0x2, subclass 0x2,
  protocol 0xFF, instead of the expected class 0xe0, subclass 0x1,
  protocol 0x3. This prevents Windows' in-box class driver from binding
  to the device. On x86 and x64 Windows, this is not an issue, as
  another driver will bind to the device, but in Windows for ARM64, that
  driver is unavailable, so the USB device is incorrectly recognized as
  a serial port.

  Installing a modified version of the inbox driver in "Disable Driver
  Signature Enforcement" mode solves the issue, but it's not a very
  clean solution.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1732981/+subscriptions



Re: [Qemu-devel] [PULL 07/11] cpu-exec: don't overwrite exception_index

2017-11-17 Thread Peter Maydell
On 16 November 2017 at 11:59, Paolo Bonzini  wrote:
> From: Pavel Dovgalyuk 
>
> This patch adds a condition before overwriting exception_index fiels.
> It is needed when exception_index is already set to some meaningful value.
>
> Signed-off-by: Pavel Dovgalyuk 
>
> Message-Id: <20171114081812.27640.26372.stgit@pasha-VirtualBox>
> Signed-off-by: Paolo Bonzini 
> ---
>  accel/tcg/cpu-exec.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 61297f8f4a..0473055a08 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -594,7 +594,9 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
>  if (unlikely(atomic_read(>exit_request)
>  || (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == 
> 0))) {
>  atomic_set(>exit_request, 0);
> -cpu->exception_index = EXCP_INTERRUPT;
> +if (cpu->exception_index == -1) {
> +cpu->exception_index = EXCP_INTERRUPT;
> +}
>  return true;
>  }

Hi. This commit breaks booting of Debian on aarch64 virt board.
(repro instructions for creating the image available at:
https://translatedcode.wordpress.com/2017/07/24/installing-debian-on-qemus-64-bit-arm-virt-board/)
The guest kernel never prints anything to the serial port.

Reverting this commit fixes master for me, so I plan to do
that on Monday.

thanks
-- PMM



Re: [Qemu-devel] [PATCH 03/15] sun4u: move ISABus inside of EBusState

2017-11-17 Thread Artyom Tarasenko
On Fri, Nov 17, 2017 at 4:46 PM, Mark Cave-Ayland
 wrote:
> On 17/11/17 14:53, Artyom Tarasenko wrote:
>
>> On Fri, Nov 17, 2017 at 2:42 PM, Mark Cave-Ayland
>>  wrote:
>>> Since the EBus is effectively a PCI-ISA bridge then the underlying ISA bus
>>> should be contained within the PCI bridge itself.
>>
>> While it's like that on the Sabre chipset, the Spitfire chipset (which
>> I hope to add at some point) has the EBus, but no PCI, so maybe it's
>> better to model it separately.
>> On the other hand, the Spitfire has different EBus devices
>> (particularly different type of the serial ports), so I'm not sure.
>
> Oh I didn't realise you had more plans in this area :)  Any idea when
> you'll be able to work on the them?

After I make AIX boot. :-)

> TBH as you probably already know,
> even the patchset in its current form with the ISA bus encapsulation is
> so much better than what is already there, so I'd prefer to merge it and
> help you work through any problems later unless you feel particularly
> strongly?

Ok, let's do it.

Reviewed-by: Artyom Tarasenko 

-- 
Regards,
Artyom Tarasenko

SPARC and PPC PReP under qemu blog: http://tyom.blogspot.com/search/label/qemu



Re: [Qemu-devel] [PATCH 01/15] apb: move QOM macros and typedefs from apb.c to apb.h

2017-11-17 Thread Artyom Tarasenko
On Fri, Nov 17, 2017 at 4:40 PM, Mark Cave-Ayland
 wrote:
> On 17/11/17 14:24, Artyom Tarasenko wrote:
>
>> Hi Mark,
>>
>> On Fri, Nov 17, 2017 at 2:42 PM, Mark Cave-Ayland
>>  wrote:
>>> This also includes the related IOMMUState typedef and defines.
>>>
>>> Signed-off-by: Mark Cave-Ayland 
>>> ---
>>>  hw/pci-host/apb.c |   85 
>>> 
>>>  include/hw/pci-host/apb.h |   86 
>>> +
>>>  2 files changed, 86 insertions(+), 85 deletions(-)
>>>
>>> diff --git a/hw/pci-host/apb.c b/hw/pci-host/apb.c
>>> index 64025cd..f743a4e 100644
>>> --- a/hw/pci-host/apb.c
>>> +++ b/hw/pci-host/apb.c
>>> @@ -82,91 +82,6 @@ do { printf("IOMMU: " fmt , ## __VA_ARGS__); } while (0)
>>>  #define MAX_IVEC 0x40
>>>  #define NO_IRQ_REQUEST (MAX_IVEC + 1)
>>>
>>> -#define IOMMU_PAGE_SIZE_8K  (1ULL << 13)
>>> -#define IOMMU_PAGE_MASK_8K  (~(IOMMU_PAGE_SIZE_8K - 1))
>>> -#define IOMMU_PAGE_SIZE_64K (1ULL << 16)
>>> -#define IOMMU_PAGE_MASK_64K (~(IOMMU_PAGE_SIZE_64K - 1))
>>> -
>>> -#define IOMMU_NREGS 3
>>> -
>>> -#define IOMMU_CTRL  0x0
>>> -#define IOMMU_CTRL_TBW_SIZE (1ULL << 2)
>>> -#define IOMMU_CTRL_MMU_EN   (1ULL)
>>> -
>>> -#define IOMMU_CTRL_TSB_SHIFT16
>>> -
>>> -#define IOMMU_BASE  0x8
>>> -#define IOMMU_FLUSH 0x10
>>> -
>>> -#define IOMMU_TTE_DATA_V(1ULL << 63)
>>> -#define IOMMU_TTE_DATA_SIZE (1ULL << 61)
>>> -#define IOMMU_TTE_DATA_W(1ULL << 1)
>>> -
>>> -#define IOMMU_TTE_PHYS_MASK_8K  0x1ffe000ULL
>>> -#define IOMMU_TTE_PHYS_MASK_64K 0x1ff8000ULL
>>> -
>>> -#define IOMMU_TSB_8K_OFFSET_MASK_8M0x007fe000ULL
>>> -#define IOMMU_TSB_8K_OFFSET_MASK_16M   0x00ffe000ULL
>>> -#define IOMMU_TSB_8K_OFFSET_MASK_32M   0x01ffe000ULL
>>> -#define IOMMU_TSB_8K_OFFSET_MASK_64M   0x03ffe000ULL
>>> -#define IOMMU_TSB_8K_OFFSET_MASK_128M  0x07ffe000ULL
>>> -#define IOMMU_TSB_8K_OFFSET_MASK_256M  0x0fffe000ULL
>>> -#define IOMMU_TSB_8K_OFFSET_MASK_512M  0x1fffe000ULL
>>> -#define IOMMU_TSB_8K_OFFSET_MASK_1G0x3fffe000ULL
>>> -
>>> -#define IOMMU_TSB_64K_OFFSET_MASK_64M  0x03ffULL
>>> -#define IOMMU_TSB_64K_OFFSET_MASK_128M 0x07ffULL
>>> -#define IOMMU_TSB_64K_OFFSET_MASK_256M 0x0fffULL
>>> -#define IOMMU_TSB_64K_OFFSET_MASK_512M 0x1fffULL
>>> -#define IOMMU_TSB_64K_OFFSET_MASK_1G   0x3fffULL
>>> -#define IOMMU_TSB_64K_OFFSET_MASK_2G   0x7fffULL
>>> -
>>> -typedef struct IOMMUState {
>>> -AddressSpace iommu_as;
>>> -IOMMUMemoryRegion iommu;
>>> -
>>> -uint64_t regs[IOMMU_NREGS];
>>> -} IOMMUState;
>>> -
>>> -#define TYPE_APB "pbm"
>>> -
>>> -#define APB_DEVICE(obj) \
>>> -OBJECT_CHECK(APBState, (obj), TYPE_APB)
>>> -
>>> -#define TYPE_APB_IOMMU_MEMORY_REGION "pbm-iommu-memory-region"
>>> -
>>> -typedef struct APBState {
>>> -PCIHostState parent_obj;
>>> -
>>> -MemoryRegion apb_config;
>>> -MemoryRegion pci_config;
>>> -MemoryRegion pci_mmio;
>>> -MemoryRegion pci_ioport;
>>> -uint64_t pci_irq_in;
>>> -IOMMUState iommu;
>>> -uint32_t pci_control[16];
>>> -uint32_t pci_irq_map[8];
>>> -uint32_t pci_err_irq_map[4];
>>> -uint32_t obio_irq_map[32];
>>> -qemu_irq *pbm_irqs;
>>> -qemu_irq *ivec_irqs;
>>> -unsigned int irq_request;
>>> -uint32_t reset_control;
>>> -unsigned int nr_resets;
>>> -} APBState;
>>> -
>>> -#define TYPE_PBM_PCI_BRIDGE "pbm-bridge"
>>> -#define PBM_PCI_BRIDGE(obj) \
>>> -OBJECT_CHECK(PBMPCIBridge, (obj), TYPE_PBM_PCI_BRIDGE)
>>> -
>>> -typedef struct PBMPCIBridge {
>>> -/*< private >*/
>>> -PCIBridge parent_obj;
>>> -
>>> -/* Is this busA with in-built devices (ebus)? */
>>> -bool busA;
>>> -} PBMPCIBridge;
>>> -
>>>  static inline void pbm_set_request(APBState *s, unsigned int irq_num)
>>>  {
>>>  APB_DPRINTF("%s: request irq %d\n", __func__, irq_num);
>>> diff --git a/include/hw/pci-host/apb.h b/include/hw/pci-host/apb.h
>>> index b19bd55..5d39c03 100644
>>> --- a/include/hw/pci-host/apb.h
>>> +++ b/include/hw/pci-host/apb.h
>>> @@ -2,6 +2,92 @@
>>>  #define PCI_HOST_APB_H
>>>
>>>  #include "qemu-common.h"
>>> +#include "hw/pci/pci_host.h"
>>> +
>>> +#define IOMMU_NREGS 3
>>> +
>>> +#define IOMMU_PAGE_SIZE_8K  (1ULL << 13)
>>> +#define IOMMU_PAGE_MASK_8K  (~(IOMMU_PAGE_SIZE_8K - 1))
>>> +#define IOMMU_PAGE_SIZE_64K (1ULL << 16)
>>> +#define IOMMU_PAGE_MASK_64K (~(IOMMU_PAGE_SIZE_64K - 1))
>>> +
>>> +#define IOMMU_CTRL  0x0
>>> +#define IOMMU_CTRL_TBW_SIZE (1ULL << 2)
>>> +#define IOMMU_CTRL_MMU_EN   (1ULL)
>>
>> While at it, I think the naming for the model-specific constants
>> should be more explicit.
>> How about US2I_IOMMU_ or SABRE_IOMMU_?
>

Re: [Qemu-devel] [PATCH v4 for-2.11] QAPI & interop: Clarify events emitted by 'block-job-cancel'

2017-11-17 Thread Kashyap Chamarthy
On Fri, Nov 17, 2017 at 07:54:44PM +0100, Kashyap Chamarthy wrote:
> When you cancel an in-progress 'mirror' job (or "active `block-commit`)

When applying, can the maintainer please fix-up the missing closing
double-quote in the brackets above?  It should be: "active
`block-commit`"

[...]

-- 
/kashyap



[Qemu-devel] [Bug 1717708] Re: QEMU aarch64 can't run Windows ARM64 iso's

2017-11-17 Thread Laszlo Ersek (Red Hat)
(1) See also .

(2) One idea is (a) letting Windows use VirtioGpuDxe's GOP (blt only)
until it calls ExitBootServices() -- which I have already confirmed to
work with an ARM64 Windows Server variant that I was legally given
access to --, and (b) once available, injecting a native ARM64 virtio-
gpu-pci driver into the installer image, which Windows could start using
right after ExitBootServices().

(3) BTW, Gerd recently posted

[Qemu-devel] [PATCH 0/4] ramfb: simple boot framebuffer, no legacy vga
http://lists.nongnu.org/archive/html/qemu-devel/2017-11/msg03299.html

which could be an alternative. (Albeit -- in my opinion! -- inferior to
(2).)

(4) In the above-referenced article at , Microsoft write, "Microsoft welcomes feedback and
comments from implementers on this set of requirements". That sounds
awesome: for a change, how about we stop reverse engineering Windows
boot (off of lawfully obtained media of course) and engage in a
conversation. Writing VirtioGpuDxe wasn't trivial, and I'm not amused at
the thought of (a) it being a waste, (b) struggling with more and more
hacks / custom devices & drivers until one set happens to work. We
should stop fabricating one-sided hacks for placating Windows.

(5) Can we please stop making references to leaked or otherwise
unlawfully obtained Windows media, and warez sites. In the upstream QEMU
bug tracker no less. Thanks.

** Bug watch added: bugzilla.tianocore.org/ #785
   https://bugzilla.tianocore.org/show_bug.cgi?id=785

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1717708

Title:
  QEMU aarch64 can't run Windows ARM64 iso's

Status in QEMU:
  New

Bug description:
  Hi,
  recently Windows ARM64 ISOs have been posted on the internet..
  just checked with latest QEMU 2.10 release from 
  https://qemu.weilnetz.de/w64/qemu-w64-setup-20170830.exe 
  "h:\qemu\qemu-system-aarch64.exe" -boot d -cdrom 
h:\iso\16353.1000.170825-1423.RS_PRERELEASE_CLIENTPRO_OEMRET_ARM64FRE_ES-ES.ISO 
-m 2048 -cpu cortex-a57 -smp 1 -machine virt
  seems no video output..
  checked various machine options for example versatilepb (says guest has not 
initialized the guest)..

  so don't know if it's a QEMU bug or lacking feature but can support
  running Windows ARM64 builds (would be nice if you can add a
  Snapdragon835 machine type which is which first machines will be
  running..)

  
  note running a Windows x64 ISO with similar parameters works (removing -cpu 
and -machine as not needed)

  thanks..

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1717708/+subscriptions



[Qemu-devel] [Bug 1717708] Re: QEMU aarch64 can't run Windows ARM64 iso's

2017-11-17 Thread Laszlo Ersek (Red Hat)
Sigh, in bullet (1) of comment #10, I obviously didn't mean to link this
same LP entry. Instead I meant
. Sorry.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1717708

Title:
  QEMU aarch64 can't run Windows ARM64 iso's

Status in QEMU:
  New

Bug description:
  Hi,
  recently Windows ARM64 ISOs have been posted on the internet..
  just checked with latest QEMU 2.10 release from 
  https://qemu.weilnetz.de/w64/qemu-w64-setup-20170830.exe 
  "h:\qemu\qemu-system-aarch64.exe" -boot d -cdrom 
h:\iso\16353.1000.170825-1423.RS_PRERELEASE_CLIENTPRO_OEMRET_ARM64FRE_ES-ES.ISO 
-m 2048 -cpu cortex-a57 -smp 1 -machine virt
  seems no video output..
  checked various machine options for example versatilepb (says guest has not 
initialized the guest)..

  so don't know if it's a QEMU bug or lacking feature but can support
  running Windows ARM64 builds (would be nice if you can add a
  Snapdragon835 machine type which is which first machines will be
  running..)

  
  note running a Windows x64 ISO with similar parameters works (removing -cpu 
and -machine as not needed)

  thanks..

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1717708/+subscriptions



Re: [Qemu-devel] [PATCH 15/15] sun4u: switch from EBUS_DPRINTF() macro to trace-events

2017-11-17 Thread Philippe Mathieu-Daudé
On 11/17/2017 10:42 AM, Mark Cave-Ayland wrote:
> Signed-off-by: Mark Cave-Ayland 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  hw/sparc64/sun4u.c  |   12 ++--
>  hw/sparc64/trace-events |3 +++
>  2 files changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/sparc64/sun4u.c b/hw/sparc64/sun4u.c
> index da386d3..50ccf75 100644
> --- a/hw/sparc64/sun4u.c
> +++ b/hw/sparc64/sun4u.c
> @@ -47,17 +47,9 @@
>  #include "hw/ide/pci.h"
>  #include "hw/loader.h"
>  #include "elf.h"
> +#include "trace.h"
>  #include "qemu/cutils.h"
>  
> -//#define DEBUG_EBUS
> -
> -#ifdef DEBUG_EBUS
> -#define EBUS_DPRINTF(fmt, ...)  \
> -do { printf("EBUS: " fmt , ## __VA_ARGS__); } while (0)
> -#else
> -#define EBUS_DPRINTF(fmt, ...)
> -#endif
> -
>  #define KERNEL_LOAD_ADDR 0x00404000
>  #define CMDLINE_ADDR 0x003ff000
>  #define PROM_SIZE_MAX(4 * 1024 * 1024)
> @@ -218,7 +210,7 @@ static void ebus_isa_irq_handler(void *opaque, int n, int 
> level)
>  qemu_irq irq = s->isa_bus_irqs[n];
>  
>  /* Pass ISA bus IRQs onto their gpio equivalent */
> -EBUS_DPRINTF("Set ISA IRQ %d level %d\n", n, level);
> +trace_ebus_isa_irq_handler(n, level);
>  if (irq) {
>  qemu_set_irq(irq, level);
>  }
> diff --git a/hw/sparc64/trace-events b/hw/sparc64/trace-events
> index 9284b1f..04d80b7 100644
> --- a/hw/sparc64/trace-events
> +++ b/hw/sparc64/trace-events
> @@ -1 +1,4 @@
>  # See docs/devel/tracing.txt for syntax documentation.
> +
> +# hw/sparc64/sun4u.c
> +ebus_isa_irq_handler(int n, int level) "Set ISA IRQ %d level %d"
> 



Re: [Qemu-devel] [PATCH for-2.11] iotests: Fix 176 on 32-bit host

2017-11-17 Thread Eric Blake
On 11/17/2017 01:04 PM, Eric Blake wrote:
> The contents of a qcow2 bitmap are rounded up to a size that
> matches the number of bits available for the granularity, but
> that granularity differs for 32-bit hosts (our default 64k
> cluster allows for 2M bitmap coverage per 'long') and 64-bit
> hosts (4M bitmap per 'long').  If the image is a multiple of
> 2M but not 4M, then the number of bytes occupied by the array
> of longs in memory differs between architecture, thus
> resulting in different SHA256 hashes.
> 
> Furthermore (but untested by me), if our computation of the
> SHA256 hash is at all endian-dependent because of how we store
> data in memory, that's another variable we'd have to account
> for (ideally, we specified the bitmap stored in qcow2 as
> fixed-endian on disk, because the same qcow2 file must be
> usable across any architecture; but that says nothing about
> how we represent things in memory).  But we already have test
> 165 to validate that bitmaps are stored correctly on disk,
> while this test is merely testing that the bitmap exists.
> 
> So for this test, the easiest solution is to filter out the
> actual hash value.  Broken in commit 4096974e.

Of course, if Kevin sends a v2 pull, it's probably better to just squash
this in to my original testsuite change (since a v2 would probably
invalidate that commit id).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 07/15] apb: return APBState from pci_apb_init() rather then PCIBus

2017-11-17 Thread Philippe Mathieu-Daudé
"rather then PCIBus" -> "rather than PCIBus"

On 11/17/2017 10:42 AM, Mark Cave-Ayland wrote:
> This is a first step towards removing pci_apb_init() completely.
> 
> Signed-off-by: Mark Cave-Ayland 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  hw/pci-host/apb.c |8 
>  hw/sparc64/sun4u.c|6 --
>  include/hw/pci-host/apb.h |6 +++---
>  3 files changed, 11 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/pci-host/apb.c b/hw/pci-host/apb.c
> index c7837ef..0c70999 100644
> --- a/hw/pci-host/apb.c
> +++ b/hw/pci-host/apb.c
> @@ -612,9 +612,9 @@ static void apb_pci_bridge_realize(PCIDevice *dev, Error 
> **errp)
>  pci_bridge_update_mappings(PCI_BRIDGE(br));
>  }
>  
> -PCIBus *pci_apb_init(hwaddr special_base,
> - hwaddr mem_base,
> - qemu_irq *ivec_irqs, PCIBus **busA, PCIBus **busB)
> +APBState *pci_apb_init(hwaddr special_base,
> +   hwaddr mem_base,
> +   qemu_irq *ivec_irqs, PCIBus **busA, PCIBus **busB)
>  {
>  DeviceState *dev;
>  SysBusDevice *s;
> @@ -675,7 +675,7 @@ PCIBus *pci_apb_init(hwaddr special_base,
>  qdev_init_nofail(_dev->qdev);
>  *busA = pci_bridge_get_sec_bus(br);
>  
> -return phb->bus;
> +return d;
>  }
>  
>  static void pci_pbm_reset(DeviceState *d)
> diff --git a/hw/sparc64/sun4u.c b/hw/sparc64/sun4u.c
> index b441f1e..a64ddc5 100644
> --- a/hw/sparc64/sun4u.c
> +++ b/hw/sparc64/sun4u.c
> @@ -482,6 +482,7 @@ static void sun4uv_init(MemoryRegion *address_space_mem,
>  Nvram *nvram;
>  unsigned int i;
>  uint64_t initrd_addr, initrd_size, kernel_addr, kernel_size, 
> kernel_entry;
> +APBState *apb;
>  PCIBus *pci_bus, *pci_busA, *pci_busB;
>  PCIDevice *ebus, *pci_dev;
>  SysBusDevice *s;
> @@ -502,8 +503,9 @@ static void sun4uv_init(MemoryRegion *address_space_mem,
>  prom_init(hwdef->prom_addr, bios_name);
>  
>  ivec_irqs = qemu_allocate_irqs(sparc64_cpu_set_ivec_irq, cpu, IVEC_MAX);
> -pci_bus = pci_apb_init(APB_SPECIAL_BASE, APB_MEM_BASE, ivec_irqs, 
> _busA,
> -   _busB);
> +apb = pci_apb_init(APB_SPECIAL_BASE, APB_MEM_BASE, ivec_irqs, _busA,
> +   _busB);
> +pci_bus = PCI_HOST_BRIDGE(apb)->bus;
>  
>  /* Only in-built Simba PBMs can exist on the root bus, slot 0 on busA is
> reserved (leaving no slots free after on-board devices) however slots
> diff --git a/include/hw/pci-host/apb.h b/include/hw/pci-host/apb.h
> index 35d7d5a..a4ef51a 100644
> --- a/include/hw/pci-host/apb.h
> +++ b/include/hw/pci-host/apb.h
> @@ -89,7 +89,7 @@ typedef struct PBMPCIBridge {
>  #define PBM_PCI_BRIDGE(obj) \
>  OBJECT_CHECK(PBMPCIBridge, (obj), TYPE_PBM_PCI_BRIDGE)
>  
> -PCIBus *pci_apb_init(hwaddr special_base,
> - hwaddr mem_base,
> - qemu_irq *ivec_irqs, PCIBus **bus2, PCIBus **bus3);
> +APBState *pci_apb_init(hwaddr special_base,
> +   hwaddr mem_base,
> +   qemu_irq *ivec_irqs, PCIBus **bus2, PCIBus **bus3);
>  #endif
> 



Re: [Qemu-devel] [PULL 0/4] NBD patches for 2.11-rc2

2017-11-17 Thread Peter Maydell
On 17 November 2017 at 15:05, Eric Blake <ebl...@redhat.com> wrote:
> The following changes since commit fec035a53fa15c4c8c4e62bfef56a35df4161e38:
>
>   Merge remote-tracking branch 'remotes/kraxel/tags/ui-20171117-pull-request' 
> into staging (2017-11-17 10:18:41 +)
>
> are available in the git repository at:
>
>   git://repo.or.cz/qemu/ericb.git tags/pull-nbd-2017-11-17
>
> for you to fetch changes up to fed5f8f82056c9f222433c41aeb9fca50c89f297:
>
>   nbd/server: Fix error reporting for bad requests (2017-11-17 08:38:38 -0600)
>
> 
> nbd patches for 2017-11-17
>
> Eric Blake - nbd: Don't crash when server reports NBD_CMD_READ failure
> Eric Blake - nbd/client: Use error_prepend() correctly
> Eric Blake - nbd/client: Don't hard-disconnect on ESHUTDOWN from server
> Eric Blake - nbd/server: Fix error reporting for bad requests
>
> 

Applied, thanks.

-- PMM



[Qemu-devel] [PATCH for-2.11] iotests: Fix 176 on 32-bit host

2017-11-17 Thread Eric Blake
The contents of a qcow2 bitmap are rounded up to a size that
matches the number of bits available for the granularity, but
that granularity differs for 32-bit hosts (our default 64k
cluster allows for 2M bitmap coverage per 'long') and 64-bit
hosts (4M bitmap per 'long').  If the image is a multiple of
2M but not 4M, then the number of bytes occupied by the array
of longs in memory differs between architecture, thus
resulting in different SHA256 hashes.

Furthermore (but untested by me), if our computation of the
SHA256 hash is at all endian-dependent because of how we store
data in memory, that's another variable we'd have to account
for (ideally, we specified the bitmap stored in qcow2 as
fixed-endian on disk, because the same qcow2 file must be
usable across any architecture; but that says nothing about
how we represent things in memory).  But we already have test
165 to validate that bitmaps are stored correctly on disk,
while this test is merely testing that the bitmap exists.

So for this test, the easiest solution is to filter out the
actual hash value.  Broken in commit 4096974e.

Reported-by: Max Reitz 
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/176 | 3 ++-
 tests/qemu-iotests/176.out | 8 
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/176 b/tests/qemu-iotests/176
index 0f31a20294..b8dc17c592 100755
--- a/tests/qemu-iotests/176
+++ b/tests/qemu-iotests/176
@@ -52,7 +52,8 @@ _supported_os Linux
 function run_qemu()
 {
 $QEMU -nographic -qmp stdio -serial none "$@" 2>&1 \
-   | _filter_testdir | _filter_qmp | _filter_qemu
+   | _filter_testdir | _filter_qmp | _filter_qemu \
+| sed 's/"sha256": ".\{64\}"/"sha256": HASH/'
 }

 for reason in snapshot bitmap; do
diff --git a/tests/qemu-iotests/176.out b/tests/qemu-iotests/176.out
index e62085cd0a..f03a2e776c 100644
--- a/tests/qemu-iotests/176.out
+++ b/tests/qemu-iotests/176.out
@@ -205,7 +205,7 @@ Offset  Length  File
 QMP_VERSION
 {"return": {}}
 {"return": {}}
-{"return": {"sha256": 
"e12600978d86b5a453861ae5c17d275204673fef3874b7c3c5433c6153d84706"}}
+{"return": {"sha256": HASH}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"SHUTDOWN", "data": {"guest": false}}

@@ -255,7 +255,7 @@ Offset  Length  File
 QMP_VERSION
 {"return": {}}
 {"return": {}}
-{"return": {"sha256": 
"e12600978d86b5a453861ae5c17d275204673fef3874b7c3c5433c6153d84706"}}
+{"return": {"sha256": HASH}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"SHUTDOWN", "data": {"guest": false}}

@@ -305,7 +305,7 @@ Offset  Length  File
 QMP_VERSION
 {"return": {}}
 {"return": {}}
-{"return": {"sha256": 
"e12600978d86b5a453861ae5c17d275204673fef3874b7c3c5433c6153d84706"}}
+{"return": {"sha256": HASH}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"SHUTDOWN", "data": {"guest": false}}

@@ -352,7 +352,7 @@ Offset  Length  File
 QMP_VERSION
 {"return": {}}
 {"return": {}}
-{"return": {"sha256": 
"e12600978d86b5a453861ae5c17d275204673fef3874b7c3c5433c6153d84706"}}
+{"return": {"sha256": HASH}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"SHUTDOWN", "data": {"guest": false}}
 *** done
-- 
2.13.6




Re: [Qemu-devel] [PATCH for-2.11] qcow2: fix image corruption on commit with persistent snapshot

2017-11-17 Thread Eric Blake
On 11/17/2017 12:46 PM, Eric Blake wrote:

>>
>> The test fails with -m32 and probably also on Big Endian architectures,
>> because the bitmap hash differs.
> 
> Oh my.  Thanks for the rapid testing.
> 
>> We could "fix" this by replacing the 2100 by a 2102, so for both bit
>> widths rounding is the same.

Actually, thinking about it more, are we sure that a .qcow2 image
created on an -m32 host can be properly loaded on a 64-bit host, despite
their different rounded-up sizes for the bitmaps?  If the difference in
hash is due only to the in-memory representation, but the on-disk layout
is identical, then we are safe; but if the difference in hash is visible
on-disk, then we have a real bug in our qcow2 specification or
implementation of persistent bitmaps.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v4 for-2.11] QAPI & interop: Clarify events emitted by 'block-job-cancel'

2017-11-17 Thread Kashyap Chamarthy
When you cancel an in-progress 'mirror' job (or "active `block-commit`)
with QMP `block-job-cancel`, it emits the event: BLOCK_JOB_CANCELLED.
However, when `block-job-cancel` is issued *after* `drive-mirror` has
indicated (via the event BLOCK_JOB_READY) that the source and
destination have reached synchronization:

[...] # Snip `drive-mirror` invocation & outputs
{
  "execute":"block-job-cancel",
  "arguments":{
"device":"virtio0"
  }
}

{"return": {}}

It (`block-job-cancel`) will counterintuitively emit the event
'BLOCK_JOB_COMPLETED':

{
  "timestamp":{
"seconds":1510678024,
"microseconds":526240
  },
  "event":"BLOCK_JOB_COMPLETED",
  "data":{
"device":"virtio0",
"len":41126400,
"offset":41126400,
"speed":0,
"type":"mirror"
  }
}

But this is expected behaviour, where the _COMPLETED event indicates
that synchronization has successfully ended (and the destination now has
a point-in-time copy, which is at the time of cancel).

So add a small note to this effect in 'block-core.json'.  While at it,
also update the "Live disk synchronization -- drive-mirror and
blockdev-mirror" section in 'live-block-operations.rst'.

(Thanks: Max Reitz for reminding me of this caveat on IRC.)

Signed-off-by: Kashyap Chamarthy 
Reviewed-by: Eric Blake 
---
Changes in v4:
 - Make the wording nicer [Eric Blake]

Changes in v3:
 - Fix / rewrite the section "Live disk synchronization -- drive-mirror
   and blockdev-mirror" to note this gotcha [John Snow]

Changes in v2:
 - "Note:" seems to be a special construct in Patchew, my usage caused a
build failure.  So do: s/Note:/Note that/
 - Add the missing 'Signed-off-by'
---
 docs/interop/live-block-operations.rst | 50 ++
 qapi/block-core.json   |  6 
 2 files changed, 38 insertions(+), 18 deletions(-)

diff --git a/docs/interop/live-block-operations.rst 
b/docs/interop/live-block-operations.rst
index 5f01797..e811c95 100644
--- a/docs/interop/live-block-operations.rst
+++ b/docs/interop/live-block-operations.rst
@@ -506,26 +506,40 @@ Again, given our familiar disk image chain::
 
 [A] <-- [B] <-- [C] <-- [D]
 
-The ``drive-mirror`` (and its newer equivalent ``blockdev-mirror``) allows
-you to copy data from the entire chain into a single target image (which
-can be located on a different host).
-
-Once a 'mirror' job has started, there are two possible actions while a
-``drive-mirror`` job is active:
-
-(1) Issuing the command ``block-job-cancel`` after it emits the event
-``BLOCK_JOB_CANCELLED``: will (after completing synchronization of
-the content from the disk image chain to the target image, [E])
-create a point-in-time (which is at the time of *triggering* the
-cancel command) copy, contained in image [E], of the the entire disk
+The ``drive-mirror`` (and its newer equivalent ``blockdev-mirror``)
+allows you to copy data from the entire chain into a single target image
+(which can be located on a different host), [E].
+
+.. note::
+
+When you cancel an in-progress 'mirror' job *before* the source and
+target are synchronized, ``block-job-cancel`` will emit the event
+``BLOCK_JOB_CANCELLED``.  However, note that if you cancel a
+'mirror' job *after* it has indicated (via the event
+``BLOCK_JOB_READY``) that the source and target have reached
+synchronization, then the event emitted by ``block-job-cancel``
+changes to ``BLOCK_JOB_COMPLETED``.
+
+Besides the 'mirror' job, the "active ``block-commit``" is the only
+other block device job that emits the event ``BLOCK_JOB_READY``.
+The rest of the block device jobs ('stream', "non-active
+``block-commit``", and 'backup') end automatically.
+
+So there are two possible actions to take, after a 'mirror' job has
+emitted the event ``BLOCK_JOB_READY``, indicating that the source and
+target have reached synchronization:
+
+(1) Issuing the command ``block-job-cancel`` (after it emits the event
+``BLOCK_JOB_COMPLETED``) will create a point-in-time (which is at
+the time of *triggering* the cancel command) copy of the entire disk
 image chain (or only the top-most image, depending on the ``sync``
-mode).
+mode), contained in the target image [E].
 
-(2) Issuing the command ``block-job-complete`` after it emits the event
-``BLOCK_JOB_COMPLETED``: will, after completing synchronization of
-the content, adjust the guest device (i.e. live QEMU) to point to
-the target image, and, causing all the new writes from this point on
-to happen there.  One use case for this is live storage migration.
+(2) Issuing the command ``block-job-complete`` (after it emits the event
+``BLOCK_JOB_COMPLETED``) will adjust the guest device (i.e. live
+QEMU) to point to the target image, [E], causing all the new writes
+from this point on 

Re: [Qemu-devel] [PATCH for-2.11] qcow2: fix image corruption on commit with persistent snapshot

2017-11-17 Thread Eric Blake
On 11/17/2017 12:17 PM, Max Reitz wrote:
> On 2017-11-17 17:47, Eric Blake wrote:
>> If an image contains persistent snapshots, we cannot use the
>> fast path of bdrv_make_empty() to clear the image during
>> qemu-img commit, because that will lose the clusters related
>> to the bitmaps.
>>
>> Also leave a comment in qcow2_read_extensions to remind future
>> feature additions to think about fast-path removal, since we
>> just barely fixed the same bug for LUKS encryption.
>>
>> It's a pain that qemu-img has not yet been taught to manipulate,
>> or even at a very minimum display, information about persistent
>> bitmaps; instead, we have to use QMP commands.  It's also a
>> pain that only qeury-block and x-debug-block-dirty-bitmap-sha256
>> will allow bitmap introspection; but the former requires the
>> node to be hooked to a block device, and the latter is experimental.
>>
>> Signed-off-by: Eric Blake 
>> ---
>>
>> As promised, based on Dan's similar patch for LUKS encryption
>>
>>  block/qcow2.c  |  17 ++--
>>  tests/qemu-iotests/176 |  55 ++--
>>  tests/qemu-iotests/176.out | 216 
>> -
>>  3 files changed, 270 insertions(+), 18 deletions(-)
> 
> The test fails with -m32 and probably also on Big Endian architectures,
> because the bitmap hash differs.

Oh my.  Thanks for the rapid testing.

> We could "fix" this by replacing the 2100 by a 2102, so for both bit
> widths rounding is the same.  But that would still leave the issue open
> for Big Endian architectures (which generate just completely different
> hashes)

Isn't sha256 supposed to be a byte-based hash that is not endian
specific?  What causes that difference?

> -- and I'm not really a fan of testing this on every possible
> architecture and then adding different reference outputs.
> 
> Therefore, the best fix is probably to just filter the hashes out (you
> don't need the exact value anyway, do you?), and I think it's fine to do
> this as a follow-up.

At any rate, I concur with this conclusion; I'll post a followup that
filters out the hash (for this test, we only care that the existence of
a hash proves the bitmap exists; unlike 165 where we want to validate
that it is actually tracking correct information).

I missed Kevin's -rc2 pull, unless he wants to send a v2; but we also
have time (it's not the end of the world if the fix goes in -rc3).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH for-2.11] pc-bios/s390-ccw: Fix problem with invalid virtio-scsi LUN when rebooting

2017-11-17 Thread Christian Borntraeger


On 11/17/2017 07:10 PM, Thomas Huth wrote:
> When rebooting a guest that has a virtio-scsi disk, the s390-ccw
> bios sometimes bails out with an error message like this:
> 
> ! SCSI cannot report LUNs: STATUS=02 RSPN=70 KEY=05 CODE=25 QLFR=00, sure !
> 
> Enabling the scsi_req* tracing in QEMU shows that the ccw bios is
> trying to execute the REPORT LUNS SCSI command with a LUN != 0, and
> this causes the SCSI command to fail.
> Looks like we neither clear the BSS of the s390-ccw bios during reboot,
> nor do we explicitly set the default_scsi_device.lun value to 0, so
> this variable can contain random values from the OS after the reboot.
> By setting this variable explicitly to 0, the problem is fixed and
> the reboots always succeed.
> 
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1514352
> Signed-off-by: Thomas Huth 

Acked-by: Christian Borntraeger 

We had these things multile times in the past. The QEMU elf loader does not
zero the BSS (it just loads the load section).  Hmm, what about letting the 
BIOS zero its bss itself. IIRC the kernel does the same thing. I will have a
look into that for 2.12.

PS: Please do not forget to rebuild the bios
> ---
>  pc-bios/s390-ccw/virtio-scsi.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/pc-bios/s390-ccw/virtio-scsi.c b/pc-bios/s390-ccw/virtio-scsi.c
> index c92f5d3..4fe4b9d 100644
> --- a/pc-bios/s390-ccw/virtio-scsi.c
> +++ b/pc-bios/s390-ccw/virtio-scsi.c
> @@ -223,7 +223,8 @@ static void virtio_scsi_locate_device(VDev *vdev)
> 
>  for (target = 0; target <= vdev->config.scsi.max_target; target++) {
>  sdev->channel = channel;
> -sdev->target = target; /* sdev->lun will be 0 here */
> +sdev->target = target;
> +sdev->lun = 0;  /* LUN has to be 0 for REPORT LUNS */
>  if (!scsi_report_luns(vdev, data, sizeof(data))) {
>  if (resp.response == VIRTIO_SCSI_S_BAD_TARGET) {
>  continue;
> 




Re: [Qemu-devel] [PATCH] block/snapshot: dirty all dirty bitmaps on snapshot-switch

2017-11-17 Thread Kevin Wolf
Am 17.11.2017 um 19:15 hat John Snow geschrieben:
> 
> 
> On 11/17/2017 10:01 AM, Max Reitz wrote:
> > On 2017-11-17 13:30, Kevin Wolf wrote:
> >> Am 23.10.2017 um 11:29 hat Vladimir Sementsov-Ogievskiy geschrieben:
> >>> Snapshot-switch actually changes active state of disk so it should
> >>> reflect on dirty bitmaps. Otherwise next incremental backup using
> >>> these bitmaps will be invalid.
> >>>
> >>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> >>
> >> We discussed this quite a while ago, and I'm still not convinced that
> >> this approach makes sense.
> > 
> > I think it at least makes more sense than not handling this case at all.
> > 
> >> Can you give just one example of a use case where dirtying the whole
> >> bitmap while loading a snapshot is the desired behaviour?
> >>
> >> I think the most useful behaviour would be something where the bitmaps
> >> themselves are snapshotted, too.
> > 
> > Agreed.
> > 
> >>  But for the time being, the easiest and
> >> safest solution might just be to error out in any snapshot operations
> >> if any bitmaps are in use.
> > 
> > Sounds OK, too.  I personally don't have an opinion either way.
> > 
> > But in any case, what we did before this patch was definitely wrong so I
> > consider it an improvement.
> > 
> 
> This is how I feel about it too. Erroring out entirely is an option, but
> code-wise just dirtying everything is at least verifiably not-wrong and
> pretty simple to implement.

But that's exactly the problem: If something is just plain wrong, you
can always replace it with something that makes sense. If it errors out,
you can still remove that error later. But if you have something that
doesn't make a whole lot of sense, but kinda sorta works (like after
this patch), it's ABI and you can't implement something more useful
later any more.

> It's an improvement... Don't do it, but at least you won't get
> something wrong after, just something heinously unoptimal.

It's a short-term improvement that may become a long-term burden.

Kevin



Re: [Qemu-devel] [PATCH for-2.11] spapr: reset DRCs after devices

2017-11-17 Thread Michael Roth
Quoting Greg Kurz (2017-11-17 06:56:48)
> A DRC with a pending unplug request releases its associated device at
> machine reset time.
> 
> In the case of LMB, when all DRCs for a DIMM device have been reset,
> the DIMM gets unplugged, causing guest memory to disappear. This may
> be very confusing for anything still using this memory.
> 
> This is exactly what happens with vhost backends, and QEMU aborts
> with:
> 
> qemu-system-ppc64: used ring relocated for ring 2
> qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion
>  `r >= 0' failed.
> 
> The issue is that each DRC registers a QEMU reset handler, and we
> don't control the order in which these handlers are called (ie,
> a LMB DRC will unplug a DIMM before the virtio device using the
> memory on this DIMM could stop its vhost backend).
> 
> To avoid such situations, let's reset DRCs after all devices
> have been reset.
> 
> Reported-by: Mallesh N. Koti 
> Signed-off-by: Greg Kurz 

Reviewed-by: Michael Roth 

> ---
>  hw/ppc/spapr.c |   21 +
>  hw/ppc/spapr_drc.c |7 ---
>  2 files changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 6841bd294b3c..6285f7211f9a 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1411,6 +1411,19 @@ static void find_unknown_sysbus_device(SysBusDevice 
> *sbdev, void *opaque)
>  }
>  }
> 
> +static int spapr_reset_drcs(Object *child, void *opaque)
> +{
> +sPAPRDRConnector *drc =
> +(sPAPRDRConnector *) object_dynamic_cast(child,
> + TYPE_SPAPR_DR_CONNECTOR);
> +
> +if (drc) {
> +spapr_drc_reset(drc);
> +}
> +
> +return 0;
> +}
> +
>  static void ppc_spapr_reset(void)
>  {
>  MachineState *machine = MACHINE(qdev_get_machine());
> @@ -1434,6 +1447,14 @@ static void ppc_spapr_reset(void)
>  }
> 
>  qemu_devices_reset();
> +
> +/* DRC reset may cause a device to be unplugged. This will cause troubles
> + * if this device is used by another device (eg, a running vhost backend
> + * will crash QEMU if the DIMM holding the vring goes away). To avoid 
> such
> + * situations, we reset DRCs after all devices have been reset.
> + */
> +object_child_foreach_recursive(object_get_root(), spapr_reset_drcs, 
> NULL);
> +
>  spapr_clear_pending_events(spapr);
> 
>  /*
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index 915e9b51c40c..e3b122968e89 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -455,11 +455,6 @@ void spapr_drc_reset(sPAPRDRConnector *drc)
>  }
>  }
> 
> -static void drc_reset(void *opaque)
> -{
> -spapr_drc_reset(SPAPR_DR_CONNECTOR(opaque));
> -}
> -
>  bool spapr_drc_needed(void *opaque)
>  {
>  sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
> @@ -518,7 +513,6 @@ static void realize(DeviceState *d, Error **errp)
>  }
>  vmstate_register(DEVICE(drc), spapr_drc_index(drc), _spapr_drc,
>   drc);
> -qemu_register_reset(drc_reset, drc);
>  trace_spapr_drc_realize_complete(spapr_drc_index(drc));
>  }
> 
> @@ -529,7 +523,6 @@ static void unrealize(DeviceState *d, Error **errp)
>  gchar *name;
> 
>  trace_spapr_drc_unrealize(spapr_drc_index(drc));
> -qemu_unregister_reset(drc_reset, drc);
>  vmstate_unregister(DEVICE(drc), _spapr_drc, drc);
>  root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
>  name = g_strdup_printf("%x", spapr_drc_index(drc));
> 




[Qemu-devel] [Bug 1732959] [NEW] [regression] Clock jump on source VM after migration

2017-11-17 Thread Neil Skrypuch
Public bug reported:

We (ab)use migration + block mirroring to perform transparent zero
downtime VM backups. Basically:

1) do a block mirror of the source VM's disk
2) migrate the source VM to a destination VM using the disk copy
3) cancel the block mirroring
4) resume the source VM
5) shut down the destination VM gracefully and move the disk to backup

Relatively recently, the source VM's clock started jumping after step
#4. More specifically, the clock jumps an amount of time proportional to
the time since it was last migrated. With a week between migrations,
clock jumps between ~2.5s and ~12s have been observed. For a particular
host, the amount of clock jump is fairly consistent, but there is a
large variation from one host to the next (this is likely down to
hardware variations and the amount of NTP adjusted clock drift on the
host).

This is caused by a kernel regression which I was able to bisect. The
result of the bisect was:

108b249c453dd7132599ab6dc7e435a7036c193f is the first bad commit
commit 108b249c453dd7132599ab6dc7e435a7036c193f
Author: Paolo Bonzini 
Date:   Thu Sep 1 14:21:03 2016 +0200

KVM: x86: introduce get_kvmclock_ns

Introduce a function that reads the exact nanoseconds value that is
provided to the guest in kvmclock.  This crystallizes the notion of
kvmclock as a thin veneer over a stable TSC, that the guest will
(hopefully) convert with NTP.  In other words, kvmclock is *not* a
paravirtualized host-to-guest NTP.

Drop the get_kernel_ns() function, that was used both to get the base
value of the master clock and to get the current value of kvmclock.
The former use is replaced by ktime_get_boot_ns(), the latter is
the purpose of get_kernel_ns().

This also allows KVM to provide a Hyper-V time reference counter that
is synchronized with the time that is computed from the TSC page.

Reviewed-by: Roman Kagan 
Signed-off-by: Paolo Bonzini 

I am able to reproduce the issue with much newer kernels as well,
including 4.12.5 and 4.9.6.

Reliably reproducing the problem in isolation is difficult, as one must
run a VM for many hours before the clock jump from this bug is
noticeable over the clock jump inherent with a pause and resume of the
VM. The reproducer I am including is set to run the VM for 18 hours
before migration and looks for >= 150 ms of clock jump. On different
hardware, you may need to let the VM run for more than 18 hours to
reliably reproduce the issue.

To reproduce the issue, please see the attached reproducer. The host
needs to have perl, screen and socat installed for the backup script to
work. Both the host and guest need to be running NTP (and NTP must
autostart at boot in the guest). The host needs to be able to SSH into
the guest using SSH keys (to measure the clock jump), so you will need
to configure the network and SSH keys appropriately, then change the
hardcoded IP address in checktime.sh and test.sh. I have only tested
with CentOS 7 guests.

The qemu command that gets run is in .kvmscreen (the destination VM's
command line is programmatically constructed from this command as well),
you may need to tweak the bridge configuration. Also, although the
reproducer is relatively self contained, it has several built in
assumptions that will break if the image file is not in the /var/lib/kvm
directory or if the monitor file is not in the /var/lib/kvm/monitor
directory, or if the /backup directory does not exist. Finally, if you
change the process name or socket name in .kvmscreen, you'll need to
adjust the cleanup section in test.sh.

With all of the above in place, run test.sh and check back in a little
over 18 hours, part of the output should include something along these
lines:

Target not found (wanted 150, at 10)

- or -

Target found (wanted 150, found 340)

If the target is reported as found, that means that we have probably
reproduced the described issue.

The version of QEMU in use does not appear to matter. At one point I
tested every major version from 2.4 to 2.9 (inclusive) and reproduced
the issue in all of them.

This was initially observed on two different Gentoo hosts. I have also
started to see this issue happening with four different RHEL 7 hosts as
of the upgrade to RHEL 7.4. This is not too surprising as it appears
that the above commit has been backported into RHEL 7. All hosts and
guests are 64-bit.

** Affects: qemu
 Importance: Undecided
 Status: New

** Attachment added: "qemu-clock-jump-reproducer.tar.bz2"
   
https://bugs.launchpad.net/bugs/1732959/+attachment/5010639/+files/qemu-clock-jump-reproducer.tar.bz2

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1732959

Title:
  [regression] Clock jump on source VM after migration

Status in QEMU:
  New

Bug description:
  We (ab)use migration + block 

[Qemu-devel] [PULL 24/25] block: Make bdrv_next() keep strong references

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

On one hand, it is a good idea for bdrv_next() to return a strong
reference because ideally nearly every pointer should be refcounted.
This fixes intermittent failure of iotest 194.

On the other, it is absolutely necessary for bdrv_next() itself to keep
a strong reference to both the BB (in its first phase) and the BDS (at
least in the second phase) because when called the next time, it will
dereference those objects to get a link to the next one.  Therefore, it
needs these objects to stay around until then.  Just storing the pointer
to the next in the iterator is not really viable because that pointer
might become invalid as well.

Both arguments taken together means we should probably just invoke
bdrv_ref() and blk_ref() in bdrv_next().  This means we have to assert
that bdrv_next() is always called from the main loop, but that was
probably necessary already before this patch and judging from the
callers, it also looks to actually be the case.

Keeping these strong references means however that callers need to give
them up if they decide to abort the iteration early.  They can do so
through the new bdrv_next_cleanup() function.

Suggested-by: Kevin Wolf 
Signed-off-by: Max Reitz 
Message-id: 20171110172545.32609-1-mre...@redhat.com
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Max Reitz 
---
 include/block/block.h |  1 +
 block.c   |  3 +++
 block/block-backend.c | 48 ++--
 block/snapshot.c  |  6 ++
 migration/block.c |  1 +
 5 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index fbc21daf62..c05cac57e5 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -461,6 +461,7 @@ typedef struct BdrvNextIterator {
 
 BlockDriverState *bdrv_first(BdrvNextIterator *it);
 BlockDriverState *bdrv_next(BdrvNextIterator *it);
+void bdrv_next_cleanup(BdrvNextIterator *it);
 
 BlockDriverState *bdrv_next_monitor_owned(BlockDriverState *bs);
 bool bdrv_is_encrypted(BlockDriverState *bs);
diff --git a/block.c b/block.c
index 996778cfa0..6c8ef98dfa 100644
--- a/block.c
+++ b/block.c
@@ -4255,6 +4255,7 @@ void bdrv_invalidate_cache_all(Error **errp)
 aio_context_release(aio_context);
 if (local_err) {
 error_propagate(errp, local_err);
+bdrv_next_cleanup();
 return;
 }
 }
@@ -4330,6 +4331,7 @@ int bdrv_inactivate_all(void)
 for (bs = bdrv_first(); bs; bs = bdrv_next()) {
 ret = bdrv_inactivate_recurse(bs, pass);
 if (ret < 0) {
+bdrv_next_cleanup();
 goto out;
 }
 }
@@ -4864,6 +4866,7 @@ bool bdrv_is_first_non_filter(BlockDriverState *candidate)
 
 /* candidate is the first non filter */
 if (perm) {
+bdrv_next_cleanup();
 return true;
 }
 }
diff --git a/block/block-backend.c b/block/block-backend.c
index f10b1db612..5836cb3087 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -442,21 +442,37 @@ BlockBackend *blk_next(BlockBackend *blk)
  * the monitor or attached to a BlockBackend */
 BlockDriverState *bdrv_next(BdrvNextIterator *it)
 {
-BlockDriverState *bs;
+BlockDriverState *bs, *old_bs;
+
+/* Must be called from the main loop */
+assert(qemu_get_current_aio_context() == qemu_get_aio_context());
 
 /* First, return all root nodes of BlockBackends. In order to avoid
  * returning a BDS twice when multiple BBs refer to it, we only return it
  * if the BB is the first one in the parent list of the BDS. */
 if (it->phase == BDRV_NEXT_BACKEND_ROOTS) {
+BlockBackend *old_blk = it->blk;
+
+old_bs = old_blk ? blk_bs(old_blk) : NULL;
+
 do {
 it->blk = blk_all_next(it->blk);
 bs = it->blk ? blk_bs(it->blk) : NULL;
 } while (it->blk && (bs == NULL || bdrv_first_blk(bs) != it->blk));
 
+if (it->blk) {
+blk_ref(it->blk);
+}
+blk_unref(old_blk);
+
 if (bs) {
+bdrv_ref(bs);
+bdrv_unref(old_bs);
 return bs;
 }
 it->phase = BDRV_NEXT_MONITOR_OWNED;
+} else {
+old_bs = it->bs;
 }
 
 /* Then return the monitor-owned BDSes without a BB attached. Ignore all
@@ -467,18 +483,46 @@ BlockDriverState *bdrv_next(BdrvNextIterator *it)
 bs = it->bs;
 } while (bs && bdrv_has_blk(bs));
 
+if (bs) {
+bdrv_ref(bs);
+}
+bdrv_unref(old_bs);
+
 return bs;
 }
 
-BlockDriverState *bdrv_first(BdrvNextIterator *it)
+static void bdrv_next_reset(BdrvNextIterator *it)
 {
 *it = (BdrvNextIterator) {
 .phase = BDRV_NEXT_BACKEND_ROOTS,
 };
+}
 
+BlockDriverState *bdrv_first(BdrvNextIterator *it)
+{
+bdrv_next_reset(it);
 return bdrv_next(it);
 

Re: [Qemu-devel] [PATCH v8 02/14] block/dirty-bitmap: add locked version of bdrv_release_dirty_bitmap

2017-11-17 Thread John Snow


On 11/17/2017 03:07 AM, Vladimir Sementsov-Ogievskiy wrote:
> 11.11.2017 01:52, John Snow wrote:
>>
>> On 10/30/2017 12:32 PM, Vladimir Sementsov-Ogievskiy wrote:
>>> It is needed to realize bdrv_dirty_bitmap_release_successor in
>>> the following patch.
>>>
>> OK, but...
>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>> ---
>>>   block/dirty-bitmap.c | 25 -
>>>   1 file changed, 20 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
>>> index 81adbeb6d4..981f99d362 100644
>>> --- a/block/dirty-bitmap.c
>>> +++ b/block/dirty-bitmap.c
>>> @@ -326,13 +326,13 @@ static bool
>>> bdrv_dirty_bitmap_has_name(BdrvDirtyBitmap *bitmap)
>>>   return !!bdrv_dirty_bitmap_name(bitmap);
>>>   }
>>>   -/* Called with BQL taken.  */
>>> -static void bdrv_do_release_matching_dirty_bitmap(
>>> +/* Called within bdrv_dirty_bitmap_lock..unlock */
>> ...Add this so it will compile:
> 
> how do you compile to get an error? and what is unused?
> 

.../src/qemu/block/dirty-bitmap.c:368:13: error:
‘bdrv_release_dirty_bitmap_locked’ defined but not used
[-Werror=unused-function]
 static void bdrv_release_dirty_bitmap_locked(BlockDriverState *bs,
 ^~~~
cc1: all warnings being treated as errors


I commented on the wrong prototype. The ((__unused__)) attribute just
quiets this warning so it can compile without you having to refactor.

--js



[Qemu-devel] [PULL 18/25] qcow2: check_errors are fatal

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

When trying to repair a dirty image, qcow2_check() may apparently
succeed (no really fatal error occurred that would prevent the check
from continuing), but if check_errors in the result object is non-zero,
we cannot trust the image to be usable.

Reported-by: R. Nageswara Sastry 
Buglink: https://bugs.launchpad.net/qemu/+bug/1728639
Signed-off-by: Max Reitz 
Message-id: 20171110203111.7666-2-mre...@redhat.com
Reviewed-by: Eric Blake 
Signed-off-by: Max Reitz 
---
 block/qcow2.c  |  5 -
 tests/qemu-iotests/060 | 20 
 tests/qemu-iotests/060.out | 23 +++
 3 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 811b913233..1914a940e5 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1477,7 +1477,10 @@ static int qcow2_do_open(BlockDriverState *bs, QDict 
*options, int flags,
 BdrvCheckResult result = {0};
 
 ret = qcow2_check(bs, , BDRV_FIX_ERRORS | BDRV_FIX_LEAKS);
-if (ret < 0) {
+if (ret < 0 || result.check_errors) {
+if (ret >= 0) {
+ret = -EIO;
+}
 error_setg_errno(errp, -ret, "Could not repair dirty image");
 goto fail;
 }
diff --git a/tests/qemu-iotests/060 b/tests/qemu-iotests/060
index fae08b03bf..56bdf1ee2e 100755
--- a/tests/qemu-iotests/060
+++ b/tests/qemu-iotests/060
@@ -301,6 +301,26 @@ _make_test_img 64M
 poke_file "$TEST_IMG" "48""\x00\x00\x00\x00\x00\x00\x00\x00"
 $QEMU_IO -c "write 0 64k" "$TEST_IMG" | _filter_qemu_io
 
+echo
+echo "=== Testing dirty corrupt image ==="
+echo
+
+_make_test_img 64M
+
+# Let the refblock appear unaligned
+poke_file "$TEST_IMG" "$rt_offset""\x00\x00\x00\x00\xff\xff\x2a\x00"
+# Mark the image dirty, thus forcing an automatic check when opening it
+poke_file "$TEST_IMG" 72 "\x00\x00\x00\x00\x00\x00\x00\x01"
+# Open the image (qemu should refuse to do so)
+$QEMU_IO -c close "$TEST_IMG" 2>&1 | _filter_testdir | _filter_imgfmt
+
+echo '--- Repairing ---'
+
+# The actual repair should have happened (because of the dirty bit),
+# but some cleanup may have failed (like freeing the old reftable)
+# because the image was already marked corrupt by that point
+_check_test_img -r all
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out
index 62c22701b8..f013fe73c0 100644
--- a/tests/qemu-iotests/060.out
+++ b/tests/qemu-iotests/060.out
@@ -284,4 +284,27 @@ No errors were found on the image.
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
 qcow2: Marking image as corrupt: Preventing invalid allocation of L2 table at 
offset 0; further corruption events will be suppressed
 write failed: Input/output error
+
+=== Testing dirty corrupt image ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
+ERROR refcount block 0 is not cluster aligned; refcount table entry corrupted
+IMGFMT: Marking image as corrupt: Refblock offset 0x2a00 unaligned 
(reftable index: 0); further corruption events will be suppressed
+Can't get refcount for cluster 0: Input/output error
+Can't get refcount for cluster 1: Input/output error
+Can't get refcount for cluster 2: Input/output error
+Can't get refcount for cluster 3: Input/output error
+Rebuilding refcount structure
+Repairing cluster 1 refcount=1 reference=0
+can't open device TEST_DIR/t.IMGFMT: Could not repair dirty image: 
Input/output error
+--- Repairing ---
+Leaked cluster 1 refcount=1 reference=0
+Repairing cluster 1 refcount=1 reference=0
+The following inconsistencies were found and repaired:
+
+1 leaked clusters
+0 corruptions
+
+Double checking the fixed image now...
+No errors were found on the image.
 *** done
-- 
2.13.6




[Qemu-devel] [PULL 23/25] qcow2: Fix overly broad madvise()

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

@mem_size and @offset are both size_t, thus subtracting them from one
another will just return a big size_t if mem_size < offset -- even more
obvious here because the result is stored in another size_t.

Checking that result to be positive is therefore not sufficient to
exclude the case that offset > mem_size.  Thus, we currently sometimes
issue an madvise() over a very large address range.

This is triggered by iotest 163, but with -m64, this does not result in
tangible problems.  But with -m32, this test produces three segfaults,
all of which are fixed by this patch.

Signed-off-by: Max Reitz 
Message-id: 20171114184127.24238-1-mre...@redhat.com
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
Reviewed-by: Darren Kenny 
Signed-off-by: Max Reitz 
---
 block/qcow2-cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index a5baaba0ff..c48ffebd8f 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -85,7 +85,7 @@ static void qcow2_cache_table_release(BlockDriverState *bs, 
Qcow2Cache *c,
 size_t mem_size = (size_t) s->cluster_size * num_tables;
 size_t offset = QEMU_ALIGN_UP((uintptr_t) t, align) - (uintptr_t) t;
 size_t length = QEMU_ALIGN_DOWN(mem_size - offset, align);
-if (length > 0) {
+if (mem_size > offset && length > 0) {
 madvise((uint8_t *) t + offset, length, MADV_DONTNEED);
 }
 #endif
-- 
2.13.6




[Qemu-devel] [PULL 25/25] iotests: Make 087 pass without AIO enabled

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

If AIO has not been enabled in the qemu build that is to be tested, we
should skip the "aio=native without O_DIRECT" test instead of failing.

Signed-off-by: Max Reitz 
Message-id: 20171115180732.31753-1-mre...@redhat.com
Reviewed-by: Eric Blake 
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/087 | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/087 b/tests/qemu-iotests/087
index 27ab6c5151..2561a14456 100755
--- a/tests/qemu-iotests/087
+++ b/tests/qemu-iotests/087
@@ -102,7 +102,14 @@ echo
 echo === aio=native without O_DIRECT ===
 echo
 
-run_qemu <

[Qemu-devel] [PULL 22/25] qcow2: Refuse to get unaligned offsets from cache

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

Instead of using an assertion, it is better to emit a corruption event
here.  Checking all offsets for correct alignment can be tedious and it
is easily possible to forget to do so.  qcow2_cache_do_get() is a
function every L2 and refblock access has to go through, so this is a
good central point to add such a check.

And for good measure, let us also add an assertion that the offset is
non-zero.  Making this a corruption event is not feasible, because a
zero offset usually means something special (such as the cluster is
unused), so all callers should be checking this anyway.  If they do not,
it is their fault, hence the assertion here.

Signed-off-by: Max Reitz 
Message-id: 20171110203111.7666-6-mre...@redhat.com
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
Signed-off-by: Max Reitz 
---
 block/qcow2-cache.c| 21 +
 tests/qemu-iotests/060 | 21 +
 tests/qemu-iotests/060.out | 29 +
 3 files changed, 71 insertions(+)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 75746a7f43..a5baaba0ff 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -62,6 +62,18 @@ static inline int qcow2_cache_get_table_idx(BlockDriverState 
*bs,
 return idx;
 }
 
+static inline const char *qcow2_cache_get_name(BDRVQcow2State *s, Qcow2Cache 
*c)
+{
+if (c == s->refcount_block_cache) {
+return "refcount block";
+} else if (c == s->l2_table_cache) {
+return "L2 table";
+} else {
+/* Do not abort, because this is not critical */
+return "unknown";
+}
+}
+
 static void qcow2_cache_table_release(BlockDriverState *bs, Qcow2Cache *c,
   int i, int num_tables)
 {
@@ -314,9 +326,18 @@ static int qcow2_cache_do_get(BlockDriverState *bs, 
Qcow2Cache *c,
 uint64_t min_lru_counter = UINT64_MAX;
 int min_lru_index = -1;
 
+assert(offset != 0);
+
 trace_qcow2_cache_get(qemu_coroutine_self(), c == s->l2_table_cache,
   offset, read_from_disk);
 
+if (offset_into_cluster(s, offset)) {
+qcow2_signal_corruption(bs, true, -1, -1, "Cannot get entry from %s "
+"cache: Offset %#" PRIx64 " is unaligned",
+qcow2_cache_get_name(s, c), offset);
+return -EIO;
+}
+
 /* Check if the table is already cached */
 i = lookup_index = (offset / s->cluster_size * 4) % c->size;
 do {
diff --git a/tests/qemu-iotests/060 b/tests/qemu-iotests/060
index c230696b3a..1eca09417b 100755
--- a/tests/qemu-iotests/060
+++ b/tests/qemu-iotests/060
@@ -405,6 +405,27 @@ _check_test_img -r all
 $QEMU_IMG resize --shrink "$TEST_IMG" 32M
 _img_info | grep 'virtual size'
 
+echo
+echo "=== Discarding a refblock covered by an unaligned refblock ==="
+echo
+
+IMGOPTS='refcount_bits=1' _make_test_img 64M
+
+# Same as above
+poke_file "$TEST_IMG" "$(($rt_offset+8))" "\x00\x00\x00\x10\x00\x00\x00\x00"
+# But now we actually "create" an unaligned third refblock
+poke_file "$TEST_IMG" "$(($rt_offset+16))" "\x00\x00\x00\x00\x00\x00\x02\x00"
+$QEMU_IMG resize --shrink "$TEST_IMG" 32M
+
+echo '--- Repairing ---'
+# Fails the first repair because the corruption prevents the check
+# function from double-checking
+# (Using -q for the first invocation, because otherwise the
+#  double-check error message appears above the summary for some
+#  reason -- so let's just hide the summary)
+_check_test_img -q -r all
+_check_test_img -r all
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out
index 358e54cdc9..56f5eb15d8 100644
--- a/tests/qemu-iotests/060.out
+++ b/tests/qemu-iotests/060.out
@@ -370,4 +370,33 @@ virtual size: 64M (67108864 bytes)
 No errors were found on the image.
 Image resized.
 virtual size: 32M (33554432 bytes)
+
+=== Discarding a refblock covered by an unaligned refblock ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
+qcow2: Marking image as corrupt: Cannot get entry from refcount block cache: 
Offset 0x200 is unaligned; further corruption events will be suppressed
+qemu-img: Failed to discard unused refblocks: Input/output error
+--- Repairing ---
+Repairing refcount block 1 is outside image
+ERROR refcount block 2 is not cluster aligned; refcount table entry corrupted
+qcow2: Marking image as corrupt: Refblock offset 0x200 unaligned (reftable 
index: 0x2); further corruption events will be suppressed
+Can't get refcount for cluster 1048576: Input/output error
+Rebuilding refcount structure
+Repairing cluster 1 refcount=1 reference=0
+Repairing cluster 2 refcount=1 reference=0
+Repairing cluster 1048576 refcount=1 reference=0
+qemu-img: Check failed: No medium found
+Leaked cluster 1 refcount=1 reference=0
+Leaked cluster 2 refcount=1 reference=0
+Leaked 

Re: [Qemu-devel] [Qemu-ppc] [PATCH for-2.11] spapr: reset DRCs after devices

2017-11-17 Thread Michael Roth
Quoting Daniel Henrique Barboza (2017-11-17 10:21:27)
> 
> 
> On 11/17/2017 10:56 AM, Greg Kurz wrote:
> > A DRC with a pending unplug request releases its associated device at
> > machine reset time.
> >
> > In the case of LMB, when all DRCs for a DIMM device have been reset,
> > the DIMM gets unplugged, causing guest memory to disappear. This may
> > be very confusing for anything still using this memory.
> >
> > This is exactly what happens with vhost backends, and QEMU aborts
> > with:
> >
> > qemu-system-ppc64: used ring relocated for ring 2
> > qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion
> >   `r >= 0' failed.
> >
> > The issue is that each DRC registers a QEMU reset handler, and we
> > don't control the order in which these handlers are called (ie,
> > a LMB DRC will unplug a DIMM before the virtio device using the
> > memory on this DIMM could stop its vhost backend).
> >
> > To avoid such situations, let's reset DRCs after all devices
> > have been reset.
> >
> > Reported-by: Mallesh N. Koti 
> > Signed-off-by: Greg Kurz 
> > ---
> I believe this has to do with the problem discussed at the memory
> unplug reboot with vhost=on, right?
> 
> I don't see any problem with this patch and it's probably the best solution
> short-term for the problem (and potentially other related LMB release
> problems), but that said, how hard it is to forbid LMB unplug at all if the
> memory is being used in vhost?
> 
> The way I see it, the root problem is that a device unplug failed at the 
> guest
> side and we don't know about it, leaving drc->unplug_requested flags set
> around the code when it shouldn't. Reading that thread I see a comment from

I'm not sure we want to go down the road of preemptively aborting
specific cases where we think unplug will fail in the guest. We already
need to deal with the cases where we don't know whether they'll fail
ahead of time, so it seems like more of a headache to add special cases
on top of that that management would need to deal with.

And in this particular case we'd never be able to unplug the DIMM if
the guest happens to use it for the vring each boot (unless maybe the
unplug request occured during early boot), even though there's no
technical reason we shouldn't be able to at least handle it on the
next reset.

> Michael S. Tsirkin saying that this bug is somewhat expected, that vhost 
> backend will
> not behave well if memory is going away. Unlike other LMB unplug 
> failures from
> the guest side that might happen for any other reason, this one sees 
> predicable
> and we can avoid it.
> 
> I think this is something worth considering. But for now,
> 
> 
> Reviewed-by: Daniel Henrique Barboza 
> 
> >   hw/ppc/spapr.c |   21 +
> >   hw/ppc/spapr_drc.c |7 ---
> >   2 files changed, 21 insertions(+), 7 deletions(-)
> >
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 6841bd294b3c..6285f7211f9a 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1411,6 +1411,19 @@ static void find_unknown_sysbus_device(SysBusDevice 
> > *sbdev, void *opaque)
> >   }
> >   }
> >
> > +static int spapr_reset_drcs(Object *child, void *opaque)
> > +{
> > +sPAPRDRConnector *drc =
> > +(sPAPRDRConnector *) object_dynamic_cast(child,
> > + TYPE_SPAPR_DR_CONNECTOR);
> > +
> > +if (drc) {
> > +spapr_drc_reset(drc);
> > +}
> > +
> > +return 0;
> > +}
> > +
> >   static void ppc_spapr_reset(void)
> >   {
> >   MachineState *machine = MACHINE(qdev_get_machine());
> > @@ -1434,6 +1447,14 @@ static void ppc_spapr_reset(void)
> >   }
> >
> >   qemu_devices_reset();
> > +
> > +/* DRC reset may cause a device to be unplugged. This will cause 
> > troubles
> > + * if this device is used by another device (eg, a running vhost 
> > backend
> > + * will crash QEMU if the DIMM holding the vring goes away). To avoid 
> > such
> > + * situations, we reset DRCs after all devices have been reset.
> > + */
> > +object_child_foreach_recursive(object_get_root(), spapr_reset_drcs, 
> > NULL);
> > +
> >   spapr_clear_pending_events(spapr);
> >
> >   /*
> > diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> > index 915e9b51c40c..e3b122968e89 100644
> > --- a/hw/ppc/spapr_drc.c
> > +++ b/hw/ppc/spapr_drc.c
> > @@ -455,11 +455,6 @@ void spapr_drc_reset(sPAPRDRConnector *drc)
> >   }
> >   }
> >
> > -static void drc_reset(void *opaque)
> > -{
> > -spapr_drc_reset(SPAPR_DR_CONNECTOR(opaque));
> > -}
> > -
> >   bool spapr_drc_needed(void *opaque)
> >   {
> >   sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
> > @@ -518,7 +513,6 @@ static void realize(DeviceState *d, Error **errp)
> >   }
> >   vmstate_register(DEVICE(drc), spapr_drc_index(drc), 
> > _spapr_drc,
> >drc);
> > -

Re: [Qemu-devel] [PATCH for-2.11] qcow2: fix image corruption on commit with persistent snapshot

2017-11-17 Thread Max Reitz
On 2017-11-17 17:47, Eric Blake wrote:
> If an image contains persistent snapshots, we cannot use the
> fast path of bdrv_make_empty() to clear the image during
> qemu-img commit, because that will lose the clusters related
> to the bitmaps.
> 
> Also leave a comment in qcow2_read_extensions to remind future
> feature additions to think about fast-path removal, since we
> just barely fixed the same bug for LUKS encryption.
> 
> It's a pain that qemu-img has not yet been taught to manipulate,
> or even at a very minimum display, information about persistent
> bitmaps; instead, we have to use QMP commands.  It's also a
> pain that only qeury-block and x-debug-block-dirty-bitmap-sha256
> will allow bitmap introspection; but the former requires the
> node to be hooked to a block device, and the latter is experimental.
> 
> Signed-off-by: Eric Blake 
> ---
> 
> As promised, based on Dan's similar patch for LUKS encryption
> 
>  block/qcow2.c  |  17 ++--
>  tests/qemu-iotests/176 |  55 ++--
>  tests/qemu-iotests/176.out | 216 
> -
>  3 files changed, 270 insertions(+), 18 deletions(-)

The test fails with -m32 and probably also on Big Endian architectures,
because the bitmap hash differs.

The reason the hash differs for -m32 is because of different bitmap
sizes: The default bitmap granularity is 64 kB, so for 32 bit this means
the bitmap always covers a multiple of 2 MB, whereas for 64 bit, the
bitmap always covers a multiple of 4 MB.
> diff --git a/tests/qemu-iotests/176 b/tests/qemu-iotests/176
> index 950b28720e..0f31a20294 100755
> --- a/tests/qemu-iotests/176
> +++ b/tests/qemu-iotests/176

[...]

> @@ -66,14 +74,29 @@ _supported_os Linux
>  for i in 0 1 2 3; do
> 
>  echo
> -echo "=== Test pass $i ==="
> +echo "=== Test pass $reason.$i ==="
>  echo
> 
>  len=$((2100 * 1024 * 1024 + 512)) # larger than 2G, and not cluster aligned

That means that with -m64, the bitmap will behave as if a 2104 MB image
has been created, but with -m32, it will behave as if it is a 2102 MB
image.  Therefore, you get different hashes (because the 64 bit bitmap
has 32 more zero bits than the 32 bit bitmap).

We could "fix" this by replacing the 2100 by a 2102, so for both bit
widths rounding is the same.  But that would still leave the issue open
for Big Endian architectures (which generate just completely different
hashes) -- and I'm not really a fan of testing this on every possible
architecture and then adding different reference outputs.

Therefore, the best fix is probably to just filter the hashes out (you
don't need the exact value anyway, do you?), and I think it's fine to do
this as a follow-up.

Max



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PULL 15/25] tests: Add check-qobject for equality tests

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

Add a new test file (check-qobject.c) for unit tests that concern
QObjects as a whole.

Its only purpose for now is to test the qobject_is_equal() function.

Signed-off-by: Max Reitz 
Message-id: 20171114180128.17076-7-mre...@redhat.com
Reviewed-by: Eric Blake 
Signed-off-by: Max Reitz 
---
 tests/check-qobject.c  | 328 +
 tests/.gitignore   |   1 +
 tests/Makefile.include |   4 +-
 3 files changed, 332 insertions(+), 1 deletion(-)
 create mode 100644 tests/check-qobject.c

diff --git a/tests/check-qobject.c b/tests/check-qobject.c
new file mode 100644
index 00..03e9175113
--- /dev/null
+++ b/tests/check-qobject.c
@@ -0,0 +1,328 @@
+/*
+ * Generic QObject unit-tests.
+ *
+ * Copyright (C) 2017 Red Hat Inc.
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+
+#include "qapi/qmp/types.h"
+#include "qemu-common.h"
+
+#include 
+
+/* Marks the end of the test_equality() argument list.
+ * We cannot use NULL there because that is a valid argument. */
+static QObject test_equality_end_of_arguments;
+
+/**
+ * Test whether all variadic QObject *arguments are equal (@expected
+ * is true) or whether they are all not equal (@expected is false).
+ * Every QObject is tested to be equal to itself (to test
+ * reflexivity), all tests are done both ways (to test symmetry), and
+ * transitivity is not assumed but checked (each object is compared to
+ * every other one).
+ *
+ * Note that qobject_is_equal() is not really an equivalence relation,
+ * so this function may not be used for all objects (reflexivity is
+ * not guaranteed, e.g. in the case of a QNum containing NaN).
+ *
+ * The @_ argument is required because a boolean may not be the last
+ * argument before a variadic argument list (C11 7.16.1.4 para. 4).
+ */
+static void do_test_equality(bool expected, int _, ...)
+{
+va_list ap_count, ap_extract;
+QObject **args;
+int arg_count = 0;
+int i, j;
+
+va_start(ap_count, _);
+va_copy(ap_extract, ap_count);
+while (va_arg(ap_count, QObject *) != _equality_end_of_arguments) {
+arg_count++;
+}
+va_end(ap_count);
+
+args = g_new(QObject *, arg_count);
+for (i = 0; i < arg_count; i++) {
+args[i] = va_arg(ap_extract, QObject *);
+}
+va_end(ap_extract);
+
+for (i = 0; i < arg_count; i++) {
+g_assert(qobject_is_equal(args[i], args[i]) == true);
+
+for (j = i + 1; j < arg_count; j++) {
+g_assert(qobject_is_equal(args[i], args[j]) == expected);
+}
+}
+}
+
+#define check_equal(...) \
+do_test_equality(true, 0, __VA_ARGS__, _equality_end_of_arguments)
+#define check_unequal(...) \
+do_test_equality(false, 0, __VA_ARGS__, _equality_end_of_arguments)
+
+static void do_free_all(int _, ...)
+{
+va_list ap;
+QObject *obj;
+
+va_start(ap, _);
+while ((obj = va_arg(ap, QObject *)) != NULL) {
+qobject_decref(obj);
+}
+va_end(ap);
+}
+
+#define free_all(...) \
+do_free_all(0, __VA_ARGS__, NULL)
+
+static void qobject_is_equal_null_test(void)
+{
+check_unequal(qnull(), NULL);
+}
+
+static void qobject_is_equal_num_test(void)
+{
+QNum *u0, *i0, *d0, *dnan, *um42, *im42, *dm42;
+
+u0 = qnum_from_uint(0u);
+i0 = qnum_from_int(0);
+d0 = qnum_from_double(0.0);
+dnan = qnum_from_double(NAN);
+um42 = qnum_from_uint((uint64_t)-42);
+im42 = qnum_from_int(-42);
+dm42 = qnum_from_double(-42.0);
+
+/* Integers representing a mathematically equal number should
+ * compare equal */
+check_equal(u0, i0);
+/* Doubles, however, are always unequal to integers */
+check_unequal(u0, d0);
+check_unequal(i0, d0);
+
+/* Do not assume any object is equal to itself -- note however
+ * that NaN cannot occur in a JSON object anyway. */
+g_assert(qobject_is_equal(QOBJECT(dnan), QOBJECT(dnan)) == false);
+
+/* No unsigned overflow */
+check_unequal(um42, im42);
+check_unequal(um42, dm42);
+check_unequal(im42, dm42);
+
+free_all(u0, i0, d0, dnan, um42, im42, dm42);
+}
+
+static void qobject_is_equal_bool_test(void)
+{
+QBool *btrue_0, *btrue_1, *bfalse_0, *bfalse_1;
+
+btrue_0 = qbool_from_bool(true);
+btrue_1 = qbool_from_bool(true);
+bfalse_0 = qbool_from_bool(false);
+bfalse_1 = qbool_from_bool(false);
+
+check_equal(btrue_0, btrue_1);
+check_equal(bfalse_0, bfalse_1);
+check_unequal(btrue_0, bfalse_0);
+
+free_all(btrue_0, btrue_1, bfalse_0, bfalse_1);
+}
+
+static void qobject_is_equal_string_test(void)
+{
+QString *str_base, *str_whitespace_0, *str_whitespace_1, *str_whitespace_2;
+QString *str_whitespace_3, *str_case, *str_built;
+
+str_base = qstring_from_str("foo");
+str_whitespace_0 = 

[Qemu-devel] [PULL 19/25] qcow2: Unaligned zero cluster in handle_alloc()

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

We should check whether the cluster offset we are about to use is
actually valid; that is, whether it is aligned to cluster boundaries.

Reported-by: R. Nageswara Sastry 
Buglink: https://bugs.launchpad.net/qemu/+bug/1728643
Buglink: https://bugs.launchpad.net/qemu/+bug/1728657
Signed-off-by: Max Reitz 
Message-id: 20171110203111.7666-3-mre...@redhat.com
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
Signed-off-by: Max Reitz 
---
 block/qcow2-cluster.c  | 13 -
 tests/qemu-iotests/060 | 16 
 tests/qemu-iotests/060.out | 10 ++
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 2e072ed155..a3fec27bf9 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1308,10 +1308,21 @@ static int handle_alloc(BlockDriverState *bs, uint64_t 
guest_offset,
 (!*host_offset ||
  start_of_cluster(s, *host_offset) == (entry & L2E_OFFSET_MASK)))
 {
+int preallocated_nb_clusters;
+
+if (offset_into_cluster(s, entry & L2E_OFFSET_MASK)) {
+qcow2_signal_corruption(bs, true, -1, -1, "Preallocated zero "
+"cluster offset %#llx unaligned (guest "
+"offset: %#" PRIx64 ")",
+entry & L2E_OFFSET_MASK, guest_offset);
+ret = -EIO;
+goto fail;
+}
+
 /* Try to reuse preallocated zero clusters; contiguous normal clusters
  * would be fine, too, but count_cow_clusters() above has limited
  * nb_clusters already to a range of COW clusters */
-int preallocated_nb_clusters =
+preallocated_nb_clusters =
 count_contiguous_clusters(nb_clusters, s->cluster_size,
   _table[l2_index], QCOW_OFLAG_COPIED);
 assert(preallocated_nb_clusters > 0);
diff --git a/tests/qemu-iotests/060 b/tests/qemu-iotests/060
index 56bdf1ee2e..49bc89df38 100755
--- a/tests/qemu-iotests/060
+++ b/tests/qemu-iotests/060
@@ -321,6 +321,22 @@ echo '--- Repairing ---'
 # because the image was already marked corrupt by that point
 _check_test_img -r all
 
+echo
+echo "=== Writing to an unaligned preallocated zero cluster ==="
+echo
+
+_make_test_img 64M
+
+# Allocate the L2 table
+$QEMU_IO -c "write 0 64k" -c "discard 0 64k" "$TEST_IMG" | _filter_qemu_io
+# Pretend there is a preallocated zero cluster somewhere inside the
+# image header
+poke_file "$TEST_IMG" "$l2_offset" "\x80\x00\x00\x00\x00\x00\x2a\x01"
+# Let's write to it!
+$QEMU_IO -c "write 0 64k" "$TEST_IMG" | _filter_qemu_io
+
+# Can't repair this yet (TODO: We can just deallocate the cluster)
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out
index f013fe73c0..c583076808 100644
--- a/tests/qemu-iotests/060.out
+++ b/tests/qemu-iotests/060.out
@@ -307,4 +307,14 @@ The following inconsistencies were found and repaired:
 
 Double checking the fixed image now...
 No errors were found on the image.
+
+=== Writing to an unaligned preallocated zero cluster ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+discard 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qcow2: Marking image as corrupt: Preallocated zero cluster offset 0x2a00 
unaligned (guest offset: 0); further corruption events will be suppressed
+write failed: Input/output error
 *** done
-- 
2.13.6




[Qemu-devel] [PULL 20/25] block: Guard against NULL bs->drv

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

We currently do not guard everywhere against a NULL bs->drv where we
should be doing so.  Most of the places fixed here just do not care
about that case at all.

Some care implicitly, e.g. through a prior function call to
bdrv_getlength() which would always fail for an ejected BDS.  Add an
assert there to make it more obvious.

Other places seem to care, but do so insufficiently: Freeing clusters in
a qcow2 image is an error-free operation, but it may leave the image in
an unusable state anyway.  Giving qcow2_free_clusters() an error code is
not really viable, it is much easier to note that bs->drv may be NULL
even after a successful driver call.  This concerns bdrv_co_flush(), and
the way the check is added to bdrv_co_pdiscard() (in every iteration
instead of only once).

Finally, some places employ at least an assert(bs->drv); somewhere, that
may be reasonable (such as in the reopen code), but in
bdrv_has_zero_init(), it is definitely not.  Returning 0 there in case
of an ejected BDS saves us much headache instead.

Reported-by: R. Nageswara Sastry 
Buglink: https://bugs.launchpad.net/qemu/+bug/1728660
Signed-off-by: Max Reitz 
Message-id: 20171110203111.7666-4-mre...@redhat.com
Reviewed-by: Eric Blake 
Signed-off-by: Max Reitz 
---
 block.c| 19 ++-
 block/io.c | 36 
 block/qapi.c   |  8 +++-
 block/replication.c| 15 +++
 block/vvfat.c  |  2 +-
 tests/qemu-iotests/060 | 22 ++
 tests/qemu-iotests/060.out | 31 +++
 7 files changed, 130 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index 70c6d7cf94..996778cfa0 100644
--- a/block.c
+++ b/block.c
@@ -720,6 +720,10 @@ static int refresh_total_sectors(BlockDriverState *bs, 
int64_t hint)
 {
 BlockDriver *drv = bs->drv;
 
+if (!drv) {
+return -ENOMEDIUM;
+}
+
 /* Do not attempt drv->bdrv_getlength() on scsi-generic devices */
 if (bdrv_is_sg(bs))
 return 0;
@@ -3431,6 +3435,10 @@ int bdrv_change_backing_file(BlockDriverState *bs,
 BlockDriver *drv = bs->drv;
 int ret;
 
+if (!drv) {
+return -ENOMEDIUM;
+}
+
 /* Backing file format doesn't make sense without a backing file */
 if (backing_fmt && !backing_file) {
 return -EINVAL;
@@ -3916,7 +3924,9 @@ int bdrv_has_zero_init_1(BlockDriverState *bs)
 
 int bdrv_has_zero_init(BlockDriverState *bs)
 {
-assert(bs->drv);
+if (!bs->drv) {
+return 0;
+}
 
 /* If BS is a copy on write image, it is initialized to
the contents of the base image, which may not be zeroes.  */
@@ -4256,6 +4266,10 @@ static int bdrv_inactivate_recurse(BlockDriverState *bs,
 BdrvChild *child, *parent;
 int ret;
 
+if (!bs->drv) {
+return -ENOMEDIUM;
+}
+
 if (!setting_flag && bs->drv->bdrv_inactivate) {
 ret = bs->drv->bdrv_inactivate(bs);
 if (ret < 0) {
@@ -4790,6 +4804,9 @@ void bdrv_remove_aio_context_notifier(BlockDriverState 
*bs,
 int bdrv_amend_options(BlockDriverState *bs, QemuOpts *opts,
BlockDriverAmendStatusCB *status_cb, void *cb_opaque)
 {
+if (!bs->drv) {
+return -ENOMEDIUM;
+}
 if (!bs->drv->bdrv_amend_options) {
 return -ENOTSUP;
 }
diff --git a/block/io.c b/block/io.c
index 3d5ef2cabe..4fdf93a014 100644
--- a/block/io.c
+++ b/block/io.c
@@ -853,6 +853,10 @@ static int coroutine_fn 
bdrv_driver_preadv(BlockDriverState *bs,
 
 assert(!(flags & ~BDRV_REQ_MASK));
 
+if (!drv) {
+return -ENOMEDIUM;
+}
+
 if (drv->bdrv_co_preadv) {
 return drv->bdrv_co_preadv(bs, offset, bytes, qiov, flags);
 }
@@ -894,6 +898,10 @@ static int coroutine_fn 
bdrv_driver_pwritev(BlockDriverState *bs,
 
 assert(!(flags & ~BDRV_REQ_MASK));
 
+if (!drv) {
+return -ENOMEDIUM;
+}
+
 if (drv->bdrv_co_pwritev) {
 ret = drv->bdrv_co_pwritev(bs, offset, bytes, qiov,
flags & bs->supported_write_flags);
@@ -945,6 +953,10 @@ bdrv_driver_pwritev_compressed(BlockDriverState *bs, 
uint64_t offset,
 {
 BlockDriver *drv = bs->drv;
 
+if (!drv) {
+return -ENOMEDIUM;
+}
+
 if (!drv->bdrv_co_pwritev_compressed) {
 return -ENOTSUP;
 }
@@ -975,6 +987,10 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BdrvChild 
*child,
 BDRV_REQUEST_MAX_BYTES);
 unsigned int progress = 0;
 
+if (!drv) {
+return -ENOMEDIUM;
+}
+
 /* FIXME We cannot require callers to have write permissions when all they
  * are doing is a read request. If we did things right, write permissions
  * would be obtained anyway, but internally by the copy-on-read code. As
@@ -1291,6 

[Qemu-devel] [PULL 12/25] qapi: Add qobject_is_equal()

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

This generic function (along with its implementations for different
types) determines whether two QObjects are equal.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
Reviewed-by: Markus Armbruster 
Message-id: 20171114180128.17076-4-mre...@redhat.com
Signed-off-by: Max Reitz 
---
 include/qapi/qmp/qbool.h   |  1 +
 include/qapi/qmp/qdict.h   |  1 +
 include/qapi/qmp/qlist.h   |  1 +
 include/qapi/qmp/qnull.h   |  2 ++
 include/qapi/qmp/qnum.h|  1 +
 include/qapi/qmp/qobject.h |  9 
 include/qapi/qmp/qstring.h |  1 +
 qobject/qbool.c|  8 +++
 qobject/qdict.c| 29 +
 qobject/qlist.c| 32 +++
 qobject/qnull.c|  9 
 qobject/qnum.c | 54 ++
 qobject/qobject.c  | 29 +
 qobject/qstring.c  |  9 
 14 files changed, 186 insertions(+)

diff --git a/include/qapi/qmp/qbool.h b/include/qapi/qmp/qbool.h
index a4c309..f77ea86c4e 100644
--- a/include/qapi/qmp/qbool.h
+++ b/include/qapi/qmp/qbool.h
@@ -24,6 +24,7 @@ typedef struct QBool {
 QBool *qbool_from_bool(bool value);
 bool qbool_get_bool(const QBool *qb);
 QBool *qobject_to_qbool(const QObject *obj);
+bool qbool_is_equal(const QObject *x, const QObject *y);
 void qbool_destroy_obj(QObject *obj);
 
 #endif /* QBOOL_H */
diff --git a/include/qapi/qmp/qdict.h b/include/qapi/qmp/qdict.h
index 7ea5120c4a..fc218e7be6 100644
--- a/include/qapi/qmp/qdict.h
+++ b/include/qapi/qmp/qdict.h
@@ -43,6 +43,7 @@ void qdict_del(QDict *qdict, const char *key);
 int qdict_haskey(const QDict *qdict, const char *key);
 QObject *qdict_get(const QDict *qdict, const char *key);
 QDict *qobject_to_qdict(const QObject *obj);
+bool qdict_is_equal(const QObject *x, const QObject *y);
 void qdict_iter(const QDict *qdict,
 void (*iter)(const char *key, QObject *obj, void *opaque),
 void *opaque);
diff --git a/include/qapi/qmp/qlist.h b/include/qapi/qmp/qlist.h
index 59d209bbae..ec3fcc1a4c 100644
--- a/include/qapi/qmp/qlist.h
+++ b/include/qapi/qmp/qlist.h
@@ -61,6 +61,7 @@ QObject *qlist_peek(QList *qlist);
 int qlist_empty(const QList *qlist);
 size_t qlist_size(const QList *qlist);
 QList *qobject_to_qlist(const QObject *obj);
+bool qlist_is_equal(const QObject *x, const QObject *y);
 void qlist_destroy_obj(QObject *obj);
 
 static inline const QListEntry *qlist_first(const QList *qlist)
diff --git a/include/qapi/qmp/qnull.h b/include/qapi/qmp/qnull.h
index d075549283..c992ee2ae1 100644
--- a/include/qapi/qmp/qnull.h
+++ b/include/qapi/qmp/qnull.h
@@ -27,4 +27,6 @@ static inline QNull *qnull(void)
 return _;
 }
 
+bool qnull_is_equal(const QObject *x, const QObject *y);
+
 #endif /* QNULL_H */
diff --git a/include/qapi/qmp/qnum.h b/include/qapi/qmp/qnum.h
index d6b0791139..c3d86794bb 100644
--- a/include/qapi/qmp/qnum.h
+++ b/include/qapi/qmp/qnum.h
@@ -69,6 +69,7 @@ double qnum_get_double(QNum *qn);
 char *qnum_to_string(QNum *qn);
 
 QNum *qobject_to_qnum(const QObject *obj);
+bool qnum_is_equal(const QObject *x, const QObject *y);
 void qnum_destroy_obj(QObject *obj);
 
 #endif /* QNUM_H */
diff --git a/include/qapi/qmp/qobject.h b/include/qapi/qmp/qobject.h
index ef1d1a9237..38ac68845c 100644
--- a/include/qapi/qmp/qobject.h
+++ b/include/qapi/qmp/qobject.h
@@ -68,6 +68,15 @@ static inline void qobject_incref(QObject *obj)
 }
 
 /**
+ * qobject_is_equal(): Return whether the two objects are equal.
+ *
+ * Any of the pointers may be NULL; return true if both are.  Always
+ * return false if only one is (therefore a QNull object is not
+ * considered equal to a NULL pointer).
+ */
+bool qobject_is_equal(const QObject *x, const QObject *y);
+
+/**
  * qobject_destroy(): Free resources used by the object
  */
 void qobject_destroy(QObject *obj);
diff --git a/include/qapi/qmp/qstring.h b/include/qapi/qmp/qstring.h
index 10076b7c8c..65c05a9be5 100644
--- a/include/qapi/qmp/qstring.h
+++ b/include/qapi/qmp/qstring.h
@@ -31,6 +31,7 @@ void qstring_append_int(QString *qstring, int64_t value);
 void qstring_append(QString *qstring, const char *str);
 void qstring_append_chr(QString *qstring, int c);
 QString *qobject_to_qstring(const QObject *obj);
+bool qstring_is_equal(const QObject *x, const QObject *y);
 void qstring_destroy_obj(QObject *obj);
 
 #endif /* QSTRING_H */
diff --git a/qobject/qbool.c b/qobject/qbool.c
index 0606bbd2a3..ac825fc5a2 100644
--- a/qobject/qbool.c
+++ b/qobject/qbool.c
@@ -52,6 +52,14 @@ QBool *qobject_to_qbool(const QObject *obj)
 }
 
 /**
+ * qbool_is_equal(): Test whether the two QBools are equal
+ */
+bool qbool_is_equal(const QObject *x, const QObject *y)
+{
+return qobject_to_qbool(x)->value == qobject_to_qbool(y)->value;
+}
+
+/**
  * 

[Qemu-devel] [PULL 21/25] qcow2: Add bounds check to get_refblock_offset()

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

Reported-by: R. Nageswara Sastry 
Buglink: https://bugs.launchpad.net/qemu/+bug/1728661
Signed-off-by: Max Reitz 
Message-id: 20171110203111.7666-5-mre...@redhat.com
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
Signed-off-by: Max Reitz 
---
 block/qcow2.h  |  6 --
 block/qcow2-refcount.c | 26 +-
 tests/qemu-iotests/060 | 46 ++
 tests/qemu-iotests/060.out | 22 ++
 4 files changed, 93 insertions(+), 7 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 782a206ecb..6f0ff15dd0 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -527,12 +527,6 @@ uint32_t offset_to_reftable_index(BDRVQcow2State *s, 
uint64_t offset)
 return offset >> (s->refcount_block_bits + s->cluster_bits);
 }
 
-static inline uint64_t get_refblock_offset(BDRVQcow2State *s, uint64_t offset)
-{
-uint32_t index = offset_to_reftable_index(s, offset);
-return s->refcount_table[index] & REFT_OFFSET_MASK;
-}
-
 /* qcow2.c functions */
 int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector *qiov,
   int64_t sector_num, int nb_sectors);
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 60b8eef3e8..3de1ab51ba 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -3077,16 +3077,40 @@ done:
 return ret;
 }
 
+static int64_t get_refblock_offset(BlockDriverState *bs, uint64_t offset)
+{
+BDRVQcow2State *s = bs->opaque;
+uint32_t index = offset_to_reftable_index(s, offset);
+int64_t covering_refblock_offset = 0;
+
+if (index < s->refcount_table_size) {
+covering_refblock_offset = s->refcount_table[index] & REFT_OFFSET_MASK;
+}
+if (!covering_refblock_offset) {
+qcow2_signal_corruption(bs, true, -1, -1, "Refblock at %#" PRIx64 " is 
"
+"not covered by the refcount structures",
+offset);
+return -EIO;
+}
+
+return covering_refblock_offset;
+}
+
 static int qcow2_discard_refcount_block(BlockDriverState *bs,
 uint64_t discard_block_offs)
 {
 BDRVQcow2State *s = bs->opaque;
-uint64_t refblock_offs = get_refblock_offset(s, discard_block_offs);
+int64_t refblock_offs;
 uint64_t cluster_index = discard_block_offs >> s->cluster_bits;
 uint32_t block_index = cluster_index & (s->refcount_block_size - 1);
 void *refblock;
 int ret;
 
+refblock_offs = get_refblock_offset(bs, discard_block_offs);
+if (refblock_offs < 0) {
+return refblock_offs;
+}
+
 assert(discard_block_offs != 0);
 
 ret = qcow2_cache_get(bs, s->refcount_block_cache, refblock_offs,
diff --git a/tests/qemu-iotests/060 b/tests/qemu-iotests/060
index 44141f6243..c230696b3a 100755
--- a/tests/qemu-iotests/060
+++ b/tests/qemu-iotests/060
@@ -359,6 +359,52 @@ echo '--- Repairing ---'
 _check_test_img -q -r all
 _check_test_img -r all
 
+echo
+echo "=== Discarding an out-of-bounds refblock ==="
+echo
+
+_make_test_img 64M
+
+# Pretend there's a refblock really up high
+poke_file "$TEST_IMG" "$(($rt_offset+8))" "\x00\xff\xff\xff\x00\x00\x00\x00"
+# Let's try to shrink the qcow2 image so that the block driver tries
+# to discard that refblock (and see what happens!)
+$QEMU_IMG resize --shrink "$TEST_IMG" 32M
+
+echo '--- Checking and retrying ---'
+# Image should not be resized
+_img_info | grep 'virtual size'
+# But it should pass this check, because the "partial" resize has
+# already overwritten refblocks past the end
+_check_test_img -r all
+# So let's try again
+$QEMU_IMG resize --shrink "$TEST_IMG" 32M
+_img_info | grep 'virtual size'
+
+echo
+echo "=== Discarding a non-covered in-bounds refblock ==="
+echo
+
+IMGOPTS='refcount_bits=1' _make_test_img 64M
+
+# Pretend there's a refblock somewhere where there is no refblock to
+# cover it (but the covering refblock has a valid index in the
+# reftable)
+# Every refblock covers 65536 * 8 * 65536 = 32 GB, so we have to point
+# to 0x10__ (64G) to point to the third refblock
+poke_file "$TEST_IMG" "$(($rt_offset+8))" "\x00\x00\x00\x10\x00\x00\x00\x00"
+$QEMU_IMG resize --shrink "$TEST_IMG" 32M
+
+echo '--- Checking and retrying ---'
+# Image should not be resized
+_img_info | grep 'virtual size'
+# But it should pass this check, because the "partial" resize has
+# already overwritten refblocks past the end
+_check_test_img -r all
+# So let's try again
+$QEMU_IMG resize --shrink "$TEST_IMG" 32M
+_img_info | grep 'virtual size'
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out
index 07dfdcac99..358e54cdc9 100644
--- a/tests/qemu-iotests/060.out
+++ b/tests/qemu-iotests/060.out
@@ -348,4 +348,26 @@ The following inconsistencies were found 

[Qemu-devel] [PULL 16/25] iotests: Add test for failing qemu-img commit

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

Signed-off-by: Max Reitz 
Message-id: 20170616135847.17726-1-mre...@redhat.com
Reviewed-by: Eric Blake 
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/020 | 27 +++
 tests/qemu-iotests/020.out | 17 +
 2 files changed, 44 insertions(+)

diff --git a/tests/qemu-iotests/020 b/tests/qemu-iotests/020
index 7a00ec..cefe3a830e 100755
--- a/tests/qemu-iotests/020
+++ b/tests/qemu-iotests/020
@@ -110,6 +110,33 @@ for offset in $TEST_OFFSETS; do
 io_zero readv $(( offset + 64 * 1024 + 65536 * 4 )) 65536 65536 1
 done
 _check_test_img
+_cleanup
+TEST_IMG=$TEST_IMG_SAVE
+
+echo
+echo 'Testing failing commit'
+echo
+
+# Create an image with a null backing file to which committing will fail (with
+# ENOSPC so we can distinguish the result from some generic EIO which may be
+# generated anywhere in the block layer)
+_make_test_img -b "json:{'driver': 'raw',
+ 'file': {
+ 'driver': 'blkdebug',
+ 'inject-error': [{
+ 'event': 'write_aio',
+ 'errno': 28,
+ 'once': true
+ }],
+ 'image': {
+ 'driver': 'null-co'
+ }}}"
+
+# Just write anything so committing will not be a no-op
+$QEMU_IO -c 'writev 0 64k' "$TEST_IMG" | _filter_qemu_io
+
+$QEMU_IMG commit "$TEST_IMG"
+_cleanup_test_img
 
 # success, all done
 echo "*** done"
diff --git a/tests/qemu-iotests/020.out b/tests/qemu-iotests/020.out
index 42f6c1b151..165b70aa49 100644
--- a/tests/qemu-iotests/020.out
+++ b/tests/qemu-iotests/020.out
@@ -1075,4 +1075,21 @@ read 65536/65536 bytes at offset 4295098368
 read 65536/65536 bytes at offset 4295294976
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 No errors were found on the image.
+
+Testing failing commit
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824 
backing_file=json:{'driver': 'raw',,
+ 'file': {
+ 'driver': 'blkdebug',,
+ 'inject-error': [{
+ 'event': 'write_aio',,
+ 'errno': 28,,
+ 'once': true
+ }],,
+ 'image': {
+ 'driver': 'null-co'
+ }}}
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qemu-img: Block job failed: No space left on device
 *** done
-- 
2.13.6




[Qemu-devel] [PULL 08/25] iotests: test clearing unknown autoclear_features by qcow2

2017-11-17 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Test clearing unknown autoclear_features by qcow2 on incoming
migration.

[ kwolf: Fixed wait for destination VM startup ]

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/196 | 66 ++
 tests/qemu-iotests/196.out |  5 
 tests/qemu-iotests/group   |  1 +
 3 files changed, 72 insertions(+)
 create mode 100755 tests/qemu-iotests/196
 create mode 100644 tests/qemu-iotests/196.out

diff --git a/tests/qemu-iotests/196 b/tests/qemu-iotests/196
new file mode 100755
index 00..4116ebc92b
--- /dev/null
+++ b/tests/qemu-iotests/196
@@ -0,0 +1,66 @@
+#!/usr/bin/env python
+#
+# Test clearing unknown autoclear_features flag by qcow2 after
+# migration. This test mimics migration to older qemu.
+#
+# Copyright (c) 2017 Parallels International GmbH
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import iotests
+from iotests import qemu_img
+
+disk = os.path.join(iotests.test_dir, 'disk')
+migfile = os.path.join(iotests.test_dir, 'migfile')
+
+class TestInvalidateAutoclear(iotests.QMPTestCase):
+
+def tearDown(self):
+self.vm_a.shutdown()
+self.vm_b.shutdown()
+os.remove(disk)
+os.remove(migfile)
+
+def setUp(self):
+qemu_img('create', '-f', iotests.imgfmt, disk, '1M')
+
+self.vm_a = iotests.VM(path_suffix='a').add_drive(disk)
+self.vm_a.launch()
+
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk)
+self.vm_b.add_incoming("exec: cat '" + migfile + "'")
+
+def test_migration(self):
+result = self.vm_a.qmp('migrate', uri='exec:cat>' + migfile)
+self.assert_qmp(result, 'return', {});
+self.assertNotEqual(self.vm_a.event_wait("STOP"), None)
+
+with open(disk, 'r+b') as f:
+f.seek(88) # first byte of autoclear_features field
+f.write(b'\xff')
+
+self.vm_b.launch()
+while True:
+result = self.vm_b.qmp('query-status')
+if result['return']['status'] == 'running':
+break
+
+with open(disk, 'rb') as f:
+f.seek(88)
+self.assertEqual(f.read(1), b'\x00')
+
+if __name__ == '__main__':
+iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/196.out b/tests/qemu-iotests/196.out
new file mode 100644
index 00..ae1213e6f8
--- /dev/null
+++ b/tests/qemu-iotests/196.out
@@ -0,0 +1,5 @@
+.
+--
+Ran 1 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 83b839bbe3..1fad602152 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -193,5 +193,6 @@
 192 rw auto quick
 194 rw auto migration quick
 195 rw auto quick
+196 rw auto quick
 197 rw auto quick
 198 rw auto
-- 
2.13.6




[Qemu-devel] [PULL 09/25] qcow2: fix image corruption on commit with persistent bitmap

2017-11-17 Thread Kevin Wolf
From: Eric Blake 

If an image contains persistent bitmaps, we cannot use the
fast path of bdrv_make_empty() to clear the image during
qemu-img commit, because that will lose the clusters related
to the bitmaps.

Also leave a comment in qcow2_read_extensions to remind future
feature additions to think about fast-path removal, since we
just barely fixed the same bug for LUKS encryption.

It's a pain that qemu-img has not yet been taught to manipulate,
or even at a very minimum display, information about persistent
bitmaps; instead, we have to use QMP commands.  It's also a
pain that only qeury-block and x-debug-block-dirty-bitmap-sha256
will allow bitmap introspection; but the former requires the
node to be hooked to a block device, and the latter is experimental.

Signed-off-by: Eric Blake 
Signed-off-by: Kevin Wolf 
---
 block/qcow2.c  |  17 ++--
 tests/qemu-iotests/176 |  55 ++--
 tests/qemu-iotests/176.out | 216 -
 3 files changed, 270 insertions(+), 18 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index e9a86b7443..f2731a7cb5 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -376,6 +376,8 @@ static int qcow2_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,
 
 default:
 /* unknown magic - save it in case we need to rewrite the header */
+/* If you add a new feature, make sure to also update the fast
+ * path of qcow2_make_empty() to deal with it. */
 {
 Qcow2UnknownHeaderExtension *uext;
 
@@ -3600,15 +3602,16 @@ static int qcow2_make_empty(BlockDriverState *bs)
 
 l1_clusters = DIV_ROUND_UP(s->l1_size, s->cluster_size / sizeof(uint64_t));
 
-if (s->qcow_version >= 3 && !s->snapshots &&
+if (s->qcow_version >= 3 && !s->snapshots && !s->nb_bitmaps &&
 3 + l1_clusters <= s->refcount_block_size &&
 s->crypt_method_header != QCOW_CRYPT_LUKS) {
-/* The following function only works for qcow2 v3 images (it requires
- * the dirty flag) and only as long as there are no snapshots (because
- * it completely empties the image). Furthermore, the L1 table and 
three
- * additional clusters (image header, refcount table, one refcount
- * block) have to fit inside one refcount block. It cannot be used
- * for LUKS (yet) as it throws away the LUKS header cluster(s) */
+/* The following function only works for qcow2 v3 images (it
+ * requires the dirty flag) and only as long as there are no
+ * features that reserve extra clusters (such as snapshots,
+ * LUKS header, or persistent bitmaps), because it completely
+ * empties the image.  Furthermore, the L1 table and three
+ * additional clusters (image header, refcount table, one
+ * refcount block) have to fit inside one refcount block. */
 return make_completely_empty(bs);
 }
 
diff --git a/tests/qemu-iotests/176 b/tests/qemu-iotests/176
index 950b28720e..0f31a20294 100755
--- a/tests/qemu-iotests/176
+++ b/tests/qemu-iotests/176
@@ -3,10 +3,11 @@
 # Commit changes into backing chains and empty the top image if the
 # backing image is not explicitly specified.
 #
-# Variant of 097, which includes snapshots to test different codepath
-# in qcow2
+# Variant of 097, which includes snapshots and persistent bitmaps, to
+# tickle the slow codepath in qcow2. See also 198, for another feature
+# that tickles the slow codepath.
 #
-# Copyright (C) 2014 Red Hat, Inc.
+# Copyright (C) 2014, 2017 Red Hat, Inc.
 #
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@@ -43,11 +44,18 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 . ./common.filter
 . ./common.pattern
 
-# Any format supporting backing files and bdrv_make_empty
+# This test is specific to qcow2
 _supported_fmt qcow2
 _supported_proto file
 _supported_os Linux
 
+function run_qemu()
+{
+$QEMU -nographic -qmp stdio -serial none "$@" 2>&1 \
+   | _filter_testdir | _filter_qmp | _filter_qemu
+}
+
+for reason in snapshot bitmap; do
 
 # Four passes:
 #  0: Two-layer backing chain, commit to upper backing file (implicitly)
@@ -66,14 +74,29 @@ _supported_os Linux
 for i in 0 1 2 3; do
 
 echo
-echo "=== Test pass $i ==="
+echo "=== Test pass $reason.$i ==="
 echo
 
 len=$((2100 * 1024 * 1024 + 512)) # larger than 2G, and not cluster aligned
 TEST_IMG="$TEST_IMG.base" _make_test_img $len
 TEST_IMG="$TEST_IMG.itmd" _make_test_img -b "$TEST_IMG.base" $len
 _make_test_img -b "$TEST_IMG.itmd" $len
-$QEMU_IMG snapshot -c snap "$TEST_IMG"
+# Update the top image to use a feature that is incompatible with fast path
+case $reason in
+snapshot) $QEMU_IMG snapshot -c snap "$TEST_IMG" ;;
+bitmap)
+   run_qemu <

[Qemu-devel] [PULL 17/25] qcow2: reject unaligned offsets in write compressed

2017-11-17 Thread Kevin Wolf
From: Anton Nefedov 

Misaligned compressed write is not supported.

Signed-off-by: Anton Nefedov 
Message-id: 1510654613-47868-2-git-send-email-anton.nefe...@virtuozzo.com
Reviewed-by: Eric Blake 
Signed-off-by: Max Reitz 
---
 block/qcow2.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index f2731a7cb5..811b913233 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3358,6 +3358,10 @@ qcow2_co_pwritev_compressed(BlockDriverState *bs, 
uint64_t offset,
 return bdrv_truncate(bs->file, cluster_offset, PREALLOC_MODE_OFF, 
NULL);
 }
 
+if (offset_into_cluster(s, offset)) {
+return -EINVAL;
+}
+
 buf = qemu_blockalign(bs, s->cluster_size);
 if (bytes != s->cluster_size) {
 if (bytes > s->cluster_size ||
-- 
2.13.6




[Qemu-devel] [PULL 11/25] qapi/qlist: Add qlist_append_null() macro

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

Besides the macro itself, this patch also adds a corresponding
Coccinelle rule.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
Message-id: 20171114180128.17076-3-mre...@redhat.com
Signed-off-by: Max Reitz 
---
 include/qapi/qmp/qlist.h | 3 +++
 scripts/coccinelle/qobject.cocci | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/include/qapi/qmp/qlist.h b/include/qapi/qmp/qlist.h
index c4b5fdad9b..59d209bbae 100644
--- a/include/qapi/qmp/qlist.h
+++ b/include/qapi/qmp/qlist.h
@@ -15,6 +15,7 @@
 
 #include "qapi/qmp/qobject.h"
 #include "qapi/qmp/qnum.h"
+#include "qapi/qmp/qnull.h"
 #include "qemu/queue.h"
 
 typedef struct QListEntry {
@@ -37,6 +38,8 @@ typedef struct QList {
 qlist_append(qlist, qbool_from_bool(value))
 #define qlist_append_str(qlist, value) \
 qlist_append(qlist, qstring_from_str(value))
+#define qlist_append_null(qlist) \
+qlist_append(qlist, qnull())
 
 #define QLIST_FOREACH_ENTRY(qlist, var) \
 for ((var) = ((qlist)->head.tqh_first); \
diff --git a/scripts/coccinelle/qobject.cocci b/scripts/coccinelle/qobject.cocci
index 1120eb1a42..47bcafe9a9 100644
--- a/scripts/coccinelle/qobject.cocci
+++ b/scripts/coccinelle/qobject.cocci
@@ -41,4 +41,7 @@ expression Obj, E;
 |
 - qlist_append(Obj, qstring_from_str(E));
 + qlist_append_str(Obj, E);
+|
+- qlist_append(Obj, qnull());
++ qlist_append_null(Obj);
 )
-- 
2.13.6




[Qemu-devel] [PULL 05/25] block: Deprecate bdrv_set_read_only() and users

2017-11-17 Thread Kevin Wolf
bdrv_set_read_only() is used by some block drivers to override the
read-only option given by the user. This is not how read-only images
generally work in QEMU: Instead of second guessing what the user really
meant (which currently includes making an image read-only even if the
user didn't only use the default, but explicitly said read-only=off), we
should error out if we can't provide what the user requested.

This adds deprecation warnings to all callers of bdrv_set_read_only() so
that the behaviour can be corrected after the usual deprecation period.

Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json |  7 +--
 block.c  |  5 +
 block/bochs.c| 13 ++---
 block/cloop.c| 13 ++---
 block/dmg.c  | 12 +---
 block/rbd.c  | 14 ++
 block/vvfat.c|  6 +-
 7 files changed, 54 insertions(+), 16 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index ab96e348e6..76bf50f813 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -3134,8 +3134,11 @@
 # This option is required on the top level of blockdev-add.
 # @discard:   discard-related options (default: ignore)
 # @cache: cache-related options
-# @read-only: whether the block device should be read-only
-# (default: false)
+# @read-only: whether the block device should be read-only (default: 
false).
+# Note that some block drivers support only read-only access,
+# either generally or in certain configurations. In this case,
+# the default value does not work and the option must be
+# specified explicitly.
 # @detect-zeroes: detect and optimize zero writes (Since 2.1)
 # (default: off)
 # @force-share:   force share all permission on added nodes.
diff --git a/block.c b/block.c
index f6415547fe..0ed0c27140 100644
--- a/block.c
+++ b/block.c
@@ -261,6 +261,11 @@ int bdrv_can_set_read_only(BlockDriverState *bs, bool 
read_only,
 return 0;
 }
 
+/* TODO Remove (deprecated since 2.11)
+ * Block drivers are not supposed to automatically change bs->read_only.
+ * Instead, they should just check whether they can provide what the user
+ * explicitly requested and error out if read-write is requested, but they can
+ * only provide read-only access. */
 int bdrv_set_read_only(BlockDriverState *bs, bool read_only, Error **errp)
 {
 int ret = 0;
diff --git a/block/bochs.c b/block/bochs.c
index a759b6eff0..50c630047b 100644
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -28,6 +28,7 @@
 #include "block/block_int.h"
 #include "qemu/module.h"
 #include "qemu/bswap.h"
+#include "qemu/error-report.h"
 
 /**/
 
@@ -110,9 +111,15 @@ static int bochs_open(BlockDriverState *bs, QDict 
*options, int flags,
 return -EINVAL;
 }
 
-ret = bdrv_set_read_only(bs, true, errp); /* no write support yet */
-if (ret < 0) {
-return ret;
+if (!bdrv_is_read_only(bs)) {
+error_report("Opening bochs images without an explicit read-only=on "
+ "option is deprecated. Future versions will refuse to "
+ "open the image instead of automatically marking the "
+ "image read-only.");
+ret = bdrv_set_read_only(bs, true, errp); /* no write support yet */
+if (ret < 0) {
+return ret;
+}
 }
 
 ret = bdrv_pread(bs->file, 0, , sizeof(bochs));
diff --git a/block/cloop.c b/block/cloop.c
index d6597fcf78..2be68987bd 100644
--- a/block/cloop.c
+++ b/block/cloop.c
@@ -23,6 +23,7 @@
  */
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "qemu/error-report.h"
 #include "qemu-common.h"
 #include "block/block_int.h"
 #include "qemu/module.h"
@@ -72,9 +73,15 @@ static int cloop_open(BlockDriverState *bs, QDict *options, 
int flags,
 return -EINVAL;
 }
 
-ret = bdrv_set_read_only(bs, true, errp);
-if (ret < 0) {
-return ret;
+if (!bdrv_is_read_only(bs)) {
+error_report("Opening cloop images without an explicit read-only=on "
+ "option is deprecated. Future versions will refuse to "
+ "open the image instead of automatically marking the "
+ "image read-only.");
+ret = bdrv_set_read_only(bs, true, errp);
+if (ret < 0) {
+return ret;
+}
 }
 
 /* read header */
diff --git a/block/dmg.c b/block/dmg.c
index 6c0711f563..c9b3c519c4 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -419,9 +419,15 @@ static int dmg_open(BlockDriverState *bs, QDict *options, 
int flags,
 return -EINVAL;
 }
 
-ret = bdrv_set_read_only(bs, true, errp);
-if (ret < 0) {
-return ret;
+if (!bdrv_is_read_only(bs)) {
+error_report("Opening dmg images without an explicit read-only=on "
+   

[Qemu-devel] [PULL 07/25] block: Fix permissions in image activation

2017-11-17 Thread Kevin Wolf
Inactive images generally request less permissions for their image files
than they would if they were active (in particular, write permissions).
Activating the image involves extending the permissions, therefore.

drv->bdrv_invalidate_cache() can already require write access to the
image file, so we have to update the permissions earlier than that.
The current code does it only later, so we have to move up this part.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
---
 block.c | 32 ++--
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index 0ed0c27140..752fe6192b 100644
--- a/block.c
+++ b/block.c
@@ -4174,7 +4174,29 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error 
**errp)
 }
 }
 
+/*
+ * Update permissions, they may differ for inactive nodes.
+ *
+ * Note that the required permissions of inactive images are always a
+ * subset of the permissions required after activating the image. This
+ * allows us to just get the permissions upfront without restricting
+ * drv->bdrv_invalidate_cache().
+ *
+ * It also means that in error cases, we don't have to try and revert to
+ * the old permissions (which is an operation that could fail, too). We can
+ * just keep the extended permissions for the next time that an activation
+ * of the image is tried.
+ */
 bs->open_flags &= ~BDRV_O_INACTIVE;
+bdrv_get_cumulative_perm(bs, , _perm);
+ret = bdrv_check_perm(bs, NULL, perm, shared_perm, NULL, _err);
+if (ret < 0) {
+bs->open_flags |= BDRV_O_INACTIVE;
+error_propagate(errp, local_err);
+return;
+}
+bdrv_set_perm(bs, perm, shared_perm);
+
 if (bs->drv->bdrv_invalidate_cache) {
 bs->drv->bdrv_invalidate_cache(bs, _err);
 if (local_err) {
@@ -4191,16 +4213,6 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error 
**errp)
 return;
 }
 
-/* Update permissions, they may differ for inactive nodes */
-bdrv_get_cumulative_perm(bs, , _perm);
-ret = bdrv_check_perm(bs, NULL, perm, shared_perm, NULL, _err);
-if (ret < 0) {
-bs->open_flags |= BDRV_O_INACTIVE;
-error_propagate(errp, local_err);
-return;
-}
-bdrv_set_perm(bs, perm, shared_perm);
-
 QLIST_FOREACH(parent, >parents, next_parent) {
 if (parent->role->activate) {
 parent->role->activate(parent, _err);
-- 
2.13.6




[Qemu-devel] [PULL 14/25] iotests: Add test for non-string option reopening

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

Signed-off-by: Max Reitz 
Reviewed-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Message-id: 20171114180128.17076-6-mre...@redhat.com
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/133 | 9 +
 tests/qemu-iotests/133.out | 5 +
 2 files changed, 14 insertions(+)

diff --git a/tests/qemu-iotests/133 b/tests/qemu-iotests/133
index 9d35a6a1ca..af6b3e1dd4 100755
--- a/tests/qemu-iotests/133
+++ b/tests/qemu-iotests/133
@@ -83,6 +83,15 @@ $QEMU_IO -c 'reopen -o driver=qcow2' $TEST_IMG
 $QEMU_IO -c 'reopen -o file.driver=file' $TEST_IMG
 $QEMU_IO -c 'reopen -o backing.driver=qcow2' $TEST_IMG
 
+echo
+echo "=== Check that reopening works with non-string options ==="
+echo
+
+# Using the json: pseudo-protocol we can create non-string options
+# (Invoke 'info' just so we get some output afterwards)
+IMGOPTSSYNTAX=false $QEMU_IO -f null-co -c 'reopen' -c 'info' \
+"json:{'driver': 'null-co', 'size': 65536}"
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/133.out b/tests/qemu-iotests/133.out
index cc86b94880..f4a85aeb63 100644
--- a/tests/qemu-iotests/133.out
+++ b/tests/qemu-iotests/133.out
@@ -19,4 +19,9 @@ Cannot change the option 'driver'
 
 === Check that unchanged driver is okay ===
 
+
+=== Check that reopening works with non-string options ===
+
+format name: null-co
+format name: null-co
 *** done
-- 
2.13.6




[Qemu-devel] [PULL 06/25] qcow2: fix image corruption after committing qcow2 image into base

2017-11-17 Thread Kevin Wolf
From: "Daniel P. Berrange" 

After committing the qcow2 image contents into the base image, qemu-img
will call bdrv_make_empty to drop the payload in the layered image.

When this is done for qcow2 images, it blows away the LUKS encryption
header, making the resulting image unusable. There are two codepaths
for emptying a qcow2 image, and the second (slower) codepath leaves
the LUKS header intact, so force use of that codepath.

Reviewed-by: Eric Blake 
Signed-off-by: Daniel P. Berrange 
Signed-off-by: Kevin Wolf 
---
 block/qcow2.c|   6 +-
 tests/qemu-iotests/198   | 104 
 tests/qemu-iotests/198.out   | 126 +++
 tests/qemu-iotests/common.filter |   4 +-
 tests/qemu-iotests/group |   1 +
 5 files changed, 238 insertions(+), 3 deletions(-)
 create mode 100755 tests/qemu-iotests/198
 create mode 100644 tests/qemu-iotests/198.out

diff --git a/block/qcow2.c b/block/qcow2.c
index 92e5d548e3..e9a86b7443 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3601,12 +3601,14 @@ static int qcow2_make_empty(BlockDriverState *bs)
 l1_clusters = DIV_ROUND_UP(s->l1_size, s->cluster_size / sizeof(uint64_t));
 
 if (s->qcow_version >= 3 && !s->snapshots &&
-3 + l1_clusters <= s->refcount_block_size) {
+3 + l1_clusters <= s->refcount_block_size &&
+s->crypt_method_header != QCOW_CRYPT_LUKS) {
 /* The following function only works for qcow2 v3 images (it requires
  * the dirty flag) and only as long as there are no snapshots (because
  * it completely empties the image). Furthermore, the L1 table and 
three
  * additional clusters (image header, refcount table, one refcount
- * block) have to fit inside one refcount block. */
+ * block) have to fit inside one refcount block. It cannot be used
+ * for LUKS (yet) as it throws away the LUKS header cluster(s) */
 return make_completely_empty(bs);
 }
 
diff --git a/tests/qemu-iotests/198 b/tests/qemu-iotests/198
new file mode 100755
index 00..34ef666381
--- /dev/null
+++ b/tests/qemu-iotests/198
@@ -0,0 +1,104 @@
+#!/bin/bash
+#
+# Test commit of encrypted qcow2 files
+#
+# Copyright (C) 2017 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=berra...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+status=1   # failure is the default!
+
+_cleanup()
+{
+   _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto generic
+_supported_os Linux
+
+
+size=16M
+TEST_IMG_BASE=$TEST_IMG.base
+SECRET0="secret,id=sec0,data=astrochicken"
+SECRET1="secret,id=sec1,data=furby"
+
+TEST_IMG_SAVE=$TEST_IMG
+TEST_IMG=$TEST_IMG_BASE
+echo "== create base =="
+_make_test_img --object $SECRET0 -o 
"encrypt.format=luks,encrypt.key-secret=sec0,encrypt.iter-time=10" $size
+TEST_IMG=$TEST_IMG_SAVE
+
+IMGSPECBASE="driver=$IMGFMT,file.filename=$TEST_IMG_BASE,encrypt.key-secret=sec0"
+IMGSPECLAYER="driver=$IMGFMT,file.filename=$TEST_IMG,encrypt.key-secret=sec1"
+IMGSPEC="$IMGSPECLAYER,backing.driver=$IMGFMT,backing.file.filename=$TEST_IMG_BASE,backing.encrypt.key-secret=sec0"
+QEMU_IO_OPTIONS=$QEMU_IO_OPTIONS_NO_FMT
+
+echo
+echo "== writing whole image base =="
+$QEMU_IO --object $SECRET0 -c "write -P 0xa 0 $size" --image-opts $IMGSPECBASE 
| _filter_qemu_io | _filter_testdir
+
+echo "== create overlay =="
+_make_test_img --object $SECRET1 -o 
"encrypt.format=luks,encrypt.key-secret=sec1,encrypt.iter-time=10" -u -b 
"$TEST_IMG_BASE" $size
+
+echo
+echo "== writing whole image layer =="
+$QEMU_IO --object $SECRET0 --object $SECRET1 -c "write -P 0x9 0 $size" 
--image-opts $IMGSPEC | _filter_qemu_io | _filter_testdir
+
+echo
+echo "== verify pattern base =="
+$QEMU_IO --object $SECRET0 -c "read -P 0xa 0 $size" --image-opts $IMGSPECBASE 
| _filter_qemu_io | _filter_testdir
+
+echo
+echo "== verify pattern layer =="
+$QEMU_IO --object $SECRET0 --object $SECRET1 -c "read -P 0x9 0 $size" 
--image-opts $IMGSPEC | _filter_qemu_io | _filter_testdir
+
+echo
+echo "== committing layer into base =="
+$QEMU_IMG commit --object $SECRET0 

[Qemu-devel] [PULL 00/25] Block layer patches for 2.11.0-rc2

2017-11-17 Thread Kevin Wolf
The following changes since commit fec035a53fa15c4c8c4e62bfef56a35df4161e38:

  Merge remote-tracking branch 'remotes/kraxel/tags/ui-20171117-pull-request' 
into staging (2017-11-17 10:18:41 +)

are available in the git repository at:

  git://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to d5a49c6e7d9e42059450674ec845b7bc0d62cb7e:

  Merge remote-tracking branch 'mreitz/tags/pull-block-2017-11-17' into 
queue-block (2017-11-17 18:24:30 +0100)


Block layer patches for 2.11.0-rc2


Anton Nefedov (1):
  qcow2: reject unaligned offsets in write compressed

Daniel P. Berrange (2):
  qcow2: don't permit changing encryption parameters
  qcow2: fix image corruption after committing qcow2 image into base

Eric Blake (1):
  qcow2: fix image corruption on commit with persistent bitmap

Kevin Wolf (5):
  qemu-iotests: Use -nographic in 182
  block: Fix error path in bdrv_backing_update_filename()
  block: Deprecate bdrv_set_read_only() and users
  block: Fix permissions in image activation
  Merge remote-tracking branch 'mreitz/tags/pull-block-2017-11-17' into 
queue-block

Max Reitz (15):
  qapi/qnull: Add own header
  qapi/qlist: Add qlist_append_null() macro
  qapi: Add qobject_is_equal()
  block: qobject_is_equal() in bdrv_reopen_prepare()
  iotests: Add test for non-string option reopening
  tests: Add check-qobject for equality tests
  iotests: Add test for failing qemu-img commit
  qcow2: check_errors are fatal
  qcow2: Unaligned zero cluster in handle_alloc()
  block: Guard against NULL bs->drv
  qcow2: Add bounds check to get_refblock_offset()
  qcow2: Refuse to get unaligned offsets from cache
  qcow2: Fix overly broad madvise()
  block: Make bdrv_next() keep strong references
  iotests: Make 087 pass without AIO enabled

Vladimir Sementsov-Ogievskiy (1):
  iotests: test clearing unknown autoclear_features by qcow2

Wang Guang (1):
  replication: Fix replication open fail

 qapi/block-core.json |   7 +-
 block/qcow2.h|   6 -
 include/block/block.h|   1 +
 include/qapi/qmp/qbool.h |   1 +
 include/qapi/qmp/qdict.h |   2 +
 include/qapi/qmp/qlist.h |   4 +
 include/qapi/qmp/qnull.h |  32 
 include/qapi/qmp/qnum.h  |   1 +
 include/qapi/qmp/qobject.h   |  21 ++-
 include/qapi/qmp/qstring.h   |   1 +
 include/qapi/qmp/types.h |   1 +
 block.c  |  90 ---
 block/block-backend.c|  48 +-
 block/bochs.c|  13 +-
 block/cloop.c|  13 +-
 block/dmg.c  |  12 +-
 block/io.c   |  36 +
 block/qapi.c |   8 +-
 block/qcow2-cache.c  |  23 ++-
 block/qcow2-cluster.c|  13 +-
 block/qcow2-refcount.c   |  26 +++-
 block/qcow2.c|  31 +++-
 block/rbd.c  |  14 +-
 block/replication.c  |  26 +++-
 block/snapshot.c |   6 +
 block/vvfat.c|   8 +-
 migration/block.c|   1 +
 qapi/qapi-clone-visitor.c|   1 +
 qapi/string-input-visitor.c  |   1 +
 qobject/qbool.c  |   8 +
 qobject/qdict.c  |  29 
 qobject/qlist.c  |  32 
 qobject/qnull.c  |  11 +-
 qobject/qnum.c   |  54 +++
 qobject/qobject.c|  29 
 qobject/qstring.c|   9 ++
 tests/check-qnull.c  |   2 +-
 tests/check-qobject.c| 328 +++
 scripts/coccinelle/qobject.cocci |   3 +
 tests/.gitignore |   1 +
 tests/Makefile.include   |   4 +-
 tests/qemu-iotests/020   |  27 
 tests/qemu-iotests/020.out   |  17 ++
 tests/qemu-iotests/060   | 125 +++
 tests/qemu-iotests/060.out   | 115 ++
 tests/qemu-iotests/087   |   9 +-
 tests/qemu-iotests/133   |   9 ++
 tests/qemu-iotests/133.out   |   5 +
 tests/qemu-iotests/176   |  55 ++-
 tests/qemu-iotests/176.out   | 216 +-
 tests/qemu-iotests/182   |   2 +-
 tests/qemu-iotests/196   |  66 
 tests/qemu-iotests/196.out   |   5 +
 tests/qemu-iotests/198   | 104 +
 tests/qemu-iotests/198.out   | 126 +++
 tests/qemu-iotests/common.filter |   4 +-
 tests/qemu-iotests/group |   2 +
 57 files changed, 1751 insertions(+), 93 deletions(-)
 create mode 100644 include/qapi/qmp/qnull.h
 create mode 100644 tests/check-qobject.c
 create mode 100755 tests/qemu-iotests/196
 create mode 100644 tests/qemu-iotests/196.

[Qemu-devel] [PULL 01/25] replication: Fix replication open fail

2017-11-17 Thread Kevin Wolf
From: Wang Guang 

replication_child_perm request write
permissions for all child which will lead bdrv_check_perm fail.
replication_child_perm() should request write
permissions only if it is writable itself.

Signed-off-by: Wang Guang 
Signed-off-by: Wang Yong 
Reviewed-by: Xie Changlong 
Signed-off-by: Kevin Wolf 
---
 block/replication.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/block/replication.c b/block/replication.c
index 3a4e6822e4..1c95d673ff 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -161,10 +161,13 @@ static void replication_child_perm(BlockDriverState *bs, 
BdrvChild *c,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
 {
-*nperm = *nshared = BLK_PERM_CONSISTENT_READ \
-| BLK_PERM_WRITE \
-| BLK_PERM_WRITE_UNCHANGED;
-
+*nperm = BLK_PERM_CONSISTENT_READ;
+if ((bs->open_flags & (BDRV_O_INACTIVE | BDRV_O_RDWR)) == BDRV_O_RDWR) {
+*nperm |= BLK_PERM_WRITE;
+}
+*nshared = BLK_PERM_CONSISTENT_READ \
+   | BLK_PERM_WRITE \
+   | BLK_PERM_WRITE_UNCHANGED;
 return;
 }
 
-- 
2.13.6




[Qemu-devel] [PULL 13/25] block: qobject_is_equal() in bdrv_reopen_prepare()

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

Currently, bdrv_reopen_prepare() assumes that all BDS options are
strings. However, this is not the case if the BDS has been created
through the json: pseudo-protocol or blockdev-add.

Note that the user-invokable reopen command is an HMP command, so you
can only specify strings there. Therefore, specifying a non-string
option with the "same" value as it was when originally created will now
return an error because the values are supposedly similar (and there is
no way for the user to circumvent this but to just not specify the
option again -- however, this is still strictly better than just
crashing).

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Message-id: 20171114180128.17076-5-mre...@redhat.com
Signed-off-by: Max Reitz 
---
 block.c | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/block.c b/block.c
index 752fe6192b..70c6d7cf94 100644
--- a/block.c
+++ b/block.c
@@ -3074,19 +3074,26 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, 
BlockReopenQueue *queue,
 const QDictEntry *entry = qdict_first(reopen_state->options);
 
 do {
-QString *new_obj = qobject_to_qstring(entry->value);
-const char *new = qstring_get_str(new_obj);
+QObject *new = entry->value;
+QObject *old = qdict_get(reopen_state->bs->options, entry->key);
+
 /*
- * Caution: while qdict_get_try_str() is fine, getting
- * non-string types would require more care.  When
- * bs->options come from -blockdev or blockdev_add, its
- * members are typed according to the QAPI schema, but
- * when they come from -drive, they're all QString.
+ * TODO: When using -drive to specify blockdev options, all values
+ * will be strings; however, when using -blockdev, blockdev-add or
+ * filenames using the json:{} pseudo-protocol, they will be
+ * correctly typed.
+ * In contrast, reopening options are (currently) always strings
+ * (because you can only specify them through qemu-io; all other
+ * callers do not specify any options).
+ * Therefore, when using anything other than -drive to create a 
BDS,
+ * this cannot detect non-string options as unchanged, because
+ * qobject_is_equal() always returns false for objects of different
+ * type.  In the future, this should be remedied by correctly 
typing
+ * all options.  For now, this is not too big of an issue because
+ * the user can simply omit options which cannot be changed anyway,
+ * so they will stay unchanged.
  */
-const char *old = qdict_get_try_str(reopen_state->bs->options,
-entry->key);
-
-if (!old || strcmp(new, old)) {
+if (!qobject_is_equal(new, old)) {
 error_setg(errp, "Cannot change the option '%s'", entry->key);
 ret = -EINVAL;
 goto error;
-- 
2.13.6




[Qemu-devel] [PULL 03/25] block: Fix error path in bdrv_backing_update_filename()

2017-11-17 Thread Kevin Wolf
error_setg_errno() takes a positive errno code. Spotted by Coverity
(CID 1381628).

Signed-off-by: Kevin Wolf 
Reviewed-by: Alberto Garcia 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
---
 block.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 684cb018da..f6415547fe 100644
--- a/block.c
+++ b/block.c
@@ -998,7 +998,7 @@ static int bdrv_backing_update_filename(BdrvChild *c, 
BlockDriverState *base,
 ret = bdrv_change_backing_file(parent, filename,
base->drv ? base->drv->format_name : "");
 if (ret < 0) {
-error_setg_errno(errp, ret, "Could not update backing file link");
+error_setg_errno(errp, -ret, "Could not update backing file link");
 }
 
 if (!(orig_flags & BDRV_O_RDWR)) {
-- 
2.13.6




[Qemu-devel] [PULL 02/25] qemu-iotests: Use -nographic in 182

2017-11-17 Thread Kevin Wolf
This avoids that random UI frontend error messages end up in the output.
In particular, we were seeing this line in CI error logs:

+Unable to init server: Could not connect: Connection refused

Signed-off-by: Kevin Wolf 
Reviewed-by: Kashyap Chamarthy 
Reviewed-by: Jeff Cody 
---
 tests/qemu-iotests/182 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/182 b/tests/qemu-iotests/182
index 2e078ceed8..4b31592fb8 100755
--- a/tests/qemu-iotests/182
+++ b/tests/qemu-iotests/182
@@ -62,7 +62,7 @@ _launch_qemu -drive 
file=$TEST_IMG,if=none,id=drive0,file.locking=on \
 
 echo
 echo "Starting a second QEMU using the same image should fail"
-echo 'quit' | $QEMU -monitor stdio \
+echo 'quit' | $QEMU -nographic -monitor stdio \
 -drive file=$TEST_IMG,if=none,id=drive0,file.locking=on \
 -device $virtioblk,drive=drive0 2>&1 | _filter_testdir 2>&1 |
 _filter_qemu |
-- 
2.13.6




[Qemu-devel] [PULL 04/25] qcow2: don't permit changing encryption parameters

2017-11-17 Thread Kevin Wolf
From: "Daniel P. Berrange" 

Currently if trying to change encryption parameters on a qcow2 image, qemu-img
will abort. We already explicitly check for attempt to change encrypt.format
but missed other parameters like encrypt.key-secret. Rather than list each
parameter, just blacklist changing of all parameters with a 'encrypt.' prefix.

Signed-off-by: Daniel P. Berrange 
Reviewed-by: Alberto Garcia 
Reviewed-by: Eric Blake 
Signed-off-by: Kevin Wolf 
---
 block/qcow2.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index b3d66a0e88..92e5d548e3 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -4069,6 +4069,9 @@ static int qcow2_amend_options(BlockDriverState *bs, 
QemuOpts *opts,
 error_report("Changing the encryption format is not 
supported");
 return -ENOTSUP;
 }
+} else if (g_str_has_prefix(desc->name, "encrypt.")) {
+error_report("Changing the encryption parameters is not 
supported");
+return -ENOTSUP;
 } else if (!strcmp(desc->name, BLOCK_OPT_CLUSTER_SIZE)) {
 cluster_size = qemu_opt_get_size(opts, BLOCK_OPT_CLUSTER_SIZE,
  cluster_size);
-- 
2.13.6




[Qemu-devel] [PULL 10/25] qapi/qnull: Add own header

2017-11-17 Thread Kevin Wolf
From: Max Reitz 

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
Reviewed-by: Markus Armbruster 
Message-id: 20171114180128.17076-2-mre...@redhat.com
Signed-off-by: Max Reitz 
---
 include/qapi/qmp/qdict.h|  1 +
 include/qapi/qmp/qnull.h| 30 ++
 include/qapi/qmp/qobject.h  | 12 
 include/qapi/qmp/types.h|  1 +
 qapi/qapi-clone-visitor.c   |  1 +
 qapi/string-input-visitor.c |  1 +
 qobject/qnull.c |  2 +-
 tests/check-qnull.c |  2 +-
 8 files changed, 36 insertions(+), 14 deletions(-)
 create mode 100644 include/qapi/qmp/qnull.h

diff --git a/include/qapi/qmp/qdict.h b/include/qapi/qmp/qdict.h
index 6588c7f0c8..7ea5120c4a 100644
--- a/include/qapi/qmp/qdict.h
+++ b/include/qapi/qmp/qdict.h
@@ -15,6 +15,7 @@
 
 #include "qapi/qmp/qobject.h"
 #include "qapi/qmp/qlist.h"
+#include "qapi/qmp/qnull.h"
 #include "qapi/qmp/qnum.h"
 #include "qemu/queue.h"
 
diff --git a/include/qapi/qmp/qnull.h b/include/qapi/qmp/qnull.h
new file mode 100644
index 00..d075549283
--- /dev/null
+++ b/include/qapi/qmp/qnull.h
@@ -0,0 +1,30 @@
+/*
+ * QNull
+ *
+ * Copyright (C) 2015 Red Hat, Inc.
+ *
+ * Authors:
+ *  Markus Armbruster 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1
+ * or later.  See the COPYING.LIB file in the top-level directory.
+ */
+
+#ifndef QNULL_H
+#define QNULL_H
+
+#include "qapi/qmp/qobject.h"
+
+struct QNull {
+QObject base;
+};
+
+extern QNull qnull_;
+
+static inline QNull *qnull(void)
+{
+QINCREF(_);
+return _;
+}
+
+#endif /* QNULL_H */
diff --git a/include/qapi/qmp/qobject.h b/include/qapi/qmp/qobject.h
index eab29edd12..ef1d1a9237 100644
--- a/include/qapi/qmp/qobject.h
+++ b/include/qapi/qmp/qobject.h
@@ -93,16 +93,4 @@ static inline QType qobject_type(const QObject *obj)
 return obj->type;
 }
 
-struct QNull {
-QObject base;
-};
-
-extern QNull qnull_;
-
-static inline QNull *qnull(void)
-{
-QINCREF(_);
-return _;
-}
-
 #endif /* QOBJECT_H */
diff --git a/include/qapi/qmp/types.h b/include/qapi/qmp/types.h
index a4bc662bfb..749ac44dcb 100644
--- a/include/qapi/qmp/types.h
+++ b/include/qapi/qmp/types.h
@@ -19,5 +19,6 @@
 #include "qapi/qmp/qstring.h"
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qlist.h"
+#include "qapi/qmp/qnull.h"
 
 #endif /* QAPI_QMP_TYPES_H */
diff --git a/qapi/qapi-clone-visitor.c b/qapi/qapi-clone-visitor.c
index d8b62792bc..daab6819b4 100644
--- a/qapi/qapi-clone-visitor.c
+++ b/qapi/qapi-clone-visitor.c
@@ -12,6 +12,7 @@
 #include "qapi/clone-visitor.h"
 #include "qapi/visitor-impl.h"
 #include "qapi/error.h"
+#include "qapi/qmp/qnull.h"
 
 struct QapiCloneVisitor {
 Visitor visitor;
diff --git a/qapi/string-input-visitor.c b/qapi/string-input-visitor.c
index 67a0a4a58b..b3fdd0827d 100644
--- a/qapi/string-input-visitor.c
+++ b/qapi/string-input-visitor.c
@@ -16,6 +16,7 @@
 #include "qapi/string-input-visitor.h"
 #include "qapi/visitor-impl.h"
 #include "qapi/qmp/qerror.h"
+#include "qapi/qmp/qnull.h"
 #include "qemu/option.h"
 #include "qemu/queue.h"
 #include "qemu/range.h"
diff --git a/qobject/qnull.c b/qobject/qnull.c
index 69a21d1059..bc9fd31626 100644
--- a/qobject/qnull.c
+++ b/qobject/qnull.c
@@ -12,7 +12,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu-common.h"
-#include "qapi/qmp/qobject.h"
+#include "qapi/qmp/qnull.h"
 
 QNull qnull_ = {
 .base = {
diff --git a/tests/check-qnull.c b/tests/check-qnull.c
index 5c6eb0adc8..afa4400da1 100644
--- a/tests/check-qnull.c
+++ b/tests/check-qnull.c
@@ -8,7 +8,7 @@
  */
 #include "qemu/osdep.h"
 
-#include "qapi/qmp/qobject.h"
+#include "qapi/qmp/qnull.h"
 #include "qemu-common.h"
 #include "qapi/qobject-input-visitor.h"
 #include "qapi/qobject-output-visitor.h"
-- 
2.13.6




Re: [Qemu-devel] [PATCH] block/snapshot: dirty all dirty bitmaps on snapshot-switch

2017-11-17 Thread John Snow


On 11/17/2017 10:01 AM, Max Reitz wrote:
> On 2017-11-17 13:30, Kevin Wolf wrote:
>> Am 23.10.2017 um 11:29 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>> Snapshot-switch actually changes active state of disk so it should
>>> reflect on dirty bitmaps. Otherwise next incremental backup using
>>> these bitmaps will be invalid.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>
>> We discussed this quite a while ago, and I'm still not convinced that
>> this approach makes sense.
> 
> I think it at least makes more sense than not handling this case at all.
> 
>> Can you give just one example of a use case where dirtying the whole
>> bitmap while loading a snapshot is the desired behaviour?
>>
>> I think the most useful behaviour would be something where the bitmaps
>> themselves are snapshotted, too.
> 
> Agreed.
> 
>>  But for the time being, the easiest and
>> safest solution might just be to error out in any snapshot operations
>> if any bitmaps are in use.
> 
> Sounds OK, too.  I personally don't have an opinion either way.
> 
> But in any case, what we did before this patch was definitely wrong so I
> consider it an improvement.
> 

This is how I feel about it too. Erroring out entirely is an option, but
code-wise just dirtying everything is at least verifiably not-wrong and
pretty simple to implement.

It's an improvement... Don't do it, but at least you won't get something
wrong after, just something heinously unoptimal.

> Max
> 



[Qemu-devel] [PATCH for-2.11] pc-bios/s390-ccw: Fix problem with invalid virtio-scsi LUN when rebooting

2017-11-17 Thread Thomas Huth
When rebooting a guest that has a virtio-scsi disk, the s390-ccw
bios sometimes bails out with an error message like this:

! SCSI cannot report LUNs: STATUS=02 RSPN=70 KEY=05 CODE=25 QLFR=00, sure !

Enabling the scsi_req* tracing in QEMU shows that the ccw bios is
trying to execute the REPORT LUNS SCSI command with a LUN != 0, and
this causes the SCSI command to fail.
Looks like we neither clear the BSS of the s390-ccw bios during reboot,
nor do we explicitly set the default_scsi_device.lun value to 0, so
this variable can contain random values from the OS after the reboot.
By setting this variable explicitly to 0, the problem is fixed and
the reboots always succeed.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1514352
Signed-off-by: Thomas Huth 
---
 pc-bios/s390-ccw/virtio-scsi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/pc-bios/s390-ccw/virtio-scsi.c b/pc-bios/s390-ccw/virtio-scsi.c
index c92f5d3..4fe4b9d 100644
--- a/pc-bios/s390-ccw/virtio-scsi.c
+++ b/pc-bios/s390-ccw/virtio-scsi.c
@@ -223,7 +223,8 @@ static void virtio_scsi_locate_device(VDev *vdev)
 
 for (target = 0; target <= vdev->config.scsi.max_target; target++) {
 sdev->channel = channel;
-sdev->target = target; /* sdev->lun will be 0 here */
+sdev->target = target;
+sdev->lun = 0;  /* LUN has to be 0 for REPORT LUNS */
 if (!scsi_report_luns(vdev, data, sizeof(data))) {
 if (resp.response == VIRTIO_SCSI_S_BAD_TARGET) {
 continue;
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH 09/10] qemu-iotests: remove unused "here" variable

2017-11-17 Thread Eric Blake
On 11/16/2017 11:38 AM, Cleber Rosa wrote:
> Another legacy variable that did not convince me it has any
> purpose whatsoever.
> 
> Signed-off-by: Cleber Rosa 
> ---

> +++ b/tests/qemu-iotests/001
> @@ -23,7 +23,6 @@
>  seq=`basename $0`
>  echo "QA output created by $seq"
>  
> -here=`pwd`

Good riddance.  And in the majority of the tests, it was a needless
fork(), compared to my preference (which is also guaranteed by POSIX and
by /bin/bash):

> +++ b/tests/qemu-iotests/176
> @@ -27,7 +27,6 @@
>  seq="$(basename $0)"
>  echo "QA output created by $seq"
>  
> -here="$PWD"

since $PWD and `pwd` should always produce the same thing, if we even
had a reason to use it. ;)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 02/15] sun4u: ebus QOMify tidy-up

2017-11-17 Thread Philippe Mathieu-Daudé
On 11/17/2017 10:42 AM, Mark Cave-Ayland wrote:
> The main change here is to introduce the proper TYPE_EBUS/EBUS QOM macros
> and remove the use of DO_UPCAST.
> 
> Alongside this there are some a couple of minor cosmetic changes and a rename
> of pci_ebus_realize() to ebus_realize() since the ebus device is always what
> is effectively a PCI-ISA bridge.
> 
> Signed-off-by: Mark Cave-Ayland 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  hw/sparc64/sun4u.c |   19 ---
>  1 file changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/sparc64/sun4u.c b/hw/sparc64/sun4u.c
> index 1672f25..394b7d6 100644
> --- a/hw/sparc64/sun4u.c
> +++ b/hw/sparc64/sun4u.c
> @@ -81,11 +81,16 @@ struct hwdef {
>  };
>  
>  typedef struct EbusState {
> -PCIDevice pci_dev;
> +/*< private >*/
> +PCIDevice parent_obj;
> +
>  MemoryRegion bar0;
>  MemoryRegion bar1;
>  } EbusState;
>  
> +#define TYPE_EBUS "ebus"
> +#define EBUS(obj) OBJECT_CHECK(EbusState, (obj), TYPE_EBUS)
> +
>  void DMA_init(ISABus *bus, int high_page_enable)
>  {
>  }
> @@ -236,9 +241,9 @@ pci_ebus_init(PCIDevice *pci_dev, qemu_irq *irqs)
>  return isa_bus;
>  }
>  
> -static void pci_ebus_realize(PCIDevice *pci_dev, Error **errp)
> +static void ebus_realize(PCIDevice *pci_dev, Error **errp)
>  {
> -EbusState *s = DO_UPCAST(EbusState, pci_dev, pci_dev);
> +EbusState *s = EBUS(pci_dev);
>  
>  if (!isa_bus_new(DEVICE(pci_dev), get_system_memory(),
>   pci_address_space_io(pci_dev), errp)) {
> @@ -264,7 +269,7 @@ static void ebus_class_init(ObjectClass *klass, void 
> *data)
>  {
>  PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>  
> -k->realize = pci_ebus_realize;
> +k->realize = ebus_realize;
>  k->vendor_id = PCI_VENDOR_ID_SUN;
>  k->device_id = PCI_DEVICE_ID_SUN_EBUS;
>  k->revision = 0x01;
> @@ -272,10 +277,10 @@ static void ebus_class_init(ObjectClass *klass, void 
> *data)
>  }
>  
>  static const TypeInfo ebus_info = {
> -.name  = "ebus",
> +.name  = TYPE_EBUS,
>  .parent= TYPE_PCI_DEVICE,
> -.instance_size = sizeof(EbusState),
>  .class_init= ebus_class_init,
> +.instance_size = sizeof(EbusState),
>  .interfaces = (InterfaceInfo[]) {
>  { INTERFACE_CONVENTIONAL_PCI_DEVICE },
>  { },
> @@ -463,7 +468,7 @@ static void sun4uv_init(MemoryRegion *address_space_mem,
>  pci_busA->slot_reserved_mask = 0xfff1;
>  pci_busB->slot_reserved_mask = 0xfff0;
>  
> -ebus = pci_create_multifunction(pci_busA, PCI_DEVFN(1, 0), true, "ebus");
> +ebus = pci_create_multifunction(pci_busA, PCI_DEVFN(1, 0), true, 
> TYPE_EBUS);
>  qdev_init_nofail(DEVICE(ebus));
>  
>  isa_bus = pci_ebus_init(ebus, pbm_irqs);
> 



Re: [Qemu-devel] [Qemu-block] [PATCH 06/10] qemu-iotests: turn owner variable into a comment

2017-11-17 Thread Eric Blake
On 11/17/2017 07:18 AM, Cleber Rosa wrote:
> 
> 
> On 11/17/2017 02:19 AM, Paolo Bonzini wrote:
>> On 16/11/2017 18:38, Cleber Rosa wrote:
>>> This variables has no real use.  To avoid pretending it does, while
>>> still keeping the information, let's turn it into a comment.
>>>
>>> The format chosen is the one already being used on tests 149 and 194.
>>
>> I would just delete it...
>>
>> Paolo
>>
> 
> I'm fine with both, but I feel this is like putting info on ChangeLogs
> in the days of GIT.  Unless other people object, I'll remove them in v2.

I'm also in the 'please delete it; git history is good enough' camp.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v8 04/14] block/dirty-bitmap: add bdrv_dirty_bitmap_set_frozen

2017-11-17 Thread Vladimir Sementsov-Ogievskiy

17.11.2017 20:20, John Snow wrote:


On 11/17/2017 09:46 AM, Vladimir Sementsov-Ogievskiy wrote:

14.11.2017 02:32, John Snow wrote:

On 10/30/2017 12:32 PM, Vladimir Sementsov-Ogievskiy wrote:

Make it possible to set bitmap 'frozen' without a successor.
This is needed to protect the bitmap during outgoing bitmap postcopy
migration.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
   include/block/dirty-bitmap.h |  1 +
   block/dirty-bitmap.c | 22 --
   2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index a9e2a92e4f..ae6d697850 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -39,6 +39,7 @@ uint32_t
bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
   uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap);
   bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
   bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
+void bdrv_dirty_bitmap_set_frozen(BdrvDirtyBitmap *bitmap, bool
frozen);
   const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap);
   int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap);
   DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap);
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 7578863aa1..67fc6bd6e0 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -40,6 +40,8 @@ struct BdrvDirtyBitmap {
   QemuMutex *mutex;
   HBitmap *bitmap;    /* Dirty bitmap implementation */
   HBitmap *meta;  /* Meta dirty bitmap */
+    bool frozen;    /* Bitmap is frozen, it can't be
modified
+   through QMP */

I hesitate, because this now means that we have two independent bits of
state we need to update and maintain consistency with.

it was your proposal)))

https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg01172.html

"
Your new use case here sounds like Frozen to me, but it simply does not
have an anonymous successor to force it to be recognized as "frozen." We
can add a `bool protected` or `bool frozen` field to force recognition
of this status and adjust the json documentation accordingly.

I think then we'd have four recognized states:

FROZEN: Cannot be reset/deleted. Bitmap is in-use by a block job or
other internal process. Bitmap is otherwise ACTIVE.
DISABLED: Not recording any writes (by choice.)
READONLY: Not able to record any writes (by necessity.)
ACTIVE: Normal bitmap status.

Sound right?
"



I was afraid you'd say that :(

It's okay, anyway. I shouldn't let myself go so long between reviews
like this, because you catch me changing my mind. Anyway, please go
ahead with it. I don't want to delay you on something that works because
I can't make up *my* mind.


Hm, if you remember, reusing "frozen" state was strange for me too. And 
it's not

late to move to
1. make a new state: MIGRATION, which disallows qmp operations on bitmap
2. or just make them unnamed, so they can't be touched by qmp

anything is ok for me as well as leaving it as is. It's all little 
things, the core is patch 10.


"frozen" sounds like unchanged, but user will see dirty-count changing 
in query-block.


--
Best regards,
Vladimir




Re: [Qemu-devel] [PATCH v3 for-2.11] QAPI & interop: Clarify events emitted by 'block-job-cancel'

2017-11-17 Thread Kashyap Chamarthy
On Fri, Nov 17, 2017 at 11:07:41AM -0600, Eric Blake wrote:
> On 11/17/2017 10:25 AM, Kashyap Chamarthy wrote:

[...]

> Sorry for not wordsmithing on earlier versions:

That's fine.  Always appreciate your wordsmithing :-)

> > +'mirror' job *after* it has indicated (by emitting the event
> > +``BLOCK_JOB_READY``) that the source and target now remain
> > +synchronized, then ``block-job-cancel`` will emit the event
> 
> Might read slightly nicer as:
> 
> the source and target have reached synchronization,

Ah, that sounds nicer indeed.  Will fix.

[...]

> > +# Note that the 'block-job-cancel' command will emit the event
> > +# BLOCK_JOB_COMPLETED if you issue it ('block-job-cancel') after 
> > 'drive-mirror'
> > +# has indicated (by emitting the event BLOCK_JOB_READY) that the source and
> > +# destination remain synchronized.  In this case, the BLOCK_JOB_COMPLETED
> > +# event indicates that synchronization (from 'drive-mirror') has 
> > successfully
> > +# ended and the destination now has a point-in-time copy, which is at the 
> > time
> > +# of cancel.
> 
> 
> Accurate, but a bit hard to follow the flow of the sentence.

Yeah, I wasn't satisfied with my phrasing too.  I sat on it for nearly
half an hour, and still it came out super clunky.

> Might read nicer as:
> 
> Note that if you issue 'block-job-cancel' after 'drive-mirror' has
> indicated (via the event BLOCK_JOB_READY) that the source and
> destination are synchronized, then the event triggered by this command
> changes to BLOCK_JOB_COMPLETED, to indicate that the mirroring has
> ended and the destination now has a point-in-time copy tied to the
> time of the cancellation.

Thanks, yours reads much nicer.

> Documentation is worth of inclusion in 2.11.  Whether you keep your
> wording, or incorporate mine in a v4, you can add:
> Reviewed-by: Eric Blake 

I'll definitely incorporate in v4.  

I'll also make similar wording adjustment in live-block-operations.rst.

Thanks for the quick review.

-- 
/kashyap



Re: [Qemu-devel] [PATCH for-2.11] qcow2: fix image corruption on commit with persistent snapshot

2017-11-17 Thread Kevin Wolf
Am 17.11.2017 um 18:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 17.11.2017 19:47, Eric Blake wrote:
> > If an image contains persistent snapshots, we cannot use the
> 
> bitmaps
> 
> > fast path of bdrv_make_empty() to clear the image during
> > qemu-img commit, because that will lose the clusters related
> > to the bitmaps.
> > 
> > Also leave a comment in qcow2_read_extensions to remind future
> > feature additions to think about fast-path removal, since we
> > just barely fixed the same bug for LUKS encryption.
> > 
> > It's a pain that qemu-img has not yet been taught to manipulate,
> > or even at a very minimum display, information about persistent
> > bitmaps; instead, we have to use QMP commands.  It's also a
> > pain that only qeury-block and x-debug-block-dirty-bitmap-sha256
> > will allow bitmap introspection; but the former requires the
> > node to be hooked to a block device, and the latter is experimental.
> 
> sorry for that pain =(.
> 
> Honestly, I don't understand why such a simple and obvious fix needs an
> additional test.

Because we could otherwise accidentally break it again in the future.

If there is a feature that you care about, make sure to have a test for
everything in it, and that you add a regression test for every bug that
you fix.

Kevin



Re: [Qemu-devel] [PATCH v8 04/14] block/dirty-bitmap: add bdrv_dirty_bitmap_set_frozen

2017-11-17 Thread John Snow


On 11/17/2017 09:46 AM, Vladimir Sementsov-Ogievskiy wrote:
> 14.11.2017 02:32, John Snow wrote:
>>
>> On 10/30/2017 12:32 PM, Vladimir Sementsov-Ogievskiy wrote:
>>> Make it possible to set bitmap 'frozen' without a successor.
>>> This is needed to protect the bitmap during outgoing bitmap postcopy
>>> migration.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>> ---
>>>   include/block/dirty-bitmap.h |  1 +
>>>   block/dirty-bitmap.c | 22 --
>>>   2 files changed, 21 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
>>> index a9e2a92e4f..ae6d697850 100644
>>> --- a/include/block/dirty-bitmap.h
>>> +++ b/include/block/dirty-bitmap.h
>>> @@ -39,6 +39,7 @@ uint32_t
>>> bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
>>>   uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap);
>>>   bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
>>>   bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
>>> +void bdrv_dirty_bitmap_set_frozen(BdrvDirtyBitmap *bitmap, bool
>>> frozen);
>>>   const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap);
>>>   int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap);
>>>   DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap);
>>> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
>>> index 7578863aa1..67fc6bd6e0 100644
>>> --- a/block/dirty-bitmap.c
>>> +++ b/block/dirty-bitmap.c
>>> @@ -40,6 +40,8 @@ struct BdrvDirtyBitmap {
>>>   QemuMutex *mutex;
>>>   HBitmap *bitmap;    /* Dirty bitmap implementation */
>>>   HBitmap *meta;  /* Meta dirty bitmap */
>>> +    bool frozen;    /* Bitmap is frozen, it can't be
>>> modified
>>> +   through QMP */
>> I hesitate, because this now means that we have two independent bits of
>> state we need to update and maintain consistency with.
> 
> it was your proposal)))
> 
> https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg01172.html
> 
> "
> Your new use case here sounds like Frozen to me, but it simply does not
> have an anonymous successor to force it to be recognized as "frozen." We
> can add a `bool protected` or `bool frozen` field to force recognition
> of this status and adjust the json documentation accordingly.
> 
> I think then we'd have four recognized states:
> 
> FROZEN: Cannot be reset/deleted. Bitmap is in-use by a block job or
> other internal process. Bitmap is otherwise ACTIVE.
> DISABLED: Not recording any writes (by choice.)
> READONLY: Not able to record any writes (by necessity.)
> ACTIVE: Normal bitmap status.
> 
> Sound right?
> "
> 
> 

I was afraid you'd say that :(

It's okay, anyway. I shouldn't let myself go so long between reviews
like this, because you catch me changing my mind. Anyway, please go
ahead with it. I don't want to delay you on something that works because
I can't make up *my* mind.



Re: [Qemu-devel] [PATCH for-2.11] qcow2: fix image corruption on commit with persistent snapshot

2017-11-17 Thread Kevin Wolf
Am 17.11.2017 um 17:47 hat Eric Blake geschrieben:
> If an image contains persistent snapshots, we cannot use the
> fast path of bdrv_make_empty() to clear the image during
> qemu-img commit, because that will lose the clusters related
> to the bitmaps.
> 
> Also leave a comment in qcow2_read_extensions to remind future
> feature additions to think about fast-path removal, since we
> just barely fixed the same bug for LUKS encryption.
> 
> It's a pain that qemu-img has not yet been taught to manipulate,
> or even at a very minimum display, information about persistent
> bitmaps; instead, we have to use QMP commands.  It's also a
> pain that only qeury-block and x-debug-block-dirty-bitmap-sha256

s/qeury/query/

> will allow bitmap introspection; but the former requires the
> node to be hooked to a block device, and the latter is experimental.
> 
> Signed-off-by: Eric Blake 

Thanks, fixed the typo above and applied to the block branch.

Kevin



Re: [Qemu-devel] [PATCH for-2.11] qcow2: fix image corruption on commit with persistent snapshot

2017-11-17 Thread Eric Blake
On 11/17/2017 11:07 AM, Vladimir Sementsov-Ogievskiy wrote:
> 17.11.2017 19:47, Eric Blake wrote:
>> If an image contains persistent snapshots, we cannot use the
> 
> bitmaps

Oh, right. Maintainer can fix that, if I don't need to respin.

> 
>> fast path of bdrv_make_empty() to clear the image during
>> qemu-img commit, because that will lose the clusters related
>> to the bitmaps.
>>
>> Also leave a comment in qcow2_read_extensions to remind future
>> feature additions to think about fast-path removal, since we
>> just barely fixed the same bug for LUKS encryption.
>>
>> It's a pain that qemu-img has not yet been taught to manipulate,
>> or even at a very minimum display, information about persistent
>> bitmaps; instead, we have to use QMP commands.  It's also a
>> pain that only qeury-block and x-debug-block-dirty-bitmap-sha256
>> will allow bitmap introspection; but the former requires the
>> node to be hooked to a block device, and the latter is experimental.
> 
> sorry for that pain =(.
> 
> Honestly, I don't understand why such a simple and obvious fix needs an
> additional test.

Because the test was cool!  Here's how it fails without the qcow2 fix:

--- /home/eblake/qemu/tests/qemu-iotests/176.out2017-11-17
10:41:56.287187401 -0600
+++ /home/eblake/qemu/tests/qemu-iotests/176.out.bad2017-11-17
10:42:07.621220260 -0600
@@ -179,6 +179,8 @@
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 wrote 512/512 bytes at offset 2202009600
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qemu-img: Bitmap '' doesn't satisfy the constraints
+qemu-img: Persistent bitmaps are lost for node '#block148'
 Image committed.
 read 196608/196608 bytes at offset 2147287040
 192 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
@@ -198,14 +200,11 @@
 0x7ffd  0x1 TEST_DIR/t.IMGFMT.base
 0x7ffe  0x2 TEST_DIR/t.IMGFMT.itmd
 0x8340  0x200   TEST_DIR/t.IMGFMT.itmd
-Offset  Length  File
-0x7ffd  0x1 TEST_DIR/t.IMGFMT.base
-0x7ffe  0x2 TEST_DIR/t.IMGFMT.itmd
-0x8340  0x200   TEST_DIR/t.IMGFMT.itmd
+qemu-img: Could not open
'/home/eblake/qemu/tests/qemu-iotests/scratch/t.qcow2': Bitmap ''
doesn't satisfy the constraints
 QMP_VERSION
 {"return": {}}
-{"return": {}}
-{"return": {"sha256":
"e12600978d86b5a453861ae5c17d275204673fef3874b7c3c5433c6153d84706"}}
+{"error": {"class": "GenericError", "desc": "Bitmap '' doesn't satisfy
the constraints"}}
+{"error": {"class": "GenericError", "desc": "Node 'drive0' not found"}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
"event": "SHUTDOWN", "data": {"guest": false}}


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v3 for-2.11] QAPI & interop: Clarify events emitted by 'block-job-cancel'

2017-11-17 Thread Eric Blake
On 11/17/2017 10:25 AM, Kashyap Chamarthy wrote:
> When you cancel an in-progress live block operation with QMP
> `block-job-cancel`, it emits the event: BLOCK_JOB_CANCELLED.  However,
> when `block-job-cancel` is issued after `drive-mirror` has indicated (by
> emitting the event BLOCK_JOB_READY) that the source and destination
> remain synchronized:
> 

> But this is expected behaviour, where the _COMPLETED event indicates
> that synchronization has successfully ended (and the destination has a
> point-in-time copy, which is at the time of cancel).
> 
> So add a small note to this effect in 'block-core.json'.  While at it,
> also update the "Live disk synchronization -- drive-mirror and
> blockdev-mirror" section in 'live-block-operations.rst'.
> 
> (Thanks: Max Reitz for reminding me of this caveat on IRC.)

Sorry for not wordsmithing on earlier versions:

> 
> Signed-off-by: Kashyap Chamarthy 
> ---

> +.. note::
> +
> +When you cancel an in-progress 'mirror' job *before* the source and
> +target are synchronized, ``block-job-cancel`` will emit the event
> +``BLOCK_JOB_CANCELLED``.  However, note that if you cancel a
> +'mirror' job *after* it has indicated (by emitting the event
> +``BLOCK_JOB_READY``) that the source and target now remain
> +synchronized, then ``block-job-cancel`` will emit the event

Might read slightly nicer as:

the source and target have reached synchronization,


> +++ b/qapi/block-core.json
> @@ -2065,6 +2065,14 @@
>  # BLOCK_JOB_CANCELLED event.  Before that happens the job is still visible 
> when
>  # enumerated using query-block-jobs.
>  #
> +# Note that the 'block-job-cancel' command will emit the event
> +# BLOCK_JOB_COMPLETED if you issue it ('block-job-cancel') after 
> 'drive-mirror'
> +# has indicated (by emitting the event BLOCK_JOB_READY) that the source and
> +# destination remain synchronized.  In this case, the BLOCK_JOB_COMPLETED
> +# event indicates that synchronization (from 'drive-mirror') has successfully
> +# ended and the destination now has a point-in-time copy, which is at the 
> time
> +# of cancel.


Accurate, but a bit hard to follow the flow of the sentence.  Might read
nicer as:

Note that if you issue 'block-job-cancel' after 'drive-mirror' has
indicated (via the event BLOCK_JOB_READY) that the source and
destination are synchronized, then the event triggered by this command
changes to BLOCK_JOB_COMPLETED, to indicate that the mirroring has ended
and the destination now has a point-in-time copy tied to the time of the
cancellation.

Documentation is worth of inclusion in 2.11.  Whether you keep your
wording, or incorporate mine in a v4, you can add:
Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH for-2.11] qcow2: fix image corruption on commit with persistent snapshot

2017-11-17 Thread Vladimir Sementsov-Ogievskiy

17.11.2017 19:47, Eric Blake wrote:

If an image contains persistent snapshots, we cannot use the


bitmaps


fast path of bdrv_make_empty() to clear the image during
qemu-img commit, because that will lose the clusters related
to the bitmaps.

Also leave a comment in qcow2_read_extensions to remind future
feature additions to think about fast-path removal, since we
just barely fixed the same bug for LUKS encryption.

It's a pain that qemu-img has not yet been taught to manipulate,
or even at a very minimum display, information about persistent
bitmaps; instead, we have to use QMP commands.  It's also a
pain that only qeury-block and x-debug-block-dirty-bitmap-sha256
will allow bitmap introspection; but the former requires the
node to be hooked to a block device, and the latter is experimental.


sorry for that pain =(.

Honestly, I don't understand why such a simple and obvious fix needs an 
additional test.




Signed-off-by: Eric Blake 
---

As promised, based on Dan's similar patch for LUKS encryption

  block/qcow2.c  |  17 ++--
  tests/qemu-iotests/176 |  55 ++--
  tests/qemu-iotests/176.out | 216 -
  3 files changed, 270 insertions(+), 18 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index e9a86b7443..f2731a7cb5 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -376,6 +376,8 @@ static int qcow2_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,

  default:
  /* unknown magic - save it in case we need to rewrite the header 
*/
+/* If you add a new feature, make sure to also update the fast
+ * path of qcow2_make_empty() to deal with it. */
  {
  Qcow2UnknownHeaderExtension *uext;

@@ -3600,15 +3602,16 @@ static int qcow2_make_empty(BlockDriverState *bs)

  l1_clusters = DIV_ROUND_UP(s->l1_size, s->cluster_size / 
sizeof(uint64_t));

-if (s->qcow_version >= 3 && !s->snapshots &&
+if (s->qcow_version >= 3 && !s->snapshots && !s->nb_bitmaps &&
  3 + l1_clusters <= s->refcount_block_size &&
  s->crypt_method_header != QCOW_CRYPT_LUKS) {
-/* The following function only works for qcow2 v3 images (it requires
- * the dirty flag) and only as long as there are no snapshots (because
- * it completely empties the image). Furthermore, the L1 table and 
three
- * additional clusters (image header, refcount table, one refcount
- * block) have to fit inside one refcount block. It cannot be used
- * for LUKS (yet) as it throws away the LUKS header cluster(s) */
+/* The following function only works for qcow2 v3 images (it
+ * requires the dirty flag) and only as long as there are no
+ * features that reserve extra clusters (such as snapshots,
+ * LUKS header, or persistent bitmaps), because it completely
+ * empties the image.  Furthermore, the L1 table and three
+ * additional clusters (image header, refcount table, one
+ * refcount block) have to fit inside one refcount block. */
  return make_completely_empty(bs);
  }

diff --git a/tests/qemu-iotests/176 b/tests/qemu-iotests/176
index 950b28720e..0f31a20294 100755
--- a/tests/qemu-iotests/176
+++ b/tests/qemu-iotests/176
@@ -3,10 +3,11 @@
  # Commit changes into backing chains and empty the top image if the
  # backing image is not explicitly specified.
  #
-# Variant of 097, which includes snapshots to test different codepath
-# in qcow2
+# Variant of 097, which includes snapshots and persistent bitmaps, to
+# tickle the slow codepath in qcow2. See also 198, for another feature
+# that tickles the slow codepath.
  #
-# Copyright (C) 2014 Red Hat, Inc.
+# Copyright (C) 2014, 2017 Red Hat, Inc.
  #
  # This program is free software; you can redistribute it and/or modify
  # it under the terms of the GNU General Public License as published by
@@ -43,11 +44,18 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
  . ./common.filter
  . ./common.pattern

-# Any format supporting backing files and bdrv_make_empty
+# This test is specific to qcow2
  _supported_fmt qcow2
  _supported_proto file
  _supported_os Linux

+function run_qemu()
+{
+$QEMU -nographic -qmp stdio -serial none "$@" 2>&1 \
+   | _filter_testdir | _filter_qmp | _filter_qemu
+}
+
+for reason in snapshot bitmap; do

  # Four passes:
  #  0: Two-layer backing chain, commit to upper backing file (implicitly)
@@ -66,14 +74,29 @@ _supported_os Linux
  for i in 0 1 2 3; do

  echo
-echo "=== Test pass $i ==="
+echo "=== Test pass $reason.$i ==="
  echo

  len=$((2100 * 1024 * 1024 + 512)) # larger than 2G, and not cluster aligned
  TEST_IMG="$TEST_IMG.base" _make_test_img $len
  TEST_IMG="$TEST_IMG.itmd" _make_test_img -b "$TEST_IMG.base" $len
  _make_test_img -b "$TEST_IMG.itmd" $len
-$QEMU_IMG snapshot -c snap "$TEST_IMG"
+# Update the top image to use a feature that is incompatible 

Re: [Qemu-devel] [PATCH] block: Close a BlockDriverState completely even when bs->drv is NULL

2017-11-17 Thread Kevin Wolf
Am 17.11.2017 um 17:19 hat Alberto Garcia geschrieben:
> On Fri 17 Nov 2017 05:14:08 PM CET, Max Reitz wrote:
> > On 2017-11-06 15:53, Alberto Garcia wrote:
> >> bdrv_close() skips much of its logic when bs->drv is NULL. This is
> >> fine when we're closing a BlockDriverState that has just been created
> >> (because e.g the initialization process failed), but it's not enough
> >> in other cases.
> >> 
> >> For example, when a valid qcow2 image is found to be corrupted then
> >> QEMU marks it as such in the file header and then sets bs->drv to
> >> NULL in order to make the BlockDriverState unusable. When that BDS is
> >> later closed then many of its data structures are not freed (leaking
> >> their memory) and none of its children are detached. This results in
> >> bdrv_close_all() failing to close all BDSs and making this assertion
> >> fail when QEMU is being shut down:
> >> 
> >>bdrv_close_all: Assertion `QTAILQ_EMPTY(_bdrv_states)' failed.
> >> 
> >> This patch makes bdrv_close() do the full uninitialization process
> >> in all cases. This fixes the problem with corrupted images and still
> >> works fine with freshly created BDSs.
> >> 
> >> Signed-off-by: Alberto Garcia 
> >> ---
> >>  block.c| 57 
> >> +++---
> >>  tests/qemu-iotests/060 | 13 +++
> >>  tests/qemu-iotests/060.out | 12 ++
> >>  3 files changed, 53 insertions(+), 29 deletions(-)
> >
> > Sooo...  What's the exact status of this patch? :-)
> 
> I can resend it rebased on top of your block branch, but I'm fine if you
> merge the iotest manually (it's a trivial merge).
> 
> I'm not sure about Kevin's comments though, it wasn't clear to me if
> he's fine if we apply this patch or not.

I'm not sure if it's enough, but I think the patch is good anyway.

Kevin



Re: [Qemu-devel] [PATCH v22 00/30] qcow2: persistent dirty bitmaps

2017-11-17 Thread Eric Blake
On 11/17/2017 10:17 AM, Vladimir Sementsov-Ogievskiy wrote:

>> Nothing in this series touched qemu-img.  At the very minimum, it would
>> be nice if 'qemu-img info' were to display a summary of all bitmaps in
>> the qcow2 file; the QMP type ImageInfoSpecificQCow2 should be updated to
>> mention this information, and qemu-img updated to output it.  Being able
>> to create and remove persistent bitmaps from an at-rest qcow2 file
>> without having to fire up qemu to issue QMP commands to do so would also
>> be nice.
>>
>> It's a bit late for getting this request into 2.12, but had it been

Make that "too late for 2.11"

>> available, it would make my work at testing that 'qemu-img commit'
>> doesn't corrupt persistent bitmaps.
>>
> 
> You are right, this is needed, but I can't promise something for the
> nearest future..

And as we both agree it is work for 2.12 or later, it is not needed
right away; so we have a couple of months.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH for-2.11] qcow2: fix image corruption on commit with persistent snapshot

2017-11-17 Thread Eric Blake
If an image contains persistent snapshots, we cannot use the
fast path of bdrv_make_empty() to clear the image during
qemu-img commit, because that will lose the clusters related
to the bitmaps.

Also leave a comment in qcow2_read_extensions to remind future
feature additions to think about fast-path removal, since we
just barely fixed the same bug for LUKS encryption.

It's a pain that qemu-img has not yet been taught to manipulate,
or even at a very minimum display, information about persistent
bitmaps; instead, we have to use QMP commands.  It's also a
pain that only qeury-block and x-debug-block-dirty-bitmap-sha256
will allow bitmap introspection; but the former requires the
node to be hooked to a block device, and the latter is experimental.

Signed-off-by: Eric Blake 
---

As promised, based on Dan's similar patch for LUKS encryption

 block/qcow2.c  |  17 ++--
 tests/qemu-iotests/176 |  55 ++--
 tests/qemu-iotests/176.out | 216 -
 3 files changed, 270 insertions(+), 18 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index e9a86b7443..f2731a7cb5 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -376,6 +376,8 @@ static int qcow2_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,

 default:
 /* unknown magic - save it in case we need to rewrite the header */
+/* If you add a new feature, make sure to also update the fast
+ * path of qcow2_make_empty() to deal with it. */
 {
 Qcow2UnknownHeaderExtension *uext;

@@ -3600,15 +3602,16 @@ static int qcow2_make_empty(BlockDriverState *bs)

 l1_clusters = DIV_ROUND_UP(s->l1_size, s->cluster_size / sizeof(uint64_t));

-if (s->qcow_version >= 3 && !s->snapshots &&
+if (s->qcow_version >= 3 && !s->snapshots && !s->nb_bitmaps &&
 3 + l1_clusters <= s->refcount_block_size &&
 s->crypt_method_header != QCOW_CRYPT_LUKS) {
-/* The following function only works for qcow2 v3 images (it requires
- * the dirty flag) and only as long as there are no snapshots (because
- * it completely empties the image). Furthermore, the L1 table and 
three
- * additional clusters (image header, refcount table, one refcount
- * block) have to fit inside one refcount block. It cannot be used
- * for LUKS (yet) as it throws away the LUKS header cluster(s) */
+/* The following function only works for qcow2 v3 images (it
+ * requires the dirty flag) and only as long as there are no
+ * features that reserve extra clusters (such as snapshots,
+ * LUKS header, or persistent bitmaps), because it completely
+ * empties the image.  Furthermore, the L1 table and three
+ * additional clusters (image header, refcount table, one
+ * refcount block) have to fit inside one refcount block. */
 return make_completely_empty(bs);
 }

diff --git a/tests/qemu-iotests/176 b/tests/qemu-iotests/176
index 950b28720e..0f31a20294 100755
--- a/tests/qemu-iotests/176
+++ b/tests/qemu-iotests/176
@@ -3,10 +3,11 @@
 # Commit changes into backing chains and empty the top image if the
 # backing image is not explicitly specified.
 #
-# Variant of 097, which includes snapshots to test different codepath
-# in qcow2
+# Variant of 097, which includes snapshots and persistent bitmaps, to
+# tickle the slow codepath in qcow2. See also 198, for another feature
+# that tickles the slow codepath.
 #
-# Copyright (C) 2014 Red Hat, Inc.
+# Copyright (C) 2014, 2017 Red Hat, Inc.
 #
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@@ -43,11 +44,18 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 . ./common.filter
 . ./common.pattern

-# Any format supporting backing files and bdrv_make_empty
+# This test is specific to qcow2
 _supported_fmt qcow2
 _supported_proto file
 _supported_os Linux

+function run_qemu()
+{
+$QEMU -nographic -qmp stdio -serial none "$@" 2>&1 \
+   | _filter_testdir | _filter_qmp | _filter_qemu
+}
+
+for reason in snapshot bitmap; do

 # Four passes:
 #  0: Two-layer backing chain, commit to upper backing file (implicitly)
@@ -66,14 +74,29 @@ _supported_os Linux
 for i in 0 1 2 3; do

 echo
-echo "=== Test pass $i ==="
+echo "=== Test pass $reason.$i ==="
 echo

 len=$((2100 * 1024 * 1024 + 512)) # larger than 2G, and not cluster aligned
 TEST_IMG="$TEST_IMG.base" _make_test_img $len
 TEST_IMG="$TEST_IMG.itmd" _make_test_img -b "$TEST_IMG.base" $len
 _make_test_img -b "$TEST_IMG.itmd" $len
-$QEMU_IMG snapshot -c snap "$TEST_IMG"
+# Update the top image to use a feature that is incompatible with fast path
+case $reason in
+snapshot) $QEMU_IMG snapshot -c snap "$TEST_IMG" ;;
+bitmap)
+   run_qemu <

Re: [Qemu-devel] [PATCH 0/4] ramfb: simple boot framebuffer, no legacy vga

2017-11-17 Thread Laszlo Ersek
CC Ard & Drew

Hi,

On 11/17/17 13:04, Gerd Hoffmann wrote:
>   Hi,
> 
> Ok folks, here is a experimental patch series for a legacy free boot
> framebuffer.  If you want play with it I recommend getting the bits from
> 
>   https://www.kraxel.org/cgit/qemu/log/?h=sirius/ramfb
> 
> because they come with an updated seabios and a new vgabios rom.
> 
> The boot framebuffer is expected to be configured by the firmware, so it
> uses fw_cfg as interface.  Initialization goes as follows:
> 
>   (1) Check whenever etc/ramfb is present.
>   (2) Allocate framebuffer from RAM.
>   (3) Fill struct RAMFBCfg, write it to etc/ramfb.

Please add documentation on the "fourcc" (pixel format?) and "flags"
fields. Also please document the endianness of the integer fields.

> 
> Done.  You can write stuff to the framebuffer now, and it should appear
> automagically on the screen.
> 
> Note that this isn't very efficient because it does a full display
> update on each refresh.  No dirty tracking.  Dirty tracking would have
> to be active for the whole ram slot, so that wouldn't be very efficient
> either.  So it is *really* intended to be only active for a short time
> at boot, before the guest loaded the drivers for the real display
> hardware.
> 
> seavgabios is able to emulate vga text mode on top of a framebuffer, for
> coreboot native graphics initialialization.  Which works fine for
> everything which writes text using the vgabios interface (basically
> everyhing which works with sgabios).
> 
> So I hacked that up to work with ramfb.  Right now it's proof-of-concept
> code with too much cut+paste, so it will clearly need a bunch of
> cleanups if this approach turns out to be workable.  Look here:
>   https://www.kraxe.org/cgit/seabios/log/?h=ramfb
> 
> So, how to play?
> 
> There is ramfb-testdev.  Standalone device, for testing purposes.  Also
> listens on vga ports and logs any access, so we can see the bad boys ;)
> Use "qemu -vga none -device ramfb-testdev".
> 
> There is virtio-ramfb.  Simlar to virtio-vga, but using ramfb instead of
> adding vga compatibility.  Shows how you can wire up ramfb support to
> some display device.  Unlike virtio-vga it should work fine on arm.  Use
> "qemu -vga none -device virtio-ramfb" for this one.

(Side note: .)

We should make sure that any device model that combines ramfb with
another PCI display device is not matched by the OVMF driver for that
PCI display device. IOW, we should use separate PCI IDs or subsystem IDs
(I don't recall the details off-hand). I'd like to avoid messing with
the current device probing code in OVMF. It just wouldn't be nice if two
independent drivers (e.g. VirtioGpuDxe and a supposed "RamFbDxe") bound
the two interfaces at the same time.

This device will take a DXE driver or UEFI driver that does not follow
the UEFI driver model because there's no "controller handle" to bind.
(The thing to probe is an fw_cfg file.)

> 
> What works?
> 
> Boot loaders all use vgabios calls for text mode, so they show up just
> fine.  Also ipxe, seabios itself of course.  So you can boot up your
> linux guest.  vesafb works too.
> 
> What doesn't work?
> 
> vgacon (direct vga hardware access).
> 
> Windows (bios mode).  Boot logo shows up just fine.  But at some point
> windows does lots of vga register accesses (even though it sets the
> video mode via vesa bios interface) and appears to be unhappy that
> things don't work as expected because there is no actual vga hardware.
> 
> Future plans:
> 
> Try use that as vgpu boot display.  Requires solving that windows issue
> somehow, otherwise it's going to be a non-starter.
> 
> Trick László into writing a UEFI driver for that.

I cannot imagine when I'll get to this. :(

Feel free to file a BZ somewhere (TianoCore or RH Bugzilla) and/or poke
me occasionally. My TODO list is full of things than are *all* more
urgenterer than all the other things on the list.

I'm nearing TODO list bankruptcy (IRL too).

> Linux will use efifb
> not vgacon then, and hopefully Windows doesn't try access vga registers
> either.
> 
> Comments?
> Hints on the windows issue anyone?

We looked into those when we worked on the Int10h VBE shim in OVMF. We
decided the VGA register accesses were way too complicated and so we did
the shim for BOCHS VGA and QXL only. IIRC. (I could be conflating things.)

This looks like a great job, honestly; I'm sorry I can't say more
positive things about a GOP driver for it right now.

Thanks,
Laszlo

> 
> enjoy,
>   Gerd
> 
> Gerd Hoffmann (4):
>   [testing] update bios, add vgabios-ramfb
>   ramfb: simple boot framebuffer living in guest ram
>   add virtio-ramfb
>   add ramfb-testdev
> 
>  include/hw/display/ramfb.h |   8 +++
>  hw/display/ramfb-testdev.c |  87 ++
>  hw/display/ramfb.c |  94 
>  hw/display/virtio-ramfb.c  | 149 
> +
>  

Re: [Qemu-devel] [Qemu-ppc] [PATCH for-2.11] spapr: reset DRCs after devices

2017-11-17 Thread Greg Kurz
On Fri, 17 Nov 2017 14:21:27 -0200
Daniel Henrique Barboza  wrote:

> On 11/17/2017 10:56 AM, Greg Kurz wrote:
> > A DRC with a pending unplug request releases its associated device at
> > machine reset time.
> >
> > In the case of LMB, when all DRCs for a DIMM device have been reset,
> > the DIMM gets unplugged, causing guest memory to disappear. This may
> > be very confusing for anything still using this memory.
> >
> > This is exactly what happens with vhost backends, and QEMU aborts
> > with:
> >
> > qemu-system-ppc64: used ring relocated for ring 2
> > qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion
> >   `r >= 0' failed.
> >
> > The issue is that each DRC registers a QEMU reset handler, and we
> > don't control the order in which these handlers are called (ie,
> > a LMB DRC will unplug a DIMM before the virtio device using the
> > memory on this DIMM could stop its vhost backend).
> >
> > To avoid such situations, let's reset DRCs after all devices
> > have been reset.
> >
> > Reported-by: Mallesh N. Koti 
> > Signed-off-by: Greg Kurz 
> > ---  
> I believe this has to do with the problem discussed at the memory
> unplug reboot with vhost=on, right?
> 
> I don't see any problem with this patch and it's probably the best solution
> short-term for the problem (and potentially other related LMB release
> problems), but that said, how hard it is to forbid LMB unplug at all if the
> memory is being used in vhost?
> 

I don't know yet but I'll dig some more.

> The way I see it, the root problem is that a device unplug failed at the 
> guest
> side and we don't know about it, leaving drc->unplug_requested flags set
> around the code when it shouldn't. Reading that thread I see a comment from
> Michael S. Tsirkin saying that this bug is somewhat expected, that vhost 
> backend will
> not behave well if memory is going away. Unlike other LMB unplug 
> failures from
> the guest side that might happen for any other reason, this one sees 
> predicable
> and we can avoid it.
> 
> I think this is something worth considering. But for now,
> 
> 
> Reviewed-by: Daniel Henrique Barboza 
> 

Thanks!

> >   hw/ppc/spapr.c |   21 +
> >   hw/ppc/spapr_drc.c |7 ---
> >   2 files changed, 21 insertions(+), 7 deletions(-)
> >
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 6841bd294b3c..6285f7211f9a 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1411,6 +1411,19 @@ static void find_unknown_sysbus_device(SysBusDevice 
> > *sbdev, void *opaque)
> >   }
> >   }
> >
> > +static int spapr_reset_drcs(Object *child, void *opaque)
> > +{
> > +sPAPRDRConnector *drc =
> > +(sPAPRDRConnector *) object_dynamic_cast(child,
> > + TYPE_SPAPR_DR_CONNECTOR);
> > +
> > +if (drc) {
> > +spapr_drc_reset(drc);
> > +}
> > +
> > +return 0;
> > +}
> > +
> >   static void ppc_spapr_reset(void)
> >   {
> >   MachineState *machine = MACHINE(qdev_get_machine());
> > @@ -1434,6 +1447,14 @@ static void ppc_spapr_reset(void)
> >   }
> >
> >   qemu_devices_reset();
> > +
> > +/* DRC reset may cause a device to be unplugged. This will cause 
> > troubles
> > + * if this device is used by another device (eg, a running vhost 
> > backend
> > + * will crash QEMU if the DIMM holding the vring goes away). To avoid 
> > such
> > + * situations, we reset DRCs after all devices have been reset.
> > + */
> > +object_child_foreach_recursive(object_get_root(), spapr_reset_drcs, 
> > NULL);
> > +
> >   spapr_clear_pending_events(spapr);
> >
> >   /*
> > diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> > index 915e9b51c40c..e3b122968e89 100644
> > --- a/hw/ppc/spapr_drc.c
> > +++ b/hw/ppc/spapr_drc.c
> > @@ -455,11 +455,6 @@ void spapr_drc_reset(sPAPRDRConnector *drc)
> >   }
> >   }
> >
> > -static void drc_reset(void *opaque)
> > -{
> > -spapr_drc_reset(SPAPR_DR_CONNECTOR(opaque));
> > -}
> > -
> >   bool spapr_drc_needed(void *opaque)
> >   {
> >   sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
> > @@ -518,7 +513,6 @@ static void realize(DeviceState *d, Error **errp)
> >   }
> >   vmstate_register(DEVICE(drc), spapr_drc_index(drc), 
> > _spapr_drc,
> >drc);
> > -qemu_register_reset(drc_reset, drc);
> >   trace_spapr_drc_realize_complete(spapr_drc_index(drc));
> >   }
> >
> > @@ -529,7 +523,6 @@ static void unrealize(DeviceState *d, Error **errp)
> >   gchar *name;
> >
> >   trace_spapr_drc_unrealize(spapr_drc_index(drc));
> > -qemu_unregister_reset(drc_reset, drc);
> >   vmstate_unregister(DEVICE(drc), _spapr_drc, drc);
> >   root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
> >   name = g_strdup_printf("%x", spapr_drc_index(drc));
> >
> >  
> 




[Qemu-devel] [PATCH v3] QAPI & interop: Clarify events emitted by 'block-job-cancel'

2017-11-17 Thread Kashyap Chamarthy
When you cancel an in-progress live block operation with QMP
`block-job-cancel`, it emits the event: BLOCK_JOB_CANCELLED.  However,
when `block-job-cancel` is issued after `drive-mirror` has indicated (by
emitting the event BLOCK_JOB_READY) that the source and destination
remain synchronized:

[...] # Snip `drive-mirror` invocation & outputs
{
  "execute":"block-job-cancel",
  "arguments":{
"device":"virtio0"
  }
}

{"return": {}}

It (`block-job-cancel`) will counterintuitively emit the event
'BLOCK_JOB_COMPLETED':

{
  "timestamp":{
"seconds":1510678024,
"microseconds":526240
  },
  "event":"BLOCK_JOB_COMPLETED",
  "data":{
"device":"virtio0",
"len":41126400,
"offset":41126400,
"speed":0,
"type":"mirror"
  }
}

But this is expected behaviour, where the _COMPLETED event indicates
that synchronization has successfully ended (and the destination has a
point-in-time copy, which is at the time of cancel).

So add a small note to this effect in 'block-core.json'.  While at it,
also update the "Live disk synchronization -- drive-mirror and
blockdev-mirror" section in 'live-block-operations.rst'.

(Thanks: Max Reitz for reminding me of this caveat on IRC.)

Signed-off-by: Kashyap Chamarthy 
---
Changes in v3:
 - Fix / rewrite the section "Live disk synchronization -- drive-mirror
   and blockdev-mirror" to note this gotcha [John Snow]

Changes in v2:
 - "Note:" seems to be a special construct in Patchew, my usage caused a
build failure.  So do: s/Note:/Note that/
 - Add the missing 'Signed-off-by'
---
 docs/interop/live-block-operations.rst | 50 ++
 qapi/block-core.json   |  8 ++
 2 files changed, 40 insertions(+), 18 deletions(-)

diff --git a/docs/interop/live-block-operations.rst 
b/docs/interop/live-block-operations.rst
index 5f01797..3cb64f6 100644
--- a/docs/interop/live-block-operations.rst
+++ b/docs/interop/live-block-operations.rst
@@ -506,26 +506,40 @@ Again, given our familiar disk image chain::
 
 [A] <-- [B] <-- [C] <-- [D]
 
-The ``drive-mirror`` (and its newer equivalent ``blockdev-mirror``) allows
-you to copy data from the entire chain into a single target image (which
-can be located on a different host).
-
-Once a 'mirror' job has started, there are two possible actions while a
-``drive-mirror`` job is active:
-
-(1) Issuing the command ``block-job-cancel`` after it emits the event
-``BLOCK_JOB_CANCELLED``: will (after completing synchronization of
-the content from the disk image chain to the target image, [E])
-create a point-in-time (which is at the time of *triggering* the
-cancel command) copy, contained in image [E], of the the entire disk
+The ``drive-mirror`` (and its newer equivalent ``blockdev-mirror``)
+allows you to copy data from the entire chain into a single target image
+(which can be located on a different host), [E].
+
+.. note::
+
+When you cancel an in-progress 'mirror' job *before* the source and
+target are synchronized, ``block-job-cancel`` will emit the event
+``BLOCK_JOB_CANCELLED``.  However, note that if you cancel a
+'mirror' job *after* it has indicated (by emitting the event
+``BLOCK_JOB_READY``) that the source and target now remain
+synchronized, then ``block-job-cancel`` will emit the event
+``BLOCK_JOB_COMPLETED``.
+
+Besides the 'mirror' job, the "active ``block-commit``" is the only
+other block device job that emits the event ``BLOCK_JOB_READY``.
+The rest of the block device jobs ('stream', "non-active
+``block-commit``", and 'backup') end automatically.
+
+So there are two possible actions, after a 'mirror' job has
+emitted the event ``BLOCK_JOB_READY``, indicating that source and target
+now remain synchronized:
+
+(1) Issuing the command ``block-job-cancel`` (after it emits the event
+``BLOCK_JOB_COMPLETED``) will create a point-in-time (which is at
+the time of *triggering* the cancel command) copy of the entire disk
 image chain (or only the top-most image, depending on the ``sync``
-mode).
+mode), contained in the target image [E].
 
-(2) Issuing the command ``block-job-complete`` after it emits the event
-``BLOCK_JOB_COMPLETED``: will, after completing synchronization of
-the content, adjust the guest device (i.e. live QEMU) to point to
-the target image, and, causing all the new writes from this point on
-to happen there.  One use case for this is live storage migration.
+(2) Issuing the command ``block-job-complete`` (after it emits the event
+``BLOCK_JOB_COMPLETED``) will adjust the guest device (i.e. live
+QEMU) to point to the target image, [E], causing all the new writes
+from this point on to happen there.  One use case for this is live
+storage migration.
 
 About synchronization modes: The synchronization mode determines
 *which* 

Re: [Qemu-devel] [Qemu-ppc] [PATCH for-2.11] spapr: reset DRCs after devices

2017-11-17 Thread Daniel Henrique Barboza



On 11/17/2017 10:56 AM, Greg Kurz wrote:

A DRC with a pending unplug request releases its associated device at
machine reset time.

In the case of LMB, when all DRCs for a DIMM device have been reset,
the DIMM gets unplugged, causing guest memory to disappear. This may
be very confusing for anything still using this memory.

This is exactly what happens with vhost backends, and QEMU aborts
with:

qemu-system-ppc64: used ring relocated for ring 2
qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion
  `r >= 0' failed.

The issue is that each DRC registers a QEMU reset handler, and we
don't control the order in which these handlers are called (ie,
a LMB DRC will unplug a DIMM before the virtio device using the
memory on this DIMM could stop its vhost backend).

To avoid such situations, let's reset DRCs after all devices
have been reset.

Reported-by: Mallesh N. Koti 
Signed-off-by: Greg Kurz 
---

I believe this has to do with the problem discussed at the memory
unplug reboot with vhost=on, right?

I don't see any problem with this patch and it's probably the best solution
short-term for the problem (and potentially other related LMB release
problems), but that said, how hard it is to forbid LMB unplug at all if the
memory is being used in vhost?

The way I see it, the root problem is that a device unplug failed at the 
guest

side and we don't know about it, leaving drc->unplug_requested flags set
around the code when it shouldn't. Reading that thread I see a comment from
Michael S. Tsirkin saying that this bug is somewhat expected, that vhost 
backend will
not behave well if memory is going away. Unlike other LMB unplug 
failures from
the guest side that might happen for any other reason, this one sees 
predicable

and we can avoid it.

I think this is something worth considering. But for now,


Reviewed-by: Daniel Henrique Barboza 


  hw/ppc/spapr.c |   21 +
  hw/ppc/spapr_drc.c |7 ---
  2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6841bd294b3c..6285f7211f9a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1411,6 +1411,19 @@ static void find_unknown_sysbus_device(SysBusDevice 
*sbdev, void *opaque)
  }
  }

+static int spapr_reset_drcs(Object *child, void *opaque)
+{
+sPAPRDRConnector *drc =
+(sPAPRDRConnector *) object_dynamic_cast(child,
+ TYPE_SPAPR_DR_CONNECTOR);
+
+if (drc) {
+spapr_drc_reset(drc);
+}
+
+return 0;
+}
+
  static void ppc_spapr_reset(void)
  {
  MachineState *machine = MACHINE(qdev_get_machine());
@@ -1434,6 +1447,14 @@ static void ppc_spapr_reset(void)
  }

  qemu_devices_reset();
+
+/* DRC reset may cause a device to be unplugged. This will cause troubles
+ * if this device is used by another device (eg, a running vhost backend
+ * will crash QEMU if the DIMM holding the vring goes away). To avoid such
+ * situations, we reset DRCs after all devices have been reset.
+ */
+object_child_foreach_recursive(object_get_root(), spapr_reset_drcs, NULL);
+
  spapr_clear_pending_events(spapr);

  /*
diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 915e9b51c40c..e3b122968e89 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -455,11 +455,6 @@ void spapr_drc_reset(sPAPRDRConnector *drc)
  }
  }

-static void drc_reset(void *opaque)
-{
-spapr_drc_reset(SPAPR_DR_CONNECTOR(opaque));
-}
-
  bool spapr_drc_needed(void *opaque)
  {
  sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
@@ -518,7 +513,6 @@ static void realize(DeviceState *d, Error **errp)
  }
  vmstate_register(DEVICE(drc), spapr_drc_index(drc), _spapr_drc,
   drc);
-qemu_register_reset(drc_reset, drc);
  trace_spapr_drc_realize_complete(spapr_drc_index(drc));
  }

@@ -529,7 +523,6 @@ static void unrealize(DeviceState *d, Error **errp)
  gchar *name;

  trace_spapr_drc_unrealize(spapr_drc_index(drc));
-qemu_unregister_reset(drc_reset, drc);
  vmstate_unregister(DEVICE(drc), _spapr_drc, drc);
  root_container = container_get(object_get_root(), DRC_CONTAINER_PATH);
  name = g_strdup_printf("%x", spapr_drc_index(drc));







Re: [Qemu-devel] [PATCH v22 00/30] qcow2: persistent dirty bitmaps

2017-11-17 Thread Vladimir Sementsov-Ogievskiy

17.11.2017 19:04, Eric Blake wrote:

Revisiting an old series:

On 06/28/2017 07:05 AM, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

There is a new update of qcow2-bitmap series - v22.

web: 
https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=qcow2-bitmap-v22
git: https://src.openvz.org/scm/~vsementsov/qemu.git (tag qcow2-bitmap-v22)

v22:

Rebase on master, so changes, mostly related to new dirty bitmaps mutex:

   qcow2: add persistent dirty bitmaps support
  block.c  |   65 +-
  block/Makefile.objs  |2 +-
  block/dirty-bitmap.c |  154 -
  block/io.c   |8 +
  block/qcow2-bitmap.c | 1481 ++
  block/qcow2-refcount.c   |   59 +-
  block/qcow2.c|  155 -
  block/qcow2.h|   43 ++
  blockdev.c   |   73 ++-
  docs/interop/qcow2.txt   |8 +-
  include/block/block.h|3 +
  include/block/block_int.h|   14 +
  include/block/dirty-bitmap.h |   22 +-
  include/qemu/hbitmap.h   |   49 +-
  qapi/block-core.json |   42 +-
  tests/Makefile.include   |2 +-
  tests/qemu-iotests/165   |  105 +++
  tests/qemu-iotests/165.out   |5 +
  tests/qemu-iotests/group |1 +
  tests/test-hbitmap.c |   19 +
  util/hbitmap.c   |   51 +-
  21 files changed, 2280 insertions(+), 81 deletions(-)

Nothing in this series touched qemu-img.  At the very minimum, it would
be nice if 'qemu-img info' were to display a summary of all bitmaps in
the qcow2 file; the QMP type ImageInfoSpecificQCow2 should be updated to
mention this information, and qemu-img updated to output it.  Being able
to create and remove persistent bitmaps from an at-rest qcow2 file
without having to fire up qemu to issue QMP commands to do so would also
be nice.

It's a bit late for getting this request into 2.12, but had it been
available, it would make my work at testing that 'qemu-img commit'
doesn't corrupt persistent bitmaps.



You are right, this is needed, but I can't promise something for the 
nearest future..


--
Best regards,
Vladimir




Re: [Qemu-devel] [PATCH] block: Close a BlockDriverState completely even when bs->drv is NULL

2017-11-17 Thread Alberto Garcia
On Fri 17 Nov 2017 05:14:08 PM CET, Max Reitz wrote:
> On 2017-11-06 15:53, Alberto Garcia wrote:
>> bdrv_close() skips much of its logic when bs->drv is NULL. This is
>> fine when we're closing a BlockDriverState that has just been created
>> (because e.g the initialization process failed), but it's not enough
>> in other cases.
>> 
>> For example, when a valid qcow2 image is found to be corrupted then
>> QEMU marks it as such in the file header and then sets bs->drv to
>> NULL in order to make the BlockDriverState unusable. When that BDS is
>> later closed then many of its data structures are not freed (leaking
>> their memory) and none of its children are detached. This results in
>> bdrv_close_all() failing to close all BDSs and making this assertion
>> fail when QEMU is being shut down:
>> 
>>bdrv_close_all: Assertion `QTAILQ_EMPTY(_bdrv_states)' failed.
>> 
>> This patch makes bdrv_close() do the full uninitialization process
>> in all cases. This fixes the problem with corrupted images and still
>> works fine with freshly created BDSs.
>> 
>> Signed-off-by: Alberto Garcia 
>> ---
>>  block.c| 57 
>> +++---
>>  tests/qemu-iotests/060 | 13 +++
>>  tests/qemu-iotests/060.out | 12 ++
>>  3 files changed, 53 insertions(+), 29 deletions(-)
>
> Sooo...  What's the exact status of this patch? :-)

I can resend it rebased on top of your block branch, but I'm fine if you
merge the iotest manually (it's a trivial merge).

I'm not sure about Kevin's comments though, it wasn't clear to me if
he's fine if we apply this patch or not.

Berto



Re: [Qemu-devel] [Qemu-block] segfault in parallel blockjobs (iotest 30)

2017-11-17 Thread Alberto Garcia
On Thu 16 Nov 2017 10:56:58 PM CET, John Snow wrote:
 I have the impression that one major source of headaches is the fact
 that the reopen queue contains nodes that don't need to be reopened at
 all. Ideally this should be detected early on in bdrv_reopen_queue(), so
 there's no chance that the queue contains nodes used by a different
 block job. If we had that then op blockers should be enough to prevent
 these things. Or am I missing something?

>>> After applying Max's patch I tried the similar approach; that is keep
>>> BDSes referenced while they are in the reopen queue.  Now I get the
>>> stream job hanging. Somehow one blk_root_drained_begin() is not paired
>>> with blk_root_drained_end(). So the job stays paused.
>> 
>> I can see this if I apply Max's patch and keep refs to BDSs in the
>> reopen queue:
>> 
>> #0  block_job_pause (...) at blockjob.c:130
>> #1  0x55c143cb586d in block_job_drained_begin (...) at blockjob.c:227
>> #2  0x55c143d08067 in blk_set_dev_ops (...) at block/block-backend.c:887
>> #3  0x55c143cb69db in block_job_create (...) at blockjob.c:678
>> #4  0x55c143d17c0c in mirror_start_job (...) at block/mirror.c:1177
>> 
>> There's a ops->drained_begin(opaque) call in blk_set_dev_ops() that
>> doesn't seem to be paired. And when you call block_job_start() then it
>> yields immediately waiting for the resume (that never arrives).
>> 
>> John, this change was yours (f4d9cc88ee69a5b04). Any idea?
>
> The idea at the time was that if you tell the BlockBackend to drain and
> then attach a job to it, once you go to *end* the drained region you'd
> have a mismatched begin/end pair.
>
> To allow for some flexibility and to make sure that you *didn't* have a
> mismatched begin/end call, what I did was that if you attach dev_ops to
> an already drained backend (i.e. we "missed our chance" to issue the
> drained_begin) we play catch-up and issue the drained call.
>
> There's no matching call here, because I anticipated whoever initially
> bumped that quiesce_counter up to be issuing the drained_end, which will
> then be propagated according to the dev_ops structure in place.

I see, thanks!

Berto



Re: [Qemu-devel] [PATCH] block: Close a BlockDriverState completely even when bs->drv is NULL

2017-11-17 Thread Max Reitz
On 2017-11-06 15:53, Alberto Garcia wrote:
> bdrv_close() skips much of its logic when bs->drv is NULL. This is
> fine when we're closing a BlockDriverState that has just been created
> (because e.g the initialization process failed), but it's not enough
> in other cases.
> 
> For example, when a valid qcow2 image is found to be corrupted then
> QEMU marks it as such in the file header and then sets bs->drv to
> NULL in order to make the BlockDriverState unusable. When that BDS is
> later closed then many of its data structures are not freed (leaking
> their memory) and none of its children are detached. This results in
> bdrv_close_all() failing to close all BDSs and making this assertion
> fail when QEMU is being shut down:
> 
>bdrv_close_all: Assertion `QTAILQ_EMPTY(_bdrv_states)' failed.
> 
> This patch makes bdrv_close() do the full uninitialization process
> in all cases. This fixes the problem with corrupted images and still
> works fine with freshly created BDSs.
> 
> Signed-off-by: Alberto Garcia 
> ---
>  block.c| 57 
> +++---
>  tests/qemu-iotests/060 | 13 +++
>  tests/qemu-iotests/060.out | 12 ++
>  3 files changed, 53 insertions(+), 29 deletions(-)

Sooo...  What's the exact status of this patch? :-)

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] iotests: Make 087 pass without AIO enabled

2017-11-17 Thread Max Reitz
On 2017-11-15 19:07, Max Reitz wrote:
> If AIO has not been enabled in the qemu build that is to be tested, we
> should skip the "aio=native without O_DIRECT" test instead of failing.
> 
> Signed-off-by: Max Reitz 
> ---
> Cleber wanted to fix this in July with his "build configuration query
> tool and conditional (qemu-io)test skip" series
> (https://lists.gnu.org/archive/html/qemu-block/2017-07/msg01303.html),
> but unfortunately there hasn't been any activity on that (as far as I
> can see), so let's just solve it the simple way.
> ---
>  tests/qemu-iotests/087 | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)

Applied to my block branch:

https://github.com/XanClic/qemu/commits/block

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2 for-2.11] block: Make bdrv_next() keep strong references

2017-11-17 Thread Max Reitz
On 2017-11-10 18:25, Max Reitz wrote:
> On one hand, it is a good idea for bdrv_next() to return a strong
> reference because ideally nearly every pointer should be refcounted.
> This fixes intermittent failure of iotest 194.
> 
> On the other, it is absolutely necessary for bdrv_next() itself to keep
> a strong reference to both the BB (in its first phase) and the BDS (at
> least in the second phase) because when called the next time, it will
> dereference those objects to get a link to the next one.  Therefore, it
> needs these objects to stay around until then.  Just storing the pointer
> to the next in the iterator is not really viable because that pointer
> might become invalid as well.
> 
> Both arguments taken together means we should probably just invoke
> bdrv_ref() and blk_ref() in bdrv_next().  This means we have to assert
> that bdrv_next() is always called from the main loop, but that was
> probably necessary already before this patch and judging from the
> callers, it also looks to actually be the case.
> 
> Keeping these strong references means however that callers need to give
> them up if they decide to abort the iteration early.  They can do so
> through the new bdrv_next_cleanup() function.
> 
> Suggested-by: Kevin Wolf 
> Signed-off-by: Max Reitz 
> ---
> v2: Instead of keeping the strong reference in bdrv_drain_all_*() only,
> have them for all callers of bdrv_next() [Fam, Kevin]
> (Completely different patch now, so no git-backport-diff included
>  here)
> ---
>  include/block/block.h |  1 +
>  block.c   |  3 +++
>  block/block-backend.c | 48 ++--
>  block/snapshot.c  |  6 ++
>  migration/block.c |  1 +
>  5 files changed, 57 insertions(+), 2 deletions(-)

Due to one supporter and otherwise lack of resistance: Applied to my
block branch (https://github.com/XanClic/qemu/commits/block).

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 02/24] sdl: remove -alt-grab and -ctrl-grab support

2017-11-17 Thread Programmingkid

> With absolute pointer devices such as usb-tablet being widely used
> mouse grabs (for relative pointing devices) should be rarely needed
> these days.  So the benefit of the options to configure the hotkey
> modifiers for grab (and other actions) seems questionable.  Which
> is expecially true for the -ctrl-grab which isn't handled in the
> handle_keyup() code.

So does this mean you are against a patch that would allow the user to chose a 
key to act as a mouse ungrab key?


Re: [Qemu-devel] [PATCH v22 00/30] qcow2: persistent dirty bitmaps

2017-11-17 Thread Eric Blake
Revisiting an old series:

On 06/28/2017 07:05 AM, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> There is a new update of qcow2-bitmap series - v22.
> 
> web: 
> https://src.openvz.org/users/vsementsov/repos/qemu/browse?at=qcow2-bitmap-v22
> git: https://src.openvz.org/scm/~vsementsov/qemu.git (tag qcow2-bitmap-v22)
> 
> v22:
> 
> Rebase on master, so changes, mostly related to new dirty bitmaps mutex:
> 

>   qcow2: add persistent dirty bitmaps support

> 
>  block.c  |   65 +-
>  block/Makefile.objs  |2 +-
>  block/dirty-bitmap.c |  154 -
>  block/io.c   |8 +
>  block/qcow2-bitmap.c | 1481 
> ++
>  block/qcow2-refcount.c   |   59 +-
>  block/qcow2.c|  155 -
>  block/qcow2.h|   43 ++
>  blockdev.c   |   73 ++-
>  docs/interop/qcow2.txt   |8 +-
>  include/block/block.h|3 +
>  include/block/block_int.h|   14 +
>  include/block/dirty-bitmap.h |   22 +-
>  include/qemu/hbitmap.h   |   49 +-
>  qapi/block-core.json |   42 +-
>  tests/Makefile.include   |2 +-
>  tests/qemu-iotests/165   |  105 +++
>  tests/qemu-iotests/165.out   |5 +
>  tests/qemu-iotests/group |1 +
>  tests/test-hbitmap.c |   19 +
>  util/hbitmap.c   |   51 +-
>  21 files changed, 2280 insertions(+), 81 deletions(-)

Nothing in this series touched qemu-img.  At the very minimum, it would
be nice if 'qemu-img info' were to display a summary of all bitmaps in
the qcow2 file; the QMP type ImageInfoSpecificQCow2 should be updated to
mention this information, and qemu-img updated to output it.  Being able
to create and remove persistent bitmaps from an at-rest qcow2 file
without having to fire up qemu to issue QMP commands to do so would also
be nice.

It's a bit late for getting this request into 2.12, but had it been
available, it would make my work at testing that 'qemu-img commit'
doesn't corrupt persistent bitmaps.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH for-2.11 2/2] iotests: test clearing unknown autoclear_features by qcow2

2017-11-17 Thread Max Reitz
On 2017-11-17 16:47, Kevin Wolf wrote:
> From: Vladimir Sementsov-Ogievskiy 
> 
> Test clearing unknown autoclear_features by qcow2 on incoming
> migration.
> 
> [ kwolf: Fixed wait for destination VM startup ]
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Signed-off-by: Kevin Wolf 
> ---
>  tests/qemu-iotests/196 | 66 
> ++
>  tests/qemu-iotests/196.out |  5 
>  tests/qemu-iotests/group   |  1 +
>  3 files changed, 72 insertions(+)
>  create mode 100755 tests/qemu-iotests/196
>  create mode 100644 tests/qemu-iotests/196.out
> 
> diff --git a/tests/qemu-iotests/196 b/tests/qemu-iotests/196
> new file mode 100755
> index 00..4116ebc92b
> --- /dev/null
> +++ b/tests/qemu-iotests/196
> @@ -0,0 +1,66 @@
> +#!/usr/bin/env python
> +#
> +# Test clearing unknown autoclear_features flag by qcow2 after
> +# migration. This test mimics migration to older qemu.
> +#
> +# Copyright (c) 2017 Parallels International GmbH
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see .
> +#
> +
> +import os
> +import iotests
> +from iotests import qemu_img
> +
> +disk = os.path.join(iotests.test_dir, 'disk')
> +migfile = os.path.join(iotests.test_dir, 'migfile')
> +
> +class TestInvalidateAutoclear(iotests.QMPTestCase):
> +
> +def tearDown(self):
> +self.vm_a.shutdown()
> +self.vm_b.shutdown()
> +os.remove(disk)
> +os.remove(migfile)
> +
> +def setUp(self):
> +qemu_img('create', '-f', iotests.imgfmt, disk, '1M')
> +
> +self.vm_a = iotests.VM(path_suffix='a').add_drive(disk)
> +self.vm_a.launch()
> +
> +self.vm_b = iotests.VM(path_suffix='b').add_drive(disk)
> +self.vm_b.add_incoming("exec: cat '" + migfile + "'")
> +
> +def test_migration(self):
> +result = self.vm_a.qmp('migrate', uri='exec:cat>' + migfile)
> +self.assert_qmp(result, 'return', {});
> +self.assertNotEqual(self.vm_a.event_wait("STOP"), None)
> +
> +with open(disk, 'r+b') as f:
> +f.seek(88) # first byte of autoclear_features field
> +f.write(b'\xff')
> +
> +self.vm_b.launch()
> +while True:
> +result = self.vm_b.qmp('query-status')
> +if result['return']['status'] == 'running':
> +break
> +
> +with open(disk, 'rb') as f:
> +f.seek(88)
> +self.assertEqual(f.read(1), b'\x00')

I guess I'd rather shutdown the VM first and do the check then, but
since the header is updated on-disk when opening the qcow2 file...

Reviewed-by: Max Reitz 

> +
> +if __name__ == '__main__':
> +iotests.main(supported_fmts=['qcow2'])
> diff --git a/tests/qemu-iotests/196.out b/tests/qemu-iotests/196.out
> new file mode 100644
> index 00..ae1213e6f8
> --- /dev/null
> +++ b/tests/qemu-iotests/196.out
> @@ -0,0 +1,5 @@
> +.
> +--
> +Ran 1 tests
> +
> +OK
> diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
> index 83b839bbe3..1fad602152 100644
> --- a/tests/qemu-iotests/group
> +++ b/tests/qemu-iotests/group
> @@ -193,5 +193,6 @@
>  192 rw auto quick
>  194 rw auto migration quick
>  195 rw auto quick
> +196 rw auto quick
>  197 rw auto quick
>  198 rw auto
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 19/24] console: add and use qemu_display_find_default

2017-11-17 Thread Darren Kenny

On Fri, Nov 17, 2017 at 04:03:28PM +0100, Gerd Hoffmann wrote:

 Hi,


OK, so the odd thing then is the check for !remote_display earlier
on in the function (missing from the quote above) which seems to end
up initializing VNC (albeit with localhost) when CONFIG_VNC is
defined, but no other local display type has been selected.


remote_display is set in case either -vnc or -spice was given on the
command line, and qemu will not start a local user display in that case
(unless you explicitly ask for it via -display).


Makes sense.




OK, so neither SPICE or VNC really fit into this patchset because
they are seen as additional to the display type, not the actual
display type - SPICE is even more special too given the dependency
on a guest VM driver to really function correctly.


Hmm?  spice works just fine with any display.


Sorry, mixed that up with EGL I think.




I wonder should there be a, similar, qemu_display_find_remote_default()
which would be used when no local display is defined. And that would
use a similar mechanism to permit selecting VNC or SPICE via
registration mechanisms over rather than the #ifdef style for remote
displays that will remain ever after this patchset (certainly could
be another patchset beyond this one).


Well, historically qemu tried to do alot of stuff (like picking a
display to use) automatically for the user.  And we did stick to that
for backward compatibility reasons.  But this automagic causes quite
some headache now and then, when changing things and the autmagic logic
gets into the way.  So I don't feel like adding more of this.  Also vnc


Agree, there are times when automagic works - but unfortunately for
more consistent usage, you are better being explicit.


is in pretty much any qemu build, because it is enabled by default and
doesn't need any external dependencies, so falling back to spice would
be quite rare ...


Understood - definitely VNC is a pretty good fall-back for most
people.

Thanks for outlining your thoughts on this, very helpful in
understanding the UI code and why somethings are how they are.

Thanks,

Darren.



Re: [Qemu-devel] [PATCH for-2.11 1/2] block: Fix permissions in image activation

2017-11-17 Thread Max Reitz
On 2017-11-17 16:47, Kevin Wolf wrote:
> Inactive images generally request less permissions for their image files
> than they would if they were active (in particular, write permissions).
> Activating the image involves extending the permissions, therefore.
> 
> drv->bdrv_invalidate_cache() can already require write access to the
> image file, so we have to update the permissions earlier than that.
> The current code does it only later, so we have to move up this part.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block.c | 32 ++--
>  1 file changed, 22 insertions(+), 10 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH for-2.11 2/2] iotests: test clearing unknown autoclear_features by qcow2

2017-11-17 Thread Kevin Wolf
From: Vladimir Sementsov-Ogievskiy 

Test clearing unknown autoclear_features by qcow2 on incoming
migration.

[ kwolf: Fixed wait for destination VM startup ]

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/196 | 66 ++
 tests/qemu-iotests/196.out |  5 
 tests/qemu-iotests/group   |  1 +
 3 files changed, 72 insertions(+)
 create mode 100755 tests/qemu-iotests/196
 create mode 100644 tests/qemu-iotests/196.out

diff --git a/tests/qemu-iotests/196 b/tests/qemu-iotests/196
new file mode 100755
index 00..4116ebc92b
--- /dev/null
+++ b/tests/qemu-iotests/196
@@ -0,0 +1,66 @@
+#!/usr/bin/env python
+#
+# Test clearing unknown autoclear_features flag by qcow2 after
+# migration. This test mimics migration to older qemu.
+#
+# Copyright (c) 2017 Parallels International GmbH
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import iotests
+from iotests import qemu_img
+
+disk = os.path.join(iotests.test_dir, 'disk')
+migfile = os.path.join(iotests.test_dir, 'migfile')
+
+class TestInvalidateAutoclear(iotests.QMPTestCase):
+
+def tearDown(self):
+self.vm_a.shutdown()
+self.vm_b.shutdown()
+os.remove(disk)
+os.remove(migfile)
+
+def setUp(self):
+qemu_img('create', '-f', iotests.imgfmt, disk, '1M')
+
+self.vm_a = iotests.VM(path_suffix='a').add_drive(disk)
+self.vm_a.launch()
+
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk)
+self.vm_b.add_incoming("exec: cat '" + migfile + "'")
+
+def test_migration(self):
+result = self.vm_a.qmp('migrate', uri='exec:cat>' + migfile)
+self.assert_qmp(result, 'return', {});
+self.assertNotEqual(self.vm_a.event_wait("STOP"), None)
+
+with open(disk, 'r+b') as f:
+f.seek(88) # first byte of autoclear_features field
+f.write(b'\xff')
+
+self.vm_b.launch()
+while True:
+result = self.vm_b.qmp('query-status')
+if result['return']['status'] == 'running':
+break
+
+with open(disk, 'rb') as f:
+f.seek(88)
+self.assertEqual(f.read(1), b'\x00')
+
+if __name__ == '__main__':
+iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/196.out b/tests/qemu-iotests/196.out
new file mode 100644
index 00..ae1213e6f8
--- /dev/null
+++ b/tests/qemu-iotests/196.out
@@ -0,0 +1,5 @@
+.
+--
+Ran 1 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 83b839bbe3..1fad602152 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -193,5 +193,6 @@
 192 rw auto quick
 194 rw auto migration quick
 195 rw auto quick
+196 rw auto quick
 197 rw auto quick
 198 rw auto
-- 
2.13.6




[Qemu-devel] [PATCH for-2.11 1/2] block: Fix permissions in image activation

2017-11-17 Thread Kevin Wolf
Inactive images generally request less permissions for their image files
than they would if they were active (in particular, write permissions).
Activating the image involves extending the permissions, therefore.

drv->bdrv_invalidate_cache() can already require write access to the
image file, so we have to update the permissions earlier than that.
The current code does it only later, so we have to move up this part.

Signed-off-by: Kevin Wolf 
---
 block.c | 32 ++--
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index edc5bb9f9b..6fe5b91260 100644
--- a/block.c
+++ b/block.c
@@ -4178,7 +4178,29 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error 
**errp)
 }
 }
 
+/*
+ * Update permissions, they may differ for inactive nodes.
+ *
+ * Note that the required permissions of inactive images are always a
+ * subset of the permissions required after activating the image. This
+ * allows us to just get the permissions upfront without restricting
+ * drv->bdrv_invalidate_cache().
+ *
+ * It also means that in error cases, we don't have to try and revert to
+ * the old permissions (which is an operation that could fail, too). We can
+ * just keep the extended permissions for the next time that an activation
+ * of the image is tried.
+ */
 bs->open_flags &= ~BDRV_O_INACTIVE;
+bdrv_get_cumulative_perm(bs, , _perm);
+ret = bdrv_check_perm(bs, NULL, perm, shared_perm, NULL, _err);
+if (ret < 0) {
+bs->open_flags |= BDRV_O_INACTIVE;
+error_propagate(errp, local_err);
+return;
+}
+bdrv_set_perm(bs, perm, shared_perm);
+
 if (bs->drv->bdrv_invalidate_cache) {
 bs->drv->bdrv_invalidate_cache(bs, _err);
 if (local_err) {
@@ -4195,16 +4217,6 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error 
**errp)
 return;
 }
 
-/* Update permissions, they may differ for inactive nodes */
-bdrv_get_cumulative_perm(bs, , _perm);
-ret = bdrv_check_perm(bs, NULL, perm, shared_perm, NULL, _err);
-if (ret < 0) {
-bs->open_flags |= BDRV_O_INACTIVE;
-error_propagate(errp, local_err);
-return;
-}
-bdrv_set_perm(bs, perm, shared_perm);
-
 QLIST_FOREACH(parent, >parents, next_parent) {
 if (parent->role->activate) {
 parent->role->activate(parent, _err);
-- 
2.13.6




[Qemu-devel] [PATCH for-2.11 0/2] block: Fix permissions in image activation

2017-11-17 Thread Kevin Wolf
Kevin Wolf (1):
  block: Fix permissions in image activation

Vladimir Sementsov-Ogievskiy (1):
  iotests: test clearing unknown autoclear_features by qcow2

 block.c| 32 +++---
 tests/qemu-iotests/196 | 66 ++
 tests/qemu-iotests/196.out |  5 
 tests/qemu-iotests/group   |  1 +
 4 files changed, 94 insertions(+), 10 deletions(-)
 create mode 100755 tests/qemu-iotests/196
 create mode 100644 tests/qemu-iotests/196.out

-- 
2.13.6




Re: [Qemu-devel] [PATCH 03/15] sun4u: move ISABus inside of EBusState

2017-11-17 Thread Mark Cave-Ayland
On 17/11/17 14:53, Artyom Tarasenko wrote:

> On Fri, Nov 17, 2017 at 2:42 PM, Mark Cave-Ayland
>  wrote:
>> Since the EBus is effectively a PCI-ISA bridge then the underlying ISA bus
>> should be contained within the PCI bridge itself.
> 
> While it's like that on the Sabre chipset, the Spitfire chipset (which
> I hope to add at some point) has the EBus, but no PCI, so maybe it's
> better to model it separately.
> On the other hand, the Spitfire has different EBus devices
> (particularly different type of the serial ports), so I'm not sure.

Oh I didn't realise you had more plans in this area :)  Any idea when
you'll be able to work on the them? TBH as you probably already know,
even the patchset in its current form with the ISA bus encapsulation is
so much better than what is already there, so I'd prefer to merge it and
help you work through any problems later unless you feel particularly
strongly?


ATB,

Mark.



Re: [Qemu-devel] [PATCH 01/15] apb: move QOM macros and typedefs from apb.c to apb.h

2017-11-17 Thread Mark Cave-Ayland
On 17/11/17 14:24, Artyom Tarasenko wrote:

> Hi Mark,
> 
> On Fri, Nov 17, 2017 at 2:42 PM, Mark Cave-Ayland
>  wrote:
>> This also includes the related IOMMUState typedef and defines.
>>
>> Signed-off-by: Mark Cave-Ayland 
>> ---
>>  hw/pci-host/apb.c |   85 
>> 
>>  include/hw/pci-host/apb.h |   86 
>> +
>>  2 files changed, 86 insertions(+), 85 deletions(-)
>>
>> diff --git a/hw/pci-host/apb.c b/hw/pci-host/apb.c
>> index 64025cd..f743a4e 100644
>> --- a/hw/pci-host/apb.c
>> +++ b/hw/pci-host/apb.c
>> @@ -82,91 +82,6 @@ do { printf("IOMMU: " fmt , ## __VA_ARGS__); } while (0)
>>  #define MAX_IVEC 0x40
>>  #define NO_IRQ_REQUEST (MAX_IVEC + 1)
>>
>> -#define IOMMU_PAGE_SIZE_8K  (1ULL << 13)
>> -#define IOMMU_PAGE_MASK_8K  (~(IOMMU_PAGE_SIZE_8K - 1))
>> -#define IOMMU_PAGE_SIZE_64K (1ULL << 16)
>> -#define IOMMU_PAGE_MASK_64K (~(IOMMU_PAGE_SIZE_64K - 1))
>> -
>> -#define IOMMU_NREGS 3
>> -
>> -#define IOMMU_CTRL  0x0
>> -#define IOMMU_CTRL_TBW_SIZE (1ULL << 2)
>> -#define IOMMU_CTRL_MMU_EN   (1ULL)
>> -
>> -#define IOMMU_CTRL_TSB_SHIFT16
>> -
>> -#define IOMMU_BASE  0x8
>> -#define IOMMU_FLUSH 0x10
>> -
>> -#define IOMMU_TTE_DATA_V(1ULL << 63)
>> -#define IOMMU_TTE_DATA_SIZE (1ULL << 61)
>> -#define IOMMU_TTE_DATA_W(1ULL << 1)
>> -
>> -#define IOMMU_TTE_PHYS_MASK_8K  0x1ffe000ULL
>> -#define IOMMU_TTE_PHYS_MASK_64K 0x1ff8000ULL
>> -
>> -#define IOMMU_TSB_8K_OFFSET_MASK_8M0x007fe000ULL
>> -#define IOMMU_TSB_8K_OFFSET_MASK_16M   0x00ffe000ULL
>> -#define IOMMU_TSB_8K_OFFSET_MASK_32M   0x01ffe000ULL
>> -#define IOMMU_TSB_8K_OFFSET_MASK_64M   0x03ffe000ULL
>> -#define IOMMU_TSB_8K_OFFSET_MASK_128M  0x07ffe000ULL
>> -#define IOMMU_TSB_8K_OFFSET_MASK_256M  0x0fffe000ULL
>> -#define IOMMU_TSB_8K_OFFSET_MASK_512M  0x1fffe000ULL
>> -#define IOMMU_TSB_8K_OFFSET_MASK_1G0x3fffe000ULL
>> -
>> -#define IOMMU_TSB_64K_OFFSET_MASK_64M  0x03ffULL
>> -#define IOMMU_TSB_64K_OFFSET_MASK_128M 0x07ffULL
>> -#define IOMMU_TSB_64K_OFFSET_MASK_256M 0x0fffULL
>> -#define IOMMU_TSB_64K_OFFSET_MASK_512M 0x1fffULL
>> -#define IOMMU_TSB_64K_OFFSET_MASK_1G   0x3fffULL
>> -#define IOMMU_TSB_64K_OFFSET_MASK_2G   0x7fffULL
>> -
>> -typedef struct IOMMUState {
>> -AddressSpace iommu_as;
>> -IOMMUMemoryRegion iommu;
>> -
>> -uint64_t regs[IOMMU_NREGS];
>> -} IOMMUState;
>> -
>> -#define TYPE_APB "pbm"
>> -
>> -#define APB_DEVICE(obj) \
>> -OBJECT_CHECK(APBState, (obj), TYPE_APB)
>> -
>> -#define TYPE_APB_IOMMU_MEMORY_REGION "pbm-iommu-memory-region"
>> -
>> -typedef struct APBState {
>> -PCIHostState parent_obj;
>> -
>> -MemoryRegion apb_config;
>> -MemoryRegion pci_config;
>> -MemoryRegion pci_mmio;
>> -MemoryRegion pci_ioport;
>> -uint64_t pci_irq_in;
>> -IOMMUState iommu;
>> -uint32_t pci_control[16];
>> -uint32_t pci_irq_map[8];
>> -uint32_t pci_err_irq_map[4];
>> -uint32_t obio_irq_map[32];
>> -qemu_irq *pbm_irqs;
>> -qemu_irq *ivec_irqs;
>> -unsigned int irq_request;
>> -uint32_t reset_control;
>> -unsigned int nr_resets;
>> -} APBState;
>> -
>> -#define TYPE_PBM_PCI_BRIDGE "pbm-bridge"
>> -#define PBM_PCI_BRIDGE(obj) \
>> -OBJECT_CHECK(PBMPCIBridge, (obj), TYPE_PBM_PCI_BRIDGE)
>> -
>> -typedef struct PBMPCIBridge {
>> -/*< private >*/
>> -PCIBridge parent_obj;
>> -
>> -/* Is this busA with in-built devices (ebus)? */
>> -bool busA;
>> -} PBMPCIBridge;
>> -
>>  static inline void pbm_set_request(APBState *s, unsigned int irq_num)
>>  {
>>  APB_DPRINTF("%s: request irq %d\n", __func__, irq_num);
>> diff --git a/include/hw/pci-host/apb.h b/include/hw/pci-host/apb.h
>> index b19bd55..5d39c03 100644
>> --- a/include/hw/pci-host/apb.h
>> +++ b/include/hw/pci-host/apb.h
>> @@ -2,6 +2,92 @@
>>  #define PCI_HOST_APB_H
>>
>>  #include "qemu-common.h"
>> +#include "hw/pci/pci_host.h"
>> +
>> +#define IOMMU_NREGS 3
>> +
>> +#define IOMMU_PAGE_SIZE_8K  (1ULL << 13)
>> +#define IOMMU_PAGE_MASK_8K  (~(IOMMU_PAGE_SIZE_8K - 1))
>> +#define IOMMU_PAGE_SIZE_64K (1ULL << 16)
>> +#define IOMMU_PAGE_MASK_64K (~(IOMMU_PAGE_SIZE_64K - 1))
>> +
>> +#define IOMMU_CTRL  0x0
>> +#define IOMMU_CTRL_TBW_SIZE (1ULL << 2)
>> +#define IOMMU_CTRL_MMU_EN   (1ULL)
> 
> While at it, I think the naming for the model-specific constants
> should be more explicit.
> How about US2I_IOMMU_ or SABRE_IOMMU_?

I'm actually planning to follow this up with another patchset to split
out the IOMMU into a separate file (and QOM object) in a similar way to
sun4m_iommu.c so I can do it there rather than in this patchset if you
agree?

>> 

Re: [Qemu-devel] [PATCH v2 1/1] migration/ram.c: do not set 'postcopy_running' in POSTCOPY_INCOMING_END

2017-11-17 Thread Dr. David Alan Gilbert
* Daniel Henrique Barboza (danie...@linux.vnet.ibm.com) wrote:
> When migrating a VM with 'migrate_set_capability postcopy-ram on'
> a postcopy_state is set during the process, ending up with the
> state POSTCOPY_INCOMING_END when the migration is over. This
> postcopy_state is taken into account inside ram_load to check
> how it will load the memory pages. This same ram_load is called when
> in a loadvm command.
> 
> Inside ram_load, the logic to see if we're at postcopy_running state
> is:
> 
> postcopy_running = postcopy_state_get() >= POSTCOPY_INCOMING_LISTENING
> 
> postcopy_state_get() returns this enum type:
> 
> typedef enum {
> POSTCOPY_INCOMING_NONE = 0,
> POSTCOPY_INCOMING_ADVISE,
> POSTCOPY_INCOMING_DISCARD,
> POSTCOPY_INCOMING_LISTENING,
> POSTCOPY_INCOMING_RUNNING,
> POSTCOPY_INCOMING_END
> } PostcopyState;
> 
> In the case where ram_load is executed and postcopy_state is
> POSTCOPY_INCOMING_END, postcopy_running will be set to 'true' and
> ram_load will behave like a postcopy is in progress. This scenario isn't
> achievable in a migration but it is reproducible when executing
> savevm/loadvm after migrating with 'postcopy-ram on', causing loadvm
> to fail with Error -22:
> 
> Source:
> 
> (qemu) migrate_set_capability postcopy-ram on
> (qemu) migrate tcp:127.0.0.1:
> 
> Dest:
> 
> (qemu) migrate_set_capability postcopy-ram on
> (qemu)
> ubuntu1704-intel login:
> Ubuntu 17.04 ubuntu1704-intel ttyS0
> 
> ubuntu1704-intel login: (qemu)
> (qemu) savevm test1
> (qemu) loadvm test1
> Unknown combination of migration flags: 0x4 (postcopy mode)
> error while loading state for instance 0x0 of device 'ram'
> Error -22 while loading VM state
> (qemu)
> 
> This patch fixes this problem by changing the existing logic for
> postcopy_advised and postcopy_running in ram_load, making them
> 'false' if we're at POSTCOPY_INCOMING_END state.
> 
> Signed-off-by: Daniel Henrique Barboza 

Thank you for spotting that.

As Peter says those two functions might be better elsewhere,
but they're OK there; so:

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/ram.c | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 8620aa400a..021d583b9b 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2798,6 +2798,18 @@ static int ram_load_postcopy(QEMUFile *f)
>  return ret;
>  }
>  
> +static bool postcopy_is_advised(void)
> +{
> +PostcopyState ps = postcopy_state_get();
> +return ps >= POSTCOPY_INCOMING_ADVISE && ps < POSTCOPY_INCOMING_END;
> +}
> +
> +static bool postcopy_is_running(void)
> +{
> +PostcopyState ps = postcopy_state_get();
> +return ps >= POSTCOPY_INCOMING_LISTENING && ps < POSTCOPY_INCOMING_END;
> +}
> +
>  static int ram_load(QEMUFile *f, void *opaque, int version_id)
>  {
>  int flags = 0, ret = 0, invalid_flags = 0;
> @@ -2807,9 +2819,9 @@ static int ram_load(QEMUFile *f, void *opaque, int 
> version_id)
>   * If system is running in postcopy mode, page inserts to host memory 
> must
>   * be atomic
>   */
> -bool postcopy_running = postcopy_state_get() >= 
> POSTCOPY_INCOMING_LISTENING;
> +bool postcopy_running = postcopy_is_running();
>  /* ADVISE is earlier, it shows the source has the postcopy capability on 
> */
> -bool postcopy_advised = postcopy_state_get() >= POSTCOPY_INCOMING_ADVISE;
> +bool postcopy_advised = postcopy_is_advised();
>  
>  seq_iter++;
>  
> -- 
> 2.13.6
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



[Qemu-devel] [PULL 2/4] nbd/client: Use error_prepend() correctly

2017-11-17 Thread Eric Blake
When using error prepend(), it is necessary to end with a space
in the format string; otherwise, messages come out incorrectly,
such as when connecting to a socket that hangs up immediately:

can't open device nbd://localhost:10809/: Failed to read dataUnexpected 
end-of-file before all bytes were read

Originally botched in commit e44ed99d, then several more instances
added in the meantime.

Pre-existing and not fixed here: we are inconsistent on capitalization;
some of our messages start with lower case, and others start with upper,
although the use of error_prepend() is much nicer to read when all
fragments consistently start with lower.

CC: qemu-sta...@nongnu.org
Signed-off-by: Eric Blake 
Message-Id: <20171113152424.25381-1-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Markus Armbruster 
---
 nbd/client.c | 50 ++
 1 file changed, 26 insertions(+), 24 deletions(-)

diff --git a/nbd/client.c b/nbd/client.c
index 1880103d2a..4e15fc484d 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -79,12 +79,12 @@ static int nbd_send_option_request(QIOChannel *ioc, 
uint32_t opt,
 stl_be_p(, len);

 if (nbd_write(ioc, , sizeof(req), errp) < 0) {
-error_prepend(errp, "Failed to send option request header");
+error_prepend(errp, "Failed to send option request header: ");
 return -1;
 }

 if (len && nbd_write(ioc, (char *) data, len, errp) < 0) {
-error_prepend(errp, "Failed to send option request data");
+error_prepend(errp, "Failed to send option request data: ");
 return -1;
 }

@@ -113,7 +113,7 @@ static int nbd_receive_option_reply(QIOChannel *ioc, 
uint32_t opt,
 {
 QEMU_BUILD_BUG_ON(sizeof(*reply) != 20);
 if (nbd_read(ioc, reply, sizeof(*reply), errp) < 0) {
-error_prepend(errp, "failed to read option reply");
+error_prepend(errp, "failed to read option reply: ");
 nbd_send_opt_abort(ioc);
 return -1;
 }
@@ -166,7 +166,7 @@ static int nbd_handle_reply_err(QIOChannel *ioc, 
nbd_opt_reply *reply,
 msg = g_malloc(reply->length + 1);
 if (nbd_read(ioc, msg, reply->length, errp) < 0) {
 error_prepend(errp, "failed to read option error 0x%" PRIx32
-  " (%s) message",
+  " (%s) message: ",
   reply->type, nbd_rep_lookup(reply->type));
 goto cleanup;
 }
@@ -277,7 +277,7 @@ static int nbd_receive_list(QIOChannel *ioc, const char 
*want, bool *match,
 return -1;
 }
 if (nbd_read(ioc, , sizeof(namelen), errp) < 0) {
-error_prepend(errp, "failed to read option name length");
+error_prepend(errp, "failed to read option name length: ");
 nbd_send_opt_abort(ioc);
 return -1;
 }
@@ -290,7 +290,8 @@ static int nbd_receive_list(QIOChannel *ioc, const char 
*want, bool *match,
 }
 if (namelen != strlen(want)) {
 if (nbd_drop(ioc, len, errp) < 0) {
-error_prepend(errp, "failed to skip export name with wrong 
length");
+error_prepend(errp,
+  "failed to skip export name with wrong length: ");
 nbd_send_opt_abort(ioc);
 return -1;
 }
@@ -299,14 +300,14 @@ static int nbd_receive_list(QIOChannel *ioc, const char 
*want, bool *match,

 assert(namelen < sizeof(name));
 if (nbd_read(ioc, name, namelen, errp) < 0) {
-error_prepend(errp, "failed to read export name");
+error_prepend(errp, "failed to read export name: ");
 nbd_send_opt_abort(ioc);
 return -1;
 }
 name[namelen] = '\0';
 len -= namelen;
 if (nbd_drop(ioc, len, errp) < 0) {
-error_prepend(errp, "failed to read export description");
+error_prepend(errp, "failed to read export description: ");
 nbd_send_opt_abort(ioc);
 return -1;
 }
@@ -390,7 +391,7 @@ static int nbd_opt_go(QIOChannel *ioc, const char *wantname,
 return -1;
 }
 if (nbd_read(ioc, , sizeof(type), errp) < 0) {
-error_prepend(errp, "failed to read info type");
+error_prepend(errp, "failed to read info type: ");
 nbd_send_opt_abort(ioc);
 return -1;
 }
@@ -405,13 +406,13 @@ static int nbd_opt_go(QIOChannel *ioc, const char 
*wantname,
 return -1;
 }
 if (nbd_read(ioc, >size, sizeof(info->size), errp) < 0) {
-error_prepend(errp, "failed to read info size");
+error_prepend(errp, "failed to read info size: ");
 nbd_send_opt_abort(ioc);
 return -1;
 }
 be64_to_cpus(>size);
 if (nbd_read(ioc, >flags, sizeof(info->flags), errp) < 0) {
-error_prepend(errp, "failed to read info flags");
+ 

[Qemu-devel] [PULL 1/4] nbd: Don't crash when server reports NBD_CMD_READ failure

2017-11-17 Thread Eric Blake
If a server fails a read, for example with EIO, but the connection
is still live, then we would crash trying to print a non-existent
error message in nbd_client_co_preadv().  For consistency, also
change the error printout in nbd_read_reply_entry(), although that
instance does not crash.  Bug introduced in commit f140e300.

Signed-off-by: Eric Blake 
Message-Id: <20171112013936.5942-1-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/nbd-client.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index bcfed0133d..9206652e45 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -78,7 +78,7 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
 while (!s->quit) {
 assert(s->reply.handle == 0);
 ret = nbd_receive_reply(s->ioc, >reply, _err);
-if (ret < 0) {
+if (local_err) {
 error_report_err(local_err);
 }
 if (ret <= 0) {
@@ -691,7 +691,7 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t 
offset,

 ret = nbd_co_receive_cmdread_reply(client, request.handle, offset, qiov,
_err);
-if (ret < 0) {
+if (local_err) {
 error_report_err(local_err);
 }
 return ret;
-- 
2.13.6




Re: [Qemu-devel] [PATCH 01/24] sdl: remove -no-frame support

2017-11-17 Thread Daniel P. Berrange
On Fri, Nov 17, 2017 at 02:59:54PM +, Daniel P. Berrange wrote:
> On Fri, Nov 17, 2017 at 03:49:06PM +0100, Gerd Hoffmann wrote:
> > On Fri, Nov 17, 2017 at 02:21:30PM +, Daniel P. Berrange wrote:
> > > On Fri, Nov 17, 2017 at 11:30:23AM +0100, Gerd Hoffmann wrote:
> > > > SDL2 doesn't support this any more, the SDL_NOFRAME window flag is gone.
> > > > Drop the code, print a notice when the option is still used.
> > > > 
> > > > Signed-off-by: Gerd Hoffmann 
> > > > ---
> > > >  include/ui/console.h |  5 ++---
> > > >  ui/sdl.c |  8 +---
> > > >  ui/sdl2.c|  7 +--
> > > >  vl.c | 15 ---
> > > >  4 files changed, 8 insertions(+), 27 deletions(-)
> > > 
> > > This needs to go through the deprecation process before we can
> > > drop it. There's still time to get it into 2.11 deprecated list
> > > which would cut the time needed to wait for real deletion.
> > 
> > I fail to see the point.  SDL2 simply doesn't support it, so it never
> > worked with SDL2.  We switched to SDL2 as default recently.  So it seems
> > fair to me to just say so instead of silently ignoring the option ...
> 
> We still support SDL don't we, which does support it, and which this
> option was targetted at ?

Perhaps we might even consider deprecating SDL1 as a whole, then the
deprecation of -no-frame is a natural side effect ?

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



[Qemu-devel] [PULL 4/4] nbd/server: Fix error reporting for bad requests

2017-11-17 Thread Eric Blake
The NBD spec says an attempt to NBD_CMD_TRIM on a read-only
export should fail with EPERM, as a trim has the potential
to change disk contents, but we were relying on the block
layer to catch that for us, which might not always give the
right error (and even if it does, it does not let us pass
back a sane message for structured replies).

The NBD spec says an attempt to NBD_CMD_WRITE_ZEROES out of
bounds should fail with ENOSPC, not EINVAL.

Our check for u64 offset + u32 length wraparound up front is
pointless; nothing uses offset until after the second round
of sanity checks, and we can just as easily ensure there is
no wraparound by checking whether offset is in bounds (since
a disk size cannot exceed off_t which is 63 bits, adding a
32-bit number for a valid offset can't overflow).  Bonus:
dropping the up-front check lets us keep the connection alive
after NBD_CMD_WRITE, whereas before we would drop the
connection (of course, any client sending a packet that would
trigger the failure is already buggy, so it's also okay to
drop the connection, but better quality-of-implementation
never hurts).

Solve all of these issues by some code motion and improved
request validation.

Signed-off-by: Eric Blake 
Message-Id: <20171115213557.3548-1-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 nbd/server.c | 36 
 1 file changed, 12 insertions(+), 24 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index df771fd42f..7d6801b427 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1366,15 +1366,6 @@ static int nbd_co_receive_request(NBDRequestData *req, 
NBDRequest *request,
 return -EIO;
 }

-/* Check for sanity in the parameters, part 1.  Defer as many
- * checks as possible until after reading any NBD_CMD_WRITE
- * payload, so we can try and keep the connection alive.  */
-if ((request->from + request->len) < request->from) {
-error_setg(errp,
-   "integer overflow detected, you're probably being 
attacked");
-return -EINVAL;
-}
-
 if (request->type == NBD_CMD_READ || request->type == NBD_CMD_WRITE) {
 if (request->len > NBD_MAX_BUFFER_SIZE) {
 error_setg(errp, "len (%" PRIu32" ) is larger than max len (%u)",
@@ -1399,12 +1390,21 @@ static int nbd_co_receive_request(NBDRequestData *req, 
NBDRequest *request,
   request->len);
 }

-/* Sanity checks, part 2. */
-if (request->from + request->len > client->exp->size) {
+/* Sanity checks. */
+if (client->exp->nbdflags & NBD_FLAG_READ_ONLY &&
+(request->type == NBD_CMD_WRITE ||
+ request->type == NBD_CMD_WRITE_ZEROES ||
+ request->type == NBD_CMD_TRIM)) {
+error_setg(errp, "Export is read-only");
+return -EROFS;
+}
+if (request->from > client->exp->size ||
+request->from + request->len > client->exp->size) {
 error_setg(errp, "operation past EOF; From: %" PRIu64 ", Len: %" PRIu32
", Size: %" PRIu64, request->from, request->len,
(uint64_t)client->exp->size);
-return request->type == NBD_CMD_WRITE ? -ENOSPC : -EINVAL;
+return (request->type == NBD_CMD_WRITE ||
+request->type == NBD_CMD_WRITE_ZEROES) ? -ENOSPC : -EINVAL;
 }
 valid_flags = NBD_CMD_FLAG_FUA;
 if (request->type == NBD_CMD_READ && client->structured_reply) {
@@ -1482,12 +1482,6 @@ static coroutine_fn void nbd_trip(void *opaque)

 break;
 case NBD_CMD_WRITE:
-if (exp->nbdflags & NBD_FLAG_READ_ONLY) {
-error_setg(_err, "Export is read-only");
-ret = -EROFS;
-break;
-}
-
 flags = 0;
 if (request.flags & NBD_CMD_FLAG_FUA) {
 flags |= BDRV_REQ_FUA;
@@ -1500,12 +1494,6 @@ static coroutine_fn void nbd_trip(void *opaque)

 break;
 case NBD_CMD_WRITE_ZEROES:
-if (exp->nbdflags & NBD_FLAG_READ_ONLY) {
-error_setg(_err, "Export is read-only");
-ret = -EROFS;
-break;
-}
-
 flags = 0;
 if (request.flags & NBD_CMD_FLAG_FUA) {
 flags |= BDRV_REQ_FUA;
-- 
2.13.6




  1   2   3   >