[Qemu-block] [PATCH] hw/ide/ahci.c: Fix shift left into sign bit

2015-10-16 Thread Peter Maydell
Avoid undefined behaviour from shifting left into the sign bit:

hw/ide/ahci.c:551:36: runtime error: left shift of 255 by 24 places cannot be 
represented in type 'int'

(Unfortunately C's promotion rules mean that in the expression
"some_uint8_t_variable << 24" the LHS gets promoted to signed
int before shifting.)

Signed-off-by: Peter Maydell 
---
clang's undefined sanitizer produces a lot of copies of this warning during
'make check'...

 hw/ide/ahci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 796be15..21f76ed 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -548,7 +548,7 @@ static void ahci_init_d2h(AHCIDevice *ad)
 ad->init_d2h_sent = true;
 /* We're emulating receiving the first Reg H2D Fis from the device;
  * Update the SIG register, but otherwise proceed as normal. */
-pr->sig = (ide_state->hcyl << 24) |
+pr->sig = ((uint32_t)ide_state->hcyl << 24) |
 (ide_state->lcyl << 16) |
 (ide_state->sector << 8) |
 (ide_state->nsector & 0xFF);
-- 
1.9.1




Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Laszlo Ersek
On 10/16/15 13:34, Fabio Fantoni wrote:
> Il 16/10/2015 12:47, Stefano Stabellini ha scritto:
>> On Fri, 16 Oct 2015, Fabio Fantoni wrote:
>>> Il 16/10/2015 12:13, Anthony PERARD ha scritto:
 On Fri, Oct 16, 2015 at 10:32:44AM +0200, Fabio Fantoni wrote:
> Il 15/10/2015 20:02, Anthony PERARD ha scritto:
>> On Thu, Oct 15, 2015 at 06:27:17PM +0200, Fabio Fantoni wrote:
>>> Il 14/10/2015 13:06, Stefano Stabellini ha scritto:
 I would suggest Fabio to avoid AHCI disks altogether and just use
 OVMF
 with PV disks only and Anthony's patch to libxl to avoid creating
 any
 IDE disks: http://marc.info/?l=xen-devel=144482080812353.

 Would that work for you?
>>> Thanks for the advice, I tried it:
>>> https://github.com/Fantu/Xen/commits/rebase/m2r-testing-4.6
>>>
>>> I installed W10 pro 64 bit with ide disk, installed the win pv
>>> drivers
>>> and
>>> after changed to xvdX instead hdX, is the only change needed, right?
>>> Initial boot is ok (ovmf part about pv disks seems ok) but windows
>>> boot
>>> fails with problem with pv drivers.
>>> In attachment full qemu log with xen_platform trace and domU's xl
>>> cfg.
>>>
>>> Someone have windows domUs with ovmf and pv disks only working?
>>> If yes
>>> can
>>> tell me the difference to understand what can be the problem please?
>> When I worked on the PV disk implementation in OVMF, I was able to
>> boot
>> a Windows 8 with pv disk only.
>>
>> I don't have access to the guest configuration I was using, but I
>> think
>> one
>> difference would be the viridian setting, I'm pretty sure I did
>> not set
>> it.
>>
> I tried with viridian disabled but did the same thing, looking
> cdrom as
> latest thing before xenbug trace in qemu log I tried also to remove
> it but
> also in this case don't boot correctly, full qemu log in attachment.
> I don't know if is a ovmf thing to improve (like what seems in
> Laszlo and
> Kevin mails) or xen winpv drivers unexpected case, have you tried also
> with
> latest winpv builds? (for exclude regression)
 No, I did not tried the latest winpv drivers.

 Sorry I can help much more that that. When I install this win8 guest
 tried
 to boot it with pv drivers only, that was more than a year ago. I
 have not
 check if it's still working. (Also I can not try anything more recent,
 right now.)

>>> I did many other tests, retrying with ide first boot working but show pv
>>> devices not working, I did another reboot (with ide) and pv devices was
>>> working, after I retried with pv (xvdX) and boot correctly.
>>> After other tests I found that with empty cdrom device (required for xl
>>> cd-insert/cd-eject) boot stop at start (tianocore image), same result
>>> with ide
>>> instead.
>>>  From xl cfg:
>>> disk=['/mnt/vm/disks/W10UEFI.disk1.cow-sn1,qcow2,xvda,rw',',raw,xvdb,ro,cdrom']
>>>
>>> With seabios domU boot also with empty cdrom.
>>> In qemu log I found only these that can be related:
 xen be: qdisk-51728: error: Could not open image: No such file or
 directory
 xen be: qdisk-51728: initialise() failed
>>> And latest xl dmesg line is:
 (d1) Invoking OVMF ...
>>> If you need more informations/test tell me and I'll post them.
>> Are you saying that without any cdrom drives, it works correctly?
> Yes, I did also another test to be sure, starting with ide, installing
> windows, the pv drivers, rebooting 2 times (with one at boot of time
> boot with ide only and without net and disks pv drivers working) and
> after rebooting with pv disks (xvdX) works.
> With cdrom not empty (with iso) works, with empty not, tried with both
> ide (hdX) and pv (xvdX).
> Empty cdrom not working with ovmf I suppose is ovmf bug or inexpected case.
> About major of winpv drivers problem at boot I suppose can be solved
> improving ovmf and winpv driver removing bad hybrid thing actually, but
> I have too low knowledge to be sure.
> About the problem of pv start after install that requiring at least 2
> reboot can be also a windows 10 problem (only a suppose).
> 
> About empty cdrom with ovmf can be solved please?
> 

Sorry, I find your problem report impenetrable. :( Please slow down and
try to spend time on punctuation at least.

For me to make heads or tails of this, I'll need the following:

- The debug output of an OVMF binary built with the DEBUG_VERBOSE bit
(0x0040) enabled in PcdDebugPrintErrorLevel, in addition to the
default setting.

- Preferably, I'll need two logs, one for the "working" case, and
another for the "non-working" case.

- A description of the virtual hardware (disks etc) that is
understandable to someone who hasn't booted Xen in several years; for
both cases above.

- Please try to make an exact, itemized list of the steps you perform
before executing the successful vs. 

Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Kevin Wolf
Am 16.10.2015 um 18:53 hat Paul Durrant geschrieben:
> > > > > > Just tell your admin what virtual hardware you really need. (Or tell
> > > > > > them to give you a proper interface to configure your VMs yourself.)
> > > > > >
> > > > >
> > > > > My point is that the virtual hardware that the OS user wants will
> > > > > change. Before they install PV drivers, they will need emulated
> > > > > device. After installing PV drivers they will want PV devices. Should
> > > > > they really have to contact their cloud provider to make the switch,
> > > > > when at the moment it happens automatically and transparently (the
> > > > > AHCI problem aside)?
> > > >
> > > > My point is that such a magic change shouldn't happen. It doesn't happen
> > > > on real hardware either and people still get things installed to non-IDE
> > > > disks.
> > > >
> > > > There is no reason to install the OS onto a different device than will
> > > > be used later. With Linux, it's no problem at all because the PV drivers
> > > > are already included on the installation media anyway, and on Windows
> > or
> > > > presumably any other OS you can load and install the drivers right from
> > > > the beginning.
> > > >
> > > > In fact, I would be surprised if using xendisk instead of IDE for
> > > > installing Windows didn't result in a noticably faster installation.
> > > >
> > >
> > > It most certainly would, but requiring users do it this way is likely to 
> > > meet
> > some resistance I suspect.
> > 
> > Why do you think so? Installing the PV drivers afterwards doesn't seem
> > easier than just providing them during the installation.
> > 
> 
> My experience of XenServer customers tells me that any form of manual
> intervention during guest install is likely to meet with resistance,
> unfortunately.

Do they consider the guest install complete before they manually
intervene for installing the PV drivers?

I'm no Windows expert, but I'm sure there is a way to automate
installation even when a driver disk is needed.

> > > > Now, if you really insist on providing a legacy interface even to guests
> > > > that eventually use PV drivers, there actually are sane ways to
> > > > implement this. It will be tricky to make that transition now without
> > > > breaking compatibility, but it could have been done from the start.
> > > >
> > > > Sane means for example that you don't open the same image twice (and
> > > > even read-write!) at the same time. This is a recipe for disaster and
> > > > it's surprising that you don't see corrupted images more often.
> > > >
> > >
> > > We don't because unplug is supposed to ensure the emulated device is
> > > gone before the PV frontend is started
> > 
> > The important part is the backend, but it seems that you open the second
> > instance of the image only when starting the PV frontend?
> 
> I believe this is the case, yes.
> 
> > 
> > As long as you don't enable the user to use most of qemu's functionality
> > like starting block jobs (which would keep the IDE instance around even
> > after unplugging the disk), it might actually be safe assuming that the
> > guest cooperates. Not sure what a malicious guest could do, though, as
> > nobody seems to check whether IDE is really unplugged before the second
> > instance is opened.
> 
> The Windows drivers do check. After the unplug Windows is asked to
> re-enumerate the IDE buses and we make sure the disks we expect to be
> gone are really gone.

You can't use guest code to protect against malicious guests.

> > raw and qcow2 should be safe these days, but in
> > earlier times it would probably have been possible for the guest to
> > overwrite the image header and access arbitrary files on the host as
> > backing file. It might still be true for other image formats.
> > 
> > > > So if you wanted to have a clean solution, try to think how real
> > > > hardware would solve the problem. If you want me to suggest something
> > > > off the top of my head, I would come up with an extended IDE device
> > (one
> > > > single device!) that provides the IDE I/O ports and additionally some
> > > > MMIO BAR that enables access to PV functionality.
> > > >
> > > > Once you enable PV functionality, the IDE ports stop working; device
> > > > reset disables the PV ring and goes back to IDE mode. No hard disk
> > > > suddenly disappearing from the machine, no image corruption if the IDE
> > > > device is written to before enabling PV, etc.
> > > >
> > >
> > > That's not sufficient though. The IDE device must not be enumerated by
> > > the OS and, in Windows at least, that enumeration occurs before the PV
> > > frontend has started up.
> > 
> > The trick is that it's only a single device, so there is no second
> > device that must be prevented from being enumerated. You provide a
> > driver for this specific IDE controller, so Windows wouldn't even try
> > the generic IDE driver when your driver is available.
> > 
> 
> But the whole point is that we want Windows to use the 

Re: [Qemu-block] [Qemu-devel] [PATCH] hw/ide/ahci.c: Fix shift left into sign bit

2015-10-16 Thread John Snow


On 10/16/2015 01:48 PM, Peter Maydell wrote:
> Avoid undefined behaviour from shifting left into the sign bit:
> 
> hw/ide/ahci.c:551:36: runtime error: left shift of 255 by 24 places cannot be 
> represented in type 'int'
> 
> (Unfortunately C's promotion rules mean that in the expression
> "some_uint8_t_variable << 24" the LHS gets promoted to signed
> int before shifting.)
> 
> Signed-off-by: Peter Maydell 
> ---
> clang's undefined sanitizer produces a lot of copies of this warning during
> 'make check'...
> 
>  hw/ide/ahci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> index 796be15..21f76ed 100644
> --- a/hw/ide/ahci.c
> +++ b/hw/ide/ahci.c
> @@ -548,7 +548,7 @@ static void ahci_init_d2h(AHCIDevice *ad)
>  ad->init_d2h_sent = true;
>  /* We're emulating receiving the first Reg H2D Fis from the device;
>   * Update the SIG register, but otherwise proceed as normal. */
> -pr->sig = (ide_state->hcyl << 24) |
> +pr->sig = ((uint32_t)ide_state->hcyl << 24) |
>  (ide_state->lcyl << 16) |
>  (ide_state->sector << 8) |
>  (ide_state->nsector & 0xFF);
> 

Reviewed-by: John Snow 

Since the "patches" tool seems to still be hiccuping, do you want to
just apply this directly?

--js



Re: [Qemu-block] [PATCH] block: fix memory leak in early exit

2015-10-16 Thread Alberto Garcia
On Thu 15 Oct 2015 05:54:27 PM CEST, Stefan Hajnoczi  
wrote:
> The stream block job has two early exit code paths.  They do not free
> s->backing_file_str.
>
> Also, the early exits rely on the fact that the coroutine hasn't yielded
> yet and was launched from the main thread.  Therefore the coroutine is
> guaranteed to be running in the main thread where block_job_completed()
> may be called safely.  This is very subtle so it's nice to eliminate the
> assumption by unifying the early exit with the normal exit code path.
>
> Cc: Fam Zheng 
> Cc: Jeff Cody 
> Signed-off-by: Stefan Hajnoczi 

I had a slightly simpler version of this in my intermediate block
streaming series in case you're interested:

   https://patchwork.ozlabs.org/patch/471881/

But this one looks good to me too, so:

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [PATCH v8 03/11] block: never cancel a streaming job without running stream_complete()

2015-10-16 Thread Stefan Hajnoczi
On Tue, Jun 23, 2015 at 12:32:18AM +0300, Alberto Garcia wrote:
> We need to call stream_complete() in order to do all the necessary
> clean-ups, even if there's an early failure. At the moment it's only
> useful to make sure that s->backing_file_str is not leaked, but it
> will become more important as we introduce support for streaming to
> any intermediate node.
> 
> Signed-off-by: Alberto Garcia 
> Reviewed-by: Max Reitz 
> ---
>  block/stream.c | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)

Reviewed-by: Stefan Hajnoczi 



Re: [Qemu-block] [Qemu-devel] [Xen-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Laszlo Ersek
On 10/16/15 04:38, Kevin O'Connor wrote:
> On Fri, Oct 16, 2015 at 01:10:54AM +0200, Laszlo Ersek wrote:
>> On 10/14/15 13:27, Ian Campbell wrote:
>>> On Wed, 2015-10-14 at 12:06 +0100, Stefano Stabellini wrote:
> Can't you just teach SeaBIOS how to deal with your PV disks and then
> only add that to your VM and forget about IDE/AHCI? I mean, that's how
> it's done for virtio-blk, and it doesn't involve any insanities like
> ripping out non-hotpluggable devices.

 Teaching SeaBIOS to deal with PV disks can be done, in fact we already
 support PV disks in OVMF. It is possible to boot Windows with OVMF
 without any IDE disks (patch pending for libxl to create a VM without
 emulated IDE disks).
>>>
>>> One stumbling block in the past has been how to know when the PV drivers in
>>> the BIOS are no longer required, such that the ring can be torn down and/or
>>> the connection etc handed over to the OS driver.
> [...]
>>> AFAIK the BIOS interfaces do not have anything as reliable as that.
>>>
>>> How does virtio deal with this in the BIOS case?
>>
>> It doesn't, as far as I can tell.
>>
>> I don't think it has to, though! On a BIOS box, you can always boot DOS,
>> or another operating system that continues to use the BIOS interfaces
>> forever. (Same as if you never call ExitBootServices() in UEFI.)
>>
>> Given that no starter pistol gets fired between the firmware and the OS
>> on such a platform, they must always respect each other. I guess this
>> could occur through the E820 map, or some such.
> 
> One can use the "ACPI enable" SMI event to detect this if they really
> wanted to.  In SeaBIOS one could do this from
> src/fw/smm.c:handle_smi() - however, no other drivers need this
> notification today and it would be a bit ugly to have to handle it
> from an SMI.  (Assuming Xen were to support SMIs.)
> 
>> No clue in what kind of E820 memory SeaBIOS allocates the virtio rings,
>> but I guess the Linux kernel stays away from those areas until it's past
>> device probing and binding.
> 
> In SeaBIOS, the virtio memory is allocated from reserved memory.

Perfect! That gives Xen drivers precedence to do the same.

>  (See
> the memalign_high() call in src/hw/virtio-pci.c - the "high" memory
> zone is taken from reserved memory:
> http://seabios.org/Memory_Model#Memory_available_during_initialization
> )
> 
> What's the reason for the "stumbling block" that requires the BIOS to
> tear down the Xen ring prior to the OS being able to replace it?  The
> BIOS disk calls are all synchronous, so the ring wont be active when
> the OS brings up its own ring.

Yes, that's an argument that works well in practice. However...

> Is there some low-level interaction
> that prevents the OS from just resetting the ring prior to enabling
> it?

the assumption was that the ring would be placed into normal memory. If
GRUB or the kernel overwrote the memory (reallocating the same pages for
completely unrelated purposes) that used to contain the ring while
SeaBIOS was serving requests, the hypervisor would be allowed to notice
and act upon writes to those pages *without* any explicit "kick" (=
guest-to-host notification). The hypervisor is allowed to look at the
ring any time it wishes, so guest code uses barriers while populating
the ring, and kicks the hypervisor "just in case it's not looking right
now".

But if the firmware's ring is in reserved memory, then the OS will stay
away forever. That's great -- it answers the question for virtio, and
should also guide a Xen PV driver implementation.

Thanks!
Laszlo

> 
> -Kevin
> 




[Qemu-block] [PATCH v6 2/4] quorum: implement bdrv_add_child() and bdrv_del_child()

2015-10-16 Thread Wen Congyang
Signed-off-by: Wen Congyang 
Signed-off-by: zhanghailiang 
Signed-off-by: Gonglei 
---
 block.c   |  6 +++---
 block/quorum.c| 59 +--
 include/block/block.h |  3 +++
 3 files changed, 63 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index bcba22f..d96d2cc 100644
--- a/block.c
+++ b/block.c
@@ -1079,9 +1079,9 @@ static int bdrv_fill_options(QDict **options, const char 
**pfilename,
 return 0;
 }
 
-static BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
-BlockDriverState *child_bs,
-const BdrvChildRole *child_role)
+BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
+ BlockDriverState *child_bs,
+ const BdrvChildRole *child_role)
 {
 BdrvChild *child = g_new(BdrvChild, 1);
 *child = (BdrvChild) {
diff --git a/block/quorum.c b/block/quorum.c
index c4cda32..a9e499c 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -875,9 +875,9 @@ static int quorum_open(BlockDriverState *bs, QDict 
*options, int flags,
 ret = -EINVAL;
 goto exit;
 }
-if (s->num_children < 2) {
+if (s->num_children < 1) {
 error_setg(_err,
-   "Number of provided children must be greater than 1");
+   "Number of provided children must be 1 or more");
 ret = -EINVAL;
 goto exit;
 }
@@ -997,6 +997,58 @@ static void quorum_attach_aio_context(BlockDriverState *bs,
 }
 }
 
+static void quorum_add_child(BlockDriverState *bs, BlockDriverState *child_bs,
+ Error **errp)
+{
+BDRVQuorumState *s = bs->opaque;
+BdrvChild *child;
+
+bdrv_drain(bs);
+
+assert(s->num_children <= INT_MAX / sizeof(BdrvChild *));
+if (s->num_children == INT_MAX / sizeof(BdrvChild *)) {
+error_setg(errp, "Too many children");
+return;
+}
+s->children = g_renew(BdrvChild *, s->children, s->num_children + 1);
+
+bdrv_ref(child_bs);
+child = bdrv_attach_child(bs, child_bs, _format);
+s->children[s->num_children++] = child;
+}
+
+static void quorum_del_child(BlockDriverState *bs, BlockDriverState *child_bs,
+ Error **errp)
+{
+BDRVQuorumState *s = bs->opaque;
+BdrvChild *child;
+int i;
+
+for (i = 0; i < s->num_children; i++) {
+if (s->children[i]->bs == child_bs) {
+break;
+}
+}
+
+/* we have checked it in bdrv_del_child() */
+assert(i < s->num_children);
+child = s->children[i];
+
+if (s->num_children <= s->threshold) {
+error_setg(errp,
+"The number of children cannot be lower than the vote threshold 
%d",
+s->threshold);
+return;
+}
+
+bdrv_drain(bs);
+/* We can safely remove this child now */
+memmove(>children[i], >children[i + 1],
+(s->num_children - i - 1) * sizeof(void *));
+s->children = g_renew(BdrvChild *, s->children, --s->num_children);
+bdrv_unref_child(bs, child);
+}
+
 static void quorum_refresh_filename(BlockDriverState *bs)
 {
 BDRVQuorumState *s = bs->opaque;
@@ -1052,6 +1104,9 @@ static BlockDriver bdrv_quorum = {
 .bdrv_detach_aio_context= quorum_detach_aio_context,
 .bdrv_attach_aio_context= quorum_attach_aio_context,
 
+.bdrv_add_child = quorum_add_child,
+.bdrv_del_child = quorum_del_child,
+
 .is_filter  = true,
 .bdrv_recurse_is_first_non_filter   = quorum_recurse_is_first_non_filter,
 };
diff --git a/include/block/block.h b/include/block/block.h
index ef84c87..f5bfb6b 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -516,6 +516,9 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs);
 void bdrv_ref(BlockDriverState *bs);
 void bdrv_unref(BlockDriverState *bs);
 void bdrv_unref_child(BlockDriverState *parent, BdrvChild *child);
+BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
+ BlockDriverState *child_bs,
+ const BdrvChildRole *child_role);
 
 bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp);
 void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason);
-- 
2.4.3




Re: [Qemu-block] [Qemu-devel] [Xen-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Stefano Stabellini
On Fri, 16 Oct 2015, Laszlo Ersek wrote:
> On 10/16/15 11:06, Stefano Stabellini wrote:
> > On Thu, 15 Oct 2015, Kevin O'Connor wrote:
> >> On Fri, Oct 16, 2015 at 01:10:54AM +0200, Laszlo Ersek wrote:
> >>> On 10/14/15 13:27, Ian Campbell wrote:
>  On Wed, 2015-10-14 at 12:06 +0100, Stefano Stabellini wrote:
> >> Can't you just teach SeaBIOS how to deal with your PV disks and then
> >> only add that to your VM and forget about IDE/AHCI? I mean, that's how
> >> it's done for virtio-blk, and it doesn't involve any insanities like
> >> ripping out non-hotpluggable devices.
> >
> > Teaching SeaBIOS to deal with PV disks can be done, in fact we already
> > support PV disks in OVMF. It is possible to boot Windows with OVMF
> > without any IDE disks (patch pending for libxl to create a VM without
> > emulated IDE disks).
> 
>  One stumbling block in the past has been how to know when the PV drivers 
>  in
>  the BIOS are no longer required, such that the ring can be torn down 
>  and/or
>  the connection etc handed over to the OS driver.
> >> [...]
>  AFAIK the BIOS interfaces do not have anything as reliable as that.
> 
>  How does virtio deal with this in the BIOS case?
> >>>
> >>> It doesn't, as far as I can tell.
> >>>
> >>> I don't think it has to, though! On a BIOS box, you can always boot DOS,
> >>> or another operating system that continues to use the BIOS interfaces
> >>> forever. (Same as if you never call ExitBootServices() in UEFI.)
> >>>
> >>> Given that no starter pistol gets fired between the firmware and the OS
> >>> on such a platform, they must always respect each other. I guess this
> >>> could occur through the E820 map, or some such.
> >>
> >> One can use the "ACPI enable" SMI event to detect this if they really
> >> wanted to.  In SeaBIOS one could do this from
> >> src/fw/smm.c:handle_smi() - however, no other drivers need this
> >> notification today and it would be a bit ugly to have to handle it
> >> from an SMI.  (Assuming Xen were to support SMIs.)
> >>
> >>> No clue in what kind of E820 memory SeaBIOS allocates the virtio rings,
> >>> but I guess the Linux kernel stays away from those areas until it's past
> >>> device probing and binding.
> >>
> >> In SeaBIOS, the virtio memory is allocated from reserved memory.  (See
> >> the memalign_high() call in src/hw/virtio-pci.c - the "high" memory
> >> zone is taken from reserved memory:
> >> http://seabios.org/Memory_Model#Memory_available_during_initialization
> >> )
> >>
> >> What's the reason for the "stumbling block" that requires the BIOS to
> >> tear down the Xen ring prior to the OS being able to replace it?  The
> >> BIOS disk calls are all synchronous, so the ring wont be active when
> >> the OS brings up its own ring.  Is there some low-level interaction
> >> that prevents the OS from just resetting the ring prior to enabling
> >> it?
> > 
> > Xen only exports one PV disk interface for each disk to the guest, and
> > each PV interface only supports one frontend -- only SeaBIOS or the OS
> > can be connected to one PV disk, not both. In the case of OVMF, we
> > handle that by disconnecting the PV frontend in OVMF when
> > ExitBootServices is called, so that the OS driver can reconnect later.
> 
> Does the XenBus protocol support a device reset operation, regardless of
> what state the device is currently in? (I don't remember all the state
> transitions any longer, sorry.)

The PV block protocol doesn't unfortunately. At least the block backend
in QEMU doesn't support it.



[Qemu-block] [PATCH v3 06/12] block: Add "drained begin/end" for transactional external snapshot

2015-10-16 Thread Fam Zheng
This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

Signed-off-by: Fam Zheng 
---
 blockdev.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 32b04b4..90f1e15 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1479,6 +1479,7 @@ static void external_snapshot_prepare(BlkTransactionState 
*common,
 /* Acquire AioContext now so any threads operating on old_bs stop */
 state->aio_context = bdrv_get_aio_context(state->old_bs);
 aio_context_acquire(state->aio_context);
+bdrv_drained_begin(state->old_bs);
 
 if (!bdrv_is_inserted(state->old_bs)) {
 error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
@@ -1548,8 +1549,6 @@ static void external_snapshot_commit(BlkTransactionState 
*common)
  * don't want to abort all of them if one of them fails the reopen */
 bdrv_reopen(state->new_bs, state->new_bs->open_flags & ~BDRV_O_RDWR,
 NULL);
-
-aio_context_release(state->aio_context);
 }
 
 static void external_snapshot_abort(BlkTransactionState *common)
@@ -1559,7 +1558,14 @@ static void external_snapshot_abort(BlkTransactionState 
*common)
 if (state->new_bs) {
 bdrv_unref(state->new_bs);
 }
+}
+
+static void external_snapshot_clean(BlkTransactionState *common)
+{
+ExternalSnapshotState *state =
+ DO_UPCAST(ExternalSnapshotState, common, common);
 if (state->aio_context) {
+bdrv_drained_end(state->old_bs);
 aio_context_release(state->aio_context);
 }
 }
@@ -1724,6 +1730,7 @@ static const BdrvActionOps actions[] = {
 .prepare  = external_snapshot_prepare,
 .commit   = external_snapshot_commit,
 .abort = external_snapshot_abort,
+.clean = external_snapshot_clean,
 },
 [TRANSACTION_ACTION_KIND_DRIVE_BACKUP] = {
 .instance_size = sizeof(DriveBackupState),
-- 
2.4.3




[Qemu-block] [PATCH v3 05/12] block: Introduce "drained begin/end" API

2015-10-16 Thread Fam Zheng
The semantics is that after bdrv_drained_begin(bs), bs will not get new external
requests until the matching bdrv_drained_end(bs).

Signed-off-by: Fam Zheng 
---
 block.c   |  2 ++
 block/io.c| 17 +
 include/block/block.h | 19 +++
 include/block/block_int.h |  2 ++
 4 files changed, 40 insertions(+)

diff --git a/block.c b/block.c
index 1f90b47..9b28a07 100644
--- a/block.c
+++ b/block.c
@@ -2058,6 +2058,8 @@ static void bdrv_move_feature_fields(BlockDriverState 
*bs_dest,
 bs_dest->device_list = bs_src->device_list;
 bs_dest->blk = bs_src->blk;
 
+bs_dest->quiesce_counter = bs_src->quiesce_counter;
+
 memcpy(bs_dest->op_blockers, bs_src->op_blockers,
sizeof(bs_dest->op_blockers));
 }
diff --git a/block/io.c b/block/io.c
index 17293c3..a331a19 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2618,3 +2618,20 @@ void bdrv_flush_io_queue(BlockDriverState *bs)
 }
 bdrv_start_throttled_reqs(bs);
 }
+
+void bdrv_drained_begin(BlockDriverState *bs)
+{
+if (!bs->quiesce_counter++) {
+aio_disable_external(bdrv_get_aio_context(bs));
+}
+bdrv_drain(bs);
+}
+
+void bdrv_drained_end(BlockDriverState *bs)
+{
+assert(bs->quiesce_counter > 0);
+if (--bs->quiesce_counter > 0) {
+return;
+}
+aio_enable_external(bdrv_get_aio_context(bs));
+}
diff --git a/include/block/block.h b/include/block/block.h
index 2dd6630..c4f6eef 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -619,4 +619,23 @@ void bdrv_flush_io_queue(BlockDriverState *bs);
 
 BlockAcctStats *bdrv_get_stats(BlockDriverState *bs);
 
+/**
+ * bdrv_drained_begin:
+ *
+ * Begin a quiesced section for exclusive access to the BDS, by disabling
+ * external request sources including NBD server and device model. Note that
+ * this doesn't block timers or coroutines from submitting more requests, which
+ * means block_job_pause is still necessary.
+ *
+ * This function can be recursive.
+ */
+void bdrv_drained_begin(BlockDriverState *bs);
+
+/**
+ * bdrv_drained_end:
+ *
+ * End a quiescent section started by bdrv_drained_begin().
+ */
+void bdrv_drained_end(BlockDriverState *bs);
+
 #endif
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 14ad4c3..7c58221 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -456,6 +456,8 @@ struct BlockDriverState {
 /* threshold limit for writes, in bytes. "High water mark". */
 uint64_t write_threshold_offset;
 NotifierWithReturn write_threshold_notifier;
+
+int quiesce_counter;
 };
 
 
-- 
2.4.3




Re: [Qemu-block] [PATCH v2 3/3] virtio-blk: switch off scsi-passthrough by default

2015-10-16 Thread Christian Borntraeger
Am 16.10.2015 um 12:25 schrieb Cornelia Huck:
> Devices that are compliant with virtio-1 do not support scsi
> passthrough any more (and it has not been a recommended setup
> anyway for quite some time). To avoid having to switch it off
> explicitly in newer qemus that turn on virtio-1 by default, let's
> switch the default to scsi=false for 2.5.
> 
> Signed-off-by: Cornelia Huck 
> ---
>  hw/block/virtio-blk.c | 2 +-
>  include/hw/compat.h   | 6 +-
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 8beb26b..999dbd7 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -975,7 +975,7 @@ static Property virtio_blk_properties[] = {
>  DEFINE_PROP_STRING("serial", VirtIOBlock, conf.serial),
>  DEFINE_PROP_BIT("config-wce", VirtIOBlock, conf.config_wce, 0, true),
>  #ifdef __linux__
> -DEFINE_PROP_BIT("scsi", VirtIOBlock, conf.scsi, 0, true),
> +DEFINE_PROP_BIT("scsi", VirtIOBlock, conf.scsi, 0, false),
>  #endif
>  DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 0,
>  true),
> diff --git a/include/hw/compat.h b/include/hw/compat.h
> index 095de5d..93e71af 100644
> --- a/include/hw/compat.h
> +++ b/include/hw/compat.h
> @@ -2,7 +2,11 @@
>  #define HW_COMPAT_H
> 
>  #define HW_COMPAT_2_4 \
> -/* empty */
> +{\
> +.driver   = "virtio-blk-device",\
> +.property = "scsi",\
> +.value= "true",\

does that work?

If yes, would it make sense to convert the things in HW_COMPAT_2_3 from
pci to device, e.g.


{\
-   .driver   = "virtio-blk-pci",\
+   .driver   = "virtio-blk-device",\
.property = "any_layout",\
.value= "off",\
...




[Qemu-block] [PATCH v6 4/4] hmp: add monitor command to add/remove a child

2015-10-16 Thread Wen Congyang
The new command is blockdev_change. It does the same
thing as the QMP command x-blockdev-change.

Signed-off-by: Wen Congyang 
Signed-off-by: zhanghailiang 
Signed-off-by: Gonglei 
Cc: Luiz Capitulino 
---
 hmp-commands.hx | 17 +
 hmp.c   | 38 ++
 hmp.h   |  1 +
 3 files changed, 56 insertions(+)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 3a4ae39..57475cc 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -193,6 +193,23 @@ actions (drive options rerror, werror).
 ETEXI
 
 {
+.name   = "blockdev_change",
+.args_type  = "op:s,parent:B,child:B?,node:?",
+.params = "operation parent [child] [node]",
+.help   = "Dynamic reconfigure the block driver state graph",
+.mhandler.cmd = hmp_blockdev_change,
+},
+
+STEXI
+@item blockdev_change @var{operation} @var{parent} [@var{child}] [@var{node}]
+@findex blockdev_change
+Dynamic reconfigure the block driver state graph. It can be used to
+add, remove, insert, replace a block driver state. Currently only
+the Quorum driver implements this feature to add and remove its child.
+This is useful to fix a broken quorum child.
+ETEXI
+
+{
 .name   = "change",
 .args_type  = "device:B,target:F,arg:s?",
 .params = "device filename [format]",
diff --git a/hmp.c b/hmp.c
index 5048eee..fc58ae2 100644
--- a/hmp.c
+++ b/hmp.c
@@ -2346,3 +2346,41 @@ void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict 
*qdict)
 
 qapi_free_RockerOfDpaGroupList(list);
 }
+
+void hmp_blockdev_change(Monitor *mon, const QDict *qdict)
+{
+const char *operation = qdict_get_str(qdict, "op");
+const char *parent = qdict_get_str(qdict, "parent");
+const char *child = qdict_get_try_str(qdict, "child");
+const char *node = qdict_get_try_str(qdict, "node");
+ChangeOperation op = CHANGE_OPERATION_ADD;
+Error *local_err = NULL;
+bool has_child = !!child;
+bool has_node = !!node;
+
+while (ChangeOperation_lookup[op] != NULL) {
+if (strcmp(ChangeOperation_lookup[op], operation) == 0) {
+break;
+}
+op++;
+}
+
+if (ChangeOperation_lookup[op] == NULL) {
+error_setg(_err, "Invalid parameter '%s'", "operation");
+goto out;
+}
+
+/*
+ * FIXME: we must specify the parameter child, otherwise,
+ * we can't specify the parameter node.
+ */
+if (op == CHANGE_OPERATION_ADD) {
+has_child = false;
+}
+
+qmp_x_blockdev_change(op, parent, has_child, child,
+  has_node, node, _err);
+
+out:
+hmp_handle_error(mon, _err);
+}
diff --git a/hmp.h b/hmp.h
index 81656c3..80a1faf 100644
--- a/hmp.h
+++ b/hmp.h
@@ -130,5 +130,6 @@ void hmp_rocker(Monitor *mon, const QDict *qdict);
 void hmp_rocker_ports(Monitor *mon, const QDict *qdict);
 void hmp_rocker_of_dpa_flows(Monitor *mon, const QDict *qdict);
 void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict *qdict);
+void hmp_blockdev_change(Monitor *mon, const QDict *qdict);
 
 #endif
-- 
2.4.3




[Qemu-block] [PATCH v6 0/4] qapi: child add/delete support

2015-10-16 Thread Wen Congyang
If quorum's child is broken, we can use mirror job to replace it.
But sometimes, the user only need to remove the broken child, and
add it later when the problem is fixed.

It is based on the Kevin's bdrv_swap() related patch:
http://lists.nongnu.org/archive/html/qemu-devel/2015-10/msg02152.html

ChangLog:
v6:
1. Use a single qmp command x-blockdev-change to replace x-blockdev-child-add
   and x-blockdev-child-delete
v5:
1. Address Eric Blake's comments
v4:
1. drop nbd driver's implementation. We can use human-monitor-command
   to do it.
2. Rename the command name.
v3:
1. Don't open BDS in bdrv_add_child(). Use the existing BDS which is
   created by the QMP command blockdev-add.
2. The driver NBD can support filename, path, host:port now.
v2:
1. Use bdrv_get_device_or_node_name() instead of new function
   bdrv_get_id_or_node_name()
2. Update the error message
3. Update the documents in block-core.json


Wen Congyang (4):
  Add new block driver interface to add/delete a BDS's child
  quorum: implement bdrv_add_child() and bdrv_del_child()
  qmp: add monitor command to add/remove a child
  hmp: add monitor command to add/remove a child

 block.c   | 56 --
 block/quorum.c| 59 ++--
 blockdev.c| 76 +++
 hmp-commands.hx   | 17 +++
 hmp.c | 38 
 hmp.h |  1 +
 include/block/block.h |  8 +
 include/block/block_int.h |  5 
 qapi/block-core.json  | 40 +
 qmp-commands.hx   | 50 +++
 10 files changed, 345 insertions(+), 5 deletions(-)

-- 
2.4.3




[Qemu-block] [PATCH v3 07/12] block: Add "drained begin/end" for transactional backup

2015-10-16 Thread Fam Zheng
This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

Move the assignment to state->bs up right after bdrv_drained_begin, so
that we can use it in the clean callback. The abort callback will still
check bs->job and state->job, so it's OK.

Signed-off-by: Fam Zheng 
---
 blockdev.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 90f1e15..232bc21 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1599,6 +1599,8 @@ static void drive_backup_prepare(BlkTransactionState 
*common, Error **errp)
 /* AioContext is released in .clean() */
 state->aio_context = bdrv_get_aio_context(bs);
 aio_context_acquire(state->aio_context);
+bdrv_drained_begin(bs);
+state->bs = bs;
 
 qmp_drive_backup(backup->device, backup->target,
  backup->has_format, backup->format,
@@ -1614,7 +1616,6 @@ static void drive_backup_prepare(BlkTransactionState 
*common, Error **errp)
 return;
 }
 
-state->bs = bs;
 state->job = state->bs->job;
 }
 
@@ -1634,6 +1635,7 @@ static void drive_backup_clean(BlkTransactionState 
*common)
 DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
 
 if (state->aio_context) {
+bdrv_drained_end(state->bs);
 aio_context_release(state->aio_context);
 }
 }
-- 
2.4.3




[Qemu-block] [PATCH v3 02/12] nbd: Mark fd handlers client type as "external"

2015-10-16 Thread Fam Zheng
So we could distinguish it from internal used fds, thus avoid handling
unwanted events in nested aio polls.

Signed-off-by: Fam Zheng 
---
 nbd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/nbd.c b/nbd.c
index 32a1f66..b599e62 100644
--- a/nbd.c
+++ b/nbd.c
@@ -1446,7 +1446,7 @@ static void nbd_set_handlers(NBDClient *client)
 {
 if (client->exp && client->exp->ctx) {
 aio_set_fd_handler(client->exp->ctx, client->sock,
-   false,
+   true,
client->can_read ? nbd_read : NULL,
client->send_coroutine ? nbd_restart_write : NULL,
client);
@@ -1457,7 +1457,7 @@ static void nbd_unset_handlers(NBDClient *client)
 {
 if (client->exp && client->exp->ctx) {
 aio_set_fd_handler(client->exp->ctx, client->sock,
-   false, NULL, NULL, NULL);
+   true, NULL, NULL, NULL);
 }
 }
 
-- 
2.4.3




[Qemu-block] [PATCH] virtio-blk: switch off scsi-passthrough by default

2015-10-16 Thread Cornelia Huck
Devices that are compliant with virtio-1 do not support scsi
passthrough any more (and it has not been a recommended setup
anyway for quite some time). To avoid having to switch it off
explicitly in newer qemus that turn on virtio-1 by default, let's
switch the default to scsi=false for 2.5.

Signed-off-by: Cornelia Huck 
---

Note: this is based upon my s390x series from 10/14, as I otherwise
could not test compat on s390x.

---
 hw/block/virtio-blk.c  | 2 +-
 hw/s390x/s390-virtio-ccw.c | 4 
 include/hw/compat.h| 6 +-
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 8beb26b..999dbd7 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -975,7 +975,7 @@ static Property virtio_blk_properties[] = {
 DEFINE_PROP_STRING("serial", VirtIOBlock, conf.serial),
 DEFINE_PROP_BIT("config-wce", VirtIOBlock, conf.config_wce, 0, true),
 #ifdef __linux__
-DEFINE_PROP_BIT("scsi", VirtIOBlock, conf.scsi, 0, true),
+DEFINE_PROP_BIT("scsi", VirtIOBlock, conf.scsi, 0, false),
 #endif
 DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 0,
 true),
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 20883ff..2c72358 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -272,6 +272,10 @@ static const TypeInfo ccw_machine_info = {
 .driver   = "vhost-scsi-ccw",\
 .property = "max_revision",\
 .value= "0",\
+},{\
+.driver   = "virtio-blk-ccw",\
+.property = "scsi",\
+.value= "true",\
 },
 
 static void ccw_machine_2_4_class_init(ObjectClass *oc, void *data)
diff --git a/include/hw/compat.h b/include/hw/compat.h
index 095de5d..bbf1ab2 100644
--- a/include/hw/compat.h
+++ b/include/hw/compat.h
@@ -2,7 +2,11 @@
 #define HW_COMPAT_H
 
 #define HW_COMPAT_2_4 \
-/* empty */
+{\
+.driver   = "virtio-blk-pci",\
+.property = "scsi",\
+.value= "true",\
+},
 
 #define HW_COMPAT_2_3 \
 {\
-- 
2.3.9




Re: [Qemu-block] [PATCH] virtio-blk: switch off scsi-passthrough by default

2015-10-16 Thread Paolo Bonzini


On 16/10/2015 10:54, Cornelia Huck wrote:
> On Fri, 16 Oct 2015 10:46:31 +0200
> Paolo Bonzini  wrote:
> 
>>
>>
>> On 16/10/2015 10:41, Paolo Bonzini wrote:
>>>
>>>
>>> On 16/10/2015 10:40, Cornelia Huck wrote:
 --- a/hw/s390x/s390-virtio-ccw.c
 +++ b/hw/s390x/s390-virtio-ccw.c
 @@ -272,6 +272,10 @@ static const TypeInfo ccw_machine_info = {
  .driver   = "vhost-scsi-ccw",\
  .property = "max_revision",\
  .value= "0",\
 +},{\
 +.driver   = "virtio-blk-ccw",\
 +.property = "scsi",\
 +.value= "true",\
  },
  
  static void ccw_machine_2_4_class_init(ObjectClass *oc, void *data)
 diff --git a/include/hw/compat.h b/include/hw/compat.h
 index 095de5d..bbf1ab2 100644
 --- a/include/hw/compat.h
 +++ b/include/hw/compat.h
 @@ -2,7 +2,11 @@
  #define HW_COMPAT_H
  
  #define HW_COMPAT_2_4 \
 -/* empty */
 +{\
 +.driver   = "virtio-blk-pci",\
 +.property = "scsi",\
 +.value= "true",\
 +},
  
  #define HW_COMPAT_2_3 \
  {\

>>>
>>> s390 should use HW_COMPAT_2_4 as well.  Otherwise looks good.
>>
>> Hmm, ECONCISE probably.  Sorry.
>>
>> I mean that virtio-blk-ccw's scsi property should IMO go in
>> HW_COMPAT_2_4 as well.
> 
> I was wondering about the semantics of HW_COMPAT_*: Does any hw-related
> compat stuff go in there, even if it is architecture specific (like
> ccw)?

It depends.  For stuff like your max_revision I guess it makes sense to
keep it in the board.  Similarly, x86 CPU flags go in PC_COMPAT_*.  But
for stuff like virtio-blk-ccw, it makes some sense to keep it close to
virtio-blk-pci.

>> But I noticed now that:
>>
>> * if it works it would be even better if the compat property used
>> virtio-blk-device;
> 
> Hm. Previous virtio-compat always treated -pci explicitly, but we only
> gained s390x compat handling with 2.4, so it didn't really matter. But
> if it works, this is the saner approach.

Yes, I agree.

Paolo



Re: [Qemu-block] [PATCH v2 2/2] aio: Introduce aio-epoll.c

2015-10-16 Thread Stefan Hajnoczi
On Tue, Oct 13, 2015 at 07:10:55PM +0800, Fam Zheng wrote:
> +static bool aio_epoll_try_enable(AioContext *ctx)
> +{
> +AioHandler *node;
> +struct epoll_event event;
> +if (!ctx->epoll_available) {
> +return false;
> +}

Why check this here since aio_epoll_check_poll() already checks it?

> +static int aio_epoll(AioContext *ctx, GPollFD *pfds,
> + unsigned npfd, int64_t timeout)
> +{
> +AioHandler *node;
> +int i, ret = 0;
> +struct epoll_event events[128];

The strategy is to support up to 128 events per epoll_wait(2) call and
then wait for the next event loop iteration to harvest any remaining
events?

I just want to make sure I understand how this constant affects epoll
behavior.



Re: [Qemu-block] [Qemu-devel] [Xen-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Laszlo Ersek
On 10/16/15 11:06, Stefano Stabellini wrote:
> On Thu, 15 Oct 2015, Kevin O'Connor wrote:
>> On Fri, Oct 16, 2015 at 01:10:54AM +0200, Laszlo Ersek wrote:
>>> On 10/14/15 13:27, Ian Campbell wrote:
 On Wed, 2015-10-14 at 12:06 +0100, Stefano Stabellini wrote:
>> Can't you just teach SeaBIOS how to deal with your PV disks and then
>> only add that to your VM and forget about IDE/AHCI? I mean, that's how
>> it's done for virtio-blk, and it doesn't involve any insanities like
>> ripping out non-hotpluggable devices.
>
> Teaching SeaBIOS to deal with PV disks can be done, in fact we already
> support PV disks in OVMF. It is possible to boot Windows with OVMF
> without any IDE disks (patch pending for libxl to create a VM without
> emulated IDE disks).

 One stumbling block in the past has been how to know when the PV drivers in
 the BIOS are no longer required, such that the ring can be torn down and/or
 the connection etc handed over to the OS driver.
>> [...]
 AFAIK the BIOS interfaces do not have anything as reliable as that.

 How does virtio deal with this in the BIOS case?
>>>
>>> It doesn't, as far as I can tell.
>>>
>>> I don't think it has to, though! On a BIOS box, you can always boot DOS,
>>> or another operating system that continues to use the BIOS interfaces
>>> forever. (Same as if you never call ExitBootServices() in UEFI.)
>>>
>>> Given that no starter pistol gets fired between the firmware and the OS
>>> on such a platform, they must always respect each other. I guess this
>>> could occur through the E820 map, or some such.
>>
>> One can use the "ACPI enable" SMI event to detect this if they really
>> wanted to.  In SeaBIOS one could do this from
>> src/fw/smm.c:handle_smi() - however, no other drivers need this
>> notification today and it would be a bit ugly to have to handle it
>> from an SMI.  (Assuming Xen were to support SMIs.)
>>
>>> No clue in what kind of E820 memory SeaBIOS allocates the virtio rings,
>>> but I guess the Linux kernel stays away from those areas until it's past
>>> device probing and binding.
>>
>> In SeaBIOS, the virtio memory is allocated from reserved memory.  (See
>> the memalign_high() call in src/hw/virtio-pci.c - the "high" memory
>> zone is taken from reserved memory:
>> http://seabios.org/Memory_Model#Memory_available_during_initialization
>> )
>>
>> What's the reason for the "stumbling block" that requires the BIOS to
>> tear down the Xen ring prior to the OS being able to replace it?  The
>> BIOS disk calls are all synchronous, so the ring wont be active when
>> the OS brings up its own ring.  Is there some low-level interaction
>> that prevents the OS from just resetting the ring prior to enabling
>> it?
> 
> Xen only exports one PV disk interface for each disk to the guest, and
> each PV interface only supports one frontend -- only SeaBIOS or the OS
> can be connected to one PV disk, not both. In the case of OVMF, we
> handle that by disconnecting the PV frontend in OVMF when
> ExitBootServices is called, so that the OS driver can reconnect later.

Does the XenBus protocol support a device reset operation, regardless of
what state the device is currently in? (I don't remember all the state
transitions any longer, sorry.)

Thanks
Laszlo




Re: [Qemu-block] [PATCH v2 06/22] block: Add "supports_stats" field to BlockStats

2015-10-16 Thread Alberto Garcia
On Thu 15 Oct 2015 04:58:22 PM CEST, Stefan Hajnoczi wrote:
>> > If I/O accounting isn't being used then all fields will be 0?
>> 
>> Yes, but there's no way to tell if that happens because I/O
>> accounting is not supported or because there hasn't been any I/O yet.
>> 
>> There's one additional problem: this patch assumes that accounting is
>> supported if this BDS is attached to a BlockBackend. But we don't
>> know if the device model supports accounting or not, I still need to
>> figure out what's the best way to do it.
>
> Is there a corresponding libvirt patch or why does it matter whether
> the QMP client can detect whether blockstats are available?

I'm thinking that keeping this patch as it is now is probably not very
useful.

Block statistics are kept in the BlockBackend, so the only BDS that is
going to have data != 0 when you call query-blockstats is the topmost
one. There's probably no need to have an additional flag for this.

If you disconnect a BlockBackend from a device model that implements
accounting and then connect it to one that does not, there's no way for
the client to know that. That's probably worth exposing in the API, but
this patch does not do that yet, so I think we can skip it for now.

Berto



[Qemu-block] [PATCH v3 04/12] aio: introduce aio_{disable, enable}_external

2015-10-16 Thread Fam Zheng
Signed-off-by: Fam Zheng 
---
 aio-posix.c |  3 ++-
 aio-win32.c |  3 ++-
 include/block/aio.h | 37 +
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/aio-posix.c b/aio-posix.c
index f0f9122..0467f23 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -261,7 +261,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
 
 /* fill pollfds */
 QLIST_FOREACH(node, >aio_handlers, node) {
-if (!node->deleted && node->pfd.events) {
+if (!node->deleted && node->pfd.events
+&& aio_node_check(ctx, node->is_external)) {
 add_pollfd(node);
 }
 }
diff --git a/aio-win32.c b/aio-win32.c
index 3110d85..43c4c79 100644
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -309,7 +309,8 @@ bool aio_poll(AioContext *ctx, bool blocking)
 /* fill fd sets */
 count = 0;
 QLIST_FOREACH(node, >aio_handlers, node) {
-if (!node->deleted && node->io_notify) {
+if (!node->deleted && node->io_notify
+&& aio_node_check(ctx, node->is_external)) {
 events[count++] = event_notifier_get_handle(node->e);
 }
 }
diff --git a/include/block/aio.h b/include/block/aio.h
index 12f1141..80151d1 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -122,6 +122,8 @@ struct AioContext {
 
 /* TimerLists for calling timers - one per clock type */
 QEMUTimerListGroup tlg;
+
+int external_disable_cnt;
 };
 
 /**
@@ -375,4 +377,39 @@ static inline void aio_timer_init(AioContext *ctx,
  */
 int64_t aio_compute_timeout(AioContext *ctx);
 
+/**
+ * aio_disable_external:
+ * @ctx: the aio context
+ *
+ * Disable the furthur processing of clients.
+ */
+static inline void aio_disable_external(AioContext *ctx)
+{
+atomic_inc(>external_disable_cnt);
+}
+
+/**
+ * aio_enable_external:
+ * @ctx: the aio context
+ *
+ * Disable the processing of external clients.
+ */
+static inline void aio_enable_external(AioContext *ctx)
+{
+atomic_dec(>external_disable_cnt);
+}
+
+/**
+ * aio_node_check:
+ * @ctx: the aio context
+ * @is_external: Whether or not the checked node is an external event source.
+ *
+ * Check if the node's is_external flag is okey to be polled by the ctx at this
+ * moment. True means green light.
+ */
+static inline bool aio_node_check(AioContext *ctx, bool is_external)
+{
+return !is_external || !atomic_read(>external_disable_cnt);
+}
+
 #endif
-- 
2.4.3




Re: [Qemu-block] [PATCH v2 2/2] aio: Introduce aio-epoll.c

2015-10-16 Thread Fam Zheng
On Fri, 10/16 11:32, Stefan Hajnoczi wrote:
> On Tue, Oct 13, 2015 at 07:10:55PM +0800, Fam Zheng wrote:
> > +static bool aio_epoll_try_enable(AioContext *ctx)
> > +{
> > +AioHandler *node;
> > +struct epoll_event event;
> > +if (!ctx->epoll_available) {
> > +return false;
> > +}
> 
> Why check this here since aio_epoll_check_poll() already checks it?

You're right, it's redundant. I will remove it.

> 
> > +static int aio_epoll(AioContext *ctx, GPollFD *pfds,
> > + unsigned npfd, int64_t timeout)
> > +{
> > +AioHandler *node;
> > +int i, ret = 0;
> > +struct epoll_event events[128];
> 
> The strategy is to support up to 128 events per epoll_wait(2) call and
> then wait for the next event loop iteration to harvest any remaining
> events?

Yes.




Re: [Qemu-block] [PATCH v2 3/3] virtio-blk: switch off scsi-passthrough by default

2015-10-16 Thread Cornelia Huck
On Fri, 16 Oct 2015 12:32:52 +0200
Christian Borntraeger  wrote:

> Am 16.10.2015 um 12:25 schrieb Cornelia Huck:
> > Devices that are compliant with virtio-1 do not support scsi
> > passthrough any more (and it has not been a recommended setup
> > anyway for quite some time). To avoid having to switch it off
> > explicitly in newer qemus that turn on virtio-1 by default, let's
> > switch the default to scsi=false for 2.5.
> > 
> > Signed-off-by: Cornelia Huck 
> > ---
> >  hw/block/virtio-blk.c | 2 +-
> >  include/hw/compat.h   | 6 +-
> >  2 files changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> > index 8beb26b..999dbd7 100644
> > --- a/hw/block/virtio-blk.c
> > +++ b/hw/block/virtio-blk.c
> > @@ -975,7 +975,7 @@ static Property virtio_blk_properties[] = {
> >  DEFINE_PROP_STRING("serial", VirtIOBlock, conf.serial),
> >  DEFINE_PROP_BIT("config-wce", VirtIOBlock, conf.config_wce, 0, true),
> >  #ifdef __linux__
> > -DEFINE_PROP_BIT("scsi", VirtIOBlock, conf.scsi, 0, true),
> > +DEFINE_PROP_BIT("scsi", VirtIOBlock, conf.scsi, 0, false),
> >  #endif
> >  DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 
> > 0,
> >  true),
> > diff --git a/include/hw/compat.h b/include/hw/compat.h
> > index 095de5d..93e71af 100644
> > --- a/include/hw/compat.h
> > +++ b/include/hw/compat.h
> > @@ -2,7 +2,11 @@
> >  #define HW_COMPAT_H
> > 
> >  #define HW_COMPAT_2_4 \
> > -/* empty */
> > +{\
> > +.driver   = "virtio-blk-device",\
> > +.property = "scsi",\
> > +.value= "true",\
> 
> does that work?

It did for me :)

> 
> If yes, would it make sense to convert the things in HW_COMPAT_2_3 from
> pci to device, e.g.
> 
> 
> {\
> -   .driver   = "virtio-blk-pci",\
> +   .driver   = "virtio-blk-device",\
> .property = "any_layout",\
> .value= "off",\
> ...

Not sure: We don't have 2.3 compat for ccw... but would give a better
template for later changes.




Re: [Qemu-block] [PATCH] block: fix memory leak in early exit

2015-10-16 Thread Stefan Hajnoczi
On Fri, Oct 16, 2015 at 08:58:12AM +0200, Alberto Garcia wrote:
> On Thu 15 Oct 2015 05:54:27 PM CEST, Stefan Hajnoczi  
> wrote:
> > The stream block job has two early exit code paths.  They do not free
> > s->backing_file_str.
> >
> > Also, the early exits rely on the fact that the coroutine hasn't yielded
> > yet and was launched from the main thread.  Therefore the coroutine is
> > guaranteed to be running in the main thread where block_job_completed()
> > may be called safely.  This is very subtle so it's nice to eliminate the
> > assumption by unifying the early exit with the normal exit code path.
> >
> > Cc: Fam Zheng 
> > Cc: Jeff Cody 
> > Signed-off-by: Stefan Hajnoczi 
> 
> I had a slightly simpler version of this in my intermediate block
> streaming series in case you're interested:
> 
>https://patchwork.ozlabs.org/patch/471881/
> 
> But this one looks good to me too, so:
> 
> Reviewed-by: Alberto Garcia 

Kevin, please take Alberto's patch instead of mine.  The Message-ID is:

  d575a576c18d8972ac1a200c4022b39cbbce2507.1435008395.git.be...@igalia.com

BTW, I notice that Jeff isn't listed as maintainer for block/stream.c.
So according to MAINTAINERS this patch goes through you.

Stefan



[Qemu-block] [PATCH v6 3/4] qmp: add monitor command to add/remove a child

2015-10-16 Thread Wen Congyang
The new QMP command name is x-blockdev-change. It justs for adding/removing
quorum's child now, and don't support all kinds of children, all kinds of
operations, nor all block drivers. So it is experimental now.

Signed-off-by: Wen Congyang 
Signed-off-by: zhanghailiang 
Signed-off-by: Gonglei 
---
 blockdev.c   | 76 
 qapi/block-core.json | 40 +++
 qmp-commands.hx  | 50 ++
 3 files changed, 166 insertions(+)

diff --git a/blockdev.c b/blockdev.c
index 6c8cce4..72efe5d 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3086,6 +3086,82 @@ fail:
 qmp_output_visitor_cleanup(ov);
 }
 
+void qmp_x_blockdev_change(ChangeOperation op, const char *parent,
+   bool has_child, const char *child,
+   bool has_new_node, const char *new_node,
+   Error **errp)
+{
+BlockDriverState *parent_bs, *child_bs, *new_bs;
+Error *local_err = NULL;
+
+parent_bs = bdrv_lookup_bs(parent, parent, _err);
+if (!parent_bs) {
+error_propagate(errp, local_err);
+return;
+}
+
+switch(op) {
+case CHANGE_OPERATION_ADD:
+if (has_child) {
+error_setg(errp, "The operation %s doesn't support the parameter 
child",
+   ChangeOperation_lookup[op]);
+return;
+}
+if (!has_new_node) {
+error_setg(errp, "The operation %s needs the parameter new_node",
+   ChangeOperation_lookup[op]);
+return;
+}
+break;
+case CHANGE_OPERATION_DELETE:
+if (has_new_node) {
+error_setg(errp, "The operation %s doesn't support the parameter 
node",
+   ChangeOperation_lookup[op]);
+return;
+}
+if (!has_child) {
+error_setg(errp, "The operation %s needs the parameter child",
+   ChangeOperation_lookup[op]);
+return;
+}
+default:
+break;
+}
+
+if (has_child) {
+child_bs = bdrv_find_node(child);
+if (!child_bs) {
+error_setg(errp, "Node '%s' not found", child);
+return;
+}
+}
+
+if (has_new_node) {
+new_bs = bdrv_find_node(new_node);
+if (!new_bs) {
+error_setg(errp, "Node '%s' not found", new_node);
+return;
+}
+}
+
+switch(op) {
+case CHANGE_OPERATION_ADD:
+bdrv_add_child(parent_bs, new_bs, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+}
+break;
+case CHANGE_OPERATION_DELETE:
+bdrv_del_child(parent_bs, child_bs, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+}
+break;
+default:
+break;
+}
+}
+
 BlockJobInfoList *qmp_query_block_jobs(Error **errp)
 {
 BlockJobInfoList *head = NULL, **p_next = 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index bb2189e..361588f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2114,3 +2114,43 @@
 ##
 { 'command': 'block-set-write-threshold',
   'data': { 'node-name': 'str', 'write-threshold': 'uint64' } }
+
+##
+# @ChangeOperation:
+#
+# An enumeration of block device change operation.
+#
+# @add: Add a new block driver state to a existed block driver state.
+#
+# @delete: Delete a block driver state's child.
+#
+# Since: 2.5
+##
+{ 'enum': 'ChangeOperation',
+  'data': [ 'add', 'delete' ] }
+
+##
+# @x-blockdev-change
+#
+# Dynamic reconfigure the block driver state graph. It can be used to
+# add, remove, insert, replace a block driver state. Currently only
+# the Quorum driver implements this feature to add and remove its child.
+# This is useful to fix a broken quorum child.
+#
+# @operation: the chanage operation. It can be add, delete.
+#
+# @parent: the id or node name of which node will be changed.
+#
+# @child: the child node-name which will be deleted.
+#
+# @node: the new node-name which will be added.
+#
+# Note: this command is experimental, and not a stable API.
+#
+# Since: 2.5
+##
+{ 'command': 'x-blockdev-change',
+  'data' : { 'operation': 'ChangeOperation',
+ 'parent': 'str',
+ '*child': 'str',
+ '*node': 'str' } }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index d2ba800..ede7b71 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -3921,6 +3921,56 @@ Example (2):
 EQMP
 
 {
+.name   = "x-blockdev-change",
+.args_type  = "operation:s,parent:B,child:B?,node:B?",
+.mhandler.cmd_new = qmp_marshal_x_blockdev_change,
+},
+
+SQMP
+x-blockdev-change
+
+
+Dynamic reconfigure the block driver state graph. It can be used to
+add, remove, insert, replace a block driver state. Currently only

Re: [Qemu-block] [Qemu-devel] [Xen-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Stefano Stabellini
On Thu, 15 Oct 2015, Kevin O'Connor wrote:
> On Fri, Oct 16, 2015 at 01:10:54AM +0200, Laszlo Ersek wrote:
> > On 10/14/15 13:27, Ian Campbell wrote:
> > > On Wed, 2015-10-14 at 12:06 +0100, Stefano Stabellini wrote:
> > >>> Can't you just teach SeaBIOS how to deal with your PV disks and then
> > >>> only add that to your VM and forget about IDE/AHCI? I mean, that's how
> > >>> it's done for virtio-blk, and it doesn't involve any insanities like
> > >>> ripping out non-hotpluggable devices.
> > >>
> > >> Teaching SeaBIOS to deal with PV disks can be done, in fact we already
> > >> support PV disks in OVMF. It is possible to boot Windows with OVMF
> > >> without any IDE disks (patch pending for libxl to create a VM without
> > >> emulated IDE disks).
> > > 
> > > One stumbling block in the past has been how to know when the PV drivers 
> > > in
> > > the BIOS are no longer required, such that the ring can be torn down 
> > > and/or
> > > the connection etc handed over to the OS driver.
> [...]
> > > AFAIK the BIOS interfaces do not have anything as reliable as that.
> > > 
> > > How does virtio deal with this in the BIOS case?
> > 
> > It doesn't, as far as I can tell.
> > 
> > I don't think it has to, though! On a BIOS box, you can always boot DOS,
> > or another operating system that continues to use the BIOS interfaces
> > forever. (Same as if you never call ExitBootServices() in UEFI.)
> > 
> > Given that no starter pistol gets fired between the firmware and the OS
> > on such a platform, they must always respect each other. I guess this
> > could occur through the E820 map, or some such.
> 
> One can use the "ACPI enable" SMI event to detect this if they really
> wanted to.  In SeaBIOS one could do this from
> src/fw/smm.c:handle_smi() - however, no other drivers need this
> notification today and it would be a bit ugly to have to handle it
> from an SMI.  (Assuming Xen were to support SMIs.)
> 
> > No clue in what kind of E820 memory SeaBIOS allocates the virtio rings,
> > but I guess the Linux kernel stays away from those areas until it's past
> > device probing and binding.
> 
> In SeaBIOS, the virtio memory is allocated from reserved memory.  (See
> the memalign_high() call in src/hw/virtio-pci.c - the "high" memory
> zone is taken from reserved memory:
> http://seabios.org/Memory_Model#Memory_available_during_initialization
> )
> 
> What's the reason for the "stumbling block" that requires the BIOS to
> tear down the Xen ring prior to the OS being able to replace it?  The
> BIOS disk calls are all synchronous, so the ring wont be active when
> the OS brings up its own ring.  Is there some low-level interaction
> that prevents the OS from just resetting the ring prior to enabling
> it?

Xen only exports one PV disk interface for each disk to the guest, and
each PV interface only supports one frontend -- only SeaBIOS or the OS
can be connected to one PV disk, not both. In the case of OVMF, we
handle that by disconnecting the PV frontend in OVMF when
ExitBootServices is called, so that the OS driver can reconnect later.



Re: [Qemu-block] [Qemu-devel] [Xen-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Ian Campbell
On Fri, 2015-10-16 at 10:06 +0100, Stefano Stabellini wrote:

> > What's the reason for the "stumbling block" that requires the BIOS to
> > tear down the Xen ring prior to the OS being able to replace it?  The
> > BIOS disk calls are all synchronous, so the ring wont be active when
> > the OS brings up its own ring.  Is there some low-level interaction
> > that prevents the OS from just resetting the ring prior to enabling
> > it?
> 
> Xen only exports one PV disk interface for each disk to the guest, and
> each PV interface only supports one frontend -- only SeaBIOS or the OS
> can be connected to one PV disk, not both.

Which I think is just another way of saying that the Xen PV protocol
currently lacks an explicit requirement for the OS to reset the device (or
indeed the general PV infrastructure, grant tables etc) before use.

Retrofitting that requirement is of course a little tricky.

The unplug protocol might be extensible neough though. IIRC it does include
provisions for the OS to specify a version and the reject the unplug, so
upreving that to include a reset requirement _might_ be possible. At which
point it can at least be made a config option which can be switch on for
new enough guests.

i.e. if the guest is configured to use PV drivers from SeaBIOS the unplug
protocol would reject the attempt to unplug the (non-existent) IDE devices
and the guest therefore should fail to bind to the PV devices, while a
newer guest which knows it has to do a  reset would declare itself to be
newer and succeed in the unplug.

(NB: details of the protocol are sketchy in my memory, and the above may
need actual though applied to make it practical, but you get the gist I
hope).

Then you are just into some sort of multiyear transition/deprecation
sequence before you make it the default.

>  In the case of OVMF, we
> handle that by disconnecting the PV frontend in OVMF when
> ExitBootServices is called, so that the OS driver can reconnect later.




Re: [Qemu-block] [PATCH] virtio-blk: switch off scsi-passthrough by default

2015-10-16 Thread Paolo Bonzini


On 16/10/2015 10:40, Cornelia Huck wrote:
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -272,6 +272,10 @@ static const TypeInfo ccw_machine_info = {
>  .driver   = "vhost-scsi-ccw",\
>  .property = "max_revision",\
>  .value= "0",\
> +},{\
> +.driver   = "virtio-blk-ccw",\
> +.property = "scsi",\
> +.value= "true",\
>  },
>  
>  static void ccw_machine_2_4_class_init(ObjectClass *oc, void *data)
> diff --git a/include/hw/compat.h b/include/hw/compat.h
> index 095de5d..bbf1ab2 100644
> --- a/include/hw/compat.h
> +++ b/include/hw/compat.h
> @@ -2,7 +2,11 @@
>  #define HW_COMPAT_H
>  
>  #define HW_COMPAT_2_4 \
> -/* empty */
> +{\
> +.driver   = "virtio-blk-pci",\
> +.property = "scsi",\
> +.value= "true",\
> +},
>  
>  #define HW_COMPAT_2_3 \
>  {\
> 

s390 should use HW_COMPAT_2_4 as well.  Otherwise looks good.

Paolo



Re: [Qemu-block] [PATCH] virtio-blk: switch off scsi-passthrough by default

2015-10-16 Thread Cornelia Huck
On Fri, 16 Oct 2015 10:46:31 +0200
Paolo Bonzini  wrote:

> 
> 
> On 16/10/2015 10:41, Paolo Bonzini wrote:
> > 
> > 
> > On 16/10/2015 10:40, Cornelia Huck wrote:
> >> --- a/hw/s390x/s390-virtio-ccw.c
> >> +++ b/hw/s390x/s390-virtio-ccw.c
> >> @@ -272,6 +272,10 @@ static const TypeInfo ccw_machine_info = {
> >>  .driver   = "vhost-scsi-ccw",\
> >>  .property = "max_revision",\
> >>  .value= "0",\
> >> +},{\
> >> +.driver   = "virtio-blk-ccw",\
> >> +.property = "scsi",\
> >> +.value= "true",\
> >>  },
> >>  
> >>  static void ccw_machine_2_4_class_init(ObjectClass *oc, void *data)
> >> diff --git a/include/hw/compat.h b/include/hw/compat.h
> >> index 095de5d..bbf1ab2 100644
> >> --- a/include/hw/compat.h
> >> +++ b/include/hw/compat.h
> >> @@ -2,7 +2,11 @@
> >>  #define HW_COMPAT_H
> >>  
> >>  #define HW_COMPAT_2_4 \
> >> -/* empty */
> >> +{\
> >> +.driver   = "virtio-blk-pci",\
> >> +.property = "scsi",\
> >> +.value= "true",\
> >> +},
> >>  
> >>  #define HW_COMPAT_2_3 \
> >>  {\
> >>
> > 
> > s390 should use HW_COMPAT_2_4 as well.  Otherwise looks good.
> 
> Hmm, ECONCISE probably.  Sorry.
> 
> I mean that virtio-blk-ccw's scsi property should IMO go in
> HW_COMPAT_2_4 as well.

I was wondering about the semantics of HW_COMPAT_*: Does any hw-related
compat stuff go in there, even if it is architecture specific (like
ccw)?
> 
> But I noticed now that:
> 
> * if it works it would be even better if the compat property used
> virtio-blk-device;

Hm. Previous virtio-compat always treated -pci explicitly, but we only
gained s390x compat handling with 2.4, so it didn't really matter. But
if it works, this is the saner approach.

> 
> * a new pseries-2.4 machine also needs to be created.

Ah, wasn't aware they do compat as well.




Re: [Qemu-block] [PATCH v2 1/3] s390x: include HW_COMPAT_* props

2015-10-16 Thread Christian Borntraeger
Am 16.10.2015 um 12:25 schrieb Cornelia Huck:
> We want to inherit generic hw compat as well.
> 
> Signed-off-by: Cornelia Huck 
> ---
>  hw/s390x/s390-virtio-ccw.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index 20883ff..d09d867 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -20,6 +20,7 @@
>  #include "qemu/config-file.h"
>  #include "s390-pci-bus.h"
>  #include "hw/s390x/storage-keys.h"
> +#include "hw/compat.h"
> 
>  #define TYPE_S390_CCW_MACHINE   "s390-ccw-machine"
> 
> @@ -236,6 +237,7 @@ static const TypeInfo ccw_machine_info = {
>  };
> 
>  #define CCW_COMPAT_2_4 \
> +HW_COMPAT_2_4 \
>  {\
>  .driver   = TYPE_S390_SKEYS,\
>  .property = "migration-enabled",\
> 

Acked-by: Christian Borntraeger 





Re: [Qemu-block] [Qemu-devel] [PATCH 1/5] ide/atapi: make PIO read requests async

2015-10-16 Thread Peter Lieven
Am 14.10.2015 um 20:21 schrieb John Snow:
>
> On 10/14/2015 02:19 PM, Peter Lieven wrote:
>> Am 08.10.2015 um 18:44 schrieb John Snow:
>>> On 10/08/2015 08:06 AM, Peter Lieven wrote:
 Hi all,

 short summary from my side. The whole thing seems to get complicated,
 let me explain why:

 1) During review I found that the code in ide_atapi_cmd_reply_end can't
 work correctly if the
 byte_count_limit is not a divider or a multiple of cd_sector_size. The
 reason is that as soon
 as we load the next sector we start at io_buffer offset 0 overwriting
 whatever is left in there
 for transfer. We also reset the io_buffer_index to 0 which means if we
 continue with the
 elementary transfer we always transfer a whole sector (of corrupt data)
 regardless if we
 are allowed to transfer that much data. Before we consider fixing this I
 wonder if it
 is legal at all to have an unaligned byte_count_limit. It obviously has
 never caused trouble in
 practice so maybe its not happening in real life.

>>> I had overlooked that part. Good catch. I do suspect that in practice
>>> nobody will be asking for bizarre values.
>>>
>>> There's no rule against an unaligned byte_count_limit as far as I have
>>> read, but suspect nobody would have a reason to use it in practice.
>>>
 2) I found that whatever cool optimization I put in to buffer multiple
 sectors at once I end
 up with code that breaks migration because older versions would either
 not fill the io_buffer
 as expected or we introduce variables that older versions do not
 understand. This will
 lead to problems if we migrate in the middle of a transfer.

>>> Ech. This sounds like a bit of a problem. I'll need to think about this
>>> one...
>>>
 3) My current plan to get this patch to a useful state would be to use
 my initial patch and just
 change the code to use a sync request if we need to buffer additional
 sectors in an elementary
 transfer. I found that in real world operating systems the
 byte_count_limit seems to be equal to
 the cd_sector_size. After all its just a PIO transfer an operating
 system will likely switch to DMA
 as soon as the kernel ist loaded.

 Thanks,
 Peter

>>> It sounds like that might be "good enough" for now, and won't make
>>> behavior *worse* than it currently is. You can adjust the test I had
>>> checked in to not use a "tricky" value and we can amend support for this
>>> later if desired.
>> Have you had a chance to look at the series with the "good enough" fix?
>>
>> Thanks,
>> Peter
>>
> Will do so Friday, thanks!

Thank you,
Peter




Re: [Qemu-block] [PATCH] virtio-blk: switch off scsi-passthrough by default

2015-10-16 Thread Paolo Bonzini


On 16/10/2015 10:41, Paolo Bonzini wrote:
> 
> 
> On 16/10/2015 10:40, Cornelia Huck wrote:
>> --- a/hw/s390x/s390-virtio-ccw.c
>> +++ b/hw/s390x/s390-virtio-ccw.c
>> @@ -272,6 +272,10 @@ static const TypeInfo ccw_machine_info = {
>>  .driver   = "vhost-scsi-ccw",\
>>  .property = "max_revision",\
>>  .value= "0",\
>> +},{\
>> +.driver   = "virtio-blk-ccw",\
>> +.property = "scsi",\
>> +.value= "true",\
>>  },
>>  
>>  static void ccw_machine_2_4_class_init(ObjectClass *oc, void *data)
>> diff --git a/include/hw/compat.h b/include/hw/compat.h
>> index 095de5d..bbf1ab2 100644
>> --- a/include/hw/compat.h
>> +++ b/include/hw/compat.h
>> @@ -2,7 +2,11 @@
>>  #define HW_COMPAT_H
>>  
>>  #define HW_COMPAT_2_4 \
>> -/* empty */
>> +{\
>> +.driver   = "virtio-blk-pci",\
>> +.property = "scsi",\
>> +.value= "true",\
>> +},
>>  
>>  #define HW_COMPAT_2_3 \
>>  {\
>>
> 
> s390 should use HW_COMPAT_2_4 as well.  Otherwise looks good.

Hmm, ECONCISE probably.  Sorry.

I mean that virtio-blk-ccw's scsi property should IMO go in
HW_COMPAT_2_4 as well.

But I noticed now that:

* if it works it would be even better if the compat property used
virtio-blk-device;

* a new pseries-2.4 machine also needs to be created.

Thanks,

Paolo



Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Fabio Fantoni

Il 16/10/2015 12:13, Anthony PERARD ha scritto:

On Fri, Oct 16, 2015 at 10:32:44AM +0200, Fabio Fantoni wrote:

Il 15/10/2015 20:02, Anthony PERARD ha scritto:

On Thu, Oct 15, 2015 at 06:27:17PM +0200, Fabio Fantoni wrote:

Il 14/10/2015 13:06, Stefano Stabellini ha scritto:

I would suggest Fabio to avoid AHCI disks altogether and just use OVMF
with PV disks only and Anthony's patch to libxl to avoid creating any
IDE disks: http://marc.info/?l=xen-devel=144482080812353.

Would that work for you?

Thanks for the advice, I tried it:
https://github.com/Fantu/Xen/commits/rebase/m2r-testing-4.6

I installed W10 pro 64 bit with ide disk, installed the win pv drivers and
after changed to xvdX instead hdX, is the only change needed, right?
Initial boot is ok (ovmf part about pv disks seems ok) but windows boot
fails with problem with pv drivers.
In attachment full qemu log with xen_platform trace and domU's xl cfg.

Someone have windows domUs with ovmf and pv disks only working? If yes can
tell me the difference to understand what can be the problem please?

When I worked on the PV disk implementation in OVMF, I was able to boot
a Windows 8 with pv disk only.

I don't have access to the guest configuration I was using, but I think one
difference would be the viridian setting, I'm pretty sure I did not set it.


I tried with viridian disabled but did the same thing, looking cdrom as
latest thing before xenbug trace in qemu log I tried also to remove it but
also in this case don't boot correctly, full qemu log in attachment.
I don't know if is a ovmf thing to improve (like what seems in Laszlo and
Kevin mails) or xen winpv drivers unexpected case, have you tried also with
latest winpv builds? (for exclude regression)

No, I did not tried the latest winpv drivers.

Sorry I can help much more that that. When I install this win8 guest tried
to boot it with pv drivers only, that was more than a year ago. I have not
check if it's still working. (Also I can not try anything more recent,
right now.)



I did many other tests, retrying with ide first boot working but show pv 
devices not working, I did another reboot (with ide) and pv devices was 
working, after I retried with pv (xvdX) and boot correctly.
After other tests I found that with empty cdrom device (required for xl 
cd-insert/cd-eject) boot stop at start (tianocore image), same result 
with ide instead.
From xl cfg: 
disk=['/mnt/vm/disks/W10UEFI.disk1.cow-sn1,qcow2,xvda,rw',',raw,xvdb,ro,cdrom']

With seabios domU boot also with empty cdrom.
In qemu log I found only these that can be related:
xen be: qdisk-51728: error: Could not open image: No such file or 
directory

xen be: qdisk-51728: initialise() failed

And latest xl dmesg line is:

(d1) Invoking OVMF ...


If you need more informations/test tell me and I'll post them.

Thanks for any reply and sorry for my bad english.



[Qemu-block] [PATCH v2 3/3] virtio-blk: switch off scsi-passthrough by default

2015-10-16 Thread Cornelia Huck
Devices that are compliant with virtio-1 do not support scsi
passthrough any more (and it has not been a recommended setup
anyway for quite some time). To avoid having to switch it off
explicitly in newer qemus that turn on virtio-1 by default, let's
switch the default to scsi=false for 2.5.

Signed-off-by: Cornelia Huck 
---
 hw/block/virtio-blk.c | 2 +-
 include/hw/compat.h   | 6 +-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 8beb26b..999dbd7 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -975,7 +975,7 @@ static Property virtio_blk_properties[] = {
 DEFINE_PROP_STRING("serial", VirtIOBlock, conf.serial),
 DEFINE_PROP_BIT("config-wce", VirtIOBlock, conf.config_wce, 0, true),
 #ifdef __linux__
-DEFINE_PROP_BIT("scsi", VirtIOBlock, conf.scsi, 0, true),
+DEFINE_PROP_BIT("scsi", VirtIOBlock, conf.scsi, 0, false),
 #endif
 DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 0,
 true),
diff --git a/include/hw/compat.h b/include/hw/compat.h
index 095de5d..93e71af 100644
--- a/include/hw/compat.h
+++ b/include/hw/compat.h
@@ -2,7 +2,11 @@
 #define HW_COMPAT_H
 
 #define HW_COMPAT_2_4 \
-/* empty */
+{\
+.driver   = "virtio-blk-device",\
+.property = "scsi",\
+.value= "true",\
+},
 
 #define HW_COMPAT_2_3 \
 {\
-- 
2.3.9




[Qemu-block] [PATCH v2 0/3] virtio-blk: no scsi-passthrough by default

2015-10-16 Thread Cornelia Huck
Lightly tested on s390x.

Changes v1->v2:
- have the s390x compat hander include HW_COMPAT
- prepare the pseries 2.4 compat handler
- switch compat property at virtio-blk-device instead of the
  transport level

Cornelia Huck (3):
  s390x: include HW_COMPAT_* props
  ppc/spapr: add 2.4 compat props
  virtio-blk: switch off scsi-passthrough by default

 hw/block/virtio-blk.c  | 2 +-
 hw/ppc/spapr.c | 9 +
 hw/s390x/s390-virtio-ccw.c | 2 ++
 include/hw/compat.h| 6 +-
 4 files changed, 17 insertions(+), 2 deletions(-)

-- 
2.3.9




[Qemu-block] [PATCH v2 2/3] ppc/spapr: add 2.4 compat props

2015-10-16 Thread Cornelia Huck
HW_COMPAT_2_4 will become non-empty: prepare for it.

Signed-off-by: Cornelia Huck 
---
 hw/ppc/spapr.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index d1b0e53..c216e2c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2224,7 +2224,11 @@ static const TypeInfo spapr_machine_info = {
 },
 };
 
+#define SPAPR_COMPAT_2_4 \
+HW_COMPAT_2_4
+
 #define SPAPR_COMPAT_2_3 \
+SPAPR_COMPAT_2_4 \
 HW_COMPAT_2_3 \
 {\
 .driver   = "spapr-pci-host-bridge",\
@@ -2338,11 +2342,16 @@ static const TypeInfo spapr_machine_2_3_info = {
 
 static void spapr_machine_2_4_class_init(ObjectClass *oc, void *data)
 {
+static GlobalProperty compat_props[] = {
+SPAPR_COMPAT_2_4
+{ /* end of list */ }
+};
 MachineClass *mc = MACHINE_CLASS(oc);
 
 mc->desc = "pSeries Logical Partition (PAPR compliant) v2.4";
 mc->alias = "pseries";
 mc->is_default = 0;
+mc->compat_props = compat_props;
 }
 
 static const TypeInfo spapr_machine_2_4_info = {
-- 
2.3.9




[Qemu-block] [PATCH] blkdebug: Don't confuse image as backing file

2015-10-16 Thread Fam Zheng
The word "backing file" nowadays refers to the backing_hd in the
external snapshot sense (i.e. bs->backing_hd), instead of the file sense
(bs->file). Correct the comment to use the right term.

Signed-off-by: Fam Zheng 
---
 block/blkdebug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index bc247f4..54650a7 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -426,7 +426,7 @@ static int blkdebug_open(BlockDriverState *bs, QDict 
*options, int flags,
 /* Set initial state */
 s->state = 1;
 
-/* Open the backing file */
+/* Open the image file */
 assert(bs->file == NULL);
 ret = bdrv_open_image(>file, qemu_opt_get(opts, "x-image"), options, 
"image",
   bs, _file, false, _err);
-- 
2.4.3




Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Stefano Stabellini
On Fri, 16 Oct 2015, Fabio Fantoni wrote:
> Il 16/10/2015 12:13, Anthony PERARD ha scritto:
> > On Fri, Oct 16, 2015 at 10:32:44AM +0200, Fabio Fantoni wrote:
> > > Il 15/10/2015 20:02, Anthony PERARD ha scritto:
> > > > On Thu, Oct 15, 2015 at 06:27:17PM +0200, Fabio Fantoni wrote:
> > > > > Il 14/10/2015 13:06, Stefano Stabellini ha scritto:
> > > > > > I would suggest Fabio to avoid AHCI disks altogether and just use
> > > > > > OVMF
> > > > > > with PV disks only and Anthony's patch to libxl to avoid creating
> > > > > > any
> > > > > > IDE disks: http://marc.info/?l=xen-devel=144482080812353.
> > > > > > 
> > > > > > Would that work for you?
> > > > > Thanks for the advice, I tried it:
> > > > > https://github.com/Fantu/Xen/commits/rebase/m2r-testing-4.6
> > > > > 
> > > > > I installed W10 pro 64 bit with ide disk, installed the win pv drivers
> > > > > and
> > > > > after changed to xvdX instead hdX, is the only change needed, right?
> > > > > Initial boot is ok (ovmf part about pv disks seems ok) but windows
> > > > > boot
> > > > > fails with problem with pv drivers.
> > > > > In attachment full qemu log with xen_platform trace and domU's xl cfg.
> > > > > 
> > > > > Someone have windows domUs with ovmf and pv disks only working? If yes
> > > > > can
> > > > > tell me the difference to understand what can be the problem please?
> > > > When I worked on the PV disk implementation in OVMF, I was able to boot
> > > > a Windows 8 with pv disk only.
> > > > 
> > > > I don't have access to the guest configuration I was using, but I think
> > > > one
> > > > difference would be the viridian setting, I'm pretty sure I did not set
> > > > it.
> > > > 
> > > I tried with viridian disabled but did the same thing, looking cdrom as
> > > latest thing before xenbug trace in qemu log I tried also to remove it but
> > > also in this case don't boot correctly, full qemu log in attachment.
> > > I don't know if is a ovmf thing to improve (like what seems in Laszlo and
> > > Kevin mails) or xen winpv drivers unexpected case, have you tried also
> > > with
> > > latest winpv builds? (for exclude regression)
> > No, I did not tried the latest winpv drivers.
> > 
> > Sorry I can help much more that that. When I install this win8 guest tried
> > to boot it with pv drivers only, that was more than a year ago. I have not
> > check if it's still working. (Also I can not try anything more recent,
> > right now.)
> > 
> 
> I did many other tests, retrying with ide first boot working but show pv
> devices not working, I did another reboot (with ide) and pv devices was
> working, after I retried with pv (xvdX) and boot correctly.
> After other tests I found that with empty cdrom device (required for xl
> cd-insert/cd-eject) boot stop at start (tianocore image), same result with ide
> instead.
> From xl cfg:
> disk=['/mnt/vm/disks/W10UEFI.disk1.cow-sn1,qcow2,xvda,rw',',raw,xvdb,ro,cdrom']
> With seabios domU boot also with empty cdrom.
> In qemu log I found only these that can be related:
> > xen be: qdisk-51728: error: Could not open image: No such file or directory
> > xen be: qdisk-51728: initialise() failed
> And latest xl dmesg line is:
> > (d1) Invoking OVMF ...
> 
> If you need more informations/test tell me and I'll post them.

Are you saying that without any cdrom drives, it works correctly?



[Qemu-block] [PATCH v3 09/12] block: Add "drained begin/end" for internal snapshot

2015-10-16 Thread Fam Zheng
This ensures the atomicity of the transaction by avoiding processing of
external requests such as those from ioeventfd.

state->bs is assigned right after bdrv_drained_begin. Because it was
used as the flag for deletion or not in abort, now we need a separate
flag - InternalSnapshotState.created.

Signed-off-by: Fam Zheng 
---
 blockdev.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 015afbf..c3da2c6 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1280,6 +1280,7 @@ typedef struct InternalSnapshotState {
 BlockDriverState *bs;
 AioContext *aio_context;
 QEMUSnapshotInfo sn;
+bool created;
 } InternalSnapshotState;
 
 static void internal_snapshot_prepare(BlkTransactionState *common,
@@ -1318,6 +1319,8 @@ static void internal_snapshot_prepare(BlkTransactionState 
*common,
 /* AioContext is released in .clean() */
 state->aio_context = bdrv_get_aio_context(bs);
 aio_context_acquire(state->aio_context);
+bdrv_drained_begin(bs);
+state->bs = bs;
 
 if (!bdrv_is_inserted(bs)) {
 error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device);
@@ -1375,7 +1378,7 @@ static void internal_snapshot_prepare(BlkTransactionState 
*common,
 }
 
 /* 4. succeed, mark a snapshot is created */
-state->bs = bs;
+state->created = true;
 }
 
 static void internal_snapshot_abort(BlkTransactionState *common)
@@ -1386,7 +1389,7 @@ static void internal_snapshot_abort(BlkTransactionState 
*common)
 QEMUSnapshotInfo *sn = >sn;
 Error *local_error = NULL;
 
-if (!bs) {
+if (!state->created) {
 return;
 }
 
@@ -1407,6 +1410,7 @@ static void internal_snapshot_clean(BlkTransactionState 
*common)
  common, common);
 
 if (state->aio_context) {
+bdrv_drained_end(state->bs);
 aio_context_release(state->aio_context);
 }
 }
-- 
2.4.3




Re: [Qemu-block] [PATCH v2 0/3] virtio-blk: no scsi-passthrough by default

2015-10-16 Thread Paolo Bonzini


On 16/10/2015 12:25, Cornelia Huck wrote:
> Lightly tested on s390x.
> 
> Changes v1->v2:
> - have the s390x compat hander include HW_COMPAT
> - prepare the pseries 2.4 compat handler
> - switch compat property at virtio-blk-device instead of the
>   transport level
> 
> Cornelia Huck (3):
>   s390x: include HW_COMPAT_* props
>   ppc/spapr: add 2.4 compat props
>   virtio-blk: switch off scsi-passthrough by default
> 
>  hw/block/virtio-blk.c  | 2 +-
>  hw/ppc/spapr.c | 9 +
>  hw/s390x/s390-virtio-ccw.c | 2 ++
>  include/hw/compat.h| 6 +-
>  4 files changed, 17 insertions(+), 2 deletions(-)
> 

Thanks!  It all looks good.

Paolo



Re: [Qemu-block] [PATCH] block: fix memory leak in early exit

2015-10-16 Thread Jeff Cody
On Fri, Oct 16, 2015 at 10:37:17AM +0200, Stefan Hajnoczi wrote:
> On Fri, Oct 16, 2015 at 08:58:12AM +0200, Alberto Garcia wrote:
> > On Thu 15 Oct 2015 05:54:27 PM CEST, Stefan Hajnoczi  
> > wrote:
> > > The stream block job has two early exit code paths.  They do not free
> > > s->backing_file_str.
> > >
> > > Also, the early exits rely on the fact that the coroutine hasn't yielded
> > > yet and was launched from the main thread.  Therefore the coroutine is
> > > guaranteed to be running in the main thread where block_job_completed()
> > > may be called safely.  This is very subtle so it's nice to eliminate the
> > > assumption by unifying the early exit with the normal exit code path.
> > >
> > > Cc: Fam Zheng 
> > > Cc: Jeff Cody 
> > > Signed-off-by: Stefan Hajnoczi 
> > 
> > I had a slightly simpler version of this in my intermediate block
> > streaming series in case you're interested:
> > 
> >https://patchwork.ozlabs.org/patch/471881/
> > 
> > But this one looks good to me too, so:
> > 
> > Reviewed-by: Alberto Garcia 
> 
> Kevin, please take Alberto's patch instead of mine.  The Message-ID is:
> 
>   d575a576c18d8972ac1a200c4022b39cbbce2507.1435008395.git.be...@igalia.com
> 
> BTW, I notice that Jeff isn't listed as maintainer for block/stream.c.
> So according to MAINTAINERS this patch goes through you.
> 
> Stefan

I think that is probably just a typo - it lists block/stream.h in
MAINTAINERS, not block/stream.c.



[Qemu-block] [PATCH v3 12/12] tests: Add test case for aio_disable_external

2015-10-16 Thread Fam Zheng
Signed-off-by: Fam Zheng 
---
 tests/test-aio.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/tests/test-aio.c b/tests/test-aio.c
index 03cd45d..1623803 100644
--- a/tests/test-aio.c
+++ b/tests/test-aio.c
@@ -374,6 +374,29 @@ static void test_flush_event_notifier(void)
 event_notifier_cleanup();
 }
 
+static void test_aio_external_client(void)
+{
+int i, j;
+
+for (i = 1; i < 3; i++) {
+EventNotifierTestData data = { .n = 0, .active = 10, .auto_set = true 
};
+event_notifier_init(, false);
+aio_set_event_notifier(ctx, , true, event_ready_cb);
+event_notifier_set();
+for (j = 0; j < i; j++) {
+aio_disable_external(ctx);
+}
+for (j = 0; j < i; j++) {
+assert(!aio_poll(ctx, false));
+assert(event_notifier_test_and_clear());
+event_notifier_set();
+aio_enable_external(ctx);
+}
+assert(aio_poll(ctx, false));
+event_notifier_cleanup();
+}
+}
+
 static void test_wait_event_notifier_noflush(void)
 {
 EventNotifierTestData data = { .n = 0 };
@@ -832,6 +855,7 @@ int main(int argc, char **argv)
 g_test_add_func("/aio/event/wait",  test_wait_event_notifier);
 g_test_add_func("/aio/event/wait/no-flush-cb",  
test_wait_event_notifier_noflush);
 g_test_add_func("/aio/event/flush", test_flush_event_notifier);
+g_test_add_func("/aio/external-client", test_aio_external_client);
 g_test_add_func("/aio/timer/schedule",  test_timer_schedule);
 
 g_test_add_func("/aio-gsource/flush",   test_source_flush);
-- 
2.4.3




[Qemu-block] [PATCH v3 11/12] qed: Implement .bdrv_drain

2015-10-16 Thread Fam Zheng
The "need_check_timer" is used to clear the "NEED_CHECK" flag in the
image header after a grace period once metadata update has finished. In
compliance to the bdrv_drain semantics we should make sure it remains
deleted once .bdrv_drain is called.

Call the qed_need_check_timer_cb manually to update the header
immediately.

Signed-off-by: Fam Zheng 
---
 block/qed.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/block/qed.c b/block/qed.c
index a7ff1d9..23bd273 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -381,6 +381,12 @@ static void bdrv_qed_attach_aio_context(BlockDriverState 
*bs,
 }
 }
 
+static void bdrv_qed_drain(BlockDriverState *bs)
+{
+qed_cancel_need_check_timer(bs->opaque);
+qed_need_check_timer_cb(bs->opaque);
+}
+
 static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
  Error **errp)
 {
@@ -1683,6 +1689,7 @@ static BlockDriver bdrv_qed = {
 .bdrv_check   = bdrv_qed_check,
 .bdrv_detach_aio_context  = bdrv_qed_detach_aio_context,
 .bdrv_attach_aio_context  = bdrv_qed_attach_aio_context,
+.bdrv_drain   = bdrv_qed_drain,
 };
 
 static void bdrv_qed_init(void)
-- 
2.4.3




[Qemu-block] [PATCH v3 08/12] block: Add "drained begin/end" for transactional blockdev-backup

2015-10-16 Thread Fam Zheng
Similar to the previous patch, make sure that external events are not
dispatched during transaction operations.

Signed-off-by: Fam Zheng 
---
 blockdev.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 232bc21..015afbf 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1680,6 +1680,8 @@ static void blockdev_backup_prepare(BlkTransactionState 
*common, Error **errp)
 return;
 }
 aio_context_acquire(state->aio_context);
+state->bs = bs;
+bdrv_drained_begin(bs);
 
 qmp_blockdev_backup(backup->device, backup->target,
 backup->sync,
@@ -1692,7 +1694,6 @@ static void blockdev_backup_prepare(BlkTransactionState 
*common, Error **errp)
 return;
 }
 
-state->bs = bs;
 state->job = state->bs->job;
 }
 
@@ -1712,6 +1713,7 @@ static void blockdev_backup_clean(BlkTransactionState 
*common)
 BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, 
common);
 
 if (state->aio_context) {
+bdrv_drained_end(state->bs);
 aio_context_release(state->aio_context);
 }
 }
-- 
2.4.3




[Qemu-block] [PATCH v2 1/3] s390x: include HW_COMPAT_* props

2015-10-16 Thread Cornelia Huck
We want to inherit generic hw compat as well.

Signed-off-by: Cornelia Huck 
---
 hw/s390x/s390-virtio-ccw.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 20883ff..d09d867 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -20,6 +20,7 @@
 #include "qemu/config-file.h"
 #include "s390-pci-bus.h"
 #include "hw/s390x/storage-keys.h"
+#include "hw/compat.h"
 
 #define TYPE_S390_CCW_MACHINE   "s390-ccw-machine"
 
@@ -236,6 +237,7 @@ static const TypeInfo ccw_machine_info = {
 };
 
 #define CCW_COMPAT_2_4 \
+HW_COMPAT_2_4 \
 {\
 .driver   = TYPE_S390_SKEYS,\
 .property = "migration-enabled",\
-- 
2.3.9




Re: [Qemu-block] [Qemu-devel] [PATCH] blkdebug: Don't confuse image as backing file

2015-10-16 Thread Eric Blake
On 10/16/2015 04:46 AM, Fam Zheng wrote:
> The word "backing file" nowadays refers to the backing_hd in the
> external snapshot sense (i.e. bs->backing_hd), instead of the file sense
> (bs->file). Correct the comment to use the right term.
> 
> Signed-off-by: Fam Zheng 
> ---
>  block/blkdebug.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Eric Blake 

> 
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index bc247f4..54650a7 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -426,7 +426,7 @@ static int blkdebug_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  /* Set initial state */
>  s->state = 1;
>  
> -/* Open the backing file */
> +/* Open the image file */
>  assert(bs->file == NULL);
>  ret = bdrv_open_image(>file, qemu_opt_get(opts, "x-image"), options, 
> "image",
>bs, _file, false, _err);
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH v3 00/12] block: Protect nested event loop with bdrv_drained_begin and bdrv_drained_end

2015-10-16 Thread Fam Zheng
v3: Call bdrv_drain unconditionally in bdrv_drained_begin.
Document the internal I/O implications between bdrv_drain_begin and end.

The nested aio_poll()'s in block layer has a bug that new r/w requests from
ioeventfds and nbd exports are processed, which might break the caller's
semantics (qmp_transaction) or even pointers (bdrv_reopen).

Fam Zheng (12):
  aio: Add "is_external" flag for event handlers
  nbd: Mark fd handlers client type as "external"
  dataplane: Mark host notifiers' client type as "external"
  aio: introduce aio_{disable,enable}_external
  block: Introduce "drained begin/end" API
  block: Add "drained begin/end" for transactional external snapshot
  block: Add "drained begin/end" for transactional backup
  block: Add "drained begin/end" for transactional blockdev-backup
  block: Add "drained begin/end" for internal snapshot
  block: Introduce BlockDriver.bdrv_drain callback
  qed: Implement .bdrv_drain
  tests: Add test case for aio_disable_external

 aio-posix.c |  9 -
 aio-win32.c |  8 +++-
 async.c |  3 +-
 block.c |  2 +
 block/curl.c| 14 ---
 block/io.c  | 23 +++-
 block/iscsi.c   |  9 ++---
 block/linux-aio.c   |  5 ++-
 block/nbd-client.c  | 10 +++--
 block/nfs.c | 17 -
 block/qed.c |  7 
 block/sheepdog.c| 38 ---
 block/ssh.c |  5 ++-
 block/win32-aio.c   |  5 ++-
 blockdev.c  | 27 +++---
 hw/block/dataplane/virtio-blk.c |  5 ++-
 hw/scsi/virtio-scsi-dataplane.c | 22 +++
 include/block/aio.h | 39 
 include/block/block.h   | 24 
 include/block/block_int.h   |  8 
 iohandler.c |  3 +-
 nbd.c   |  4 +-
 tests/test-aio.c| 82 -
 23 files changed, 276 insertions(+), 93 deletions(-)

-- 
2.4.3




Re: [Qemu-block] [PATCH v10 08/10] Implement new driver for block replication

2015-10-16 Thread Stefan Hajnoczi
On Fri, Oct 16, 2015 at 10:22:05AM +0800, Wen Congyang wrote:
> On 10/15/2015 10:55 PM, Stefan Hajnoczi wrote:
> > On Thu, Oct 15, 2015 at 10:19:17AM +0800, Wen Congyang wrote:
> >> On 10/14/2015 10:27 PM, Stefan Hajnoczi wrote:
> >>> On Tue, Oct 13, 2015 at 05:08:17PM +0800, Wen Congyang wrote:
>  On 10/13/2015 12:27 AM, Stefan Hajnoczi wrote:
> > On Fri, Sep 25, 2015 at 02:17:36PM +0800, Wen Congyang wrote:
> >> +/* start backup job now */
> >> +bdrv_op_unblock(s->hidden_disk, BLOCK_OP_TYPE_BACKUP_TARGET,
> >> +s->active_disk->backing_blocker);
> >> +bdrv_op_unblock(s->secondary_disk, 
> >> BLOCK_OP_TYPE_BACKUP_SOURCE,
> >> +s->hidden_disk->backing_blocker);
> >
> > Why is it safe to unblock these operations?
> >
> > Why do they have to be blocked for non-replication users?
> 
>  hidden_disk and secondary disk are opened as backing file, so it is 
>  blocked for
>  non-replication users.
>  What can I do if I don't unblock it and want to do backup?
> >>>
> >>> CCing Jeff Cody, block jobs maintainer
> >>>
> >>> You need to explain why it is safe remove this protection.  We can't
> >>> merge code that may be unsafe.
> >>>
> >>> I think we can investigate further by asking: when does QEMU code assume
> >>> the backing file is read-only?
> >>
> >> The backing file is opened in read-only mode. I want to reopen it in 
> >> read-write
> >> mode here in the next version(So the patch 1 will be dropped)
> >>
> >>>
> >>> I haven't checked but these cases come to mind:
> >>>
> >>> Operations that move data between BDS in the backing chain (e.g. commit
> >>> and stream block jobs) will lose or overwrite data if the backing file
> >>> is being written to by another coroutine.
> >>>
> >>> We need to prevent users from running these operations at the same time.
> >>
> >> Yes, but qemu doesn't provide such API.
> > 
> > This series can't be merged unless it is safe.
> > 
> > Have you looked at op blockers and thought about how to prevent unsafe
> > operations?
> 
> What about this solution:
> 1. unblock it in bdrv_set_backing_hd()
> 2. block it in qmp_block_commit(), qmp_block_stream(), qmp_block_backup()..., 
> to
>prevent unsafe operations

Come to think of it, currently QEMU only supports 1 block job per BDS.

This means that as long as COLO has a backup job running, no other block
jobs can interfere.

There still might be a risk with monitor commands like 'commit'.

Stefan



Re: [Qemu-block] [Qemu-devel] [RFC] transactions: add transaction-wide property

2015-10-16 Thread Stefan Hajnoczi
On Mon, Oct 12, 2015 at 12:50:20PM -0400, John Snow wrote:
> Ping -- any consensus on how we should implement the "do-or-die"
> argument for transactions that start block jobs? :)
> 
> This patch may look a little hokey in how it boxes arguments, but I can
> re-do it on top of Eric Blake's very official way of boxing arguments,
> when the QAPI dust settles.

I don't understand what you are trying to do after staring at the email
for 5 minutes.  Maybe the other reviewers hit the same problem and
haven't responded.

What is the problem you're trying to solve?

Stefan



Re: [Qemu-block] [PATCH v2 3/3] virtio-blk: switch off scsi-passthrough by default

2015-10-16 Thread Christian Borntraeger
Am 16.10.2015 um 12:44 schrieb Cornelia Huck:
> On Fri, 16 Oct 2015 12:32:52 +0200
> Christian Borntraeger  wrote:
> 
>> Am 16.10.2015 um 12:25 schrieb Cornelia Huck:
>>> Devices that are compliant with virtio-1 do not support scsi
>>> passthrough any more (and it has not been a recommended setup
>>> anyway for quite some time). To avoid having to switch it off
>>> explicitly in newer qemus that turn on virtio-1 by default, let's
>>> switch the default to scsi=false for 2.5.
>>>
>>> Signed-off-by: Cornelia Huck 
>>> ---
>>>  hw/block/virtio-blk.c | 2 +-
>>>  include/hw/compat.h   | 6 +-
>>>  2 files changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
>>> index 8beb26b..999dbd7 100644
>>> --- a/hw/block/virtio-blk.c
>>> +++ b/hw/block/virtio-blk.c
>>> @@ -975,7 +975,7 @@ static Property virtio_blk_properties[] = {
>>>  DEFINE_PROP_STRING("serial", VirtIOBlock, conf.serial),
>>>  DEFINE_PROP_BIT("config-wce", VirtIOBlock, conf.config_wce, 0, true),
>>>  #ifdef __linux__
>>> -DEFINE_PROP_BIT("scsi", VirtIOBlock, conf.scsi, 0, true),
>>> +DEFINE_PROP_BIT("scsi", VirtIOBlock, conf.scsi, 0, false),
>>>  #endif
>>>  DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 
>>> 0,
>>>  true),
>>> diff --git a/include/hw/compat.h b/include/hw/compat.h
>>> index 095de5d..93e71af 100644
>>> --- a/include/hw/compat.h
>>> +++ b/include/hw/compat.h
>>> @@ -2,7 +2,11 @@
>>>  #define HW_COMPAT_H
>>>
>>>  #define HW_COMPAT_2_4 \
>>> -/* empty */
>>> +{\
>>> +.driver   = "virtio-blk-device",\
>>> +.property = "scsi",\
>>> +.value= "true",\
>>
>> does that work?
> 
> It did for me :)
> 
>>
>> If yes, would it make sense to convert the things in HW_COMPAT_2_3 from
>> pci to device, e.g.
>>
>>
>> {\
>> -   .driver   = "virtio-blk-pci",\
>> +   .driver   = "virtio-blk-device",\
>> .property = "any_layout",\
>> .value= "off",\
>> ...
> 
> Not sure: We don't have 2.3 compat for ccw... but would give a better
> template for later changes.

Yes. But this can be an addon patch. 

Lets keep this patch as is to have scsi=off as default for virtio 1.0

(some iotests do fail because of this)

Christian





[Qemu-block] [PATCH v3 01/12] aio: Add "is_external" flag for event handlers

2015-10-16 Thread Fam Zheng
All callers pass in false, and the real external ones will switch to
true in coming patches.

Signed-off-by: Fam Zheng 
---
 aio-posix.c |  6 -
 aio-win32.c |  5 
 async.c |  3 ++-
 block/curl.c| 14 +-
 block/iscsi.c   |  9 +++
 block/linux-aio.c   |  5 ++--
 block/nbd-client.c  | 10 ---
 block/nfs.c | 17 +---
 block/sheepdog.c| 38 ++-
 block/ssh.c |  5 ++--
 block/win32-aio.c   |  5 ++--
 hw/block/dataplane/virtio-blk.c |  6 +++--
 hw/scsi/virtio-scsi-dataplane.c | 24 +++--
 include/block/aio.h |  2 ++
 iohandler.c |  3 ++-
 nbd.c   |  4 ++-
 tests/test-aio.c| 58 +++--
 17 files changed, 130 insertions(+), 84 deletions(-)

diff --git a/aio-posix.c b/aio-posix.c
index d477033..f0f9122 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -25,6 +25,7 @@ struct AioHandler
 IOHandler *io_write;
 int deleted;
 void *opaque;
+bool is_external;
 QLIST_ENTRY(AioHandler) node;
 };
 
@@ -43,6 +44,7 @@ static AioHandler *find_aio_handler(AioContext *ctx, int fd)
 
 void aio_set_fd_handler(AioContext *ctx,
 int fd,
+bool is_external,
 IOHandler *io_read,
 IOHandler *io_write,
 void *opaque)
@@ -82,6 +84,7 @@ void aio_set_fd_handler(AioContext *ctx,
 node->io_read = io_read;
 node->io_write = io_write;
 node->opaque = opaque;
+node->is_external = is_external;
 
 node->pfd.events = (io_read ? G_IO_IN | G_IO_HUP | G_IO_ERR : 0);
 node->pfd.events |= (io_write ? G_IO_OUT | G_IO_ERR : 0);
@@ -92,10 +95,11 @@ void aio_set_fd_handler(AioContext *ctx,
 
 void aio_set_event_notifier(AioContext *ctx,
 EventNotifier *notifier,
+bool is_external,
 EventNotifierHandler *io_read)
 {
 aio_set_fd_handler(ctx, event_notifier_get_fd(notifier),
-   (IOHandler *)io_read, NULL, notifier);
+   is_external, (IOHandler *)io_read, NULL, notifier);
 }
 
 bool aio_prepare(AioContext *ctx)
diff --git a/aio-win32.c b/aio-win32.c
index 50a6867..3110d85 100644
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -28,11 +28,13 @@ struct AioHandler {
 GPollFD pfd;
 int deleted;
 void *opaque;
+bool is_external;
 QLIST_ENTRY(AioHandler) node;
 };
 
 void aio_set_fd_handler(AioContext *ctx,
 int fd,
+bool is_external,
 IOHandler *io_read,
 IOHandler *io_write,
 void *opaque)
@@ -86,6 +88,7 @@ void aio_set_fd_handler(AioContext *ctx,
 node->opaque = opaque;
 node->io_read = io_read;
 node->io_write = io_write;
+node->is_external = is_external;
 
 event = event_notifier_get_handle(>notifier);
 WSAEventSelect(node->pfd.fd, event,
@@ -98,6 +101,7 @@ void aio_set_fd_handler(AioContext *ctx,
 
 void aio_set_event_notifier(AioContext *ctx,
 EventNotifier *e,
+bool is_external,
 EventNotifierHandler *io_notify)
 {
 AioHandler *node;
@@ -133,6 +137,7 @@ void aio_set_event_notifier(AioContext *ctx,
 node->e = e;
 node->pfd.fd = (uintptr_t)event_notifier_get_handle(e);
 node->pfd.events = G_IO_IN;
+node->is_external = is_external;
 QLIST_INSERT_HEAD(>aio_handlers, node, node);
 
 g_source_add_poll(>source, >pfd);
diff --git a/async.c b/async.c
index efce14b..bdc64a3 100644
--- a/async.c
+++ b/async.c
@@ -247,7 +247,7 @@ aio_ctx_finalize(GSource *source)
 }
 qemu_mutex_unlock(>bh_lock);
 
-aio_set_event_notifier(ctx, >notifier, NULL);
+aio_set_event_notifier(ctx, >notifier, false, NULL);
 event_notifier_cleanup(>notifier);
 rfifolock_destroy(>lock);
 qemu_mutex_destroy(>bh_lock);
@@ -329,6 +329,7 @@ AioContext *aio_context_new(Error **errp)
 }
 g_source_set_can_recurse(>source, true);
 aio_set_event_notifier(ctx, >notifier,
+   false,
(EventNotifierHandler *)
event_notifier_dummy_cb);
 ctx->thread_pool = NULL;
diff --git a/block/curl.c b/block/curl.c
index 032cc8a..8994182 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -154,18 +154,20 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int 
action,
 DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, fd);
 switch (action) {
 case CURL_POLL_IN:
- 

[Qemu-block] [PATCH v3 10/12] block: Introduce BlockDriver.bdrv_drain callback

2015-10-16 Thread Fam Zheng
Drivers can have internal request sources that generate IO, like the
need_check_timer in QED. Since we want quiesced periods that contain
nested event loops in block layer, we need to have a way to disable such
event sources.

Block drivers must implement the "bdrv_drain" callback if it has any
internal sources that can generate I/O activity, like a timer or a
worker thread (even in a library) that can schedule QEMUBH in an
asynchronous callback.

Update the comments of bdrv_drain and bdrv_drained_begin accordingly.

Signed-off-by: Fam Zheng 
---
 block/io.c| 6 +-
 include/block/block.h | 9 +++--
 include/block/block_int.h | 6 ++
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index a331a19..ef8f9cc 100644
--- a/block/io.c
+++ b/block/io.c
@@ -234,7 +234,8 @@ static bool bdrv_requests_pending(BlockDriverState *bs)
 }
 
 /*
- * Wait for pending requests to complete on a single BlockDriverState subtree
+ * Wait for pending requests to complete on a single BlockDriverState subtree,
+ * and suspend block driver's internal I/O until next request arrives.
  *
  * Note that unlike bdrv_drain_all(), the caller must hold the BlockDriverState
  * AioContext.
@@ -247,6 +248,9 @@ void bdrv_drain(BlockDriverState *bs)
 {
 bool busy = true;
 
+if (bs->drv && bs->drv->bdrv_drain) {
+bs->drv->bdrv_drain(bs);
+}
 while (busy) {
 /* Keep iterating */
  bdrv_flush_io_queue(bs);
diff --git a/include/block/block.h b/include/block/block.h
index c4f6eef..ff29133 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -624,8 +624,13 @@ BlockAcctStats *bdrv_get_stats(BlockDriverState *bs);
  *
  * Begin a quiesced section for exclusive access to the BDS, by disabling
  * external request sources including NBD server and device model. Note that
- * this doesn't block timers or coroutines from submitting more requests, which
- * means block_job_pause is still necessary.
+ * this doesn't prevent timers or coroutines from submitting more requests,
+ * which means block_job_pause is still necessary.
+ *
+ * If new I/O requests are submitted after bdrv_drained_begin is called before
+ * bdrv_drained_end, more internal I/O might be going on after the request has
+ * been completed. If you don't want this, you have to issue another bdrv_drain
+ * or use a nested bdrv_drained_begin/end section.
  *
  * This function can be recursive.
  */
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 7c58221..99359b2 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -288,6 +288,12 @@ struct BlockDriver {
  */
 int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo);
 
+/**
+ * Drain and stop any internal sources of requests in the driver, and
+ * remain so until next I/O callback (e.g. bdrv_co_writev) is called.
+ */
+void (*bdrv_drain)(BlockDriverState *bs);
+
 QLIST_ENTRY(BlockDriver) list;
 };
 
-- 
2.4.3




[Qemu-block] [PATCH v3 03/12] dataplane: Mark host notifiers' client type as "external"

2015-10-16 Thread Fam Zheng
They will be excluded by type in the nested event loops in block layer,
so that unwanted events won't be processed there.

Signed-off-by: Fam Zheng 
---
 hw/block/dataplane/virtio-blk.c |  5 ++---
 hw/scsi/virtio-scsi-dataplane.c | 18 --
 2 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index f8716bc..c42ddeb 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -283,7 +283,7 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 
 /* Get this show started by hooking up our callbacks */
 aio_context_acquire(s->ctx);
-aio_set_event_notifier(s->ctx, >host_notifier, false,
+aio_set_event_notifier(s->ctx, >host_notifier, true,
handle_notify);
 aio_context_release(s->ctx);
 return;
@@ -320,8 +320,7 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 aio_context_acquire(s->ctx);
 
 /* Stop notifications for new requests from guest */
-aio_set_event_notifier(s->ctx, >host_notifier, false,
-   NULL);
+aio_set_event_notifier(s->ctx, >host_notifier, true, NULL);
 
 /* Drain and switch bs back to the QEMU main loop */
 blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index d149418..1c188f0 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -60,8 +60,7 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
 r = g_slice_new(VirtIOSCSIVring);
 r->host_notifier = *virtio_queue_get_host_notifier(vq);
 r->guest_notifier = *virtio_queue_get_guest_notifier(vq);
-aio_set_event_notifier(s->ctx, >host_notifier, false,
-   handler);
+aio_set_event_notifier(s->ctx, >host_notifier, true, handler);
 
 r->parent = s;
 
@@ -72,8 +71,7 @@ static VirtIOSCSIVring *virtio_scsi_vring_init(VirtIOSCSI *s,
 return r;
 
 fail_vring:
-aio_set_event_notifier(s->ctx, >host_notifier, false,
-   NULL);
+aio_set_event_notifier(s->ctx, >host_notifier, true, NULL);
 k->set_host_notifier(qbus->parent, n, false);
 g_slice_free(VirtIOSCSIVring, r);
 return NULL;
@@ -165,16 +163,16 @@ static void virtio_scsi_clear_aio(VirtIOSCSI *s)
 
 if (s->ctrl_vring) {
 aio_set_event_notifier(s->ctx, >ctrl_vring->host_notifier,
-   false, NULL);
+   true, NULL);
 }
 if (s->event_vring) {
 aio_set_event_notifier(s->ctx, >event_vring->host_notifier,
-   false, NULL);
+   true, NULL);
 }
 if (s->cmd_vrings) {
 for (i = 0; i < vs->conf.num_queues && s->cmd_vrings[i]; i++) {
 aio_set_event_notifier(s->ctx, >cmd_vrings[i]->host_notifier,
-   false, NULL);
+   true, NULL);
 }
 }
 }
@@ -296,12 +294,12 @@ void virtio_scsi_dataplane_stop(VirtIOSCSI *s)
 aio_context_acquire(s->ctx);
 
 aio_set_event_notifier(s->ctx, >ctrl_vring->host_notifier,
-   false, NULL);
+   true, NULL);
 aio_set_event_notifier(s->ctx, >event_vring->host_notifier,
-   false, NULL);
+   true, NULL);
 for (i = 0; i < vs->conf.num_queues; i++) {
 aio_set_event_notifier(s->ctx, >cmd_vrings[i]->host_notifier,
-   false, NULL);
+   true, NULL);
 }
 
 blk_drain_all(); /* ensure there are no in-flight requests */
-- 
2.4.3




Re: [Qemu-block] [PATCH] blkdebug: Don't confuse image as backing file

2015-10-16 Thread Alberto Garcia
On Fri 16 Oct 2015 12:46:04 PM CEST, Fam Zheng wrote:
> The word "backing file" nowadays refers to the backing_hd in the
> external snapshot sense (i.e. bs->backing_hd), instead of the file sense
> (bs->file). Correct the comment to use the right term.
>
> Signed-off-by: Fam Zheng 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [Qemu-devel] [Xen-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Kevin O'Connor
On Fri, Oct 16, 2015 at 10:06:48AM +0100, Stefano Stabellini wrote:
> On Thu, 15 Oct 2015, Kevin O'Connor wrote:
> > What's the reason for the "stumbling block" that requires the BIOS to
> > tear down the Xen ring prior to the OS being able to replace it?  The
> > BIOS disk calls are all synchronous, so the ring wont be active when
> > the OS brings up its own ring.  Is there some low-level interaction
> > that prevents the OS from just resetting the ring prior to enabling
> > it?
> 
> Xen only exports one PV disk interface for each disk to the guest, and
> each PV interface only supports one frontend -- only SeaBIOS or the OS
> can be connected to one PV disk, not both. In the case of OVMF, we
> handle that by disconnecting the PV frontend in OVMF when
> ExitBootServices is called, so that the OS driver can reconnect later.

Well, there isn't a requirement for both SeaBIOS and the OS to be
connected at the same time - it's fine for the OS to replace SeaBIOS.
With the hardware I'm familiar with (eg, usb, ahci, virtio) the OS
just ends up replacing SeaBIOS' DMA rings when it configures its own.
I guess something in the low-level interface of Xen makes that not
work.

Is plugging/unplugging very high overhead?  Since the SeaBIOS disk
interface is fully synchronous, in theory one could have it
plug/unplug on every read request.

-Kevin



Re: [Qemu-block] [PATCH] blkdebug: Don't confuse image as backing file

2015-10-16 Thread Kevin Wolf
Am 16.10.2015 um 12:46 hat Fam Zheng geschrieben:
> The word "backing file" nowadays refers to the backing_hd in the
> external snapshot sense (i.e. bs->backing_hd), instead of the file sense
> (bs->file). Correct the comment to use the right term.
> 
> Signed-off-by: Fam Zheng 

Thanks, applied to the block branch.

Kevin



Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Kevin Wolf
Am 14.10.2015 um 14:48 hat Paul Durrant geschrieben:
> > -Original Message-
> > From: Fabio Fantoni [mailto:fabio.fant...@m2r.biz]
> > Sent: 14 October 2015 12:12
> > To: Kevin Wolf; Stefano Stabellini
> > Cc: John Snow; Anthony Perard; qemu-de...@nongnu.org; xen-
> > de...@lists.xen.org; qemu-block@nongnu.org; Paul Durrant
> > Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci
> > missed in qemu
> > 
> > 
> > 
> > Il 14/10/2015 11:47, Kevin Wolf ha scritto:
> > > [ CC qemu-block ]
> > >
> > > Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieben:
> > >> On Tue, 13 Oct 2015, John Snow wrote:
> > >>> On 10/13/2015 11:55 AM, Fabio Fantoni wrote:
> >  I added ahci disk support in libxl and using it for week seems that was
> >  ok, after a reply of Stefano Stabellini seems that xen disk unplug
> >  support only ide disks:
> > 
> > http://git.qemu.org/?p=qemu.git;a=commitdiff;h=679f4f8b178e7c66fbc2f39
> > c905374ee8663d5d8
> > 
> >  Today Paul Durrant told me that even if pv disk is ok also with ahci 
> >  and
> >  the emulated one is offline can be a risk:
> >  http://lists.xenproject.org/archives/html/win-pv-devel/2015-
> > 10/msg00021.html
> > 
> > 
> >  I tried to take a fast look in qemu code but I not understand the
> > needed
> >  thing for add the xen disk unplug support also for ahci, can someone do
> >  it or tell me useful information for do it please?
> > 
> >  Thanks for any reply and sorry for my bad english.
> > 
> > >>> I'm not entirely sure what features you need AHCI to support in order
> > >>> for Xen to be happy.
> > >>>
> > >>> I'd guess hotplugging, but where I get confused is that IDE disks don't
> > >>> support hotplugging either, so I guess I'm not sure sure what you need.
> > >>>
> > >>> Stefano, can you help bridge my Xen knowledge gap?
> > >>
> > >> Hi John,
> > >>
> > >> we need something like hw/i386/xen/xen_platform.c:unplug_disks but
> > that
> > >> can unplug AHCI disk. And by unplug, I mean "make disappear" like
> > >> pci_piix3_xen_ide_unplug does for ide.
> > > Maybe this would be the right time to stop the craziness with your
> > > hybrid IDE/xendisk setup. It's a horrible thing that would never happen
> > > on real hardware.
> 
> Unfortunately, it's going to be difficult to remove such 'craziness' when you 
> don't know a priori whether the VM has PV drivers or not. 

Why wouldn't you know that beforehand? I mean, even on real hardware you
can have different disk interfaces (IDE, AHCI, SCSI) and you install
the exact driver that your hardware needs. You just do the same thing on
VM: If your hardware is PV, you install a PV driver. If your hardware is
IDE, you install an IDE driver. Whether it's PV or IDE is something that
you, the user, decided when configuring the VM, so you definitely know.

Kevin



[Qemu-block] [PULL 16/29] block: Introduce parents list

2015-10-16 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
---
 block.c   | 3 +++
 include/block/block_int.h | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/block.c b/block.c
index a2d6238..980437f 100644
--- a/block.c
+++ b/block.c
@@ -1090,6 +1090,7 @@ static BdrvChild *bdrv_attach_child(BlockDriverState 
*parent_bs,
 };
 
 QLIST_INSERT_HEAD(_bs->children, child, next);
+QLIST_INSERT_HEAD(_bs->parents, child, next_parent);
 
 return child;
 }
@@ -1097,6 +1098,7 @@ static BdrvChild *bdrv_attach_child(BlockDriverState 
*parent_bs,
 void bdrv_detach_child(BdrvChild *child)
 {
 QLIST_REMOVE(child, next);
+QLIST_REMOVE(child, next_parent);
 g_free(child);
 }
 
@@ -2038,6 +2040,7 @@ static void bdrv_move_reference_fields(BlockDriverState 
*bs_dest,
 /* keep the same entry in bdrv_states */
 bs_dest->device_list = bs_src->device_list;
 bs_dest->blk = bs_src->blk;
+bs_dest->parents = bs_src->parents;
 
 memcpy(bs_dest->op_blockers, bs_src->op_blockers,
sizeof(bs_dest->op_blockers));
diff --git a/include/block/block_int.h b/include/block/block_int.h
index cfcae52..52ea7c0 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -339,6 +339,7 @@ struct BdrvChild {
 BlockDriverState *bs;
 const BdrvChildRole *role;
 QLIST_ENTRY(BdrvChild) next;
+QLIST_ENTRY(BdrvChild) next_parent;
 };
 
 /*
@@ -445,6 +446,7 @@ struct BlockDriverState {
  * parent node of this node. */
 BlockDriverState *inherits_from;
 QLIST_HEAD(, BdrvChild) children;
+QLIST_HEAD(, BdrvChild) parents;
 
 QDict *options;
 BlockdevDetectZeroesOptions detect_zeroes;
-- 
1.8.3.1




[Qemu-block] [PULL 08/29] quorum: Convert to BdrvChild

2015-10-16 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Alberto Garcia 
Reviewed-by: Fam Zheng 
Reviewed-by: Jeff Cody 
Reviewed-by: Stefan Hajnoczi 
---
 block/quorum.c | 65 ++
 1 file changed, 34 insertions(+), 31 deletions(-)

diff --git a/block/quorum.c b/block/quorum.c
index 8fe53b4..b9ba028 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -64,7 +64,7 @@ typedef struct QuorumVotes {
 
 /* the following structure holds the state of one quorum instance */
 typedef struct BDRVQuorumState {
-BlockDriverState **bs; /* children BlockDriverStates */
+BdrvChild **children;  /* children BlockDriverStates */
 int num_children;  /* children count */
 int threshold; /* if less than threshold children reads gave the
 * same result a quorum error occurs.
@@ -336,7 +336,7 @@ static void quorum_report_bad_versions(BDRVQuorumState *s,
 continue;
 }
 QLIST_FOREACH(item, >items, next) {
-quorum_report_bad(acb, s->bs[item->index]->node_name, 0);
+quorum_report_bad(acb, s->children[item->index]->bs->node_name, 0);
 }
 }
 }
@@ -369,8 +369,9 @@ static bool quorum_rewrite_bad_versions(BDRVQuorumState *s, 
QuorumAIOCB *acb,
 continue;
 }
 QLIST_FOREACH(item, >items, next) {
-bdrv_aio_writev(s->bs[item->index], acb->sector_num, acb->qiov,
-acb->nb_sectors, quorum_rewrite_aio_cb, acb);
+bdrv_aio_writev(s->children[item->index]->bs, acb->sector_num,
+acb->qiov, acb->nb_sectors, quorum_rewrite_aio_cb,
+acb);
 }
 }
 
@@ -639,13 +640,13 @@ static BlockAIOCB *read_quorum_children(QuorumAIOCB *acb)
 int i;
 
 for (i = 0; i < s->num_children; i++) {
-acb->qcrs[i].buf = qemu_blockalign(s->bs[i], acb->qiov->size);
+acb->qcrs[i].buf = qemu_blockalign(s->children[i]->bs, 
acb->qiov->size);
 qemu_iovec_init(>qcrs[i].qiov, acb->qiov->niov);
 qemu_iovec_clone(>qcrs[i].qiov, acb->qiov, acb->qcrs[i].buf);
 }
 
 for (i = 0; i < s->num_children; i++) {
-bdrv_aio_readv(s->bs[i], acb->sector_num, >qcrs[i].qiov,
+bdrv_aio_readv(s->children[i]->bs, acb->sector_num, >qcrs[i].qiov,
acb->nb_sectors, quorum_aio_cb, >qcrs[i]);
 }
 
@@ -656,12 +657,12 @@ static BlockAIOCB *read_fifo_child(QuorumAIOCB *acb)
 {
 BDRVQuorumState *s = acb->common.bs->opaque;
 
-acb->qcrs[acb->child_iter].buf = qemu_blockalign(s->bs[acb->child_iter],
- acb->qiov->size);
+acb->qcrs[acb->child_iter].buf =
+qemu_blockalign(s->children[acb->child_iter]->bs, acb->qiov->size);
 qemu_iovec_init(>qcrs[acb->child_iter].qiov, acb->qiov->niov);
 qemu_iovec_clone(>qcrs[acb->child_iter].qiov, acb->qiov,
  acb->qcrs[acb->child_iter].buf);
-bdrv_aio_readv(s->bs[acb->child_iter], acb->sector_num,
+bdrv_aio_readv(s->children[acb->child_iter]->bs, acb->sector_num,
>qcrs[acb->child_iter].qiov, acb->nb_sectors,
quorum_aio_cb, >qcrs[acb->child_iter]);
 
@@ -702,8 +703,8 @@ static BlockAIOCB *quorum_aio_writev(BlockDriverState *bs,
 int i;
 
 for (i = 0; i < s->num_children; i++) {
-acb->qcrs[i].aiocb = bdrv_aio_writev(s->bs[i], sector_num, qiov,
- nb_sectors, _aio_cb,
+acb->qcrs[i].aiocb = bdrv_aio_writev(s->children[i]->bs, sector_num,
+ qiov, nb_sectors, _aio_cb,
  >qcrs[i]);
 }
 
@@ -717,12 +718,12 @@ static int64_t quorum_getlength(BlockDriverState *bs)
 int i;
 
 /* check that all file have the same length */
-result = bdrv_getlength(s->bs[0]);
+result = bdrv_getlength(s->children[0]->bs);
 if (result < 0) {
 return result;
 }
 for (i = 1; i < s->num_children; i++) {
-int64_t value = bdrv_getlength(s->bs[i]);
+int64_t value = bdrv_getlength(s->children[i]->bs);
 if (value < 0) {
 return value;
 }
@@ -741,7 +742,7 @@ static void quorum_invalidate_cache(BlockDriverState *bs, 
Error **errp)
 int i;
 
 for (i = 0; i < s->num_children; i++) {
-bdrv_invalidate_cache(s->bs[i], _err);
+bdrv_invalidate_cache(s->children[i]->bs, _err);
 if (local_err) {
 error_propagate(errp, local_err);
 return;
@@ -762,7 +763,7 @@ static coroutine_fn int quorum_co_flush(BlockDriverState 
*bs)
 error_votes.compare = quorum_64bits_compare;
 
 for (i = 0; i < s->num_children; i++) {
-result = bdrv_co_flush(s->bs[i]);
+result = 

[Qemu-block] [PULL 13/29] block: Split bdrv_move_feature_fields()

2015-10-16 Thread Kevin Wolf
After bdrv_swap(), some fields must be moved back to their original BDS
to compensate for the effects that a swap of the contents of the objects
has while keeping the old addresses. Other fields must be moved back
because they should logically be moved and must stay on top

When replacing bdrv_swap() with operations changing the pointers in the
parents, we only need the latter and must avoid swapping the former.
Split the function accordingly.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Fam Zheng 
Reviewed-by: Alberto Garcia 
Reviewed-by: Stefan Hajnoczi 
---
 block.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/block.c b/block.c
index a9c7ea6..a2d6238 100644
--- a/block.c
+++ b/block.c
@@ -1985,6 +1985,8 @@ static void bdrv_rebind(BlockDriverState *bs)
 }
 }
 
+/* Fields that need to stay with the top-level BDS, no matter whether the
+ * address of the top-level BDS stays the same or not. */
 static void bdrv_move_feature_fields(BlockDriverState *bs_dest,
  BlockDriverState *bs_src)
 {
@@ -2020,7 +2022,13 @@ static void bdrv_move_feature_fields(BlockDriverState 
*bs_dest,
 
 /* dirty bitmap */
 bs_dest->dirty_bitmaps  = bs_src->dirty_bitmaps;
+}
 
+/* Fields that only need to be swapped if the contents of BDSes is swapped
+ * rather than pointers being changed in the parents. */
+static void bdrv_move_reference_fields(BlockDriverState *bs_dest,
+   BlockDriverState *bs_src)
+{
 /* reference count */
 bs_dest->refcnt = bs_src->refcnt;
 
@@ -2091,6 +2099,10 @@ void bdrv_swap(BlockDriverState *bs_new, 
BlockDriverState *bs_old)
 bdrv_move_feature_fields(bs_old, bs_new);
 bdrv_move_feature_fields(bs_new, );
 
+bdrv_move_reference_fields(, bs_old);
+bdrv_move_reference_fields(bs_old, bs_new);
+bdrv_move_reference_fields(bs_new, );
+
 /* bs_new must remain unattached */
 assert(!bs_new->blk);
 
-- 
1.8.3.1




[Qemu-block] [PULL 05/29] block: Introduce BDS.file_child

2015-10-16 Thread Kevin Wolf
Store the BdrvChild for bs->file. At this point, bs->file_child->bs just
duplicates the bs->file pointer. Later, it will completely replace it.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Alberto Garcia 
Reviewed-by: Fam Zheng 
Reviewed-by: Jeff Cody 
Reviewed-by: Stefan Hajnoczi 
---
 block.c   | 12 +---
 include/block/block_int.h |  1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index 1f90b47..75afed1 100644
--- a/block.c
+++ b/block.c
@@ -1487,11 +1487,17 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
const char *filename,
 
 assert(file == NULL);
 bs->open_flags = flags;
-ret = bdrv_open_image(, filename, options, "file",
-  bs, _file, true, _err);
-if (ret < 0) {
+
+bs->file_child = bdrv_open_child(filename, options, "file", bs,
+ _file, true, _err);
+if (local_err) {
+ret = -EINVAL;
 goto fail;
 }
+
+if (bs->file_child) {
+file = bs->file_child->bs;
+}
 }
 
 /* Image format probing */
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 14ad4c3..d0dd93e 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -381,6 +381,7 @@ struct BlockDriverState {
 BlockDriverState *backing_hd;
 BdrvChild *backing_child;
 BlockDriverState *file;
+BdrvChild *file_child;
 
 NotifierList close_notifiers;
 
-- 
1.8.3.1




[Qemu-block] [PULL 17/29] block: Implement bdrv_append() without bdrv_swap()

2015-10-16 Thread Kevin Wolf
Remember all parent nodes and just change the pointers there instead of
swapping the contents of the BlockDriverState.

Handling of snapshot=on must be moved further down in bdrv_open()
because *pbs (which is the bs pointer in the BlockBackend) must already
be set before bdrv_append() is called. Otherwise bdrv_append() changes
the BB's pointer to the temporary snapshot, but bdrv_open() overwrites
it with the read-only original image.

We also need to be careful to update callers as the interface changes
(becomes less insane): Previously, the meaning of the two parameters was
inverted when bdrv_append() returns. Now any BDS pointers keep pointing
to the same node.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
---
 block.c| 112 +
 blockdev.c |   2 +-
 2 files changed, 85 insertions(+), 29 deletions(-)

diff --git a/block.c b/block.c
index 980437f..b4d2313 100644
--- a/block.c
+++ b/block.c
@@ -1516,15 +1516,6 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
const char *filename,
 
 bdrv_refresh_filename(bs);
 
-/* For snapshot=on, create a temporary qcow2 overlay. bs points to the
- * temporary snapshot afterwards. */
-if (snapshot_flags) {
-ret = bdrv_append_temp_snapshot(bs, snapshot_flags, _err);
-if (local_err) {
-goto close_and_fail;
-}
-}
-
 /* Check if any unknown options were used */
 if (options && (qdict_size(options) != 0)) {
 const QDictEntry *entry = qdict_first(options);
@@ -1556,6 +1547,16 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
const char *filename,
 
 QDECREF(options);
 *pbs = bs;
+
+/* For snapshot=on, create a temporary qcow2 overlay. bs points to the
+ * temporary snapshot afterwards. */
+if (snapshot_flags) {
+ret = bdrv_append_temp_snapshot(bs, snapshot_flags, _err);
+if (local_err) {
+goto close_and_fail;
+}
+}
+
 return 0;
 
 fail:
@@ -2000,20 +2001,6 @@ static void bdrv_move_feature_fields(BlockDriverState 
*bs_dest,
 
 bs_dest->enable_write_cache = bs_src->enable_write_cache;
 
-/* i/o throttled req */
-bs_dest->throttle_state = bs_src->throttle_state,
-bs_dest->io_limits_enabled  = bs_src->io_limits_enabled;
-bs_dest->pending_reqs[0]= bs_src->pending_reqs[0];
-bs_dest->pending_reqs[1]= bs_src->pending_reqs[1];
-bs_dest->throttled_reqs[0]  = bs_src->throttled_reqs[0];
-bs_dest->throttled_reqs[1]  = bs_src->throttled_reqs[1];
-memcpy(_dest->round_robin,
-   _src->round_robin,
-   sizeof(bs_dest->round_robin));
-memcpy(_dest->throttle_timers,
-   _src->throttle_timers,
-   sizeof(ThrottleTimers));
-
 /* r/w error */
 bs_dest->on_read_error  = bs_src->on_read_error;
 bs_dest->on_write_error = bs_src->on_write_error;
@@ -2027,10 +2014,25 @@ static void bdrv_move_feature_fields(BlockDriverState 
*bs_dest,
 }
 
 /* Fields that only need to be swapped if the contents of BDSes is swapped
- * rather than pointers being changed in the parents. */
+ * rather than pointers being changed in the parents, and throttling fields
+ * because only bdrv_swap() messes with internals of throttling. */
 static void bdrv_move_reference_fields(BlockDriverState *bs_dest,
BlockDriverState *bs_src)
 {
+/* i/o throttled req */
+bs_dest->throttle_state = bs_src->throttle_state,
+bs_dest->io_limits_enabled  = bs_src->io_limits_enabled;
+bs_dest->pending_reqs[0]= bs_src->pending_reqs[0];
+bs_dest->pending_reqs[1]= bs_src->pending_reqs[1];
+bs_dest->throttled_reqs[0]  = bs_src->throttled_reqs[0];
+bs_dest->throttled_reqs[1]  = bs_src->throttled_reqs[1];
+memcpy(_dest->round_robin,
+   _src->round_robin,
+   sizeof(bs_dest->round_robin));
+memcpy(_dest->throttle_timers,
+   _src->throttle_timers,
+   sizeof(ThrottleTimers));
+
 /* reference count */
 bs_dest->refcnt = bs_src->refcnt;
 
@@ -2156,6 +2158,45 @@ void bdrv_swap(BlockDriverState *bs_new, 
BlockDriverState *bs_old)
 bdrv_rebind(bs_old);
 }
 
+static void change_parent_backing_link(BlockDriverState *from,
+   BlockDriverState *to)
+{
+BdrvChild *c, *next;
+
+QLIST_FOREACH_SAFE(c, >parents, next_parent, next) {
+assert(c->role != _backing);
+c->bs = to;
+QLIST_REMOVE(c, next_parent);
+QLIST_INSERT_HEAD(>parents, c, next_parent);
+bdrv_ref(to);
+bdrv_unref(from);
+}
+if (from->blk) {
+blk_set_bs(from->blk, to);
+if (!to->device_list.tqe_prev) {
+QTAILQ_INSERT_BEFORE(from, to, device_list);
+}
+QTAILQ_REMOVE(_states, 

[Qemu-block] [PULL 14/29] block/io: Make bdrv_requests_pending() public

2015-10-16 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Alberto Garcia 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
---
 block/io.c| 2 +-
 include/block/block_int.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index e094694..5311473 100644
--- a/block/io.c
+++ b/block/io.c
@@ -213,7 +213,7 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs)
 }
 
 /* Check if any requests are in-flight (including throttled requests) */
-static bool bdrv_requests_pending(BlockDriverState *bs)
+bool bdrv_requests_pending(BlockDriverState *bs)
 {
 if (!QLIST_EMPTY(>tracked_requests)) {
 return true;
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 90971c0..4598101 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -667,5 +667,6 @@ bool blk_dev_is_medium_locked(BlockBackend *blk);
 void blk_dev_resize_cb(BlockBackend *blk);
 
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector, int nr_sectors);
+bool bdrv_requests_pending(BlockDriverState *bs);
 
 #endif /* BLOCK_INT_H */
-- 
1.8.3.1




[Qemu-block] [PULL 10/29] block: Remove bdrv_open_image()

2015-10-16 Thread Kevin Wolf
It is unused now.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Alberto Garcia 
Reviewed-by: Fam Zheng 
Reviewed-by: Jeff Cody 
Reviewed-by: Stefan Hajnoczi 
---
 block.c   | 34 --
 include/block/block.h |  4 
 2 files changed, 38 deletions(-)

diff --git a/block.c b/block.c
index 8fd345b..33ecd93 100644
--- a/block.c
+++ b/block.c
@@ -1279,40 +1279,6 @@ done:
 return c;
 }
 
-/*
- * This is a version of bdrv_open_child() that returns 0/-EINVAL instead of
- * a BdrvChild object.
- *
- * If allow_none is true, no image will be opened if filename is false and no
- * BlockdevRef is given. *pbs will remain unchanged and 0 will be returned.
- *
- * To conform with the behavior of bdrv_open(), *pbs has to be NULL.
- */
-int bdrv_open_image(BlockDriverState **pbs, const char *filename,
-QDict *options, const char *bdref_key,
-BlockDriverState* parent, const BdrvChildRole *child_role,
-bool allow_none, Error **errp)
-{
-Error *local_err = NULL;
-BdrvChild *c;
-
-assert(pbs);
-assert(*pbs == NULL);
-
-c = bdrv_open_child(filename, options, bdref_key, parent, child_role,
-allow_none, _err);
-if (local_err) {
-error_propagate(errp, local_err);
-return -EINVAL;
-}
-
-if (c != NULL) {
-*pbs = c->bs;
-}
-
-return 0;
-}
-
 int bdrv_append_temp_snapshot(BlockDriverState *bs, int flags, Error **errp)
 {
 /* TODO: extra byte is a hack to ensure MAX_PATH space on Windows. */
diff --git a/include/block/block.h b/include/block/block.h
index c5d9620..20d5810 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -205,10 +205,6 @@ void bdrv_swap(BlockDriverState *bs_new, BlockDriverState 
*bs_old);
 void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top);
 int bdrv_parse_cache_flags(const char *mode, int *flags);
 int bdrv_parse_discard_flags(const char *mode, int *flags);
-int bdrv_open_image(BlockDriverState **pbs, const char *filename,
-QDict *options, const char *bdref_key,
-BlockDriverState* parent, const BdrvChildRole *child_role,
-bool allow_none, Error **errp);
 BdrvChild *bdrv_open_child(const char *filename,
QDict *options, const char *bdref_key,
BlockDriverState* parent,
-- 
1.8.3.1




[Qemu-block] [PULL 07/29] blkverify: Convert s->test_file to BdrvChild

2015-10-16 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Alberto Garcia 
Reviewed-by: Fam Zheng 
Reviewed-by: Jeff Cody 
Reviewed-by: Stefan Hajnoczi 
---
 block/blkverify.c | 41 +
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/block/blkverify.c b/block/blkverify.c
index d277e63..6b71622 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -14,7 +14,7 @@
 #include "qapi/qmp/qstring.h"
 
 typedef struct {
-BlockDriverState *test_file;
+BdrvChild *test_file;
 } BDRVBlkverifyState;
 
 typedef struct BlkverifyAIOCB BlkverifyAIOCB;
@@ -132,12 +132,12 @@ static int blkverify_open(BlockDriverState *bs, QDict 
*options, int flags,
 }
 
 /* Open the test file */
-assert(s->test_file == NULL);
-ret = bdrv_open_image(>test_file, qemu_opt_get(opts, "x-image"), 
options,
-  "test", bs, _format, false, _err);
-if (ret < 0) {
+s->test_file = bdrv_open_child(qemu_opt_get(opts, "x-image"), options,
+   "test", bs, _format, false,
+   _err);
+if (local_err) {
+ret = -EINVAL;
 error_propagate(errp, local_err);
-s->test_file = NULL;
 goto fail;
 }
 
@@ -151,7 +151,7 @@ static void blkverify_close(BlockDriverState *bs)
 {
 BDRVBlkverifyState *s = bs->opaque;
 
-bdrv_unref(s->test_file);
+bdrv_unref_child(bs, s->test_file);
 s->test_file = NULL;
 }
 
@@ -159,7 +159,7 @@ static int64_t blkverify_getlength(BlockDriverState *bs)
 {
 BDRVBlkverifyState *s = bs->opaque;
 
-return bdrv_getlength(s->test_file);
+return bdrv_getlength(s->test_file->bs);
 }
 
 static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write,
@@ -242,7 +242,7 @@ static BlockAIOCB *blkverify_aio_readv(BlockDriverState *bs,
 qemu_iovec_init(>raw_qiov, acb->qiov->niov);
 qemu_iovec_clone(>raw_qiov, qiov, acb->buf);
 
-bdrv_aio_readv(s->test_file, sector_num, qiov, nb_sectors,
+bdrv_aio_readv(s->test_file->bs, sector_num, qiov, nb_sectors,
blkverify_aio_cb, acb);
 bdrv_aio_readv(bs->file, sector_num, >raw_qiov, nb_sectors,
blkverify_aio_cb, acb);
@@ -257,7 +257,7 @@ static BlockAIOCB *blkverify_aio_writev(BlockDriverState 
*bs,
 BlkverifyAIOCB *acb = blkverify_aio_get(bs, true, sector_num, qiov,
 nb_sectors, cb, opaque);
 
-bdrv_aio_writev(s->test_file, sector_num, qiov, nb_sectors,
+bdrv_aio_writev(s->test_file->bs, sector_num, qiov, nb_sectors,
 blkverify_aio_cb, acb);
 bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors,
 blkverify_aio_cb, acb);
@@ -271,7 +271,7 @@ static BlockAIOCB *blkverify_aio_flush(BlockDriverState *bs,
 BDRVBlkverifyState *s = bs->opaque;
 
 /* Only flush test file, the raw file is not important */
-return bdrv_aio_flush(s->test_file, cb, opaque);
+return bdrv_aio_flush(s->test_file->bs, cb, opaque);
 }
 
 static bool blkverify_recurse_is_first_non_filter(BlockDriverState *bs,
@@ -285,7 +285,7 @@ static bool 
blkverify_recurse_is_first_non_filter(BlockDriverState *bs,
 return true;
 }
 
-return bdrv_recurse_is_first_non_filter(s->test_file, candidate);
+return bdrv_recurse_is_first_non_filter(s->test_file->bs, candidate);
 }
 
 /* Propagate AioContext changes to ->test_file */
@@ -293,7 +293,7 @@ static void blkverify_detach_aio_context(BlockDriverState 
*bs)
 {
 BDRVBlkverifyState *s = bs->opaque;
 
-bdrv_detach_aio_context(s->test_file);
+bdrv_detach_aio_context(s->test_file->bs);
 }
 
 static void blkverify_attach_aio_context(BlockDriverState *bs,
@@ -301,7 +301,7 @@ static void blkverify_attach_aio_context(BlockDriverState 
*bs,
 {
 BDRVBlkverifyState *s = bs->opaque;
 
-bdrv_attach_aio_context(s->test_file, new_context);
+bdrv_attach_aio_context(s->test_file->bs, new_context);
 }
 
 static void blkverify_refresh_filename(BlockDriverState *bs)
@@ -309,24 +309,25 @@ static void blkverify_refresh_filename(BlockDriverState 
*bs)
 BDRVBlkverifyState *s = bs->opaque;
 
 /* bs->file has already been refreshed */
-bdrv_refresh_filename(s->test_file);
+bdrv_refresh_filename(s->test_file->bs);
 
-if (bs->file->full_open_options && s->test_file->full_open_options) {
+if (bs->file->full_open_options && s->test_file->bs->full_open_options) {
 QDict *opts = qdict_new();
 qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkverify")));
 
 QINCREF(bs->file->full_open_options);
 qdict_put_obj(opts, "raw", QOBJECT(bs->file->full_open_options));
-QINCREF(s->test_file->full_open_options);
-qdict_put_obj(opts, "test", QOBJECT(s->test_file->full_open_options));
+

[Qemu-block] [PULL 06/29] vmdk: Use BdrvChild instead of BDS for references to extents

2015-10-16 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Alberto Garcia 
Reviewed-by: Fam Zheng 
Reviewed-by: Jeff Cody 
Reviewed-by: Stefan Hajnoczi 
---
 block/vmdk.c | 99 +++-
 1 file changed, 51 insertions(+), 48 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index be0d640..9702132 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -87,7 +87,7 @@ typedef struct {
 #define L2_CACHE_SIZE 16
 
 typedef struct VmdkExtent {
-BlockDriverState *file;
+BdrvChild *file;
 bool flat;
 bool compressed;
 bool has_marker;
@@ -221,8 +221,8 @@ static void vmdk_free_extents(BlockDriverState *bs)
 g_free(e->l2_cache);
 g_free(e->l1_backup_table);
 g_free(e->type);
-if (e->file != bs->file) {
-bdrv_unref(e->file);
+if (e->file != bs->file_child) {
+bdrv_unref_child(bs, e->file);
 }
 }
 g_free(s->extents);
@@ -367,7 +367,7 @@ static int vmdk_parent_open(BlockDriverState *bs)
 /* Create and append extent to the extent array. Return the added VmdkExtent
  * address. return NULL if allocation failed. */
 static int vmdk_add_extent(BlockDriverState *bs,
-   BlockDriverState *file, bool flat, int64_t sectors,
+   BdrvChild *file, bool flat, int64_t sectors,
int64_t l1_offset, int64_t l1_backup_offset,
uint32_t l1_size,
int l2_size, uint64_t cluster_sectors,
@@ -392,7 +392,7 @@ static int vmdk_add_extent(BlockDriverState *bs,
 return -EFBIG;
 }
 
-nb_sectors = bdrv_nb_sectors(file);
+nb_sectors = bdrv_nb_sectors(file->bs);
 if (nb_sectors < 0) {
 return nb_sectors;
 }
@@ -439,14 +439,14 @@ static int vmdk_init_tables(BlockDriverState *bs, 
VmdkExtent *extent,
 return -ENOMEM;
 }
 
-ret = bdrv_pread(extent->file,
+ret = bdrv_pread(extent->file->bs,
  extent->l1_table_offset,
  extent->l1_table,
  l1_size);
 if (ret < 0) {
 error_setg_errno(errp, -ret,
  "Could not read l1 table from extent '%s'",
- extent->file->filename);
+ extent->file->bs->filename);
 goto fail_l1;
 }
 for (i = 0; i < extent->l1_size; i++) {
@@ -459,14 +459,14 @@ static int vmdk_init_tables(BlockDriverState *bs, 
VmdkExtent *extent,
 ret = -ENOMEM;
 goto fail_l1;
 }
-ret = bdrv_pread(extent->file,
+ret = bdrv_pread(extent->file->bs,
  extent->l1_backup_table_offset,
  extent->l1_backup_table,
  l1_size);
 if (ret < 0) {
 error_setg_errno(errp, -ret,
  "Could not read l1 backup table from extent '%s'",
- extent->file->filename);
+ extent->file->bs->filename);
 goto fail_l1b;
 }
 for (i = 0; i < extent->l1_size; i++) {
@@ -485,7 +485,7 @@ static int vmdk_init_tables(BlockDriverState *bs, 
VmdkExtent *extent,
 }
 
 static int vmdk_open_vmfs_sparse(BlockDriverState *bs,
- BlockDriverState *file,
+ BdrvChild *file,
  int flags, Error **errp)
 {
 int ret;
@@ -493,11 +493,11 @@ static int vmdk_open_vmfs_sparse(BlockDriverState *bs,
 VMDK3Header header;
 VmdkExtent *extent;
 
-ret = bdrv_pread(file, sizeof(magic), , sizeof(header));
+ret = bdrv_pread(file->bs, sizeof(magic), , sizeof(header));
 if (ret < 0) {
 error_setg_errno(errp, -ret,
  "Could not read header from file '%s'",
- file->filename);
+ file->bs->filename);
 return ret;
 }
 ret = vmdk_add_extent(bs, file, false,
@@ -559,7 +559,7 @@ static char *vmdk_read_desc(BlockDriverState *file, 
uint64_t desc_offset,
 }
 
 static int vmdk_open_vmdk4(BlockDriverState *bs,
-   BlockDriverState *file,
+   BdrvChild *file,
int flags, QDict *options, Error **errp)
 {
 int ret;
@@ -570,17 +570,17 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
 BDRVVmdkState *s = bs->opaque;
 int64_t l1_backup_offset = 0;
 
-ret = bdrv_pread(file, sizeof(magic), , sizeof(header));
+ret = bdrv_pread(file->bs, sizeof(magic), , sizeof(header));
 if (ret < 0) {
 error_setg_errno(errp, -ret,
  "Could not read header from file '%s'",
- file->filename);
+ file->bs->filename);
 

[Qemu-block] [PULL 23/29] util - add automated ID generation utility

2015-10-16 Thread Kevin Wolf
From: Jeff Cody 

Multiple sub-systems in QEMU may find it useful to generate IDs
for objects that a user may reference via QMP or HMP.  This patch
presents a standardized way to do it, so that automatic ID generation
follows the same rules.

This patch enforces the following rules when generating an ID:

1.) Guarantee no collisions with a user-specified ID
2.) Identify the sub-system the ID belongs to
3.) Guarantee of uniqueness
4.) Spoiling predictability, to avoid creating an assumption
of object ordering and parsing (i.e., we don't want users to think
they can guess the next ID based on prior behavior).

The scheme for this is as follows (no spaces):

# subsys D RR
Reserved char --||   | |
Subsystem String |   | |
Unique number (64-bit) --| |
Two-digit random number ---|

For example, a generated node-name for the block sub-system may look
like this:

#block076

The caller of id_generate() is responsible for freeing the generated
node name string with g_free().

Reviewed-by: John Snow 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
Signed-off-by: Jeff Cody 
Signed-off-by: Kevin Wolf 
---
 include/qemu-common.h |  8 
 util/id.c | 37 +
 2 files changed, 45 insertions(+)

diff --git a/include/qemu-common.h b/include/qemu-common.h
index 0bd212b..2f74540 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -246,6 +246,14 @@ int64_t qemu_strtosz_suffix_unit(const char *nptr, char 
**end,
 #define STR_OR_NULL(str) ((str) ? (str) : "null")
 
 /* id.c */
+
+typedef enum IdSubSystems {
+ID_QDEV,
+ID_BLOCK,
+ID_MAX  /* last element, used as array size */
+} IdSubSystems;
+
+char *id_generate(IdSubSystems id);
 bool id_wellformed(const char *id);
 
 /* path.c */
diff --git a/util/id.c b/util/id.c
index 09b22fb..bcc64d8 100644
--- a/util/id.c
+++ b/util/id.c
@@ -26,3 +26,40 @@ bool id_wellformed(const char *id)
 }
 return true;
 }
+
+#define ID_SPECIAL_CHAR '#'
+
+static const char *const id_subsys_str[] = {
+[ID_QDEV]  = "qdev",
+[ID_BLOCK] = "block",
+};
+
+/*
+ *  Generates an ID of the form PREFIX SUBSYSTEM NUMBER
+ *  where:
+ *
+ *  - PREFIX is the reserved character '#'
+ *  - SUBSYSTEM identifies the subsystem creating the ID
+ *  - NUMBER is a decimal number unique within SUBSYSTEM.
+ *
+ *Example: "#block146"
+ *
+ * Note that these IDs do not satisfy id_wellformed().
+ *
+ * The caller is responsible for freeing the returned string with g_free()
+ */
+char *id_generate(IdSubSystems id)
+{
+static uint64_t id_counters[ID_MAX];
+uint32_t rnd;
+
+assert(id < ID_MAX);
+assert(id_subsys_str[id]);
+
+rnd = g_random_int_range(0, 100);
+
+return g_strdup_printf("%c%s%" PRIu64 "%02" PRId32, ID_SPECIAL_CHAR,
+id_subsys_str[id],
+id_counters[id]++,
+rnd);
+}
-- 
1.8.3.1




[Qemu-block] [PULL 22/29] blkverify: Fix BDS leak in .bdrv_open error path

2015-10-16 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
Reviewed-by: Jeff Cody 
---
 block/blkverify.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/blkverify.c b/block/blkverify.c
index f8655ad..c5f8e8d 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -143,6 +143,9 @@ static int blkverify_open(BlockDriverState *bs, QDict 
*options, int flags,
 
 ret = 0;
 fail:
+if (ret < 0) {
+bdrv_unref_child(bs, bs->file);
+}
 qemu_opts_del(opts);
 return ret;
 }
-- 
1.8.3.1




[Qemu-block] [PULL 24/29] block: auto-generated node-names

2015-10-16 Thread Kevin Wolf
From: Jeff Cody 

If a node-name is not specified, automatically generate the node-name.

Generated node-names will use the "block" sub-system identifier.

Signed-off-by: Jeff Cody 
Signed-off-by: Kevin Wolf 
---
 block.c  | 19 ---
 tests/qemu-iotests/041   |  4 ++--
 tests/qemu-iotests/051   |  3 ++-
 tests/qemu-iotests/051.out   |  2 +-
 tests/qemu-iotests/067   |  3 ++-
 tests/qemu-iotests/067.out   |  5 +
 tests/qemu-iotests/081   |  3 ++-
 tests/qemu-iotests/081.out   |  2 +-
 tests/qemu-iotests/common.filter |  5 +
 9 files changed, 32 insertions(+), 14 deletions(-)

diff --git a/block.c b/block.c
index 6490040..09f2a75 100644
--- a/block.c
+++ b/block.c
@@ -763,12 +763,15 @@ static void bdrv_assign_node_name(BlockDriverState *bs,
   const char *node_name,
   Error **errp)
 {
-if (!node_name) {
-return;
-}
+char *gen_node_name = NULL;
 
-/* Check for empty string or invalid characters */
-if (!id_wellformed(node_name)) {
+if (!node_name) {
+node_name = gen_node_name = id_generate(ID_BLOCK);
+} else if (!id_wellformed(node_name)) {
+/*
+ * Check for empty string or invalid characters, but not if it is
+ * generated (generated names use characters not available to the user)
+ */
 error_setg(errp, "Invalid node name");
 return;
 }
@@ -777,18 +780,20 @@ static void bdrv_assign_node_name(BlockDriverState *bs,
 if (blk_by_name(node_name)) {
 error_setg(errp, "node-name=%s is conflicting with a device id",
node_name);
-return;
+goto out;
 }
 
 /* takes care of avoiding duplicates node names */
 if (bdrv_find_node(node_name)) {
 error_setg(errp, "Duplicate node name");
-return;
+goto out;
 }
 
 /* copy node name into the bs and insert it into the graph list */
 pstrcpy(bs->node_name, sizeof(bs->node_name), node_name);
 QTAILQ_INSERT_TAIL(_bdrv_states, bs, node_list);
+out:
+g_free(gen_node_name);
 }
 
 static QemuOptsList bdrv_runtime_opts = {
diff --git a/tests/qemu-iotests/041 b/tests/qemu-iotests/041
index 59c1a76..05b5962 100755
--- a/tests/qemu-iotests/041
+++ b/tests/qemu-iotests/041
@@ -780,7 +780,7 @@ class TestRepairQuorum(iotests.QMPTestCase):
 # here we check that the last registered quorum file has not been
 # swapped out and unref
 result = self.vm.qmp('query-named-block-nodes')
-self.assert_qmp(result, 'return[0]/file', quorum_img3)
+self.assert_qmp(result, 'return[1]/file', quorum_img3)
 self.vm.shutdown()
 
 def test_cancel_after_ready(self):
@@ -799,7 +799,7 @@ class TestRepairQuorum(iotests.QMPTestCase):
 result = self.vm.qmp('query-named-block-nodes')
 # here we check that the last registered quorum file has not been
 # swapped out and unref
-self.assert_qmp(result, 'return[0]/file', quorum_img3)
+self.assert_qmp(result, 'return[1]/file', quorum_img3)
 self.vm.shutdown()
 self.assertTrue(iotests.compare_images(quorum_img2, quorum_repair_img),
 'target image does not match source after mirroring')
diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
index 4a8055b..17dbf04 100755
--- a/tests/qemu-iotests/051
+++ b/tests/qemu-iotests/051
@@ -108,7 +108,8 @@ echo
 echo === Overriding backing file ===
 echo
 
-echo "info block" | run_qemu -drive 
file="$TEST_IMG",driver=qcow2,backing.file.filename="$TEST_IMG.orig" -nodefaults
+echo "info block" | run_qemu -drive 
file="$TEST_IMG",driver=qcow2,backing.file.filename="$TEST_IMG.orig" 
-nodefaults\
+  | _filter_generated_node_ids
 
 # Drivers that don't support backing files
 run_qemu -drive 
file="$TEST_IMG",driver=raw,backing.file.filename="$TEST_IMG.orig"
diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out
index 0429be2..7765aa0 100644
--- a/tests/qemu-iotests/051.out
+++ b/tests/qemu-iotests/051.out
@@ -59,7 +59,7 @@ QEMU X.Y.Z monitor - type 'help' for more information
 Testing: -drive 
file=TEST_DIR/t.qcow2,driver=qcow2,backing.file.filename=TEST_DIR/t.qcow2.orig 
-nodefaults
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) iininfinfoinfo 
info binfo 
blinfo bloinfo 
blocinfo block
-ide0-hd0: TEST_DIR/t.qcow2 (qcow2)
+ide0-hd0 (NODE_NAME): TEST_DIR/t.qcow2 (qcow2)
 Cache mode:   writeback
 Backing file: TEST_DIR/t.qcow2.orig (chain depth: 1)
 (qemu) qququiquit
diff --git a/tests/qemu-iotests/067 b/tests/qemu-iotests/067
index 3e9a053..3788534 100755

[Qemu-block] [PULL 12/29] block: Manage backing file references in bdrv_set_backing_hd()

2015-10-16 Thread Kevin Wolf
This simplifies the code somewhat, especially when dropping whole
backing file subchains.

The exception is the mirroring code that does adventurous things with
bdrv_swap() and in order to keep it working, I had to duplicate most of
bdrv_set_backing_hd() locally. We'll get rid again of this ugliness
shortly.

Signed-off-by: Kevin Wolf 
Reviewed-by: Alberto Garcia 
Reviewed-by: Max Reitz 
Reviewed-by: Stefan Hajnoczi 
---
 block.c   | 68 ++-
 block/mirror.c| 16 +---
 block/stream.c| 30 +--
 block/vvfat.c |  6 -
 include/block/block.h |  1 +
 5 files changed, 37 insertions(+), 84 deletions(-)

diff --git a/block.c b/block.c
index ecc0885..a9c7ea6 100644
--- a/block.c
+++ b/block.c
@@ -1094,7 +1094,7 @@ static BdrvChild *bdrv_attach_child(BlockDriverState 
*parent_bs,
 return child;
 }
 
-static void bdrv_detach_child(BdrvChild *child)
+void bdrv_detach_child(BdrvChild *child)
 {
 QLIST_REMOVE(child, next);
 g_free(child);
@@ -1112,13 +1112,20 @@ void bdrv_unref_child(BlockDriverState *parent, 
BdrvChild *child)
 bdrv_unref(child_bs);
 }
 
+/*
+ * Sets the backing file link of a BDS. A new reference is created; callers
+ * which don't need their own reference any more must call bdrv_unref().
+ */
 void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
 {
+if (backing_hd) {
+bdrv_ref(backing_hd);
+}
 
 if (bs->backing) {
 assert(bs->backing_blocker);
 bdrv_op_unblock_all(bs->backing->bs, bs->backing_blocker);
-bdrv_detach_child(bs->backing);
+bdrv_unref_child(bs, bs->backing);
 } else if (backing_hd) {
 error_setg(>backing_blocker,
"node is used as backing hd of '%s'",
@@ -1214,7 +1221,10 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*options, Error **errp)
 goto free_exit;
 }
 
+/* Hook up the backing file link; drop our reference, bs owns the
+ * backing_hd reference now */
 bdrv_set_backing_hd(bs, backing_hd);
+bdrv_unref(backing_hd);
 
 free_exit:
 g_free(backing_filename);
@@ -1891,11 +1901,7 @@ void bdrv_close(BlockDriverState *bs)
 bs->drv->bdrv_close(bs);
 bs->drv = NULL;
 
-if (bs->backing) {
-BlockDriverState *backing_hd = bs->backing->bs;
-bdrv_set_backing_hd(bs, NULL);
-bdrv_unref(backing_hd);
-}
+bdrv_set_backing_hd(bs, NULL);
 
 if (bs->file != NULL) {
 bdrv_unref_child(bs, bs->file);
@@ -2378,12 +2384,6 @@ BlockDriverState *bdrv_find_base(BlockDriverState *bs)
 return bdrv_find_overlay(bs, NULL);
 }
 
-typedef struct BlkIntermediateStates {
-BlockDriverState *bs;
-QSIMPLEQ_ENTRY(BlkIntermediateStates) entry;
-} BlkIntermediateStates;
-
-
 /*
  * Drops images above 'base' up to and including 'top', and sets the image
  * above 'top' to have base as its backing file.
@@ -2416,15 +2416,9 @@ typedef struct BlkIntermediateStates {
 int bdrv_drop_intermediate(BlockDriverState *active, BlockDriverState *top,
BlockDriverState *base, const char 
*backing_file_str)
 {
-BlockDriverState *intermediate;
-BlockDriverState *base_bs = NULL;
 BlockDriverState *new_top_bs = NULL;
-BlkIntermediateStates *intermediate_state, *next;
 int ret = -EIO;
 
-QSIMPLEQ_HEAD(states_to_delete, BlkIntermediateStates) states_to_delete;
-QSIMPLEQ_INIT(_to_delete);
-
 if (!top->drv || !base->drv) {
 goto exit;
 }
@@ -2443,48 +2437,22 @@ int bdrv_drop_intermediate(BlockDriverState *active, 
BlockDriverState *top,
 goto exit;
 }
 
-intermediate = top;
-
-/* now we will go down through the list, and add each BDS we find
- * into our deletion queue, until we hit the 'base'
- */
-while (intermediate) {
-intermediate_state = g_new0(BlkIntermediateStates, 1);
-intermediate_state->bs = intermediate;
-QSIMPLEQ_INSERT_TAIL(_to_delete, intermediate_state, entry);
-
-if (backing_bs(intermediate) == base) {
-base_bs = backing_bs(intermediate);
-break;
-}
-intermediate = backing_bs(intermediate);
-}
-if (base_bs == NULL) {
-/* something went wrong, we did not end at the base. safely
- * unravel everything, and exit with error */
+/* Make sure that base is in the backing chain of top */
+if (!bdrv_chain_contains(top, base)) {
 goto exit;
 }
 
 /* success - we can delete the intermediate states, and link top->base */
-backing_file_str = backing_file_str ? backing_file_str : base_bs->filename;
+backing_file_str = backing_file_str ? backing_file_str : base->filename;
 ret = bdrv_change_backing_file(new_top_bs, backing_file_str,
- 

[Qemu-block] [PULL 18/29] blockjob: Store device name at job creation

2015-10-16 Thread Kevin Wolf
Some block jobs change the block device graph on completion. This means
that the device that owns the job and originally was addressed with its
device name may no longer be what the corresponding BlockBackend points
to.

Previously, the effects of bdrv_swap() ensured that the job was (at
least partially) transferred to the target image. Events that contain
the device name could still use bdrv_get_device_name(job->bs) and get
the same result.

After removing bdrv_swap(), this won't work any more. Instead, save the
device name at job creation and use that copy for QMP events and
anything else identifying the job.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Alberto Garcia 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
---
 block/mirror.c   |  3 +--
 blockjob.c   | 15 ---
 include/block/blockjob.h |  8 
 3 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 6247b27..c277691 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -645,8 +645,7 @@ static void mirror_complete(BlockJob *job, Error **errp)
 return;
 }
 if (!s->synced) {
-error_setg(errp, QERR_BLOCK_JOB_NOT_READY,
-   bdrv_get_device_name(job->bs));
+error_setg(errp, QERR_BLOCK_JOB_NOT_READY, job->id);
 return;
 }
 
diff --git a/blockjob.c b/blockjob.c
index 62bb906..d87869c 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -54,6 +54,7 @@ void *block_job_create(const BlockJobDriver *driver, 
BlockDriverState *bs,
 bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker);
 
 job->driver= driver;
+job->id= g_strdup(bdrv_get_device_name(bs));
 job->bs= bs;
 job->cb= cb;
 job->opaque= opaque;
@@ -81,6 +82,7 @@ void block_job_release(BlockDriverState *bs)
 bs->job = NULL;
 bdrv_op_unblock_all(bs, job->blocker);
 error_free(job->blocker);
+g_free(job->id);
 g_free(job);
 }
 
@@ -113,8 +115,7 @@ void block_job_set_speed(BlockJob *job, int64_t speed, 
Error **errp)
 void block_job_complete(BlockJob *job, Error **errp)
 {
 if (job->pause_count || job->cancelled || !job->driver->complete) {
-error_setg(errp, QERR_BLOCK_JOB_NOT_READY,
-   bdrv_get_device_name(job->bs));
+error_setg(errp, QERR_BLOCK_JOB_NOT_READY, job->id);
 return;
 }
 
@@ -269,7 +270,7 @@ BlockJobInfo *block_job_query(BlockJob *job)
 {
 BlockJobInfo *info = g_new0(BlockJobInfo, 1);
 info->type  = g_strdup(BlockJobType_lookup[job->driver->job_type]);
-info->device= g_strdup(bdrv_get_device_name(job->bs));
+info->device= g_strdup(job->id);
 info->len   = job->len;
 info->busy  = job->busy;
 info->paused= job->pause_count > 0;
@@ -291,7 +292,7 @@ static void block_job_iostatus_set_err(BlockJob *job, int 
error)
 void block_job_event_cancelled(BlockJob *job)
 {
 qapi_event_send_block_job_cancelled(job->driver->job_type,
-bdrv_get_device_name(job->bs),
+job->id,
 job->len,
 job->offset,
 job->speed,
@@ -301,7 +302,7 @@ void block_job_event_cancelled(BlockJob *job)
 void block_job_event_completed(BlockJob *job, const char *msg)
 {
 qapi_event_send_block_job_completed(job->driver->job_type,
-bdrv_get_device_name(job->bs),
+job->id,
 job->len,
 job->offset,
 job->speed,
@@ -315,7 +316,7 @@ void block_job_event_ready(BlockJob *job)
 job->ready = true;
 
 qapi_event_send_block_job_ready(job->driver->job_type,
-bdrv_get_device_name(job->bs),
+job->id,
 job->len,
 job->offset,
 job->speed, _abort);
@@ -344,7 +345,7 @@ BlockErrorAction block_job_error_action(BlockJob *job, 
BlockDriverState *bs,
 default:
 abort();
 }
-qapi_event_send_block_job_error(bdrv_get_device_name(job->bs),
+qapi_event_send_block_job_error(job->id,
 is_read ? IO_OPERATION_TYPE_READ :
 IO_OPERATION_TYPE_WRITE,
 action, _abort);
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index dd9d5e6..289b13f 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -65,6 +65,14 @@ struct BlockJob {
 BlockDriverState *bs;
 
 /**
+ * The 

[Qemu-block] [PULL 21/29] block: Allow bdrv_unref_child(bs, NULL)

2015-10-16 Thread Kevin Wolf
bdrv_unref() can be called with a NULL argument and doesn't do anything
then. Make bdrv_unref_child() consistent with it.

Signed-off-by: Kevin Wolf 
Reviewed-by: Jeff Cody 
---
 block.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index f38146e..6490040 100644
--- a/block.c
+++ b/block.c
@@ -1104,12 +1104,17 @@ static void bdrv_detach_child(BdrvChild *child)
 
 void bdrv_unref_child(BlockDriverState *parent, BdrvChild *child)
 {
-BlockDriverState *child_bs = child->bs;
+BlockDriverState *child_bs;
+
+if (child == NULL) {
+return;
+}
 
 if (child->bs->inherits_from == parent) {
 child->bs->inherits_from = NULL;
 }
 
+child_bs = child->bs;
 bdrv_detach_child(child);
 bdrv_unref(child_bs);
 }
-- 
1.8.3.1




[Qemu-block] [PULL 25/29] raw-posix: warn about BDRV_O_NATIVE_AIO if libaio is unavailable

2015-10-16 Thread Kevin Wolf
From: Stefan Hajnoczi 

raw-posix.c silently ignores BDRV_O_NATIVE_AIO if libaio is unavailable.
It is confusing when aio=native performance is identical to aio=threads
because the binary was accidentally built without libaio.

Print a deprecation warning if -drive aio=native is used with a binary
that does not support libaio.  There are probably users using aio=native
who would be inconvenienced if QEMU suddenly refused to start their
guests.  In the future this will become an error.

Signed-off-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 block/raw-posix.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index cc1b874..3a527f0 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -519,7 +519,16 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
  "future QEMU versions.\n",
  bs->filename);
 }
-#endif
+#else
+if (bdrv_flags & BDRV_O_NATIVE_AIO) {
+error_printf("WARNING: aio=native was specified for '%s', but "
+ "is not supported in this build. Falling back to "
+ "aio=threads.\n"
+ " This will become an error condition in "
+ "future QEMU versions.\n",
+ bs->filename);
+}
+#endif /* !defined(CONFIG_LINUX_AIO) */
 
 s->has_discard = true;
 s->has_write_zeroes = true;
-- 
1.8.3.1




[Qemu-block] [PULL 26/29] blockdev: always compile in -drive aio= parsing

2015-10-16 Thread Kevin Wolf
From: Stefan Hajnoczi 

CONFIG_LINUX_AIO is an implementation detail of raw-posix.c.  Don't
mention CONFIG_LINUX_AIO in blockdev.c.  Let block drivers decide what
to do with BDRV_O_NATIVE_AIO.  They may print an error if it is
unsupported.

Signed-off-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 blockdev.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 6c8cce4..8141b6b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -411,7 +411,6 @@ static BlockBackend *blockdev_init(const char *file, QDict 
*bs_opts,
 bdrv_flags |= BDRV_O_NO_FLUSH;
 }
 
-#ifdef CONFIG_LINUX_AIO
 if ((buf = qemu_opt_get(opts, "aio")) != NULL) {
 if (!strcmp(buf, "native")) {
 bdrv_flags |= BDRV_O_NATIVE_AIO;
@@ -422,7 +421,6 @@ static BlockBackend *blockdev_init(const char *file, QDict 
*bs_opts,
goto early_err;
 }
 }
-#endif
 
 if ((buf = qemu_opt_get(opts, "format")) != NULL) {
 if (is_help_option(buf)) {
-- 
1.8.3.1




[Qemu-block] [PULL 01/29] iotests: disable core dumps in test 061

2015-10-16 Thread Kevin Wolf
From: Alberto Garcia 

Commit 934659c460 disabled the supression of segmentation faults in
bash tests. The new output of test 061, however, assumes that a core
dump will be produced if a program aborts. This is not necessarily the
case because core dumps can be disabled using ulimit.

Since we cannot guarantee that abort() will produce a core dump, we
should use SIGKILL instead (that does not produce any) and update the
test output accordingly.

Signed-off-by: Alberto Garcia 
Reviewed-by: Max Reitz 
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/061 | 8 
 tests/qemu-iotests/061.out | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/061 b/tests/qemu-iotests/061
index 1df887a..e191e65 100755
--- a/tests/qemu-iotests/061
+++ b/tests/qemu-iotests/061
@@ -58,8 +58,8 @@ echo
 echo "=== Testing dirty version downgrade ==="
 echo
 IMGOPTS="compat=1.1,lazy_refcounts=on" _make_test_img 64M
-$QEMU_IO -c "write -P 0x2a 0 128k" -c flush -c abort "$TEST_IMG" 2>&1 \
-| _filter_qemu_io
+$QEMU_IO -c "write -P 0x2a 0 128k" -c flush \
+ -c "sigraise $(kill -l KILL)" "$TEST_IMG" 2>&1 | _filter_qemu_io
 $PYTHON qcow2.py "$TEST_IMG" dump-header
 $QEMU_IMG amend -o "compat=0.10" "$TEST_IMG"
 $PYTHON qcow2.py "$TEST_IMG" dump-header
@@ -92,8 +92,8 @@ echo
 echo "=== Testing dirty lazy_refcounts=off ==="
 echo
 IMGOPTS="compat=1.1,lazy_refcounts=on" _make_test_img 64M
-$QEMU_IO -c "write -P 0x2a 0 128k" -c flush -c abort "$TEST_IMG" 2>&1 \
-| _filter_qemu_io
+$QEMU_IO -c "write -P 0x2a 0 128k" -c flush \
+ -c "sigraise $(kill -l KILL)" "$TEST_IMG" 2>&1 | _filter_qemu_io
 $PYTHON qcow2.py "$TEST_IMG" dump-header
 $QEMU_IMG amend -o "lazy_refcounts=off" "$TEST_IMG"
 $PYTHON qcow2.py "$TEST_IMG" dump-header
diff --git a/tests/qemu-iotests/061.out b/tests/qemu-iotests/061.out
index a683f46..b16bea9 100644
--- a/tests/qemu-iotests/061.out
+++ b/tests/qemu-iotests/061.out
@@ -57,7 +57,7 @@ No errors were found on the image.
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
 wrote 131072/131072 bytes at offset 0
 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-./common.config: Aborted (core dumped) ( exec "$QEMU_IO_PROG" 
$QEMU_IO_OPTIONS "$@" )
+./common.config: Killed  ( exec "$QEMU_IO_PROG" 
$QEMU_IO_OPTIONS "$@" )
 magic 0x514649fb
 version   3
 backing_file_offset   0x0
@@ -215,7 +215,7 @@ No errors were found on the image.
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
 wrote 131072/131072 bytes at offset 0
 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-./common.config: Aborted (core dumped) ( exec "$QEMU_IO_PROG" 
$QEMU_IO_OPTIONS "$@" )
+./common.config: Killed  ( exec "$QEMU_IO_PROG" 
$QEMU_IO_OPTIONS "$@" )
 magic 0x514649fb
 version   3
 backing_file_offset   0x0
-- 
1.8.3.1




[Qemu-block] [PULL 03/29] qmp-commands.hx: Update the supported 'transaction' operations

2015-10-16 Thread Kevin Wolf
From: Kashyap Chamarthy 

Although the canonical source of reference for QMP commands is
qapi-schema.json, for consistency's sake, update qmp-commands.hx to
state the list of supported transactionable operations, namely:

drive-backup
blockdev-backup
blockdev-snapshot-internal-sync
abort
block-dirty-bitmap-add
block-dirty-bitmap-clear

Also update the possible values for the "type" action array.

Signed-off-by: Kashyap Chamarthy 
Reviewed-by: Eric Blake 
Signed-off-by: Kevin Wolf 
---
 qmp-commands.hx | 29 ++---
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index d2ba800..2b52980 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1270,11 +1270,22 @@ SQMP
 transaction
 ---
 
-Atomically operate on one or more block devices.  The only supported operations
-for now are drive-backup, internal and external snapshotting.  A list of
-dictionaries is accepted, that contains the actions to be performed.
-If there is any failure performing any of the operations, all operations
-for the group are abandoned.
+Atomically operate on one or more block devices.  Operations that are
+currently supported:
+
+- drive-backup
+- blockdev-backup
+- blockdev-snapshot-sync
+- blockdev-snapshot-internal-sync
+- abort
+- block-dirty-bitmap-add
+- block-dirty-bitmap-clear
+
+Refer to the qemu/qapi-schema.json file for minimum required QEMU
+versions for these operations.  A list of dictionaries is accepted,
+that contains the actions to be performed.  If there is any failure
+performing any of the operations, all operations for the group are
+abandoned.
 
 For external snapshots, the dictionary contains the device, the file to use for
 the new snapshot, and the format.  The default format, if not specified, is
@@ -1301,8 +1312,12 @@ it later with qemu-img or other command.
 Arguments:
 
 actions array:
-- "type": the operation to perform.  The only supported
-  value is "blockdev-snapshot-sync". (json-string)
+- "type": the operation to perform (json-string).  Possible
+  values: "drive-backup", "blockdev-backup",
+  "blockdev-snapshot-sync",
+  "blockdev-snapshot-internal-sync",
+  "abort", "block-dirty-bitmap-add",
+  "block-dirty-bitmap-clear"
 - "data": a dictionary.  The contents depend on the value
   of "type".  When "type" is "blockdev-snapshot-sync":
   - "device": device name to snapshot (json-string)
-- 
1.8.3.1




[Qemu-block] [PULL 04/29] block: qemu-iotests - fix vmdk test 059.out

2015-10-16 Thread Kevin Wolf
From: Jeff Cody 

In commit fe646693acc13ac48b98435d14149ab04dc597bc, the option
printout format changed.

This updates the VMDK test 059.out to the correct output.

Signed-off-by: Jeff Cody 
Reviewed-by: Fam Zheng 
Signed-off-by: Kevin Wolf 
---
 tests/qemu-iotests/059.out | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/059.out b/tests/qemu-iotests/059.out
index 67e3cf5..00057fe 100644
--- a/tests/qemu-iotests/059.out
+++ b/tests/qemu-iotests/059.out
@@ -16,17 +16,17 @@ qemu-io: can't open device TEST_DIR/t.vmdk: L1 size too big
 no file open, try 'help open'
 
 === Testing monolithicFlat creation and opening ===
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=2147483648
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=2147483648 
subformat=monolithicFlat
 image: TEST_DIR/t.IMGFMT
 file format: IMGFMT
 virtual size: 2.0G (2147483648 bytes)
 
 === Testing monolithicFlat with zeroed_grain ===
 qemu-img: TEST_DIR/t.IMGFMT: Flat image can't enable zeroed grain
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=2147483648
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=2147483648 
subformat=monolithicFlat
 
 === Testing big twoGbMaxExtentFlat ===
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824000
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=1073741824000 
subformat=twoGbMaxExtentFlat
 image: TEST_DIR/t.vmdk
 file format: vmdk
 virtual size: 1.0T (1073741824000 bytes)
@@ -2043,7 +2043,7 @@ RW 12582912 VMFS "dummy.IMGFMT" 1
 
 
 === Testing truncated sparse ===
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=107374182400
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=107374182400 
subformat=monolithicSparse
 qemu-img: Could not open 'TEST_DIR/t.IMGFMT': File truncated, expecting at 
least 13172736 bytes
 
 === Converting to streamOptimized from image with small cluster size===
@@ -2054,7 +2054,7 @@ wrote 512/512 bytes at offset 10240
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
 === Testing monolithicFlat with internally generated JSON file name ===
-Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
subformat=monolithicFlat
 qemu-io: can't open: Cannot use relative extent paths with VMDK descriptor 
file 'json:{"image": {"driver": "file", "filename": "TEST_DIR/t.IMGFMT"}, 
"driver": "blkdebug", "inject-error.0.event": "read_aio"}'
 
 === Testing version 3 ===
@@ -2264,7 +2264,7 @@ read 512/512 bytes at offset 64931328
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
 === Testing 4TB monolithicFlat creation and IO ===
-Formatting 'TEST_DIR/iotest-version3.IMGFMT', fmt=IMGFMT size=4398046511104
+Formatting 'TEST_DIR/iotest-version3.IMGFMT', fmt=IMGFMT size=4398046511104 
subformat=monolithicFlat
 image: TEST_DIR/iotest-version3.IMGFMT
 file format: IMGFMT
 virtual size: 4.0T (4398046511104 bytes)
-- 
1.8.3.1




[Qemu-block] [PULL 00/29] Block layer patches

2015-10-16 Thread Kevin Wolf
The following changes since commit c49d3411faae8ffaab8f7e5db47405a008411c10:

  Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2015-10-12' into 
staging (2015-10-13 10:42:06 +0100)

are available in the git repository at:


  git://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to 6b826af7b010ed1963b1e7bfb5c389dcdbaff222:

  blkdebug: Don't confuse image as backing file (2015-10-16 15:35:48 +0200)


Block layer patches


Alberto Garcia (2):
  iotests: disable core dumps in test 061
  throttle: test that snapshots move the throttling configuration

Fam Zheng (1):
  blkdebug: Don't confuse image as backing file

Jeff Cody (3):
  block: qemu-iotests - fix vmdk test 059.out
  util - add automated ID generation utility
  block: auto-generated node-names

Kashyap Chamarthy (1):
  qmp-commands.hx: Update the supported 'transaction' operations

Kevin Wolf (19):
  block: Introduce BDS.file_child
  vmdk: Use BdrvChild instead of BDS for references to extents
  blkverify: Convert s->test_file to BdrvChild
  quorum: Convert to BdrvChild
  block: Convert bs->file to BdrvChild
  block: Remove bdrv_open_image()
  block: Convert bs->backing_hd to BdrvChild
  block: Manage backing file references in bdrv_set_backing_hd()
  block: Split bdrv_move_feature_fields()
  block/io: Make bdrv_requests_pending() public
  block-backend: Add blk_set_bs()
  block: Introduce parents list
  block: Implement bdrv_append() without bdrv_swap()
  blockjob: Store device name at job creation
  block: Add and use bdrv_replace_in_backing_chain()
  block: Remove bdrv_swap()
  block: Allow bdrv_unref_child(bs, NULL)
  blkverify: Fix BDS leak in .bdrv_open error path
  qcow2: Remove forward declaration of QCowAIOCB

Stefan Hajnoczi (3):
  raw-posix: warn about BDRV_O_NATIVE_AIO if libaio is unavailable
  blockdev: always compile in -drive aio= parsing
  qemu-nbd: always compile in --aio=MODE option

 block.c  | 512 ---
 block/blkdebug.c |  34 +--
 block/blkverify.c|  71 +++---
 block/block-backend.c|  17 ++
 block/bochs.c|   8 +-
 block/cloop.c|  10 +-
 block/dmg.c  |  20 +-
 block/io.c   |  76 +++---
 block/mirror.c   |  22 +-
 block/parallels.c|  38 +--
 block/qapi.c |  10 +-
 block/qcow.c |  47 ++--
 block/qcow2-cache.c  |  11 +-
 block/qcow2-cluster.c|  41 ++--
 block/qcow2-refcount.c   |  45 ++--
 block/qcow2-snapshot.c   |  30 +--
 block/qcow2.c|  68 +++---
 block/qcow2.h|   2 -
 block/qed-table.c|   4 +-
 block/qed.c  |  51 ++--
 block/quorum.c   |  65 ++---
 block/raw-posix.c|  11 +-
 block/raw_bsd.c  |  40 +--
 block/snapshot.c |  12 +-
 block/stream.c   |  34 +--
 block/vdi.c  |  17 +-
 block/vhdx-log.c |  25 +-
 block/vhdx.c |  36 +--
 block/vmdk.c | 133 +-
 block/vpc.c  |  34 +--
 block/vvfat.c|  19 +-
 blockdev.c   |   8 +-
 blockjob.c   |  15 +-
 include/block/block.h|  15 +-
 include/block/block_int.h|  20 +-
 include/block/blockjob.h |   8 +
 include/qemu-common.h|   8 +
 include/qemu/queue.h |   6 -
 qemu-img.c   |  20 +-
 qemu-nbd.c   |   8 -
 qmp-commands.hx  |  29 ++-
 tests/qemu-iotests/041   |   4 +-
 tests/qemu-iotests/051   |   3 +-
 tests/qemu-iotests/051.out   |   2 +-
 tests/qemu-iotests/059.out   |  12 +-
 tests/qemu-iotests/061   |   8 +-
 tests/qemu-iotests/061.out   |   4 +-
 tests/qemu-iotests/067   |   3 +-
 tests/qemu-iotests/067.out   |   5 +
 tests/qemu-iotests/081   |   3 +-
 tests/qemu-iotests/081.out   |   2 +-
 tests/qemu-iotests/096   |  69 ++
 tests/qemu-iotests/096.out   |   5 +
 tests/qemu-iotests/common.filter |   5 +
 tests/qemu-iotests/group |   1 +
 util/id.c|  37 +++
 56 files changed, 960 insertions(+), 883 deletions(-)
 create mode 100644 tests/qemu-iotests/096
 create mode 100644 tests/qemu-iotests/096.out



[Qemu-block] [PULL 09/29] block: Convert bs->file to BdrvChild

2015-10-16 Thread Kevin Wolf
This patch removes the temporary duplication between bs->file and
bs->file_child by converting everything to BdrvChild.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Alberto Garcia 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
---
 block.c   | 63 ++-
 block/blkdebug.c  | 32 +---
 block/blkverify.c | 33 ++---
 block/bochs.c |  8 +++---
 block/cloop.c | 10 
 block/dmg.c   | 20 +++
 block/io.c| 50 ++---
 block/parallels.c | 38 ++--
 block/qapi.c  |  2 +-
 block/qcow.c  | 43 +---
 block/qcow2-cache.c   | 11 +
 block/qcow2-cluster.c | 37 
 block/qcow2-refcount.c| 45 +
 block/qcow2-snapshot.c| 30 +++---
 block/qcow2.c | 62 --
 block/qed-table.c |  4 +--
 block/qed.c   | 32 
 block/raw_bsd.c   | 40 +++---
 block/snapshot.c  | 12 -
 block/vdi.c   | 17 +++--
 block/vhdx-log.c  | 25 ++-
 block/vhdx.c  | 36 ++-
 block/vmdk.c  | 27 ++--
 block/vpc.c   | 34 +
 include/block/block.h |  8 +-
 include/block/block_int.h |  3 +--
 26 files changed, 378 insertions(+), 344 deletions(-)

diff --git a/block.c b/block.c
index 75afed1..8fd345b 100644
--- a/block.c
+++ b/block.c
@@ -809,7 +809,7 @@ static QemuOptsList bdrv_runtime_opts = {
  *
  * Removes all processed options from *options.
  */
-static int bdrv_open_common(BlockDriverState *bs, BlockDriverState *file,
+static int bdrv_open_common(BlockDriverState *bs, BdrvChild *file,
 QDict *options, int flags, BlockDriver *drv, Error **errp)
 {
 int ret, open_flags;
@@ -823,7 +823,7 @@ static int bdrv_open_common(BlockDriverState *bs, 
BlockDriverState *file,
 assert(options != NULL && bs->options != options);
 
 if (file != NULL) {
-filename = file->filename;
+filename = file->bs->filename;
 } else {
 filename = qdict_get_try_str(options, "filename");
 }
@@ -1401,7 +1401,8 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
const char *filename,
  const BdrvChildRole *child_role, Error **errp)
 {
 int ret;
-BlockDriverState *file = NULL, *bs;
+BdrvChild *file = NULL;
+BlockDriverState *bs;
 BlockDriver *drv = NULL;
 const char *drvname;
 Error *local_err = NULL;
@@ -1485,25 +1486,20 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
const char *filename,
 flags = bdrv_backing_flags(flags);
 }
 
-assert(file == NULL);
 bs->open_flags = flags;
 
-bs->file_child = bdrv_open_child(filename, options, "file", bs,
- _file, true, _err);
+file = bdrv_open_child(filename, options, "file", bs,
+   _file, true, _err);
 if (local_err) {
 ret = -EINVAL;
 goto fail;
 }
-
-if (bs->file_child) {
-file = bs->file_child->bs;
-}
 }
 
 /* Image format probing */
 bs->probed = !drv;
 if (!drv && file) {
-ret = find_image_format(file, filename, , _err);
+ret = find_image_format(file->bs, filename, , _err);
 if (ret < 0) {
 goto fail;
 }
@@ -1526,7 +1522,7 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
const char *filename,
 }
 
 if (file && (bs->file != file)) {
-bdrv_unref(file);
+bdrv_unref_child(bs, file);
 file = NULL;
 }
 
@@ -1587,7 +1583,7 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
const char *filename,
 
 fail:
 if (file != NULL) {
-bdrv_unref(file);
+bdrv_unref_child(bs, file);
 }
 QDECREF(bs->options);
 QDECREF(options);
@@ -1928,6 +1924,7 @@ void bdrv_close(BlockDriverState *bs)
 BdrvChild *child, *next;
 
 bs->drv->bdrv_close(bs);
+bs->drv = NULL;
 
 if (bs->backing_hd) {
 BlockDriverState *backing_hd = bs->backing_hd;
@@ -1935,6 +1932,11 @@ void bdrv_close(BlockDriverState *bs)
 bdrv_unref(backing_hd);
 }
 
+if (bs->file != NULL) {
+bdrv_unref_child(bs, bs->file);
+bs->file = NULL;
+}
+
 QLIST_FOREACH_SAFE(child, >children, next, next) {
 /* TODO Remove bdrv_unref() from drivers' close function and use
  

[Qemu-block] [PULL 11/29] block: Convert bs->backing_hd to BdrvChild

2015-10-16 Thread Kevin Wolf
This is the final step in converting all of the BlockDriverState
pointers that block drivers use to BdrvChild.

After this patch, bs->children contains the full list of child nodes
that are referenced by a given BDS, and these children are only
referenced through BdrvChild, so that updating the pointer in there is
enough for changing edges in the graph.

Signed-off-by: Kevin Wolf 
Reviewed-by: Alberto Garcia 
Reviewed-by: Max Reitz 
Reviewed-by: Stefan Hajnoczi 
---
 block.c   | 105 +++---
 block/io.c|  24 +--
 block/mirror.c|   6 +--
 block/qapi.c  |   8 ++--
 block/qcow.c  |   4 +-
 block/qcow2-cluster.c |   4 +-
 block/qcow2.c |   6 +--
 block/qed.c   |  12 +++---
 block/stream.c|   8 ++--
 block/vmdk.c  |  21 +-
 block/vvfat.c |   6 +--
 blockdev.c|   4 +-
 include/block/block_int.h |  12 --
 qemu-img.c|   4 +-
 14 files changed, 115 insertions(+), 109 deletions(-)

diff --git a/block.c b/block.c
index 33ecd93..ecc0885 100644
--- a/block.c
+++ b/block.c
@@ -721,7 +721,7 @@ const BdrvChildRole child_format = {
 };
 
 /*
- * Returns the flags that bs->backing_hd should get, based on the given flags
+ * Returns the flags that bs->backing should get, based on the given flags
  * for the parent BDS
  */
 static int bdrv_backing_flags(int flags)
@@ -1115,32 +1115,31 @@ void bdrv_unref_child(BlockDriverState *parent, 
BdrvChild *child)
 void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
 {
 
-if (bs->backing_hd) {
+if (bs->backing) {
 assert(bs->backing_blocker);
-bdrv_op_unblock_all(bs->backing_hd, bs->backing_blocker);
-bdrv_detach_child(bs->backing_child);
+bdrv_op_unblock_all(bs->backing->bs, bs->backing_blocker);
+bdrv_detach_child(bs->backing);
 } else if (backing_hd) {
 error_setg(>backing_blocker,
"node is used as backing hd of '%s'",
bdrv_get_device_or_node_name(bs));
 }
 
-bs->backing_hd = backing_hd;
 if (!backing_hd) {
 error_free(bs->backing_blocker);
 bs->backing_blocker = NULL;
-bs->backing_child = NULL;
+bs->backing = NULL;
 goto out;
 }
-bs->backing_child = bdrv_attach_child(bs, backing_hd, _backing);
+bs->backing = bdrv_attach_child(bs, backing_hd, _backing);
 bs->open_flags &= ~BDRV_O_NO_BACKING;
 pstrcpy(bs->backing_file, sizeof(bs->backing_file), backing_hd->filename);
 pstrcpy(bs->backing_format, sizeof(bs->backing_format),
 backing_hd->drv ? backing_hd->drv->format_name : "");
 
-bdrv_op_block_all(bs->backing_hd, bs->backing_blocker);
+bdrv_op_block_all(backing_hd, bs->backing_blocker);
 /* Otherwise we won't be able to commit due to check in bdrv_commit */
-bdrv_op_unblock(bs->backing_hd, BLOCK_OP_TYPE_COMMIT_TARGET,
+bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_COMMIT_TARGET,
 bs->backing_blocker);
 out:
 bdrv_refresh_limits(bs, NULL);
@@ -1161,7 +1160,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*options, Error **errp)
 BlockDriverState *backing_hd;
 Error *local_err = NULL;
 
-if (bs->backing_hd != NULL) {
+if (bs->backing != NULL) {
 QDECREF(options);
 goto free_exit;
 }
@@ -1201,7 +1200,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*options, Error **errp)
 qdict_put(options, "driver", qstring_from_str(bs->backing_format));
 }
 
-assert(bs->backing_hd == NULL);
+assert(bs->backing == NULL);
 ret = bdrv_open_inherit(_hd,
 *backing_filename ? backing_filename : NULL,
 NULL, options, 0, bs, _backing, _err);
@@ -1892,8 +1891,8 @@ void bdrv_close(BlockDriverState *bs)
 bs->drv->bdrv_close(bs);
 bs->drv = NULL;
 
-if (bs->backing_hd) {
-BlockDriverState *backing_hd = bs->backing_hd;
+if (bs->backing) {
+BlockDriverState *backing_hd = bs->backing->bs;
 bdrv_set_backing_hd(bs, NULL);
 bdrv_unref(backing_hd);
 }
@@ -2205,20 +2204,20 @@ int bdrv_commit(BlockDriverState *bs)
 if (!drv)
 return -ENOMEDIUM;
 
-if (!bs->backing_hd) {
+if (!bs->backing) {
 return -ENOTSUP;
 }
 
 if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_COMMIT_SOURCE, NULL) ||
-bdrv_op_is_blocked(bs->backing_hd, BLOCK_OP_TYPE_COMMIT_TARGET, NULL)) 
{
+bdrv_op_is_blocked(bs->backing->bs, BLOCK_OP_TYPE_COMMIT_TARGET, 
NULL)) {
 return -EBUSY;
 }
 
-ro = bs->backing_hd->read_only;
-open_flags =  bs->backing_hd->open_flags;
+ro = bs->backing->bs->read_only;
+open_flags =  

[Qemu-block] [PULL 15/29] block-backend: Add blk_set_bs()

2015-10-16 Thread Kevin Wolf
It allows changing the BlockDriverState that a BlockBackend points to.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Alberto Garcia 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
---
 block/block-backend.c | 17 +
 include/block/block_int.h |  2 ++
 2 files changed, 19 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index c2e8732..2256551 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -239,6 +239,23 @@ BlockDriverState *blk_bs(BlockBackend *blk)
 }
 
 /*
+ * Changes the BlockDriverState attached to @blk
+ */
+void blk_set_bs(BlockBackend *blk, BlockDriverState *bs)
+{
+bdrv_ref(bs);
+
+if (blk->bs) {
+blk->bs->blk = NULL;
+bdrv_unref(blk->bs);
+}
+assert(bs->blk == NULL);
+
+blk->bs = bs;
+bs->blk = blk;
+}
+
+/*
  * Return @blk's DriveInfo if any, else null.
  */
 DriveInfo *blk_legacy_dinfo(BlockBackend *blk)
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 4598101..cfcae52 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -659,6 +659,8 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
   BlockCompletionFunc *cb, void *opaque,
   Error **errp);
 
+void blk_set_bs(BlockBackend *blk, BlockDriverState *bs);
+
 void blk_dev_change_media_cb(BlockBackend *blk, bool load);
 bool blk_dev_has_removable_media(BlockBackend *blk);
 void blk_dev_eject_request(BlockBackend *blk, bool force);
-- 
1.8.3.1




[Qemu-block] [PULL 27/29] qemu-nbd: always compile in --aio=MODE option

2015-10-16 Thread Kevin Wolf
From: Stefan Hajnoczi 

The --aio=MODE option enables Linux AIO or Windows overlapped I/O.

The #ifdef CONFIG_LINUX_AIO was a layering violation that also prevented
Windows overlapped I/O from being used.

Now that raw-posix.c prints an error when Linux AIO has not been
compiled in, we can unconditionally compile the option into qemu-nbd.

After this patch qemu-nbd --aio=native works on Windows.

Signed-off-by: Stefan Hajnoczi 
Signed-off-by: Kevin Wolf 
---
 qemu-nbd.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 6428c15..422a607 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -98,9 +98,7 @@ static void usage(const char *name)
 "'[ID_OR_NAME]'\n"
 "  -n, --nocache disable host cache\n"
 "  --cache=MODE  set cache mode (none, writeback, ...)\n"
-#ifdef CONFIG_LINUX_AIO
 "  --aio=MODEset AIO mode (native or threads)\n"
-#endif
 "  --discard=MODEset discard mode (ignore, unmap)\n"
 "  --detect-zeroes=MODE  set detect-zeroes mode (off, on, unmap)\n"
 "\n"
@@ -412,9 +410,7 @@ int main(int argc, char **argv)
 { "load-snapshot", 1, NULL, 'l' },
 { "nocache", 0, NULL, 'n' },
 { "cache", 1, NULL, QEMU_NBD_OPT_CACHE },
-#ifdef CONFIG_LINUX_AIO
 { "aio", 1, NULL, QEMU_NBD_OPT_AIO },
-#endif
 { "discard", 1, NULL, QEMU_NBD_OPT_DISCARD },
 { "detect-zeroes", 1, NULL, QEMU_NBD_OPT_DETECT_ZEROES },
 { "shared", 1, NULL, 'e' },
@@ -432,9 +428,7 @@ int main(int argc, char **argv)
 int fd;
 bool seen_cache = false;
 bool seen_discard = false;
-#ifdef CONFIG_LINUX_AIO
 bool seen_aio = false;
-#endif
 pthread_t client_thread;
 const char *fmt = NULL;
 Error *local_err = NULL;
@@ -467,7 +461,6 @@ int main(int argc, char **argv)
 errx(EXIT_FAILURE, "Invalid cache mode `%s'", optarg);
 }
 break;
-#ifdef CONFIG_LINUX_AIO
 case QEMU_NBD_OPT_AIO:
 if (seen_aio) {
 errx(EXIT_FAILURE, "--aio can only be specified once");
@@ -481,7 +474,6 @@ int main(int argc, char **argv)
errx(EXIT_FAILURE, "invalid aio mode `%s'", optarg);
 }
 break;
-#endif
 case QEMU_NBD_OPT_DISCARD:
 if (seen_discard) {
 errx(EXIT_FAILURE, "--discard can only be specified once");
-- 
1.8.3.1




Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Paul Durrant
> -Original Message-
> From: Kevin Wolf [mailto:kw...@redhat.com]
> Sent: 16 October 2015 16:02
> To: Paul Durrant
> Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> de...@nongnu.org; xen-de...@lists.xen.org; qemu-block@nongnu.org
> Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci
> missed in qemu
> 
> Am 16.10.2015 um 16:24 hat Paul Durrant geschrieben:
> > > -Original Message-
> > > From: Kevin Wolf [mailto:kw...@redhat.com]
> > > Sent: 16 October 2015 15:04
> > > To: Paul Durrant
> > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> > > de...@nongnu.org; xen-de...@lists.xen.org; qemu-block@nongnu.org
> > > Subject: Re: [Qemu-devel] Question about xen disk unplug support for
> ahci
> > > missed in qemu
> > >
> > > Am 14.10.2015 um 14:48 hat Paul Durrant geschrieben:
> > > > > -Original Message-
> > > > > From: Fabio Fantoni [mailto:fabio.fant...@m2r.biz]
> > > > > Sent: 14 October 2015 12:12
> > > > > To: Kevin Wolf; Stefano Stabellini
> > > > > Cc: John Snow; Anthony Perard; qemu-de...@nongnu.org; xen-
> > > > > de...@lists.xen.org; qemu-block@nongnu.org; Paul Durrant
> > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug support
> for
> > > ahci
> > > > > missed in qemu
> > > > >
> > > > >
> > > > >
> > > > > Il 14/10/2015 11:47, Kevin Wolf ha scritto:
> > > > > > [ CC qemu-block ]
> > > > > >
> > > > > > Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieben:
> > > > > >> On Tue, 13 Oct 2015, John Snow wrote:
> > > > > >>> On 10/13/2015 11:55 AM, Fabio Fantoni wrote:
> > > > >  I added ahci disk support in libxl and using it for week seems
> that
> > > was
> > > > >  ok, after a reply of Stefano Stabellini seems that xen disk 
> > > > >  unplug
> > > > >  support only ide disks:
> > > > > 
> > > > >
> > >
> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=679f4f8b178e7c66fbc2f39
> > > > > c905374ee8663d5d8
> > > > > 
> > > > >  Today Paul Durrant told me that even if pv disk is ok also with
> ahci
> > > and
> > > > >  the emulated one is offline can be a risk:
> > > > >  http://lists.xenproject.org/archives/html/win-pv-devel/2015-
> > > > > 10/msg00021.html
> > > > > 
> > > > > 
> > > > >  I tried to take a fast look in qemu code but I not understand the
> > > > > needed
> > > > >  thing for add the xen disk unplug support also for ahci, can
> > > someone do
> > > > >  it or tell me useful information for do it please?
> > > > > 
> > > > >  Thanks for any reply and sorry for my bad english.
> > > > > 
> > > > > >>> I'm not entirely sure what features you need AHCI to support in
> > > order
> > > > > >>> for Xen to be happy.
> > > > > >>>
> > > > > >>> I'd guess hotplugging, but where I get confused is that IDE disks
> don't
> > > > > >>> support hotplugging either, so I guess I'm not sure sure what you
> > > need.
> > > > > >>>
> > > > > >>> Stefano, can you help bridge my Xen knowledge gap?
> > > > > >>
> > > > > >> Hi John,
> > > > > >>
> > > > > >> we need something like
> hw/i386/xen/xen_platform.c:unplug_disks
> > > but
> > > > > that
> > > > > >> can unplug AHCI disk. And by unplug, I mean "make disappear" like
> > > > > >> pci_piix3_xen_ide_unplug does for ide.
> > > > > > Maybe this would be the right time to stop the craziness with your
> > > > > > hybrid IDE/xendisk setup. It's a horrible thing that would never
> happen
> > > > > > on real hardware.
> > > >
> > > > Unfortunately, it's going to be difficult to remove such 'craziness' 
> > > > when
> you
> > > don't know a priori whether the VM has PV drivers or not.
> > >
> > > Why wouldn't you know that beforehand? I mean, even on real
> hardware
> > > you
> > > can have different disk interfaces (IDE, AHCI, SCSI) and you install
> > > the exact driver that your hardware needs. You just do the same thing on
> > > VM: If your hardware is PV, you install a PV driver. If your hardware is
> > > IDE, you install an IDE driver. Whether it's PV or IDE is something that
> > > you, the user, decided when configuring the VM, so you definitely know.
> > >
> >
> > That's not necessarily true. The host admin that provisions the VM does not
> necessarily know what OS the user of that VM will install. The admin may just
> be providing a generic VM with an emulated CD drive that the user can point
> at any ISO they want.
> >
> > So, as a host admin, if you provide a VM with only PV backends and your
> user is trying to boot an OS with no PV drivers they are not going to be
> happy, so you provide emulated devices. Then, at some point later, when
> the user installs PV drivers, there really should be some way for those 
> drivers
> to start up without any need to contact the host admin and have the VM
> reconfigured.
> 
> Why only IDE and xendisk then? Maybe I have an OS that works great with
> AHCI, or virtio-blk, or an LSI SCSI controller, or a Megasas SCSI
> controller, or USB 

[Qemu-block] [PULL 29/29] blkdebug: Don't confuse image as backing file

2015-10-16 Thread Kevin Wolf
From: Fam Zheng 

The word "backing file" nowadays refers to the backing_hd in the
external snapshot sense (i.e. bs->backing_hd), instead of the file sense
(bs->file). Correct the comment to use the right term.

Signed-off-by: Fam Zheng 
Reviewed-by: Alberto Garcia 
Reviewed-by: Eric Blake 
Signed-off-by: Kevin Wolf 
---
 block/blkdebug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blkdebug.c b/block/blkdebug.c
index c1cb3cb..6860a2b 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -426,7 +426,7 @@ static int blkdebug_open(BlockDriverState *bs, QDict 
*options, int flags,
 /* Set initial state */
 s->state = 1;
 
-/* Open the backing file */
+/* Open the image file */
 bs->file = bdrv_open_child(qemu_opt_get(opts, "x-image"), options, "image",
bs, _file, false, _err);
 if (local_err) {
-- 
1.8.3.1




[Qemu-block] [PULL 19/29] block: Add and use bdrv_replace_in_backing_chain()

2015-10-16 Thread Kevin Wolf
This cleans up the mess we left behind in the mirror code after the
previous patch. Instead of using bdrv_swap(), just change pointers.

The interface change of the mirror job that callers must consider is
that after job completion, their local BDS pointers still point to the
same node now. qemu-img must change its code accordingly (which makes it
easier to understand); the other callers stays unchanged because after
completion they don't do anything with the BDS, but just with the job,
and the job is still owned by the source BDS.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
---
 block.c   | 32 +++-
 block/mirror.c| 23 +++
 include/block/block.h |  4 +++-
 qemu-img.c| 16 
 4 files changed, 49 insertions(+), 26 deletions(-)

diff --git a/block.c b/block.c
index b4d2313..7c66d3e 100644
--- a/block.c
+++ b/block.c
@@ -1095,7 +1095,7 @@ static BdrvChild *bdrv_attach_child(BlockDriverState 
*parent_bs,
 return child;
 }
 
-void bdrv_detach_child(BdrvChild *child)
+static void bdrv_detach_child(BdrvChild *child)
 {
 QLIST_REMOVE(child, next);
 QLIST_REMOVE(child, next_parent);
@@ -2232,6 +2232,36 @@ void bdrv_append(BlockDriverState *bs_new, 
BlockDriverState *bs_top)
 bdrv_unref(bs_new);
 }
 
+void bdrv_replace_in_backing_chain(BlockDriverState *old, BlockDriverState 
*new)
+{
+assert(!bdrv_requests_pending(old));
+assert(!bdrv_requests_pending(new));
+
+bdrv_ref(old);
+
+if (old->blk) {
+/* As long as these fields aren't in BlockBackend, but in the top-level
+ * BlockDriverState, it's not possible for a BDS to have two BBs.
+ *
+ * We really want to copy the fields from old to new, but we go for a
+ * swap instead so that pointers aren't duplicated and cause trouble.
+ * (Also, bdrv_swap() used to do the same.) */
+assert(!new->blk);
+swap_feature_fields(old, new);
+}
+change_parent_backing_link(old, new);
+
+/* Change backing files if a previously independent node is added to the
+ * chain. For active commit, we replace top by its own (indirect) backing
+ * file and don't do anything here so we don't build a loop. */
+if (new->backing == NULL && !bdrv_chain_contains(backing_bs(old), new)) {
+bdrv_set_backing_hd(new, backing_bs(old));
+bdrv_set_backing_hd(old, NULL);
+}
+
+bdrv_unref(old);
+}
+
 static void bdrv_delete(BlockDriverState *bs)
 {
 assert(!bs->job);
diff --git a/block/mirror.c b/block/mirror.c
index c277691..7e43511 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -353,6 +353,11 @@ static void mirror_exit(BlockJob *job, void *opaque)
 MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
 MirrorExitData *data = opaque;
 AioContext *replace_aio_context = NULL;
+BlockDriverState *src = s->common.bs;
+
+/* Make sure that the source BDS doesn't go away before we called
+ * block_job_completed(). */
+bdrv_ref(src);
 
 if (s->to_replace) {
 replace_aio_context = bdrv_get_aio_context(s->to_replace);
@@ -367,22 +372,7 @@ static void mirror_exit(BlockJob *job, void *opaque)
 if (bdrv_get_flags(s->target) != bdrv_get_flags(to_replace)) {
 bdrv_reopen(s->target, bdrv_get_flags(to_replace), NULL);
 }
-bdrv_swap(s->target, to_replace);
-if (s->common.driver->job_type == BLOCK_JOB_TYPE_COMMIT) {
-/* drop the bs loop chain formed by the swap: break the loop then
- * trigger the unref */
-/* FIXME This duplicates bdrv_set_backing_hd(), except for the
- * actual detach/unref so that the loop can be broken. When
- * bdrv_swap() gets replaced, this will become sane again. */
-BlockDriverState *backing = s->base->backing->bs;
-assert(s->base->backing_blocker);
-bdrv_op_unblock_all(backing, s->base->backing_blocker);
-error_free(s->base->backing_blocker);
-s->base->backing_blocker = NULL;
-bdrv_detach_child(s->base->backing);
-s->base->backing = NULL;
-bdrv_unref(backing);
-}
+bdrv_replace_in_backing_chain(to_replace, s->target);
 }
 if (s->to_replace) {
 bdrv_op_unblock_all(s->to_replace, s->replace_blocker);
@@ -396,6 +386,7 @@ static void mirror_exit(BlockJob *job, void *opaque)
 bdrv_unref(s->target);
 block_job_completed(>common, data->ret);
 g_free(data);
+bdrv_unref(src);
 }
 
 static void coroutine_fn mirror_run(void *opaque)
diff --git a/include/block/block.h b/include/block/block.h
index da94bdf..6d70eb4 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -203,6 +203,9 @@ BlockDriverState *bdrv_new(void);
 void 

[Qemu-block] [PULL 20/29] block: Remove bdrv_swap()

2015-10-16 Thread Kevin Wolf
bdrv_swap() is unused now. Remove it and all functions that have
no other users than bdrv_swap(). In particular, this removes the
.bdrv_rebind callbacks from block drivers.

Signed-off-by: Kevin Wolf 
Reviewed-by: Max Reitz 
Reviewed-by: Alberto Garcia 
Reviewed-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
---
 block.c   | 155 +-
 block/qed.c   |   7 ---
 block/vvfat.c |   7 ---
 include/block/block_int.h |   1 -
 include/qemu/queue.h  |   6 --
 5 files changed, 1 insertion(+), 175 deletions(-)

diff --git a/block.c b/block.c
index 7c66d3e..f38146e 100644
--- a/block.c
+++ b/block.c
@@ -1981,15 +1981,7 @@ void bdrv_make_anon(BlockDriverState *bs)
 bs->node_name[0] = '\0';
 }
 
-static void bdrv_rebind(BlockDriverState *bs)
-{
-if (bs->drv && bs->drv->bdrv_rebind) {
-bs->drv->bdrv_rebind(bs);
-}
-}
-
-/* Fields that need to stay with the top-level BDS, no matter whether the
- * address of the top-level BDS stays the same or not. */
+/* Fields that need to stay with the top-level BDS */
 static void bdrv_move_feature_fields(BlockDriverState *bs_dest,
  BlockDriverState *bs_src)
 {
@@ -2013,151 +2005,6 @@ static void bdrv_move_feature_fields(BlockDriverState 
*bs_dest,
 bs_dest->dirty_bitmaps  = bs_src->dirty_bitmaps;
 }
 
-/* Fields that only need to be swapped if the contents of BDSes is swapped
- * rather than pointers being changed in the parents, and throttling fields
- * because only bdrv_swap() messes with internals of throttling. */
-static void bdrv_move_reference_fields(BlockDriverState *bs_dest,
-   BlockDriverState *bs_src)
-{
-/* i/o throttled req */
-bs_dest->throttle_state = bs_src->throttle_state,
-bs_dest->io_limits_enabled  = bs_src->io_limits_enabled;
-bs_dest->pending_reqs[0]= bs_src->pending_reqs[0];
-bs_dest->pending_reqs[1]= bs_src->pending_reqs[1];
-bs_dest->throttled_reqs[0]  = bs_src->throttled_reqs[0];
-bs_dest->throttled_reqs[1]  = bs_src->throttled_reqs[1];
-memcpy(_dest->round_robin,
-   _src->round_robin,
-   sizeof(bs_dest->round_robin));
-memcpy(_dest->throttle_timers,
-   _src->throttle_timers,
-   sizeof(ThrottleTimers));
-
-/* reference count */
-bs_dest->refcnt = bs_src->refcnt;
-
-/* job */
-bs_dest->job= bs_src->job;
-
-/* keep the same entry in bdrv_states */
-bs_dest->device_list = bs_src->device_list;
-bs_dest->blk = bs_src->blk;
-bs_dest->parents = bs_src->parents;
-
-memcpy(bs_dest->op_blockers, bs_src->op_blockers,
-   sizeof(bs_dest->op_blockers));
-}
-
-/*
- * Swap bs contents for two image chains while they are live,
- * while keeping required fields on the BlockDriverState that is
- * actually attached to a device.
- *
- * This will modify the BlockDriverState fields, and swap contents
- * between bs_new and bs_old. Both bs_new and bs_old are modified.
- *
- * bs_new must not be attached to a BlockBackend.
- *
- * This function does not create any image files.
- */
-void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old)
-{
-BlockDriverState tmp;
-BdrvChild *child;
-
-bdrv_drain(bs_new);
-bdrv_drain(bs_old);
-
-/* The code needs to swap the node_name but simply swapping node_list won't
- * work so first remove the nodes from the graph list, do the swap then
- * insert them back if needed.
- */
-if (bs_new->node_name[0] != '\0') {
-QTAILQ_REMOVE(_bdrv_states, bs_new, node_list);
-}
-if (bs_old->node_name[0] != '\0') {
-QTAILQ_REMOVE(_bdrv_states, bs_old, node_list);
-}
-
-/* If the BlockDriverState is part of a throttling group acquire
- * its lock since we're going to mess with the protected fields.
- * Otherwise there's no need to worry since no one else can touch
- * them. */
-if (bs_old->throttle_state) {
-throttle_group_lock(bs_old);
-}
-
-/* bs_new must be unattached and shouldn't have anything fancy enabled */
-assert(!bs_new->blk);
-assert(QLIST_EMPTY(_new->dirty_bitmaps));
-assert(bs_new->job == NULL);
-assert(bs_new->io_limits_enabled == false);
-assert(bs_new->throttle_state == NULL);
-assert(!throttle_timers_are_initialized(_new->throttle_timers));
-
-tmp = *bs_new;
-*bs_new = *bs_old;
-*bs_old = tmp;
-
-/* there are some fields that should not be swapped, move them back */
-bdrv_move_feature_fields(, bs_old);
-bdrv_move_feature_fields(bs_old, bs_new);
-bdrv_move_feature_fields(bs_new, );
-
-bdrv_move_reference_fields(, bs_old);
-bdrv_move_reference_fields(bs_old, bs_new);
-bdrv_move_reference_fields(bs_new, );
-
-/* bs_new must remain unattached */

[Qemu-block] [PULL 28/29] qcow2: Remove forward declaration of QCowAIOCB

2015-10-16 Thread Kevin Wolf
This struct doesn't exist any more since commit 3fc48d09 in August 2011,
it's about time to remove its forward declaration.

Signed-off-by: Kevin Wolf 
Reviewed-by: Eric Blake 
---
 block/qcow2.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index d700bf1..3512263 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -295,8 +295,6 @@ typedef struct BDRVQcow2State {
 char *image_backing_format;
 } BDRVQcow2State;
 
-struct QCowAIOCB;
-
 typedef struct Qcow2COWRegion {
 /**
  * Offset of the COW region in bytes from the start of the first cluster
-- 
1.8.3.1




Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Kevin Wolf
Am 16.10.2015 um 16:24 hat Paul Durrant geschrieben:
> > -Original Message-
> > From: Kevin Wolf [mailto:kw...@redhat.com]
> > Sent: 16 October 2015 15:04
> > To: Paul Durrant
> > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> > de...@nongnu.org; xen-de...@lists.xen.org; qemu-block@nongnu.org
> > Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci
> > missed in qemu
> > 
> > Am 14.10.2015 um 14:48 hat Paul Durrant geschrieben:
> > > > -Original Message-
> > > > From: Fabio Fantoni [mailto:fabio.fant...@m2r.biz]
> > > > Sent: 14 October 2015 12:12
> > > > To: Kevin Wolf; Stefano Stabellini
> > > > Cc: John Snow; Anthony Perard; qemu-de...@nongnu.org; xen-
> > > > de...@lists.xen.org; qemu-block@nongnu.org; Paul Durrant
> > > > Subject: Re: [Qemu-devel] Question about xen disk unplug support for
> > ahci
> > > > missed in qemu
> > > >
> > > >
> > > >
> > > > Il 14/10/2015 11:47, Kevin Wolf ha scritto:
> > > > > [ CC qemu-block ]
> > > > >
> > > > > Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieben:
> > > > >> On Tue, 13 Oct 2015, John Snow wrote:
> > > > >>> On 10/13/2015 11:55 AM, Fabio Fantoni wrote:
> > > >  I added ahci disk support in libxl and using it for week seems that
> > was
> > > >  ok, after a reply of Stefano Stabellini seems that xen disk unplug
> > > >  support only ide disks:
> > > > 
> > > >
> > http://git.qemu.org/?p=qemu.git;a=commitdiff;h=679f4f8b178e7c66fbc2f39
> > > > c905374ee8663d5d8
> > > > 
> > > >  Today Paul Durrant told me that even if pv disk is ok also with 
> > > >  ahci
> > and
> > > >  the emulated one is offline can be a risk:
> > > >  http://lists.xenproject.org/archives/html/win-pv-devel/2015-
> > > > 10/msg00021.html
> > > > 
> > > > 
> > > >  I tried to take a fast look in qemu code but I not understand the
> > > > needed
> > > >  thing for add the xen disk unplug support also for ahci, can
> > someone do
> > > >  it or tell me useful information for do it please?
> > > > 
> > > >  Thanks for any reply and sorry for my bad english.
> > > > 
> > > > >>> I'm not entirely sure what features you need AHCI to support in
> > order
> > > > >>> for Xen to be happy.
> > > > >>>
> > > > >>> I'd guess hotplugging, but where I get confused is that IDE disks 
> > > > >>> don't
> > > > >>> support hotplugging either, so I guess I'm not sure sure what you
> > need.
> > > > >>>
> > > > >>> Stefano, can you help bridge my Xen knowledge gap?
> > > > >>
> > > > >> Hi John,
> > > > >>
> > > > >> we need something like hw/i386/xen/xen_platform.c:unplug_disks
> > but
> > > > that
> > > > >> can unplug AHCI disk. And by unplug, I mean "make disappear" like
> > > > >> pci_piix3_xen_ide_unplug does for ide.
> > > > > Maybe this would be the right time to stop the craziness with your
> > > > > hybrid IDE/xendisk setup. It's a horrible thing that would never 
> > > > > happen
> > > > > on real hardware.
> > >
> > > Unfortunately, it's going to be difficult to remove such 'craziness' when 
> > > you
> > don't know a priori whether the VM has PV drivers or not.
> > 
> > Why wouldn't you know that beforehand? I mean, even on real hardware
> > you
> > can have different disk interfaces (IDE, AHCI, SCSI) and you install
> > the exact driver that your hardware needs. You just do the same thing on
> > VM: If your hardware is PV, you install a PV driver. If your hardware is
> > IDE, you install an IDE driver. Whether it's PV or IDE is something that
> > you, the user, decided when configuring the VM, so you definitely know.
> > 
> 
> That's not necessarily true. The host admin that provisions the VM does not 
> necessarily know what OS the user of that VM will install. The admin may just 
> be providing a generic VM with an emulated CD drive that the user can point 
> at any ISO they want.
> 
> So, as a host admin, if you provide a VM with only PV backends and your user 
> is trying to boot an OS with no PV drivers they are not going to be happy, so 
> you provide emulated devices. Then, at some point later, when the user 
> installs PV drivers, there really should be some way for those drivers to 
> start up without any need to contact the host admin and have the VM 
> reconfigured.

Why only IDE and xendisk then? Maybe I have an OS that works great with
AHCI, or virtio-blk, or an LSI SCSI controller, or a Megasas SCSI
controller, or USB sticks, or... (and IDE will hardly ever be the
optimal one)

What about network cards? My OS might support the Xen PV one, or it
might support rtl8139, or e1000, or virtio-net, or pcnet, or...

Should we always put all of the hardware that can possibly be emulated
in a VM just so that the one right device is definitely included even
though we don't know what OS will be running?

This is ridiculous.

Just tell your admin what virtual hardware you really need. (Or tell
them to give you a proper interface to configure your VMs 

Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Paul Durrant
> -Original Message-
> From: Kevin Wolf [mailto:kw...@redhat.com]
> Sent: 16 October 2015 15:04
> To: Paul Durrant
> Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> de...@nongnu.org; xen-de...@lists.xen.org; qemu-block@nongnu.org
> Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci
> missed in qemu
> 
> Am 14.10.2015 um 14:48 hat Paul Durrant geschrieben:
> > > -Original Message-
> > > From: Fabio Fantoni [mailto:fabio.fant...@m2r.biz]
> > > Sent: 14 October 2015 12:12
> > > To: Kevin Wolf; Stefano Stabellini
> > > Cc: John Snow; Anthony Perard; qemu-de...@nongnu.org; xen-
> > > de...@lists.xen.org; qemu-block@nongnu.org; Paul Durrant
> > > Subject: Re: [Qemu-devel] Question about xen disk unplug support for
> ahci
> > > missed in qemu
> > >
> > >
> > >
> > > Il 14/10/2015 11:47, Kevin Wolf ha scritto:
> > > > [ CC qemu-block ]
> > > >
> > > > Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieben:
> > > >> On Tue, 13 Oct 2015, John Snow wrote:
> > > >>> On 10/13/2015 11:55 AM, Fabio Fantoni wrote:
> > >  I added ahci disk support in libxl and using it for week seems that
> was
> > >  ok, after a reply of Stefano Stabellini seems that xen disk unplug
> > >  support only ide disks:
> > > 
> > >
> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=679f4f8b178e7c66fbc2f39
> > > c905374ee8663d5d8
> > > 
> > >  Today Paul Durrant told me that even if pv disk is ok also with ahci
> and
> > >  the emulated one is offline can be a risk:
> > >  http://lists.xenproject.org/archives/html/win-pv-devel/2015-
> > > 10/msg00021.html
> > > 
> > > 
> > >  I tried to take a fast look in qemu code but I not understand the
> > > needed
> > >  thing for add the xen disk unplug support also for ahci, can
> someone do
> > >  it or tell me useful information for do it please?
> > > 
> > >  Thanks for any reply and sorry for my bad english.
> > > 
> > > >>> I'm not entirely sure what features you need AHCI to support in
> order
> > > >>> for Xen to be happy.
> > > >>>
> > > >>> I'd guess hotplugging, but where I get confused is that IDE disks 
> > > >>> don't
> > > >>> support hotplugging either, so I guess I'm not sure sure what you
> need.
> > > >>>
> > > >>> Stefano, can you help bridge my Xen knowledge gap?
> > > >>
> > > >> Hi John,
> > > >>
> > > >> we need something like hw/i386/xen/xen_platform.c:unplug_disks
> but
> > > that
> > > >> can unplug AHCI disk. And by unplug, I mean "make disappear" like
> > > >> pci_piix3_xen_ide_unplug does for ide.
> > > > Maybe this would be the right time to stop the craziness with your
> > > > hybrid IDE/xendisk setup. It's a horrible thing that would never happen
> > > > on real hardware.
> >
> > Unfortunately, it's going to be difficult to remove such 'craziness' when 
> > you
> don't know a priori whether the VM has PV drivers or not.
> 
> Why wouldn't you know that beforehand? I mean, even on real hardware
> you
> can have different disk interfaces (IDE, AHCI, SCSI) and you install
> the exact driver that your hardware needs. You just do the same thing on
> VM: If your hardware is PV, you install a PV driver. If your hardware is
> IDE, you install an IDE driver. Whether it's PV or IDE is something that
> you, the user, decided when configuring the VM, so you definitely know.
> 

That's not necessarily true. The host admin that provisions the VM does not 
necessarily know what OS the user of that VM will install. The admin may just 
be providing a generic VM with an emulated CD drive that the user can point at 
any ISO they want.

So, as a host admin, if you provide a VM with only PV backends and your user is 
trying to boot an OS with no PV drivers they are not going to be happy, so you 
provide emulated devices. Then, at some point later, when the user installs PV 
drivers, there really should be some way for those drivers to start up without 
any need to contact the host admin and have the VM reconfigured.

  Paul

> Kevin



Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Kevin Wolf
Am 16.10.2015 um 17:10 hat Paul Durrant geschrieben:
> > -Original Message-
> > From: Kevin Wolf [mailto:kw...@redhat.com]
> > Sent: 16 October 2015 16:02
> > To: Paul Durrant
> > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> > de...@nongnu.org; xen-de...@lists.xen.org; qemu-block@nongnu.org
> > Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci
> > missed in qemu
> > 
> > Am 16.10.2015 um 16:24 hat Paul Durrant geschrieben:
> > > > -Original Message-
> > > > From: Kevin Wolf [mailto:kw...@redhat.com]
> > > > Sent: 16 October 2015 15:04
> > > > To: Paul Durrant
> > > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> > > > de...@nongnu.org; xen-de...@lists.xen.org; qemu-block@nongnu.org
> > > > Subject: Re: [Qemu-devel] Question about xen disk unplug support for
> > ahci
> > > > missed in qemu
> > > >
> > > > Am 14.10.2015 um 14:48 hat Paul Durrant geschrieben:
> > > > > > -Original Message-
> > > > > > From: Fabio Fantoni [mailto:fabio.fant...@m2r.biz]
> > > > > > Sent: 14 October 2015 12:12
> > > > > > To: Kevin Wolf; Stefano Stabellini
> > > > > > Cc: John Snow; Anthony Perard; qemu-de...@nongnu.org; xen-
> > > > > > de...@lists.xen.org; qemu-block@nongnu.org; Paul Durrant
> > > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug support
> > for
> > > > ahci
> > > > > > missed in qemu
> > > > > >
> > > > > >
> > > > > >
> > > > > > Il 14/10/2015 11:47, Kevin Wolf ha scritto:
> > > > > > > [ CC qemu-block ]
> > > > > > >
> > > > > > > Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieben:
> > > > > > >> On Tue, 13 Oct 2015, John Snow wrote:
> > > > > > >>> On 10/13/2015 11:55 AM, Fabio Fantoni wrote:
> > > > > >  I added ahci disk support in libxl and using it for week seems
> > that
> > > > was
> > > > > >  ok, after a reply of Stefano Stabellini seems that xen disk 
> > > > > >  unplug
> > > > > >  support only ide disks:
> > > > > > 
> > > > > >
> > > >
> > http://git.qemu.org/?p=qemu.git;a=commitdiff;h=679f4f8b178e7c66fbc2f39
> > > > > > c905374ee8663d5d8
> > > > > > 
> > > > > >  Today Paul Durrant told me that even if pv disk is ok also with
> > ahci
> > > > and
> > > > > >  the emulated one is offline can be a risk:
> > > > > >  http://lists.xenproject.org/archives/html/win-pv-devel/2015-
> > > > > > 10/msg00021.html
> > > > > > 
> > > > > > 
> > > > > >  I tried to take a fast look in qemu code but I not understand 
> > > > > >  the
> > > > > > needed
> > > > > >  thing for add the xen disk unplug support also for ahci, can
> > > > someone do
> > > > > >  it or tell me useful information for do it please?
> > > > > > 
> > > > > >  Thanks for any reply and sorry for my bad english.
> > > > > > 
> > > > > > >>> I'm not entirely sure what features you need AHCI to support in
> > > > order
> > > > > > >>> for Xen to be happy.
> > > > > > >>>
> > > > > > >>> I'd guess hotplugging, but where I get confused is that IDE 
> > > > > > >>> disks
> > don't
> > > > > > >>> support hotplugging either, so I guess I'm not sure sure what 
> > > > > > >>> you
> > > > need.
> > > > > > >>>
> > > > > > >>> Stefano, can you help bridge my Xen knowledge gap?
> > > > > > >>
> > > > > > >> Hi John,
> > > > > > >>
> > > > > > >> we need something like
> > hw/i386/xen/xen_platform.c:unplug_disks
> > > > but
> > > > > > that
> > > > > > >> can unplug AHCI disk. And by unplug, I mean "make disappear" like
> > > > > > >> pci_piix3_xen_ide_unplug does for ide.
> > > > > > > Maybe this would be the right time to stop the craziness with your
> > > > > > > hybrid IDE/xendisk setup. It's a horrible thing that would never
> > happen
> > > > > > > on real hardware.
> > > > >
> > > > > Unfortunately, it's going to be difficult to remove such 'craziness' 
> > > > > when
> > you
> > > > don't know a priori whether the VM has PV drivers or not.
> > > >
> > > > Why wouldn't you know that beforehand? I mean, even on real
> > hardware
> > > > you
> > > > can have different disk interfaces (IDE, AHCI, SCSI) and you install
> > > > the exact driver that your hardware needs. You just do the same thing on
> > > > VM: If your hardware is PV, you install a PV driver. If your hardware is
> > > > IDE, you install an IDE driver. Whether it's PV or IDE is something that
> > > > you, the user, decided when configuring the VM, so you definitely know.
> > > >
> > >
> > > That's not necessarily true. The host admin that provisions the VM does 
> > > not
> > necessarily know what OS the user of that VM will install. The admin may 
> > just
> > be providing a generic VM with an emulated CD drive that the user can point
> > at any ISO they want.
> > >
> > > So, as a host admin, if you provide a VM with only PV backends and your
> > user is trying to boot an OS with no PV drivers they are not going to be
> > happy, so you provide emulated devices. Then, at some point 

Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Paul Durrant
> -Original Message-
> From: Kevin Wolf [mailto:kw...@redhat.com]
> Sent: 16 October 2015 17:12
> To: Paul Durrant
> Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> de...@nongnu.org; xen-de...@lists.xen.org; qemu-block@nongnu.org
> Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci
> missed in qemu
> 
> Am 16.10.2015 um 17:10 hat Paul Durrant geschrieben:
> > > -Original Message-
> > > From: Kevin Wolf [mailto:kw...@redhat.com]
> > > Sent: 16 October 2015 16:02
> > > To: Paul Durrant
> > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> > > de...@nongnu.org; xen-de...@lists.xen.org; qemu-block@nongnu.org
> > > Subject: Re: [Qemu-devel] Question about xen disk unplug support for
> ahci
> > > missed in qemu
> > >
> > > Am 16.10.2015 um 16:24 hat Paul Durrant geschrieben:
> > > > > -Original Message-
> > > > > From: Kevin Wolf [mailto:kw...@redhat.com]
> > > > > Sent: 16 October 2015 15:04
> > > > > To: Paul Durrant
> > > > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard;
> qemu-
> > > > > de...@nongnu.org; xen-de...@lists.xen.org; qemu-
> bl...@nongnu.org
> > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug support
> for
> > > ahci
> > > > > missed in qemu
> > > > >
> > > > > Am 14.10.2015 um 14:48 hat Paul Durrant geschrieben:
> > > > > > > -Original Message-
> > > > > > > From: Fabio Fantoni [mailto:fabio.fant...@m2r.biz]
> > > > > > > Sent: 14 October 2015 12:12
> > > > > > > To: Kevin Wolf; Stefano Stabellini
> > > > > > > Cc: John Snow; Anthony Perard; qemu-de...@nongnu.org; xen-
> > > > > > > de...@lists.xen.org; qemu-block@nongnu.org; Paul Durrant
> > > > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug
> support
> > > for
> > > > > ahci
> > > > > > > missed in qemu
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Il 14/10/2015 11:47, Kevin Wolf ha scritto:
> > > > > > > > [ CC qemu-block ]
> > > > > > > >
> > > > > > > > Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieben:
> > > > > > > >> On Tue, 13 Oct 2015, John Snow wrote:
> > > > > > > >>> On 10/13/2015 11:55 AM, Fabio Fantoni wrote:
> > > > > > >  I added ahci disk support in libxl and using it for week 
> > > > > > >  seems
> > > that
> > > > > was
> > > > > > >  ok, after a reply of Stefano Stabellini seems that xen disk
> unplug
> > > > > > >  support only ide disks:
> > > > > > > 
> > > > > > >
> > > > >
> > >
> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=679f4f8b178e7c66fbc2f39
> > > > > > > c905374ee8663d5d8
> > > > > > > 
> > > > > > >  Today Paul Durrant told me that even if pv disk is ok also
> with
> > > ahci
> > > > > and
> > > > > > >  the emulated one is offline can be a risk:
> > > > > > >  http://lists.xenproject.org/archives/html/win-pv-
> devel/2015-
> > > > > > > 10/msg00021.html
> > > > > > > 
> > > > > > > 
> > > > > > >  I tried to take a fast look in qemu code but I not understand
> the
> > > > > > > needed
> > > > > > >  thing for add the xen disk unplug support also for ahci, can
> > > > > someone do
> > > > > > >  it or tell me useful information for do it please?
> > > > > > > 
> > > > > > >  Thanks for any reply and sorry for my bad english.
> > > > > > > 
> > > > > > > >>> I'm not entirely sure what features you need AHCI to support
> in
> > > > > order
> > > > > > > >>> for Xen to be happy.
> > > > > > > >>>
> > > > > > > >>> I'd guess hotplugging, but where I get confused is that IDE
> disks
> > > don't
> > > > > > > >>> support hotplugging either, so I guess I'm not sure sure what
> you
> > > > > need.
> > > > > > > >>>
> > > > > > > >>> Stefano, can you help bridge my Xen knowledge gap?
> > > > > > > >>
> > > > > > > >> Hi John,
> > > > > > > >>
> > > > > > > >> we need something like
> > > hw/i386/xen/xen_platform.c:unplug_disks
> > > > > but
> > > > > > > that
> > > > > > > >> can unplug AHCI disk. And by unplug, I mean "make disappear"
> like
> > > > > > > >> pci_piix3_xen_ide_unplug does for ide.
> > > > > > > > Maybe this would be the right time to stop the craziness with
> your
> > > > > > > > hybrid IDE/xendisk setup. It's a horrible thing that would never
> > > happen
> > > > > > > > on real hardware.
> > > > > >
> > > > > > Unfortunately, it's going to be difficult to remove such 'craziness'
> when
> > > you
> > > > > don't know a priori whether the VM has PV drivers or not.
> > > > >
> > > > > Why wouldn't you know that beforehand? I mean, even on real
> > > hardware
> > > > > you
> > > > > can have different disk interfaces (IDE, AHCI, SCSI) and you install
> > > > > the exact driver that your hardware needs. You just do the same
> thing on
> > > > > VM: If your hardware is PV, you install a PV driver. If your hardware 
> > > > > is
> > > > > IDE, you install an IDE driver. Whether it's PV or IDE is something 
> > > > > that
> > > > > you, the user, decided when 

Re: [Qemu-block] [Qemu-devel] [RFC] transactions: add transaction-wide property

2015-10-16 Thread John Snow


On 10/16/2015 08:23 AM, Stefan Hajnoczi wrote:
> On Mon, Oct 12, 2015 at 12:50:20PM -0400, John Snow wrote:
>> Ping -- any consensus on how we should implement the "do-or-die"
>> argument for transactions that start block jobs? :)
>>
>> This patch may look a little hokey in how it boxes arguments, but I can
>> re-do it on top of Eric Blake's very official way of boxing arguments,
>> when the QAPI dust settles.
> 
> I don't understand what you are trying to do after staring at the email
> for 5 minutes.  Maybe the other reviewers hit the same problem and
> haven't responded.
> 
> What is the problem you're trying to solve?
> 
> Stefan
> 

Sorry...

What I am trying to do is to add the transactional blocker property to
the *transaction* command and not as an argument to each individual action.

There was some discussion on this so I wanted to just send an RFC to
show what I had in mind.

This series applies on top of Fam's latest series and moves the
arguments from each action to a transaction-wide property.



Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Kevin Wolf
Am 16.10.2015 um 18:20 hat Paul Durrant geschrieben:
> > -Original Message-
> > From: Kevin Wolf [mailto:kw...@redhat.com]
> > Sent: 16 October 2015 17:12
> > To: Paul Durrant
> > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> > de...@nongnu.org; xen-de...@lists.xen.org; qemu-block@nongnu.org
> > Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci
> > missed in qemu
> > 
> > Am 16.10.2015 um 17:10 hat Paul Durrant geschrieben:
> > > > -Original Message-
> > > > From: Kevin Wolf [mailto:kw...@redhat.com]
> > > > Sent: 16 October 2015 16:02
> > > > To: Paul Durrant
> > > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> > > > de...@nongnu.org; xen-de...@lists.xen.org; qemu-block@nongnu.org
> > > > Subject: Re: [Qemu-devel] Question about xen disk unplug support for
> > ahci
> > > > missed in qemu
> > > >
> > > > Am 16.10.2015 um 16:24 hat Paul Durrant geschrieben:
> > > > > > -Original Message-
> > > > > > From: Kevin Wolf [mailto:kw...@redhat.com]
> > > > > > Sent: 16 October 2015 15:04
> > > > > > To: Paul Durrant
> > > > > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard;
> > qemu-
> > > > > > de...@nongnu.org; xen-de...@lists.xen.org; qemu-
> > bl...@nongnu.org
> > > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug support
> > for
> > > > ahci
> > > > > > missed in qemu
> > > > > >
> > > > > > Am 14.10.2015 um 14:48 hat Paul Durrant geschrieben:
> > > > > > > > -Original Message-
> > > > > > > > From: Fabio Fantoni [mailto:fabio.fant...@m2r.biz]
> > > > > > > > Sent: 14 October 2015 12:12
> > > > > > > > To: Kevin Wolf; Stefano Stabellini
> > > > > > > > Cc: John Snow; Anthony Perard; qemu-de...@nongnu.org; xen-
> > > > > > > > de...@lists.xen.org; qemu-block@nongnu.org; Paul Durrant
> > > > > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug
> > support
> > > > for
> > > > > > ahci
> > > > > > > > missed in qemu
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Il 14/10/2015 11:47, Kevin Wolf ha scritto:
> > > > > > > > > [ CC qemu-block ]
> > > > > > > > >
> > > > > > > > > Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieben:
> > > > > > > > >> On Tue, 13 Oct 2015, John Snow wrote:
> > > > > > > > >>> On 10/13/2015 11:55 AM, Fabio Fantoni wrote:
> > > > > > > >  I added ahci disk support in libxl and using it for week 
> > > > > > > >  seems
> > > > that
> > > > > > was
> > > > > > > >  ok, after a reply of Stefano Stabellini seems that xen disk
> > unplug
> > > > > > > >  support only ide disks:
> > > > > > > > 
> > > > > > > >
> > > > > >
> > > >
> > http://git.qemu.org/?p=qemu.git;a=commitdiff;h=679f4f8b178e7c66fbc2f39
> > > > > > > > c905374ee8663d5d8
> > > > > > > > 
> > > > > > > >  Today Paul Durrant told me that even if pv disk is ok also
> > with
> > > > ahci
> > > > > > and
> > > > > > > >  the emulated one is offline can be a risk:
> > > > > > > >  http://lists.xenproject.org/archives/html/win-pv-
> > devel/2015-
> > > > > > > > 10/msg00021.html
> > > > > > > > 
> > > > > > > > 
> > > > > > > >  I tried to take a fast look in qemu code but I not 
> > > > > > > >  understand
> > the
> > > > > > > > needed
> > > > > > > >  thing for add the xen disk unplug support also for ahci, 
> > > > > > > >  can
> > > > > > someone do
> > > > > > > >  it or tell me useful information for do it please?
> > > > > > > > 
> > > > > > > >  Thanks for any reply and sorry for my bad english.
> > > > > > > > 
> > > > > > > > >>> I'm not entirely sure what features you need AHCI to support
> > in
> > > > > > order
> > > > > > > > >>> for Xen to be happy.
> > > > > > > > >>>
> > > > > > > > >>> I'd guess hotplugging, but where I get confused is that IDE
> > disks
> > > > don't
> > > > > > > > >>> support hotplugging either, so I guess I'm not sure sure 
> > > > > > > > >>> what
> > you
> > > > > > need.
> > > > > > > > >>>
> > > > > > > > >>> Stefano, can you help bridge my Xen knowledge gap?
> > > > > > > > >>
> > > > > > > > >> Hi John,
> > > > > > > > >>
> > > > > > > > >> we need something like
> > > > hw/i386/xen/xen_platform.c:unplug_disks
> > > > > > but
> > > > > > > > that
> > > > > > > > >> can unplug AHCI disk. And by unplug, I mean "make disappear"
> > like
> > > > > > > > >> pci_piix3_xen_ide_unplug does for ide.
> > > > > > > > > Maybe this would be the right time to stop the craziness with
> > your
> > > > > > > > > hybrid IDE/xendisk setup. It's a horrible thing that would 
> > > > > > > > > never
> > > > happen
> > > > > > > > > on real hardware.
> > > > > > >
> > > > > > > Unfortunately, it's going to be difficult to remove such 
> > > > > > > 'craziness'
> > when
> > > > you
> > > > > > don't know a priori whether the VM has PV drivers or not.
> > > > > >
> > > > > > Why wouldn't you know that beforehand? I mean, even on real
> > > > hardware
> > > > > 

Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-16 Thread Paul Durrant
> -Original Message-
> From: Kevin Wolf [mailto:kw...@redhat.com]
> Sent: 16 October 2015 17:43
> To: Paul Durrant
> Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> de...@nongnu.org; xen-de...@lists.xen.org; qemu-block@nongnu.org
> Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci
> missed in qemu
> 
> Am 16.10.2015 um 18:20 hat Paul Durrant geschrieben:
> > > -Original Message-
> > > From: Kevin Wolf [mailto:kw...@redhat.com]
> > > Sent: 16 October 2015 17:12
> > > To: Paul Durrant
> > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard; qemu-
> > > de...@nongnu.org; xen-de...@lists.xen.org; qemu-block@nongnu.org
> > > Subject: Re: [Qemu-devel] Question about xen disk unplug support for
> ahci
> > > missed in qemu
> > >
> > > Am 16.10.2015 um 17:10 hat Paul Durrant geschrieben:
> > > > > -Original Message-
> > > > > From: Kevin Wolf [mailto:kw...@redhat.com]
> > > > > Sent: 16 October 2015 16:02
> > > > > To: Paul Durrant
> > > > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard;
> qemu-
> > > > > de...@nongnu.org; xen-de...@lists.xen.org; qemu-
> bl...@nongnu.org
> > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug support
> for
> > > ahci
> > > > > missed in qemu
> > > > >
> > > > > Am 16.10.2015 um 16:24 hat Paul Durrant geschrieben:
> > > > > > > -Original Message-
> > > > > > > From: Kevin Wolf [mailto:kw...@redhat.com]
> > > > > > > Sent: 16 October 2015 15:04
> > > > > > > To: Paul Durrant
> > > > > > > Cc: Fabio Fantoni; Stefano Stabellini; John Snow; Anthony Perard;
> > > qemu-
> > > > > > > de...@nongnu.org; xen-de...@lists.xen.org; qemu-
> > > bl...@nongnu.org
> > > > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug
> support
> > > for
> > > > > ahci
> > > > > > > missed in qemu
> > > > > > >
> > > > > > > Am 14.10.2015 um 14:48 hat Paul Durrant geschrieben:
> > > > > > > > > -Original Message-
> > > > > > > > > From: Fabio Fantoni [mailto:fabio.fant...@m2r.biz]
> > > > > > > > > Sent: 14 October 2015 12:12
> > > > > > > > > To: Kevin Wolf; Stefano Stabellini
> > > > > > > > > Cc: John Snow; Anthony Perard; qemu-de...@nongnu.org;
> xen-
> > > > > > > > > de...@lists.xen.org; qemu-block@nongnu.org; Paul Durrant
> > > > > > > > > Subject: Re: [Qemu-devel] Question about xen disk unplug
> > > support
> > > > > for
> > > > > > > ahci
> > > > > > > > > missed in qemu
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Il 14/10/2015 11:47, Kevin Wolf ha scritto:
> > > > > > > > > > [ CC qemu-block ]
> > > > > > > > > >
> > > > > > > > > > Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieben:
> > > > > > > > > >> On Tue, 13 Oct 2015, John Snow wrote:
> > > > > > > > > >>> On 10/13/2015 11:55 AM, Fabio Fantoni wrote:
> > > > > > > > >  I added ahci disk support in libxl and using it for week
> seems
> > > > > that
> > > > > > > was
> > > > > > > > >  ok, after a reply of Stefano Stabellini seems that xen 
> > > > > > > > >  disk
> > > unplug
> > > > > > > > >  support only ide disks:
> > > > > > > > > 
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=679f4f8b178e7c66fbc2f39
> > > > > > > > > c905374ee8663d5d8
> > > > > > > > > 
> > > > > > > > >  Today Paul Durrant told me that even if pv disk is ok 
> > > > > > > > >  also
> > > with
> > > > > ahci
> > > > > > > and
> > > > > > > > >  the emulated one is offline can be a risk:
> > > > > > > > >  http://lists.xenproject.org/archives/html/win-pv-
> > > devel/2015-
> > > > > > > > > 10/msg00021.html
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > >  I tried to take a fast look in qemu code but I not
> understand
> > > the
> > > > > > > > > needed
> > > > > > > > >  thing for add the xen disk unplug support also for ahci,
> can
> > > > > > > someone do
> > > > > > > > >  it or tell me useful information for do it please?
> > > > > > > > > 
> > > > > > > > >  Thanks for any reply and sorry for my bad english.
> > > > > > > > > 
> > > > > > > > > >>> I'm not entirely sure what features you need AHCI to
> support
> > > in
> > > > > > > order
> > > > > > > > > >>> for Xen to be happy.
> > > > > > > > > >>>
> > > > > > > > > >>> I'd guess hotplugging, but where I get confused is that 
> > > > > > > > > >>> IDE
> > > disks
> > > > > don't
> > > > > > > > > >>> support hotplugging either, so I guess I'm not sure sure
> what
> > > you
> > > > > > > need.
> > > > > > > > > >>>
> > > > > > > > > >>> Stefano, can you help bridge my Xen knowledge gap?
> > > > > > > > > >>
> > > > > > > > > >> Hi John,
> > > > > > > > > >>
> > > > > > > > > >> we need something like
> > > > > hw/i386/xen/xen_platform.c:unplug_disks
> > > > > > > but
> > > > > > > > > that
> > > > > > > > > >> can unplug AHCI disk. And by unplug, I mean "make
> disappear"
> > > like
> > > > > > > > > >>