Re: [libvirt] [Qemu-devel] qapi DEVICE_DELETED event issued *before* instance_finalize?!

2016-09-06 Thread Paolo Bonzini


On 06/09/2016 16:02, Laine Stump wrote:
>>>
>> It seems like this is just pointing out another flaw in the semantics
>> of DEVICE_DELETED, a device can linger without a device id, so there's
>> no way to reference it via QMP.
> 
> Ah, right. I hadn't caught that. Yeah, since it's the device id that's
> used to keep track of which device the event is for, then it seems
> impossible to have an event that's issued after the device id is already
> recycled.

If a device lingers for more than say a second (most likely less---what
you're looking for is one or two synchronize_rcu cycles), it would be a
bug in QEMU.

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] qapi DEVICE_DELETED event issued *before* instance_finalize?!

2016-09-06 Thread Paolo Bonzini


On 06/09/2016 04:18, Alex Williamson wrote:
> On Mon, 5 Sep 2016 11:36:55 +0200
> Paolo Bonzini  wrote:
>> DEVICE_DELETED does have a meaning: management cannot talk to the device
>> anymore in QMP once it is raised.
> 
> It seems like this is just pointing out another flaw in the semantics
> of DEVICE_DELETED, a device can linger without a device id, so there's
> no way to reference it via QMP.  QEMU can't signal anything more about
> the device, nor can the VM admin perform any further operations on it.
> It's like detecting planets around distant stars, libvirt can't actually
> see the device, it can only monitor the affects the device has on the
> VM.  This is broken and it seems like the fix is to push both the
> release of the device id and the DEVICE_DELETED notification until
> after the instance_finalize callback.

You can't do that.  Think of it as DEVICE_DELETED being "removal" and
instance_finalize being "reference count has gone to 0".  You cannot
make the reference count go to 0 unless you have disconnected the device
from the parent, and the parent is the one that remembers the device id.

>> Technically what libvirt wants to know for VFIO is not whether the
>> device is gone; it's whether the device's _backend_ (the VFIO file
>> descriptor) is gone.  The device backend could have been a separate QOM
>> object, but it isn't.
>>
>> So perhaps we need a new event that is specific to VFIO?
> 
> This immediately sounds like the wrong path.  A) Why is this vfio
> specific?

Because VFIO doesn't have a separate backend object.  instance_finalize
is already where the backends are released.  You cannot for example
reuse a character device until instance_finalize (no event is generated,
but we could certainly add one to qemu-char.c if deemed useful).
VFIO_DELETED is just another example of the same thing.

It just happens that the host device is not a separate "-object
vfio-backend-pci,sysfs=..." but it's embedded in "-device vfio-pci" so
the code for the new event must go in hw/vfio rather than a hypothetical
backends/vfio.

I'm not saying that VFIO should have a separate backend object, that
would probably be overengineering.  But in some cases you get saner
semantics if you think of VFIO as a composition of two things.

> B) Without a device id, how are we going to signal an
> event?

For example by sysfs path or host device path---it makes sense to use
properties of the backend since this signals that the backend is now free.

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] qapi DEVICE_DELETED event issued *before* instance_finalize?!

2016-09-06 Thread Laine Stump

On 09/05/2016 10:18 PM, Alex Williamson wrote:

On Mon, 5 Sep 2016 11:36:55 +0200
Paolo Bonzini  wrote:


On 05/09/2016 11:23, Markus Armbruster wrote:

On the other hand, it is clearly documented that the DEVICE_DELETED
event is sent as soon as guest acknowledges completion of device
removal. So libvirt's buggy if we'd follow documentation strictly. But
then again, I don't see much information value in "guest has detached
device but qemu hasn't yet" event. Libvirt would ignore such event.

Unless I'm missing something, libvirt needs an event that signals "Guest
and QEMU are done with this device".  Current DEVICE_DELETED isn't.

Can we imagine a use for current DEVICE_DELETED, i.e. "Guest is done,
but QEMU isn't"?

Would anything break if we changed semantics of DEVICE_DELETED to what
libvirt actually needs?

If the answers are "no" and "no", let's do it.

There is a subtle aspect of this.  After the current DEVICE_DELETED, the
device id is not used any more.  So technically you could have

device_add bar,id=foo
device_del foo

// something in QEMU prevents the device from going away?
// for example there is a storage issue that blocks completion
// of a read(), and bar is a storage device

device_add bar,id=foo
device_del foo

// which foo is being deleted?  The old one or the new one?
event DEVICE_DELETED

DEVICE_DELETED does have a meaning: management cannot talk to the device
anymore in QMP once it is raised.

It seems like this is just pointing out another flaw in the semantics
of DEVICE_DELETED, a device can linger without a device id, so there's
no way to reference it via QMP.


Ah, right. I hadn't caught that. Yeah, since it's the device id that's 
used to keep track of which device the event is for, then it seems 
impossible to have an event that's issued after the device id is already 
recycled.



  QEMU can't signal anything more about
the device, nor can the VM admin perform any further operations on it.
It's like detecting planets around distant stars, libvirt can't actually
see the device, it can only monitor the affects the device has on the
VM.  This is broken and it seems like the fix is to push both the
release of the device id and the DEVICE_DELETED notification until
after the instance_finalize callback.  Doesn't that solve the nuance
you've identified here as well?


This works perfectly for libvirt.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] qapi DEVICE_DELETED event issued *before* instance_finalize?!

2016-09-05 Thread Alex Williamson
On Mon, 5 Sep 2016 11:36:55 +0200
Paolo Bonzini  wrote:

> On 05/09/2016 11:23, Markus Armbruster wrote:
> >> >
> >> > On the other hand, it is clearly documented that the DEVICE_DELETED
> >> > event is sent as soon as guest acknowledges completion of device
> >> > removal. So libvirt's buggy if we'd follow documentation strictly. But
> >> > then again, I don't see much information value in "guest has detached
> >> > device but qemu hasn't yet" event. Libvirt would ignore such event.  
> > Unless I'm missing something, libvirt needs an event that signals "Guest
> > and QEMU are done with this device".  Current DEVICE_DELETED isn't.
> > 
> > Can we imagine a use for current DEVICE_DELETED, i.e. "Guest is done,
> > but QEMU isn't"?
> > 
> > Would anything break if we changed semantics of DEVICE_DELETED to what
> > libvirt actually needs?
> > 
> > If the answers are "no" and "no", let's do it.  
> 
> There is a subtle aspect of this.  After the current DEVICE_DELETED, the
> device id is not used any more.  So technically you could have
> 
>device_add bar,id=foo
>device_del foo
> 
>// something in QEMU prevents the device from going away?
>// for example there is a storage issue that blocks completion
>// of a read(), and bar is a storage device
> 
>device_add bar,id=foo
>device_del foo
> 
>// which foo is being deleted?  The old one or the new one?
>event DEVICE_DELETED
> 
> DEVICE_DELETED does have a meaning: management cannot talk to the device
> anymore in QMP once it is raised.

It seems like this is just pointing out another flaw in the semantics
of DEVICE_DELETED, a device can linger without a device id, so there's
no way to reference it via QMP.  QEMU can't signal anything more about
the device, nor can the VM admin perform any further operations on it.
It's like detecting planets around distant stars, libvirt can't actually
see the device, it can only monitor the affects the device has on the
VM.  This is broken and it seems like the fix is to push both the
release of the device id and the DEVICE_DELETED notification until
after the instance_finalize callback.  Doesn't that solve the nuance
you've identified here as well?

> Technically what libvirt wants to know for VFIO is not whether the
> device is gone; it's whether the device's _backend_ (the VFIO file
> descriptor) is gone.  The device backend could have been a separate QOM
> object, but it isn't.
> 
> So perhaps we need a new event that is specific to VFIO?

This immediately sounds like the wrong path.  A) Why is this vfio
specific?  B) Without a device id, how are we going to signal an
event?  It seems that nobody actually cares about this interim event in
QEMU and releasing the device id prior to the actual device itself is
just as problematic as the premature signal itself.  Thanks,

Alex

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] qapi DEVICE_DELETED event issued *before* instance_finalize?!

2016-09-05 Thread Laine Stump

On 09/05/2016 05:36 AM, Paolo Bonzini wrote:


On 05/09/2016 11:23, Markus Armbruster wrote:

On the other hand, it is clearly documented that the DEVICE_DELETED
event is sent as soon as guest acknowledges completion of device
removal. So libvirt's buggy if we'd follow documentation strictly. But
then again, I don't see much information value in "guest has detached
device but qemu hasn't yet" event. Libvirt would ignore such event.

Unless I'm missing something, libvirt needs an event that signals "Guest
and QEMU are done with this device".  Current DEVICE_DELETED isn't.

Can we imagine a use for current DEVICE_DELETED, i.e. "Guest is done,
but QEMU isn't"?

Would anything break if we changed semantics of DEVICE_DELETED to what
libvirt actually needs?

If the answers are "no" and "no", let's do it.

There is a subtle aspect of this.  After the current DEVICE_DELETED, the
device id is not used any more.  So technically you could have

device_add bar,id=foo
device_del foo

// something in QEMU prevents the device from going away?
// for example there is a storage issue that blocks completion
// of a read(), and bar is a storage device

device_add bar,id=foo
device_del foo

// which foo is being deleted?  The old one or the new one?
event DEVICE_DELETED

DEVICE_DELETED does have a meaning: management cannot talk to the device
anymore in QMP once it is raised.

Technically what libvirt wants to know for VFIO is not whether the
device is gone; it's whether the device's _backend_ (the VFIO file
descriptor) is gone.  The device backend could have been a separate QOM
object, but it isn't.

So perhaps we need a new event that is specific to VFIO?


Sigh.

I always hate adding more knobs...

The original reason libvirt asked for the DEVICE_DELETED event was 
because there were cases where libvirt was attempting to re-use the 
device id when it was still in use by qemu, so attempts to attach new 
devices were failing. When it was provided we just assumed that 
"DEVICE_DELETED" meant "everybody is finished with this device, and it's 
safe to recycle all the resources now". I guess we generalized just a 
bit too much.


From libvirt's point of view, I don't see any problem with widening the 
definition of the existing DEVICE_DELETED event. But if that doesn't 
make sense from QEMU's point of view, or if anyone can come up with a 
practical reason for wanting both events, we can of course modify our 
event handling accordingly (the simplest way would be to just ignore 
DEVICE_DELETED in the case of vfio devices, and wait for the new event 
to trigger both freeing of the device ID and re-attaching the device to 
its host driver; trying to release the device ID in response to 
DEVICE_DELETED, and then re-attach the device to the host driver in 
response to a separate event would just be adding an extra layer of 
waiting for no perceptible gain).


Oh, or are you saying that for vfio devices it would have this new new 
event *instead of* DEVICE_DELETED for vfio devices? I don't really see 
the point of that...




Thanks,

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list



--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] qapi DEVICE_DELETED event issued *before* instance_finalize?!

2016-09-05 Thread Paolo Bonzini


On 05/09/2016 11:23, Markus Armbruster wrote:
>> >
>> > On the other hand, it is clearly documented that the DEVICE_DELETED
>> > event is sent as soon as guest acknowledges completion of device
>> > removal. So libvirt's buggy if we'd follow documentation strictly. But
>> > then again, I don't see much information value in "guest has detached
>> > device but qemu hasn't yet" event. Libvirt would ignore such event.
> Unless I'm missing something, libvirt needs an event that signals "Guest
> and QEMU are done with this device".  Current DEVICE_DELETED isn't.
> 
> Can we imagine a use for current DEVICE_DELETED, i.e. "Guest is done,
> but QEMU isn't"?
> 
> Would anything break if we changed semantics of DEVICE_DELETED to what
> libvirt actually needs?
> 
> If the answers are "no" and "no", let's do it.

There is a subtle aspect of this.  After the current DEVICE_DELETED, the
device id is not used any more.  So technically you could have

   device_add bar,id=foo
   device_del foo

   // something in QEMU prevents the device from going away?
   // for example there is a storage issue that blocks completion
   // of a read(), and bar is a storage device

   device_add bar,id=foo
   device_del foo

   // which foo is being deleted?  The old one or the new one?
   event DEVICE_DELETED

DEVICE_DELETED does have a meaning: management cannot talk to the device
anymore in QMP once it is raised.

Technically what libvirt wants to know for VFIO is not whether the
device is gone; it's whether the device's _backend_ (the VFIO file
descriptor) is gone.  The device backend could have been a separate QOM
object, but it isn't.

So perhaps we need a new event that is specific to VFIO?

Thanks,

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] qapi DEVICE_DELETED event issued *before* instance_finalize?!

2016-09-05 Thread Markus Armbruster
Adding Paolo.

Michal Privoznik  writes:

> On 02.09.2016 01:11, Alex Williamson wrote:
>> Hey,
>> 
>> I'm out of my QOM depth, so I'll just beg for help in advance.  I
>> noticed in testing vfio-pci hotunplug that the host seems to be trying
>> to reclaim the device before QEMU is actually done with it, there's a
>> very short race where libvirt has seen the DEVICE_DELETED event and
>> tries to unbind the physical device from vfio-pci, the use count is
>> clearly non-zero because the host driver tries to send a device
>> request, but that event channel has already been torn down.  Nearly
>> immediately after, QEMU finally releases the device, but we can't do a
>> proper reset due to some issues with device references in the kernel.
>> 
>> When I run gdb on QEMU with breakpoints at
>> qapi_event_send_device_deleted() and vfio_instance_finalize(),  the
>> QAPI even happens first.  Clearly this is horribly wrong, right?  I
>> can't unmap my references to the vfio device file until my
>> instance_finalize is called, so I'm always going to have that open when
>> libvirt takes the DEVICE_DELETED event as a cue to return the device to
>> host drivers.  The call chains look like this:
>> 
>> #0  qapi_event_send_device_deleted (has_device=true, 
>> device=0x7f5ca3e36fb0 "hostdev0", 
>> path=0x7f5c89e84fe0 "/machine/peripheral/hostdev0", 
>> errp=0x7f5ca241f9e8 ) at qapi-event.c:412
>> #1  0x7f5ca1701608 in device_unparent (obj=0x7f5ca43ffc00)
>> at hw/core/qdev.c:1115
>> #2  0x7f5ca18b7891 in object_finalize_child_property 
>> (obj=0x7f5ca380f500, 
>> name=0x7f5ca3f21da0 "hostdev0", opaque=0x7f5ca43ffc00) at 
>> qom/object.c:1362
>> #3  0x7f5ca18b56b2 in object_property_del_child (obj=0x7f5ca380f500, 
>> child=0x7f5ca43ffc00, errp=0x0) at qom/object.c:422
>> #4  0x7f5ca18b5790 in object_unparent (obj=0x7f5ca43ffc00)
>> at qom/object.c:441
>> #5  0x7f5ca16c1f31 in acpi_pcihp_eject_slot (s=0x7f5ca4c41268, bsel=0, 
>> slots=4) at hw/acpi/pcihp.c:139
>> 
>> 
>> #0  vfio_instance_finalize (obj=0x7f5ca43ffc00)
>> at /net/gimli/home/alwillia/Work/qemu.git/hw/vfio/pci.c:2731
>> #1  0x7f5ca18b57c0 in object_deinit (obj=0x7f5ca43ffc00, 
>> type=0x7f5ca376f490) at qom/object.c:448
>> #2  0x7f5ca18b5831 in object_finalize (data=0x7f5ca43ffc00)
>> at qom/object.c:462
>> #3  0x7f5ca18b6782 in object_unref (obj=0x7f5ca43ffc00) at 
>> qom/object.c:896
>> #4  0x7f5ca1550cc0 in memory_region_unref (mr=0x7f5ca43fff00)
>> at /net/gimli/home/alwillia/Work/qemu.git/memory.c:1476
>> #5  0x7f5ca1553886 in do_address_space_destroy (as=0x7f5ca43ffe10)
>> at /net/gimli/home/alwillia/Work/qemu.git/memory.c:2272
>> 
>> 
>> It appears that DEVICE_DELETED only means the VM is done with the
>> device but libvirt is interpreting it as QEMU is done with the device.
>> Which is correct?  Do we need a new event or do we need to fix the
>> ordering of this event?  An ordering fix would be more compatible with
>> existing libvirt.  Thanks,
>
> What an interesting race. I think the even should be sent only after
> both guest and qemu are done with the device. Having two events looks
> like too much granularity to me. I mean, even if libvirt learns that
> guest has detached device, it still can't do anything until qemu clears
> its internal state.
>
> On the other hand, it is clearly documented that the DEVICE_DELETED
> event is sent as soon as guest acknowledges completion of device
> removal. So libvirt's buggy if we'd follow documentation strictly. But
> then again, I don't see much information value in "guest has detached
> device but qemu hasn't yet" event. Libvirt would ignore such event.

Unless I'm missing something, libvirt needs an event that signals "Guest
and QEMU are done with this device".  Current DEVICE_DELETED isn't.

Can we imagine a use for current DEVICE_DELETED, i.e. "Guest is done,
but QEMU isn't"?

Would anything break if we changed semantics of DEVICE_DELETED to what
libvirt actually needs?

If the answers are "no" and "no", let's do it.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list