Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On Mon, Jul 16, 2018 at 08:43:24AM +0200, Markus Armbruster wrote: > Eduardo Habkost writes: > > > On Thu, Jul 12, 2018 at 10:05:46AM +0200, Paolo Bonzini wrote: > >> On 11/07/2018 22:23, Eduardo Habkost wrote: > >> > On Wed, Jul 11, 2018 at 10:16:42PM +0200, Paolo Bonzini wrote: > >> >> On 11/07/2018 20:30, Eduardo Habkost wrote: > >> The theoretical behavior should be: > >> >>> It's not clear below where you expect > >> >>> qdev_set_parent_bus(..., sysbus_get_default()) > >> >>> to be called (if it should be called at all). > >> >>> > >> >>> I don't know where it should be called, but I'm absolutely sure > >> >>> instance_init is not the right place. > >> >> > >> >> I think instance_init is fine to call qdev_set_parent_bus on contained > >> >> devices. Why do you say it's not? > >> > > >> > Because object_unref(object_new(...)) is not supposed to affect > >> > QEMU global state at all. > >> > >> It should not affect it. Any changes to the global state done by > >> instance_init are immediately undone when object_unref destroys the > >> child properties of the object. > > > > I would prefer if it didn't, but not a big deal as long as all > > QOM code is protected by the BQL (it is, right?). > > > > If we get rid of object_new() in qmp_device_list_properties(), > > then most of the restrictions on instance_init can go away, > > anyway. > > How could we get rid of object_new()? As long as we create properties > in code, we need to run the code to find the properties. By stopping registering properties at instance_init, and making them introspectable at the class object. -- Eduardo
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
Eduardo Habkost writes: > On Thu, Jul 12, 2018 at 10:05:46AM +0200, Paolo Bonzini wrote: >> On 11/07/2018 22:23, Eduardo Habkost wrote: >> > On Wed, Jul 11, 2018 at 10:16:42PM +0200, Paolo Bonzini wrote: >> >> On 11/07/2018 20:30, Eduardo Habkost wrote: >> The theoretical behavior should be: >> >>> It's not clear below where you expect >> >>> qdev_set_parent_bus(..., sysbus_get_default()) >> >>> to be called (if it should be called at all). >> >>> >> >>> I don't know where it should be called, but I'm absolutely sure >> >>> instance_init is not the right place. >> >> >> >> I think instance_init is fine to call qdev_set_parent_bus on contained >> >> devices. Why do you say it's not? >> > >> > Because object_unref(object_new(...)) is not supposed to affect >> > QEMU global state at all. >> >> It should not affect it. Any changes to the global state done by >> instance_init are immediately undone when object_unref destroys the >> child properties of the object. > > I would prefer if it didn't, but not a big deal as long as all > QOM code is protected by the BQL (it is, right?). > > If we get rid of object_new() in qmp_device_list_properties(), > then most of the restrictions on instance_init can go away, > anyway. How could we get rid of object_new()? As long as we create properties in code, we need to run the code to find the properties.
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
Thomas Huth writes: > On 12.07.2018 18:22, Peter Maydell wrote: >> On 12 July 2018 at 17:16, Markus Armbruster wrote: >>> Thomas Huth writes: >>> On 12.07.2018 14:06, Markus Armbruster wrote: > Peter Maydell writes: > >> On 11 July 2018 at 17:12, Eduardo Habkost wrote: >>> On Wed, Jul 11, 2018 at 09:21:48AM +0200, Thomas Huth wrote: Hm, ok, so how to continue here now? Shall we at least mark the bcm2836/7 devices with user_creatable=false, so that users can not crash their QEMU so easily with device_add? The problem with introspection via device-list-properties would still continue to exist, but I think that's less likely used in practice... otherwise we could still move the qdev_set_parent_bus() calls to the realize() function instead, and just add a big fat FIXME comment in front of the code block, so that we remember to clean that up one day... >>> >>> Crashing device-list-properties should be a blocker bug, IMO. > > Seconded. Well, maybe I should then not suggest to add a hmp("info qtree") below the hmp("info qom-tree") in test_one_device() of tests/device-introspect-test.c ... otherwise we'll be quite busy in the next weeks... >>> >>> If we can't fix these bugs in time, we can bring back >>> cannot_destroy_with_object_finalize_yet, as Eduardo mentioned upthread. >>> Would be sad, but sad beats crash. >> >> ...but are they actually interesting crashes? Nobody is ever >> going to actually start emulation of an integratorcp machine and >> then try to add a bcm2837 device via the QMP interface, except >> if they're deliberately doing exhaustive testing. > > It's not "device_add" that is a real problem here (otherwise we could > simply use user_creatable=false which we likely should do for this > device anyway), but rather the "device-list-properties" QMP command, > which also works for devices that are marked as user_creatable=false. > > I think it's valid that an upper layer tool scans all devices for their > properties. But since nobody complained about crashes in the past here > already, it seems like no upper layer tool is currently doing this. So I > agree with you that this should not be a blocker for the 3.0 release. Fixing the crashes by bringing back cannot_destroy_with_object_finalize_yet would be a bit of a cop out, but it would also be so easy that there's really no excuse for leaving them unfixed.
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On Thu, Jul 12, 2018 at 10:05:46AM +0200, Paolo Bonzini wrote: > On 11/07/2018 22:23, Eduardo Habkost wrote: > > On Wed, Jul 11, 2018 at 10:16:42PM +0200, Paolo Bonzini wrote: > >> On 11/07/2018 20:30, Eduardo Habkost wrote: > The theoretical behavior should be: > >>> It's not clear below where you expect > >>> qdev_set_parent_bus(..., sysbus_get_default()) > >>> to be called (if it should be called at all). > >>> > >>> I don't know where it should be called, but I'm absolutely sure > >>> instance_init is not the right place. > >> > >> I think instance_init is fine to call qdev_set_parent_bus on contained > >> devices. Why do you say it's not? > > > > Because object_unref(object_new(...)) is not supposed to affect > > QEMU global state at all. > > It should not affect it. Any changes to the global state done by > instance_init are immediately undone when object_unref destroys the > child properties of the object. I would prefer if it didn't, but not a big deal as long as all QOM code is protected by the BQL (it is, right?). If we get rid of object_new() in qmp_device_list_properties(), then most of the restrictions on instance_init can go away, anyway. -- Eduardo
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 12.07.2018 18:22, Peter Maydell wrote: > On 12 July 2018 at 17:16, Markus Armbruster wrote: >> Thomas Huth writes: >> >>> On 12.07.2018 14:06, Markus Armbruster wrote: Peter Maydell writes: > On 11 July 2018 at 17:12, Eduardo Habkost wrote: >> On Wed, Jul 11, 2018 at 09:21:48AM +0200, Thomas Huth wrote: >>> Hm, ok, so how to continue here now? Shall we at least mark the >>> bcm2836/7 devices with user_creatable=false, so that users can not crash >>> their QEMU so easily with device_add? The problem with introspection via >>> device-list-properties would still continue to exist, but I think that's >>> less likely used in practice... otherwise we could still move the >>> qdev_set_parent_bus() calls to the realize() function instead, and just >>> add a big fat FIXME comment in front of the code block, so that we >>> remember to clean that up one day... >> >> Crashing device-list-properties should be a blocker bug, IMO. Seconded. >>> >>> Well, maybe I should then not suggest to add a hmp("info qtree") below >>> the hmp("info qom-tree") in test_one_device() of >>> tests/device-introspect-test.c ... otherwise we'll be quite busy in the >>> next weeks... >> >> If we can't fix these bugs in time, we can bring back >> cannot_destroy_with_object_finalize_yet, as Eduardo mentioned upthread. >> Would be sad, but sad beats crash. > > ...but are they actually interesting crashes? Nobody is ever > going to actually start emulation of an integratorcp machine and > then try to add a bcm2837 device via the QMP interface, except > if they're deliberately doing exhaustive testing. It's not "device_add" that is a real problem here (otherwise we could simply use user_creatable=false which we likely should do for this device anyway), but rather the "device-list-properties" QMP command, which also works for devices that are marked as user_creatable=false. I think it's valid that an upper layer tool scans all devices for their properties. But since nobody complained about crashes in the past here already, it seems like no upper layer tool is currently doing this. So I agree with you that this should not be a blocker for the 3.0 release. Thomas
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 12 July 2018 at 17:16, Markus Armbruster wrote: > Thomas Huth writes: > >> On 12.07.2018 14:06, Markus Armbruster wrote: >>> Peter Maydell writes: >>> On 11 July 2018 at 17:12, Eduardo Habkost wrote: > On Wed, Jul 11, 2018 at 09:21:48AM +0200, Thomas Huth wrote: >> Hm, ok, so how to continue here now? Shall we at least mark the >> bcm2836/7 devices with user_creatable=false, so that users can not crash >> their QEMU so easily with device_add? The problem with introspection via >> device-list-properties would still continue to exist, but I think that's >> less likely used in practice... otherwise we could still move the >> qdev_set_parent_bus() calls to the realize() function instead, and just >> add a big fat FIXME comment in front of the code block, so that we >> remember to clean that up one day... > > Crashing device-list-properties should be a blocker bug, IMO. >>> >>> Seconded. >> >> Well, maybe I should then not suggest to add a hmp("info qtree") below >> the hmp("info qom-tree") in test_one_device() of >> tests/device-introspect-test.c ... otherwise we'll be quite busy in the >> next weeks... > > If we can't fix these bugs in time, we can bring back > cannot_destroy_with_object_finalize_yet, as Eduardo mentioned upthread. > Would be sad, but sad beats crash. ...but are they actually interesting crashes? Nobody is ever going to actually start emulation of an integratorcp machine and then try to add a bcm2837 device via the QMP interface, except if they're deliberately doing exhaustive testing. They're bugs, sure, but they're not bugs I can ever see any real user possibly tripping over, so we don't necessarily need to go to any particular lengths to fix them for 3.0 if that's painful. thanks -- PMM
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
Thomas Huth writes: > On 12.07.2018 14:06, Markus Armbruster wrote: >> Peter Maydell writes: >> >>> On 11 July 2018 at 17:12, Eduardo Habkost wrote: On Wed, Jul 11, 2018 at 09:21:48AM +0200, Thomas Huth wrote: > Hm, ok, so how to continue here now? Shall we at least mark the > bcm2836/7 devices with user_creatable=false, so that users can not crash > their QEMU so easily with device_add? The problem with introspection via > device-list-properties would still continue to exist, but I think that's > less likely used in practice... otherwise we could still move the > qdev_set_parent_bus() calls to the realize() function instead, and just > add a big fat FIXME comment in front of the code block, so that we > remember to clean that up one day... Crashing device-list-properties should be a blocker bug, IMO. >> >> Seconded. > > Well, maybe I should then not suggest to add a hmp("info qtree") below > the hmp("info qom-tree") in test_one_device() of > tests/device-introspect-test.c ... otherwise we'll be quite busy in the > next weeks... If we can't fix these bugs in time, we can bring back cannot_destroy_with_object_finalize_yet, as Eduardo mentioned upthread. Would be sad, but sad beats crash.
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 12.07.2018 14:06, Markus Armbruster wrote: > Peter Maydell writes: > >> On 11 July 2018 at 17:12, Eduardo Habkost wrote: >>> On Wed, Jul 11, 2018 at 09:21:48AM +0200, Thomas Huth wrote: Hm, ok, so how to continue here now? Shall we at least mark the bcm2836/7 devices with user_creatable=false, so that users can not crash their QEMU so easily with device_add? The problem with introspection via device-list-properties would still continue to exist, but I think that's less likely used in practice... otherwise we could still move the qdev_set_parent_bus() calls to the realize() function instead, and just add a big fat FIXME comment in front of the code block, so that we remember to clean that up one day... >>> >>> Crashing device-list-properties should be a blocker bug, IMO. > > Seconded. Well, maybe I should then not suggest to add a hmp("info qtree") below the hmp("info qom-tree") in test_one_device() of tests/device-introspect-test.c ... otherwise we'll be quite busy in the next weeks... Thomas
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
Peter Maydell writes: > On 12 July 2018 at 13:06, Markus Armbruster wrote: >> Peter Maydell writes: >> >>> On 11 July 2018 at 17:12, Eduardo Habkost wrote: On Wed, Jul 11, 2018 at 09:21:48AM +0200, Thomas Huth wrote: > Hm, ok, so how to continue here now? Shall we at least mark the > bcm2836/7 devices with user_creatable=false, so that users can not crash > their QEMU so easily with device_add? The problem with introspection via > device-list-properties would still continue to exist, but I think that's > less likely used in practice... otherwise we could still move the > qdev_set_parent_bus() calls to the realize() function instead, and just > add a big fat FIXME comment in front of the code block, so that we > remember to clean that up one day... Crashing device-list-properties should be a blocker bug, IMO. >> >> Seconded. >> Moving to realize is not the best solution, but I would prefer to do that in 3.0 instead of leaving the device-list-properties crash unfixed. >>> >>> I would like to see the crash fixed too. But I'd like to >>> see it fixed: >>> (a) by having clear documentation about how the QOM >>> system works, what you should do in init and what you >>> should do in realize, when and why you need to manually >>> parent objects, etc >>> (b) as far as possible making our APIs for doing this >>> easy to use correctly and difficult to use wrongly. At >>> the moment we have APIs that are far too easy to misuse, >>> which means we will continue to get bugs like this and spend >>> a lot of time on one-off fixes for them. >>> >>> In particular I don't understand why we need to manually >>> parent these objects at all. >> >> I want both (a) and (b) as badly as anyone, but we should not hold any >> particular crash bug hostage to get them. > > Without at least (a) I can't review this patch or any other > patch that fixes this kind of crash bug... Fair point. But if Paolo can, and vouches for a fix, then we shouldn't hold the fix hostage.
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 12 July 2018 at 13:06, Markus Armbruster wrote: > Peter Maydell writes: > >> On 11 July 2018 at 17:12, Eduardo Habkost wrote: >>> On Wed, Jul 11, 2018 at 09:21:48AM +0200, Thomas Huth wrote: Hm, ok, so how to continue here now? Shall we at least mark the bcm2836/7 devices with user_creatable=false, so that users can not crash their QEMU so easily with device_add? The problem with introspection via device-list-properties would still continue to exist, but I think that's less likely used in practice... otherwise we could still move the qdev_set_parent_bus() calls to the realize() function instead, and just add a big fat FIXME comment in front of the code block, so that we remember to clean that up one day... >>> >>> Crashing device-list-properties should be a blocker bug, IMO. > > Seconded. > >>> Moving to realize is not the best solution, but I would prefer to >>> do that in 3.0 instead of leaving the device-list-properties >>> crash unfixed. >> >> I would like to see the crash fixed too. But I'd like to >> see it fixed: >> (a) by having clear documentation about how the QOM >> system works, what you should do in init and what you >> should do in realize, when and why you need to manually >> parent objects, etc >> (b) as far as possible making our APIs for doing this >> easy to use correctly and difficult to use wrongly. At >> the moment we have APIs that are far too easy to misuse, >> which means we will continue to get bugs like this and spend >> a lot of time on one-off fixes for them. >> >> In particular I don't understand why we need to manually >> parent these objects at all. > > I want both (a) and (b) as badly as anyone, but we should not hold any > particular crash bug hostage to get them. Without at least (a) I can't review this patch or any other patch that fixes this kind of crash bug... thanks -- PMM
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
Peter Maydell writes: > On 11 July 2018 at 17:12, Eduardo Habkost wrote: >> On Wed, Jul 11, 2018 at 09:21:48AM +0200, Thomas Huth wrote: >>> Hm, ok, so how to continue here now? Shall we at least mark the >>> bcm2836/7 devices with user_creatable=false, so that users can not crash >>> their QEMU so easily with device_add? The problem with introspection via >>> device-list-properties would still continue to exist, but I think that's >>> less likely used in practice... otherwise we could still move the >>> qdev_set_parent_bus() calls to the realize() function instead, and just >>> add a big fat FIXME comment in front of the code block, so that we >>> remember to clean that up one day... >> >> Crashing device-list-properties should be a blocker bug, IMO. Seconded. >> Moving to realize is not the best solution, but I would prefer to >> do that in 3.0 instead of leaving the device-list-properties >> crash unfixed. > > I would like to see the crash fixed too. But I'd like to > see it fixed: > (a) by having clear documentation about how the QOM > system works, what you should do in init and what you > should do in realize, when and why you need to manually > parent objects, etc > (b) as far as possible making our APIs for doing this > easy to use correctly and difficult to use wrongly. At > the moment we have APIs that are far too easy to misuse, > which means we will continue to get bugs like this and spend > a lot of time on one-off fixes for them. > > In particular I don't understand why we need to manually > parent these objects at all. I want both (a) and (b) as badly as anyone, but we should not hold any particular crash bug hostage to get them.
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
Thomas Huth writes: > On 09.07.2018 23:42, Peter Maydell wrote: >> On 9 July 2018 at 22:03, Thomas Huth wrote: [...] > Hmm, maybe we need a qtest that first executes "info qtree", then runs > 'device-list-properties' for all devices and finally checks "info qtree" > again ... since 'device-list-properties' should not change the qtree, > the output of the "info qtree" should be the same. Lovely idea. >Currently this is not > the case :-/ Less than lovely, but I can't claim to be surprised. As Peter remarked elsewhere in this thread, QOM is underdocumented, and is too easy to misuse. On documentation: we sorely lack documentation tying everything together. GDK-Doc function comments alone can't cut it for something as complex as QOM. Even with adequate documentation, bad examples in the source will continue to multiply. We'll have to get rid of them.
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 11/07/2018 22:23, Eduardo Habkost wrote: > On Wed, Jul 11, 2018 at 10:16:42PM +0200, Paolo Bonzini wrote: >> On 11/07/2018 20:30, Eduardo Habkost wrote: The theoretical behavior should be: >>> It's not clear below where you expect >>> qdev_set_parent_bus(..., sysbus_get_default()) >>> to be called (if it should be called at all). >>> >>> I don't know where it should be called, but I'm absolutely sure >>> instance_init is not the right place. >> >> I think instance_init is fine to call qdev_set_parent_bus on contained >> devices. Why do you say it's not? > > Because object_unref(object_new(...)) is not supposed to affect > QEMU global state at all. It should not affect it. Any changes to the global state done by instance_init are immediately undone when object_unref destroys the child properties of the object. Thanks, Paolo
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 11/07/2018 21:59, Eduardo Habkost wrote: > > What exactly guarantees there will be no other references to > (e.g.) `&s->control` when `s` is freed? > > We know the references added by object_initialize(), > object_property_add_child() and qdev_set_parent_bus() will be > dropped, but what about other code calling object_ref()? That would be a bug. This is in fact the reason why memory_region_ref/unref exists---to take the reference on the "outer" device object rather than the contained memory region object. It's not pretty though. I've thought of generalizing the pattern to Object (object_ref adds a reference to the container rather than the contained object, and finalize takes care of finalizing the contained object too), but I'm a bit wary of doing it since it would complicate things further and (except for MemoryRegions) it hasn't been a problem in practice. Paolo
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 11.07.2018 22:15, Paolo Bonzini wrote: [...] > I think you're on the right track, after object_property_add_child you > need to drop the reference to the object. Yes, that's the issue indeed! The child objects get properly cleaned up once I add the object_unref() after the object_property_add_child(), and then the crash does not occur anymore. I'll check the other spots, then I'll write a patch... Thomas
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On Wed, Jul 11, 2018 at 10:16:42PM +0200, Paolo Bonzini wrote: > On 11/07/2018 20:30, Eduardo Habkost wrote: > >> The theoretical behavior should be: > > It's not clear below where you expect > > qdev_set_parent_bus(..., sysbus_get_default()) > > to be called (if it should be called at all). > > > > I don't know where it should be called, but I'm absolutely sure > > instance_init is not the right place. > > I think instance_init is fine to call qdev_set_parent_bus on contained > devices. Why do you say it's not? Because object_unref(object_new(...)) is not supposed to affect QEMU global state at all. -- Eduardo
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 11/07/2018 20:30, Eduardo Habkost wrote: >> The theoretical behavior should be: > It's not clear below where you expect > qdev_set_parent_bus(..., sysbus_get_default()) > to be called (if it should be called at all). > > I don't know where it should be called, but I'm absolutely sure > instance_init is not the right place. I think instance_init is fine to call qdev_set_parent_bus on contained devices. Why do you say it's not? Thanks, Paolo
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 11/07/2018 20:43, Thomas Huth wrote: >> >> - realize fails > In this case, the failure is before realize is attempted, > qdev_device_add() already stop with "Device '%s' can not be hotplugged > on this machine". Still, object_unparent is called by qdev_device_add in the error path, and it should work the same way (in a nutshell, recursive unparent when child properties are deleted, and finalization of the contained objects as the last reference is dropped). >> - object_unparent is called on the device that failed to realize (see >> qdev_device_add). object_unparent calls device_unparent > Hmm, are you sure? I can see that object_unparent calls device_unparent > indirectly for the *child* nodes of the device, but not for the device > itself... object_unparent -> object_property_del_child -> object_finalize_child_property -> device_unparent I think you're on the right track, after object_property_add_child you need to drop the reference to the object. For example qmp_device_add does it after qdev_device_add returns a device successfully (just an example---I understand it is not the case with bcm283x). In that case the call to object_property_add_child is in qdev_set_id. Paolo
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On Wed, Jul 11, 2018 at 09:04:35PM +0200, Thomas Huth wrote: > On 11.07.2018 19:21, Paolo Bonzini wrote: > > On 10/07/2018 08:50, Peter Maydell wrote: > Yuck. The real problem here is that we're still requiring the > code that creates these QOM devices to manually set the parent > in the first place. It's not surprising that we don't get it right > (either parenting in the wrong place or not at all). I'd much > rather see us fix that properly than keep papering over places > where we get it wrong. > >>> Sorry, I'm still not an expert in all this QOM stuff yet ... so what do > >>> you exactly recommend to do instead? > >> I'm not clear either, but I don't think that what we're > >> currently doing can be right. > > > > Well, in theory it should work... I sent the expected flow in another > > email. > > Something that just came to my mind: > > bcm2836_init() creates the TYPE_BCM2835_PERIPHERALS object with > object_initialize(). This creates one reference to the object already. > Then the object is linked to its parent with > object_property_add_child(), which creates another reference to the > object. But where are the two references correctly destroyed again? One > is certainly destroyed by device_unparent later, but the initial one? > Could it be that we are simply lacking one object_unref() after the > object_property_add_child() here? This seems to be true, but I'm confused about the reference counting model, here: What exactly guarantees there will be no other references to (e.g.) `&s->control` when `s` is freed? We know the references added by object_initialize(), object_property_add_child() and qdev_set_parent_bus() will be dropped, but what about other code calling object_ref()? -- Eduardo
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 11.07.2018 19:21, Paolo Bonzini wrote: > On 10/07/2018 08:50, Peter Maydell wrote: Yuck. The real problem here is that we're still requiring the code that creates these QOM devices to manually set the parent in the first place. It's not surprising that we don't get it right (either parenting in the wrong place or not at all). I'd much rather see us fix that properly than keep papering over places where we get it wrong. >>> Sorry, I'm still not an expert in all this QOM stuff yet ... so what do >>> you exactly recommend to do instead? >> I'm not clear either, but I don't think that what we're >> currently doing can be right. > > Well, in theory it should work... I sent the expected flow in another email. Something that just came to my mind: bcm2836_init() creates the TYPE_BCM2835_PERIPHERALS object with object_initialize(). This creates one reference to the object already. Then the object is linked to its parent with object_property_add_child(), which creates another reference to the object. But where are the two references correctly destroyed again? One is certainly destroyed by device_unparent later, but the initial one? Could it be that we are simply lacking one object_unref() after the object_property_add_child() here? Thomas
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 11.07.2018 19:20, Paolo Bonzini wrote: > On 09/07/2018 23:03, Thomas Huth wrote: >> >> The problem is that qdev_set_parent_bus() from instance_init adds a link >> to the child devices which is not valid anymore after the device init >> failed. Thus the qdev_set_parent_bus() must rather be done in the realize >> function instead. > > The theoretical behavior should be: > > - realize fails In this case, the failure is before realize is attempted, qdev_device_add() already stop with "Device '%s' can not be hotplugged on this machine". > - object_unparent is called on the device that failed to realize (see > qdev_device_add). object_unparent calls device_unparent Hmm, are you sure? I can see that object_unparent calls device_unparent indirectly for the *child* nodes of the device, but not for the device itself... Thomas
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On Wed, Jul 11, 2018 at 07:20:42PM +0200, Paolo Bonzini wrote: > On 09/07/2018 23:03, Thomas Huth wrote: > > > > The problem is that qdev_set_parent_bus() from instance_init adds a link > > to the child devices which is not valid anymore after the device init > > failed. Thus the qdev_set_parent_bus() must rather be done in the realize > > function instead. > > The theoretical behavior should be: It's not clear below where you expect qdev_set_parent_bus(..., sysbus_get_default()) to be called (if it should be called at all). I don't know where it should be called, but I'm absolutely sure instance_init is not the right place. > > - realize fails > > - object_unparent is called on the device that failed to realize (see > qdev_device_add). object_unparent calls device_unparent > > - after device_unparent finishes, the last reference to the device has > been dropped and the device is freed > > - object finalization releases all properties > > - this includes child properties, so for each child device > object_unparent is called > > - again device_unparent is called (for the child) and this removes the > child from the bus. > > Paolo -- Eduardo
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 10/07/2018 08:50, Peter Maydell wrote: >>> Yuck. The real problem here is that we're still requiring the >>> code that creates these QOM devices to manually set the parent >>> in the first place. It's not surprising that we don't get it right >>> (either parenting in the wrong place or not at all). I'd much >>> rather see us fix that properly than keep papering over places >>> where we get it wrong. >> Sorry, I'm still not an expert in all this QOM stuff yet ... so what do >> you exactly recommend to do instead? > I'm not clear either, but I don't think that what we're > currently doing can be right. Well, in theory it should work... I sent the expected flow in another email. Paolo
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 09/07/2018 23:03, Thomas Huth wrote: > > The problem is that qdev_set_parent_bus() from instance_init adds a link > to the child devices which is not valid anymore after the device init > failed. Thus the qdev_set_parent_bus() must rather be done in the realize > function instead. The theoretical behavior should be: - realize fails - object_unparent is called on the device that failed to realize (see qdev_device_add). object_unparent calls device_unparent - after device_unparent finishes, the last reference to the device has been dropped and the device is freed - object finalization releases all properties - this includes child properties, so for each child device object_unparent is called - again device_unparent is called (for the child) and this removes the child from the bus. Paolo
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 11 July 2018 at 17:12, Eduardo Habkost wrote: > On Wed, Jul 11, 2018 at 09:21:48AM +0200, Thomas Huth wrote: >> On 10.07.2018 08:50, Peter Maydell wrote: >> > On 9 July 2018 at 23:03, Thomas Huth wrote: >> >> On 09.07.2018 23:42, Peter Maydell wrote: >> >>> On 9 July 2018 at 22:03, Thomas Huth wrote: >> When trying to "device_add bcm2837" on a machine that is not suitable >> for >> this device, you can quickly crash QEMU afterwards, e.g. with "info >> qtree": >> >> echo "{'execute':'qmp_capabilities'} {'execute':'device_add', " \ >> "'arguments':{'driver':'bcm2837'}} {'execute': >> 'human-monitor-command', " \ >> "'arguments': {'command-line': 'info qtree'}}" | \ >> aarch64-softmmu/qemu-system-aarch64 -M integratorcp,accel=qtest -S >> -qmp stdio >> >> {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, >> "package": "build-all"}, "capabilities": []}} >> {"return": {}} >> {"error": {"class": "GenericError", "desc": "Device 'bcm2837' can not be >> hotplugged on this machine"}} >> Segmentation fault (core dumped) >> >> The problem is that qdev_set_parent_bus() from instance_init adds a link >> to the child devices which is not valid anymore after the device init >> failed. Thus the qdev_set_parent_bus() must rather be done in the >> realize >> function instead. >> >>> >> >>> Yuck. The real problem here is that we're still requiring the >> >>> code that creates these QOM devices to manually set the parent >> >>> in the first place. It's not surprising that we don't get it right >> >>> (either parenting in the wrong place or not at all). I'd much >> >>> rather see us fix that properly than keep papering over places >> >>> where we get it wrong. >> >> >> >> Sorry, I'm still not an expert in all this QOM stuff yet ... so what do >> >> you exactly recommend to do instead? >> > >> > I'm not clear either, but I don't think that what we're >> > currently doing can be right. >> >> Hm, ok, so how to continue here now? Shall we at least mark the >> bcm2836/7 devices with user_creatable=false, so that users can not crash >> their QEMU so easily with device_add? The problem with introspection via >> device-list-properties would still continue to exist, but I think that's >> less likely used in practice... otherwise we could still move the >> qdev_set_parent_bus() calls to the realize() function instead, and just >> add a big fat FIXME comment in front of the code block, so that we >> remember to clean that up one day... > > Crashing device-list-properties should be a blocker bug, IMO. > > Moving to realize is not the best solution, but I would prefer to > do that in 3.0 instead of leaving the device-list-properties > crash unfixed. I would like to see the crash fixed too. But I'd like to see it fixed: (a) by having clear documentation about how the QOM system works, what you should do in init and what you should do in realize, when and why you need to manually parent objects, etc (b) as far as possible making our APIs for doing this easy to use correctly and difficult to use wrongly. At the moment we have APIs that are far too easy to misuse, which means we will continue to get bugs like this and spend a lot of time on one-off fixes for them. In particular I don't understand why we need to manually parent these objects at all. thanks -- PMM
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On Wed, Jul 11, 2018 at 09:21:48AM +0200, Thomas Huth wrote: > On 10.07.2018 08:50, Peter Maydell wrote: > > On 9 July 2018 at 23:03, Thomas Huth wrote: > >> On 09.07.2018 23:42, Peter Maydell wrote: > >>> On 9 July 2018 at 22:03, Thomas Huth wrote: > When trying to "device_add bcm2837" on a machine that is not suitable for > this device, you can quickly crash QEMU afterwards, e.g. with "info > qtree": > > echo "{'execute':'qmp_capabilities'} {'execute':'device_add', " \ > "'arguments':{'driver':'bcm2837'}} {'execute': 'human-monitor-command', > " \ > "'arguments': {'command-line': 'info qtree'}}" | \ > aarch64-softmmu/qemu-system-aarch64 -M integratorcp,accel=qtest -S -qmp > stdio > > {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, > "package": "build-all"}, "capabilities": []}} > {"return": {}} > {"error": {"class": "GenericError", "desc": "Device 'bcm2837' can not be > hotplugged on this machine"}} > Segmentation fault (core dumped) > > The problem is that qdev_set_parent_bus() from instance_init adds a link > to the child devices which is not valid anymore after the device init > failed. Thus the qdev_set_parent_bus() must rather be done in the realize > function instead. > >>> > >>> Yuck. The real problem here is that we're still requiring the > >>> code that creates these QOM devices to manually set the parent > >>> in the first place. It's not surprising that we don't get it right > >>> (either parenting in the wrong place or not at all). I'd much > >>> rather see us fix that properly than keep papering over places > >>> where we get it wrong. > >> > >> Sorry, I'm still not an expert in all this QOM stuff yet ... so what do > >> you exactly recommend to do instead? > > > > I'm not clear either, but I don't think that what we're > > currently doing can be right. > > Hm, ok, so how to continue here now? Shall we at least mark the > bcm2836/7 devices with user_creatable=false, so that users can not crash > their QEMU so easily with device_add? The problem with introspection via > device-list-properties would still continue to exist, but I think that's > less likely used in practice... otherwise we could still move the > qdev_set_parent_bus() calls to the realize() function instead, and just > add a big fat FIXME comment in front of the code block, so that we > remember to clean that up one day... Crashing device-list-properties should be a blocker bug, IMO. Moving to realize is not the best solution, but I would prefer to do that in 3.0 instead of leaving the device-list-properties crash unfixed. Another solution is to reintroduce DeviceClass::cannot_destroy_with_object_finalize_yet (commit 08f00df4f4b8b4e38ad620477cc90cf5f73832d9), and set cannot_destroy_with_object_finalize_yet=true on bcm2837. -- Eduardo
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 10.07.2018 08:50, Peter Maydell wrote: > On 9 July 2018 at 23:03, Thomas Huth wrote: >> On 09.07.2018 23:42, Peter Maydell wrote: >>> On 9 July 2018 at 22:03, Thomas Huth wrote: When trying to "device_add bcm2837" on a machine that is not suitable for this device, you can quickly crash QEMU afterwards, e.g. with "info qtree": echo "{'execute':'qmp_capabilities'} {'execute':'device_add', " \ "'arguments':{'driver':'bcm2837'}} {'execute': 'human-monitor-command', " \ "'arguments': {'command-line': 'info qtree'}}" | \ aarch64-softmmu/qemu-system-aarch64 -M integratorcp,accel=qtest -S -qmp stdio {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "build-all"}, "capabilities": []}} {"return": {}} {"error": {"class": "GenericError", "desc": "Device 'bcm2837' can not be hotplugged on this machine"}} Segmentation fault (core dumped) The problem is that qdev_set_parent_bus() from instance_init adds a link to the child devices which is not valid anymore after the device init failed. Thus the qdev_set_parent_bus() must rather be done in the realize function instead. >>> >>> Yuck. The real problem here is that we're still requiring the >>> code that creates these QOM devices to manually set the parent >>> in the first place. It's not surprising that we don't get it right >>> (either parenting in the wrong place or not at all). I'd much >>> rather see us fix that properly than keep papering over places >>> where we get it wrong. >> >> Sorry, I'm still not an expert in all this QOM stuff yet ... so what do >> you exactly recommend to do instead? > > I'm not clear either, but I don't think that what we're > currently doing can be right. Hm, ok, so how to continue here now? Shall we at least mark the bcm2836/7 devices with user_creatable=false, so that users can not crash their QEMU so easily with device_add? The problem with introspection via device-list-properties would still continue to exist, but I think that's less likely used in practice... otherwise we could still move the qdev_set_parent_bus() calls to the realize() function instead, and just add a big fat FIXME comment in front of the code block, so that we remember to clean that up one day... Thomas
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 9 July 2018 at 23:03, Thomas Huth wrote: > On 09.07.2018 23:42, Peter Maydell wrote: >> On 9 July 2018 at 22:03, Thomas Huth wrote: >>> When trying to "device_add bcm2837" on a machine that is not suitable for >>> this device, you can quickly crash QEMU afterwards, e.g. with "info qtree": >>> >>> echo "{'execute':'qmp_capabilities'} {'execute':'device_add', " \ >>> "'arguments':{'driver':'bcm2837'}} {'execute': 'human-monitor-command', " \ >>> "'arguments': {'command-line': 'info qtree'}}" | \ >>> aarch64-softmmu/qemu-system-aarch64 -M integratorcp,accel=qtest -S -qmp >>> stdio >>> >>> {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, >>> "package": "build-all"}, "capabilities": []}} >>> {"return": {}} >>> {"error": {"class": "GenericError", "desc": "Device 'bcm2837' can not be >>> hotplugged on this machine"}} >>> Segmentation fault (core dumped) >>> >>> The problem is that qdev_set_parent_bus() from instance_init adds a link >>> to the child devices which is not valid anymore after the device init >>> failed. Thus the qdev_set_parent_bus() must rather be done in the realize >>> function instead. >> >> Yuck. The real problem here is that we're still requiring the >> code that creates these QOM devices to manually set the parent >> in the first place. It's not surprising that we don't get it right >> (either parenting in the wrong place or not at all). I'd much >> rather see us fix that properly than keep papering over places >> where we get it wrong. > > Sorry, I'm still not an expert in all this QOM stuff yet ... so what do > you exactly recommend to do instead? I'm not clear either, but I don't think that what we're currently doing can be right. thanks -- PMM
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 09.07.2018 23:42, Peter Maydell wrote: > On 9 July 2018 at 22:03, Thomas Huth wrote: >> When trying to "device_add bcm2837" on a machine that is not suitable for >> this device, you can quickly crash QEMU afterwards, e.g. with "info qtree": >> >> echo "{'execute':'qmp_capabilities'} {'execute':'device_add', " \ >> "'arguments':{'driver':'bcm2837'}} {'execute': 'human-monitor-command', " \ >> "'arguments': {'command-line': 'info qtree'}}" | \ >> aarch64-softmmu/qemu-system-aarch64 -M integratorcp,accel=qtest -S -qmp >> stdio >> >> {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, >> "package": "build-all"}, "capabilities": []}} >> {"return": {}} >> {"error": {"class": "GenericError", "desc": "Device 'bcm2837' can not be >> hotplugged on this machine"}} >> Segmentation fault (core dumped) >> >> The problem is that qdev_set_parent_bus() from instance_init adds a link >> to the child devices which is not valid anymore after the device init >> failed. Thus the qdev_set_parent_bus() must rather be done in the realize >> function instead. > > Yuck. The real problem here is that we're still requiring the > code that creates these QOM devices to manually set the parent > in the first place. It's not surprising that we don't get it right > (either parenting in the wrong place or not at all). I'd much > rather see us fix that properly than keep papering over places > where we get it wrong. Sorry, I'm still not an expert in all this QOM stuff yet ... so what do you exactly recommend to do instead? > Also, "device_add bcm2837" is just nonsensical: there's no > way to create an SoC device like this with device_add. > We should not allow it (for this and all the other > devices that device_add will never work with)... I'm fine if we additionally add a user_creatable = false here, but the problem then still persists, e.g. with : echo "{'execute':'qmp_capabilities'} " \ "{'execute':'device-list-properties', 'arguments': " \ "{'typename':'bcm2837'}} {'execute': 'human-monitor-command', " \ "'arguments': {'command-line': 'info qtree'}}" | valgrind -q \ aarch64-softmmu/qemu-system-aarch64 -M integratorcp -S -qmp stdio The same problem also exists for some other devices which have already been marked with user_creatable = false, e.g.: echo "{'execute':'qmp_capabilities'} " \ "{'execute':'device-list-properties', " \ "'arguments':{'typename':'macio-newworld'}} " \ "{'execute': 'human-monitor-command', 'arguments': " \ "{'command-line': 'info qtree'}}" | valgrind -q \ ppc64-softmmu/qemu-system-ppc64 -M mac99 -S -qmp stdio Hmm, maybe we need a qtest that first executes "info qtree", then runs 'device-list-properties' for all devices and finally checks "info qtree" again ... since 'device-list-properties' should not change the qtree, the output of the "info qtree" should be the same. Currently this is not the case :-/ Thomas
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 9 July 2018 at 22:03, Thomas Huth wrote: > When trying to "device_add bcm2837" on a machine that is not suitable for > this device, you can quickly crash QEMU afterwards, e.g. with "info qtree": > > echo "{'execute':'qmp_capabilities'} {'execute':'device_add', " \ > "'arguments':{'driver':'bcm2837'}} {'execute': 'human-monitor-command', " \ > "'arguments': {'command-line': 'info qtree'}}" | \ > aarch64-softmmu/qemu-system-aarch64 -M integratorcp,accel=qtest -S -qmp stdio > > {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, > "package": "build-all"}, "capabilities": []}} > {"return": {}} > {"error": {"class": "GenericError", "desc": "Device 'bcm2837' can not be > hotplugged on this machine"}} > Segmentation fault (core dumped) > > The problem is that qdev_set_parent_bus() from instance_init adds a link > to the child devices which is not valid anymore after the device init > failed. Thus the qdev_set_parent_bus() must rather be done in the realize > function instead. Yuck. The real problem here is that we're still requiring the code that creates these QOM devices to manually set the parent in the first place. It's not surprising that we don't get it right (either parenting in the wrong place or not at all). I'd much rather see us fix that properly than keep papering over places where we get it wrong. Also, "device_add bcm2837" is just nonsensical: there's no way to create an SoC device like this with device_add. We should not allow it (for this and all the other devices that device_add will never work with)... thanks -- PMM
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On 09.07.2018 23:31, Eduardo Habkost wrote: > On Mon, Jul 09, 2018 at 11:03:00PM +0200, Thomas Huth wrote: >> When trying to "device_add bcm2837" on a machine that is not suitable for >> this device, you can quickly crash QEMU afterwards, e.g. with "info qtree": >> >> echo "{'execute':'qmp_capabilities'} {'execute':'device_add', " \ >> "'arguments':{'driver':'bcm2837'}} {'execute': 'human-monitor-command', " \ >> "'arguments': {'command-line': 'info qtree'}}" | \ >> aarch64-softmmu/qemu-system-aarch64 -M integratorcp,accel=qtest -S -qmp >> stdio > > Interesting, how did you find this bug? I was running some tests with an enhanced version of this patch applied: http://lists.gnu.org/archive/html/qemu-devel/2018-03/msg05033.html Thomas
Re: [Qemu-devel] [PATCH] hw/arm/bcm283x: Fix crash with device_add bcm2837 on unsupported machines
On Mon, Jul 09, 2018 at 11:03:00PM +0200, Thomas Huth wrote: > When trying to "device_add bcm2837" on a machine that is not suitable for > this device, you can quickly crash QEMU afterwards, e.g. with "info qtree": > > echo "{'execute':'qmp_capabilities'} {'execute':'device_add', " \ > "'arguments':{'driver':'bcm2837'}} {'execute': 'human-monitor-command', " \ > "'arguments': {'command-line': 'info qtree'}}" | \ > aarch64-softmmu/qemu-system-aarch64 -M integratorcp,accel=qtest -S -qmp stdio Interesting, how did you find this bug? Running "info qtree" and other queries on device-crash-test sounds like a good idea. I will add it to my TODO list. > > {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, > "package": "build-all"}, "capabilities": []}} > {"return": {}} > {"error": {"class": "GenericError", "desc": "Device 'bcm2837' can not be > hotplugged on this machine"}} > Segmentation fault (core dumped) > > The problem is that qdev_set_parent_bus() from instance_init adds a link > to the child devices which is not valid anymore after the device init > failed. Thus the qdev_set_parent_bus() must rather be done in the realize > function instead. > > Signed-off-by: Thomas Huth [...] -- Eduardo