RE: [PATCHv6 3/3] vfio/mdev: Synchronize device create/remove with parent removal
> -Original Message- > From: Alex Williamson > Sent: Tuesday, June 11, 2019 11:25 PM > To: Parav Pandit > Cc: Cornelia Huck ; k...@vger.kernel.org; linux- > ker...@vger.kernel.org; kwankh...@nvidia.com; c...@nvidia.com > Subject: Re: [PATCHv6 3/3] vfio/mdev: Synchronize device create/remove > with parent removal > > On Tue, 11 Jun 2019 03:22:37 + > Parav Pandit wrote: > > > Hi Alex, > > > [snip] > > > Now that we have all 3 patches reviewed and comments addressed, if > > there are no more comments, can you please take it forward? > > Yep, I put it in a branch rolled into linux-next for upstream testing last > week > and just sent a pull request to Linus today. Thanks, > Oh ok. Great. Thanks Alex.
Re: [PATCHv6 3/3] vfio/mdev: Synchronize device create/remove with parent removal
On Tue, 11 Jun 2019 03:22:37 + Parav Pandit wrote: > Hi Alex, > [snip] > Now that we have all 3 patches reviewed and comments addressed, if > there are no more comments, can you please take it forward? Yep, I put it in a branch rolled into linux-next for upstream testing last week and just sent a pull request to Linus today. Thanks, Alex
RE: [PATCHv6 3/3] vfio/mdev: Synchronize device create/remove with parent removal
Hi Alex, > -Original Message- > From: Cornelia Huck > Sent: Tuesday, June 4, 2019 11:18 AM > To: Parav Pandit > Cc: k...@vger.kernel.org; linux-kernel@vger.kernel.org; > kwankh...@nvidia.com; alex.william...@redhat.com; c...@nvidia.com > Subject: Re: [PATCHv6 3/3] vfio/mdev: Synchronize device create/remove > with parent removal > > On Mon, 3 Jun 2019 13:56:58 -0500 > Parav Pandit wrote: > > > In following sequences, child devices created while removing mdev > > parent device can be left out, or it may lead to race of removing half > > initialized child mdev devices. > > > > issue-1: > > > >cpu-0 cpu-1 > >- - > > mdev_unregister_device() > > device_for_each_child() > > mdev_device_remove_cb() > > mdev_device_remove() > > create_store() > > mdev_device_create() [...] > > device_add() > > parent_remove_sysfs_files() > > > > /* BUG: device added by cpu-0 > > * whose parent is getting removed > > * and it won't process this mdev. > > */ > > > > issue-2: > > > > Below crash is observed when user initiated remove is in progress and > > mdev_unregister_driver() completes parent unregistration. > > > >cpu-0 cpu-1 > >- - > > remove_store() > >mdev_device_remove() > >active = false; > > mdev_unregister_device() > > parent device removed. > >[...] > >parents->ops->remove() > > /* > > * BUG: Accessing invalid parent. > > */ > > > > This is similar race like create() racing with mdev_unregister_device(). > > > > BUG: unable to handle kernel paging request at c0585668 PGD > > e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0 > > Oops: [#1] SMP PTI > > CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted > > 5.1.0-rc6-vdevbus+ #6 Hardware name: Supermicro > > SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 > > RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] Call Trace: > > remove_store+0x71/0x90 [mdev] > > kernfs_fop_write+0x113/0x1a0 > > vfs_write+0xad/0x1b0 > > ksys_write+0x5a/0xe0 > > do_syscall_64+0x5a/0x210 > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > > > Therefore, mdev core is improved as below to overcome above issues. > > > > Wait for any ongoing mdev create() and remove() to finish before > > unregistering parent device. > > This continues to allow multiple create and remove to progress in > > parallel for different mdev devices as most common case. > > At the same time guard parent removal while parent is being accessed > > by > > create() and remove() callbacks. > > create()/remove() and unregister_device() are synchronized by the rwsem. > > > > Refactor device removal code to mdev_device_remove_common() to avoid > > acquiring unreg_sem of the parent. > > > > Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") > > Signed-off-by: Parav Pandit > > --- > > drivers/vfio/mdev/mdev_core.c| 71 > > drivers/vfio/mdev/mdev_private.h | 2 + > > 2 files changed, 55 insertions(+), 18 deletions(-) > > > > > @@ -265,6 +299,12 @@ int mdev_device_create(struct kobject *kobj, > > > > mdev->parent = parent; > > > > Adding > > /* Check if parent unregistration has started */ > > here as well might be nice, but no need to resend the patch for that. > > > + if (!down_read_trylock(>unreg_sem)) { > > + mdev_device_free(mdev); > > + ret = -ENODEV; > > + goto mdev_fail; > > + } > > + > > device_initialize(>dev); > > mdev->dev.parent = dev; > > mdev->dev.bus = _bus_type; > > Reviewed-by: Cornelia Huck Now that we have all 3 patches reviewed and comments addressed, if there are no more comments, can you please take it forward?
Re: [PATCHv6 3/3] vfio/mdev: Synchronize device create/remove with parent removal
On Mon, 3 Jun 2019 13:56:58 -0500 Parav Pandit wrote: > In following sequences, child devices created while removing mdev parent > device can be left out, or it may lead to race of removing half > initialized child mdev devices. > > issue-1: > >cpu-0 cpu-1 >- - > mdev_unregister_device() > device_for_each_child() > mdev_device_remove_cb() > mdev_device_remove() > create_store() > mdev_device_create() [...] > device_add() > parent_remove_sysfs_files() > > /* BUG: device added by cpu-0 > * whose parent is getting removed > * and it won't process this mdev. > */ > > issue-2: > > Below crash is observed when user initiated remove is in progress > and mdev_unregister_driver() completes parent unregistration. > >cpu-0 cpu-1 >- - > remove_store() >mdev_device_remove() >active = false; > mdev_unregister_device() > parent device removed. >[...] >parents->ops->remove() > /* > * BUG: Accessing invalid parent. > */ > > This is similar race like create() racing with mdev_unregister_device(). > > BUG: unable to handle kernel paging request at c0585668 > PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0 > Oops: [#1] SMP PTI > CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6 > Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016 > RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev] > Call Trace: > remove_store+0x71/0x90 [mdev] > kernfs_fop_write+0x113/0x1a0 > vfs_write+0xad/0x1b0 > ksys_write+0x5a/0xe0 > do_syscall_64+0x5a/0x210 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > Therefore, mdev core is improved as below to overcome above issues. > > Wait for any ongoing mdev create() and remove() to finish before > unregistering parent device. > This continues to allow multiple create and remove to progress in > parallel for different mdev devices as most common case. > At the same time guard parent removal while parent is being accessed by > create() and remove() callbacks. > create()/remove() and unregister_device() are synchronized by the rwsem. > > Refactor device removal code to mdev_device_remove_common() to avoid > acquiring unreg_sem of the parent. > > Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") > Signed-off-by: Parav Pandit > --- > drivers/vfio/mdev/mdev_core.c| 71 > drivers/vfio/mdev/mdev_private.h | 2 + > 2 files changed, 55 insertions(+), 18 deletions(-) > > @@ -265,6 +299,12 @@ int mdev_device_create(struct kobject *kobj, > > mdev->parent = parent; > Adding /* Check if parent unregistration has started */ here as well might be nice, but no need to resend the patch for that. > + if (!down_read_trylock(>unreg_sem)) { > + mdev_device_free(mdev); > + ret = -ENODEV; > + goto mdev_fail; > + } > + > device_initialize(>dev); > mdev->dev.parent = dev; > mdev->dev.bus = _bus_type; Reviewed-by: Cornelia Huck