Re: [PATCH v3] vfio/mdev: Check globally for duplicate devices
On 5/18/2018 11:00 PM, Alex Williamson wrote: > On Fri, 18 May 2018 12:34:03 +0530 > Kirti Wankhede wrote: > >> On 5/18/2018 3:07 AM, Alex Williamson wrote: >>> On Fri, 18 May 2018 01:56:50 +0530 >>> Kirti Wankhede wrote: >>> On 5/17/2018 9:50 PM, Alex Williamson wrote: > On Thu, 17 May 2018 21:25:22 +0530 > Kirti Wankhede wrote: > >> On 5/17/2018 1:39 PM, Cornelia Huck wrote: >>> On Wed, 16 May 2018 21:30:19 -0600 >>> Alex Williamson wrote: >>> When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. NB. there was never intended to be any serialization guarantee provided by the mdev core with respect to creation and removal of mdev devices, mdev_parent.lock provided this only as a side-effect of the implementation for locking the namespace per parent. That serialization is now removed. >>> >> >> mdev_parent.lock is to serialize create and remove of that mdev device, >> that handles race condition that Cornelia mentioned below. > > Previously it was stated: > > On Thu, 17 May 2018 01:01:40 +0530 > Kirti Wankhede wrote: >> Here lock is not for create/remove routines of vendor driver, its about >> mdev device creation and device registration, which is a common code >> path, and so is part of mdev core module. > > So the race condition was handled previously, but as a side-effect of > protecting the namespace, aiui. I'm trying to state above that the > serialization of create/remove was never intended as a guarantee > provided to mdev vendor drivers. I don't see that there's a need to > protect "mdev device creation and device registration" beyond conflicts > in the UUID namespace, which is done here. Are there others? > Sorry not being elaborative in my earlier response to > If we can > show that vendor drivers handle the create/remove paths themselves, > perhaps we can refine the locking granularity. mdev_device_create() function does : - create mdev device - register device - call vendor driver->create - create sysfs files. mdev_device_remove() removes sysfs files, unregister device and delete device. There is common code in mdev_device_create() and mdev_device_remove() independent of what vendor driver does in its create()/remove() callback. Moving this code to each vendor driver to handle create/remove themselves doesn't make sense to me. >>> >>> I don't see where anyone is suggesting that, I'm not. >>> mdev_parent.lock here does take care of race conditions that could occur during mdev device creation and deletion in this common code path. >>> >>> Exactly what races in the common code path is mdev_parent.lock >>> preventing? mdev_device_create() calls: >>> >>> device_register() >>> mdev_device_create_ops() >>> parent->ops->create() >>> sysfs_create_groups() >>> mdev_create_sysfs_files() >>> sysfs_create_files() >>> sysfs_create_link() >>> sysfs_create_link() >>> >>> mdev_parent.lock is certainly not serializing all calls across the >>> entire kernel to device_register and sysfs_create_{groups,files,link} >>> so what is it protecting other than serializing parent->ops->create()? >>> Locks protect data, not code. The data we're protecting is the shared >>> mdev_list, there is no shared data once mdev_device_create() has its >>> mdev_device with uuid reservation placed into that list. >>> > > Thank you for enumerating these points below. > >> This lock prevents race condition that could occur due to sysfs entries >> 1. between write on 'create' and 'remove' sysfs file of mdev device >> As per current code without lock, mdev_create_sysfs_files() creates >> 'remove' sysfs, but before adding this mdev device to mdev_list, if >> 'remove' is called, that would return -ENODEV even if the device is seen >> in sysfs > > mdev_parent.lock doesn't play a factor here
Re: [PATCH v3] vfio/mdev: Check globally for duplicate devices
On Fri, 18 May 2018 12:34:03 +0530 Kirti Wankhede wrote: > On 5/18/2018 3:07 AM, Alex Williamson wrote: > > On Fri, 18 May 2018 01:56:50 +0530 > > Kirti Wankhede wrote: > > > >> On 5/17/2018 9:50 PM, Alex Williamson wrote: > >>> On Thu, 17 May 2018 21:25:22 +0530 > >>> Kirti Wankhede wrote: > >>> > On 5/17/2018 1:39 PM, Cornelia Huck wrote: > > On Wed, 16 May 2018 21:30:19 -0600 > > Alex Williamson wrote: > > > >> When we create an mdev device, we check for duplicates against the > >> parent device and return -EEXIST if found, but the mdev device > >> namespace is global since we'll link all devices from the bus. We do > >> catch this later in sysfs_do_create_link_sd() to return -EEXIST, but > >> with it comes a kernel warning and stack trace for trying to create > >> duplicate sysfs links, which makes it an undesirable response. > >> > >> Therefore we should really be looking for duplicates across all mdev > >> parent devices, or as implemented here, against our mdev device list. > >> Using mdev_list to prevent duplicates means that we can remove > >> mdev_parent.lock, but in order not to serialize mdev device creation > >> and removal globally, we add mdev_device.active which allows UUIDs to > >> be reserved such that we can drop the mdev_list_lock before the mdev > >> device is fully in place. > >> > >> NB. there was never intended to be any serialization guarantee > >> provided by the mdev core with respect to creation and removal of mdev > >> devices, mdev_parent.lock provided this only as a side-effect of the > >> implementation for locking the namespace per parent. That > >> serialization is now removed. > > > > mdev_parent.lock is to serialize create and remove of that mdev device, > that handles race condition that Cornelia mentioned below. > >>> > >>> Previously it was stated: > >>> > >>> On Thu, 17 May 2018 01:01:40 +0530 > >>> Kirti Wankhede wrote: > Here lock is not for create/remove routines of vendor driver, its about > mdev device creation and device registration, which is a common code > path, and so is part of mdev core module. > >>> > >>> So the race condition was handled previously, but as a side-effect of > >>> protecting the namespace, aiui. I'm trying to state above that the > >>> serialization of create/remove was never intended as a guarantee > >>> provided to mdev vendor drivers. I don't see that there's a need to > >>> protect "mdev device creation and device registration" beyond conflicts > >>> in the UUID namespace, which is done here. Are there others? > >>> > >> > >> Sorry not being elaborative in my earlier response to > >> > >>> If we can > >>> show that vendor drivers handle the create/remove paths themselves, > >>> perhaps we can refine the locking granularity. > >> > >> mdev_device_create() function does : > >> - create mdev device > >> - register device > >> - call vendor driver->create > >> - create sysfs files. > >> > >> mdev_device_remove() removes sysfs files, unregister device and delete > >> device. > >> > >> There is common code in mdev_device_create() and mdev_device_remove() > >> independent of what vendor driver does in its create()/remove() > >> callback. Moving this code to each vendor driver to handle create/remove > >> themselves doesn't make sense to me. > > > > I don't see where anyone is suggesting that, I'm not. > > > >> mdev_parent.lock here does take care of race conditions that could occur > >> during mdev device creation and deletion in this common code path. > > > > Exactly what races in the common code path is mdev_parent.lock > > preventing? mdev_device_create() calls: > > > > device_register() > > mdev_device_create_ops() > > parent->ops->create() > > sysfs_create_groups() > > mdev_create_sysfs_files() > > sysfs_create_files() > > sysfs_create_link() > > sysfs_create_link() > > > > mdev_parent.lock is certainly not serializing all calls across the > > entire kernel to device_register and sysfs_create_{groups,files,link} > > so what is it protecting other than serializing parent->ops->create()? > > Locks protect data, not code. The data we're protecting is the shared > > mdev_list, there is no shared data once mdev_device_create() has its > > mdev_device with uuid reservation placed into that list. > > Thank you for enumerating these points below. > This lock prevents race condition that could occur due to sysfs entries > 1. between write on 'create' and 'remove' sysfs file of mdev device > As per current code without lock, mdev_create_sysfs_files() creates > 'remove' sysfs, but before adding this mdev device to mdev_list, if > 'remove' is called, that would return -ENODEV even if the device is seen > in sysfs mdev_parent.lock doesn't play a factor here. As it exists today, the sysfs remove attribute is added during m
Re: [PATCH v3] vfio/mdev: Check globally for duplicate devices
On Fri, 18 May 2018 12:34:03 +0530 Kirti Wankhede wrote: > On 5/18/2018 3:07 AM, Alex Williamson wrote: > > On Fri, 18 May 2018 01:56:50 +0530 > > Kirti Wankhede wrote: > > > >> On 5/17/2018 9:50 PM, Alex Williamson wrote: > >>> On Thu, 17 May 2018 21:25:22 +0530 > >>> Kirti Wankhede wrote: > >>> > On 5/17/2018 1:39 PM, Cornelia Huck wrote: > > On Wed, 16 May 2018 21:30:19 -0600 > > Alex Williamson wrote: > > > >> When we create an mdev device, we check for duplicates against the > >> parent device and return -EEXIST if found, but the mdev device > >> namespace is global since we'll link all devices from the bus. We do > >> catch this later in sysfs_do_create_link_sd() to return -EEXIST, but > >> with it comes a kernel warning and stack trace for trying to create > >> duplicate sysfs links, which makes it an undesirable response. > >> > >> Therefore we should really be looking for duplicates across all mdev > >> parent devices, or as implemented here, against our mdev device list. > >> Using mdev_list to prevent duplicates means that we can remove > >> mdev_parent.lock, but in order not to serialize mdev device creation > >> and removal globally, we add mdev_device.active which allows UUIDs to > >> be reserved such that we can drop the mdev_list_lock before the mdev > >> device is fully in place. > >> > >> NB. there was never intended to be any serialization guarantee > >> provided by the mdev core with respect to creation and removal of mdev > >> devices, mdev_parent.lock provided this only as a side-effect of the > >> implementation for locking the namespace per parent. That > >> serialization is now removed. > > > > mdev_parent.lock is to serialize create and remove of that mdev device, > that handles race condition that Cornelia mentioned below. > >>> > >>> Previously it was stated: > >>> > >>> On Thu, 17 May 2018 01:01:40 +0530 > >>> Kirti Wankhede wrote: > Here lock is not for create/remove routines of vendor driver, its about > mdev device creation and device registration, which is a common code > path, and so is part of mdev core module. > >>> > >>> So the race condition was handled previously, but as a side-effect of > >>> protecting the namespace, aiui. I'm trying to state above that the > >>> serialization of create/remove was never intended as a guarantee > >>> provided to mdev vendor drivers. I don't see that there's a need to > >>> protect "mdev device creation and device registration" beyond conflicts > >>> in the UUID namespace, which is done here. Are there others? > >>> > >> > >> Sorry not being elaborative in my earlier response to > >> > >>> If we can > >>> show that vendor drivers handle the create/remove paths themselves, > >>> perhaps we can refine the locking granularity. > >> > >> mdev_device_create() function does : > >> - create mdev device > >> - register device > >> - call vendor driver->create > >> - create sysfs files. > >> > >> mdev_device_remove() removes sysfs files, unregister device and delete > >> device. > >> > >> There is common code in mdev_device_create() and mdev_device_remove() > >> independent of what vendor driver does in its create()/remove() > >> callback. Moving this code to each vendor driver to handle create/remove > >> themselves doesn't make sense to me. > > > > I don't see where anyone is suggesting that, I'm not. > > > >> mdev_parent.lock here does take care of race conditions that could occur > >> during mdev device creation and deletion in this common code path. > > > > Exactly what races in the common code path is mdev_parent.lock > > preventing? mdev_device_create() calls: > > > > device_register() > > mdev_device_create_ops() > > parent->ops->create() > > sysfs_create_groups() > > mdev_create_sysfs_files() > > sysfs_create_files() > > sysfs_create_link() > > sysfs_create_link() We might consider creating the 'remove' attribute only after the sysfs links have been created (IOW, when nothing else can fail anymore). Drawback: We'd need to clean up the sysfs links. Benefit: the -EAGAIN window should be closed. > > > > mdev_parent.lock is certainly not serializing all calls across the > > entire kernel to device_register and sysfs_create_{groups,files,link} > > so what is it protecting other than serializing parent->ops->create()? > > Locks protect data, not code. The data we're protecting is the shared > > mdev_list, there is no shared data once mdev_device_create() has its > > mdev_device with uuid reservation placed into that list. > > > > This lock prevents race condition that could occur due to sysfs entries > 1. between write on 'create' and 'remove' sysfs file of mdev device > As per current code without lock, mdev_create_sysfs_files() creates > 'remove' sysfs, but before adding this mdev device to mdev_list, if > 'remo
Re: [PATCH v3] vfio/mdev: Check globally for duplicate devices
On 05/17/2018 10:09 AM, Cornelia Huck wrote: [Dong Jia, Halil: Can you please take a look whether vfio-ccw is really ok? I don't think we open up any new races, but I'd appreciate a second or third opinion.] I will wait for things to settle a bit before I start reviewing the synchronization for mdev and vfio-ccw. I suspect there will be a v4. I would appreciate a cc, so I don't miss it. Halil
Re: [PATCH v3] vfio/mdev: Check globally for duplicate devices
On 5/18/2018 3:07 AM, Alex Williamson wrote: > On Fri, 18 May 2018 01:56:50 +0530 > Kirti Wankhede wrote: > >> On 5/17/2018 9:50 PM, Alex Williamson wrote: >>> On Thu, 17 May 2018 21:25:22 +0530 >>> Kirti Wankhede wrote: >>> On 5/17/2018 1:39 PM, Cornelia Huck wrote: > On Wed, 16 May 2018 21:30:19 -0600 > Alex Williamson wrote: > >> When we create an mdev device, we check for duplicates against the >> parent device and return -EEXIST if found, but the mdev device >> namespace is global since we'll link all devices from the bus. We do >> catch this later in sysfs_do_create_link_sd() to return -EEXIST, but >> with it comes a kernel warning and stack trace for trying to create >> duplicate sysfs links, which makes it an undesirable response. >> >> Therefore we should really be looking for duplicates across all mdev >> parent devices, or as implemented here, against our mdev device list. >> Using mdev_list to prevent duplicates means that we can remove >> mdev_parent.lock, but in order not to serialize mdev device creation >> and removal globally, we add mdev_device.active which allows UUIDs to >> be reserved such that we can drop the mdev_list_lock before the mdev >> device is fully in place. >> >> NB. there was never intended to be any serialization guarantee >> provided by the mdev core with respect to creation and removal of mdev >> devices, mdev_parent.lock provided this only as a side-effect of the >> implementation for locking the namespace per parent. That >> serialization is now removed. > mdev_parent.lock is to serialize create and remove of that mdev device, that handles race condition that Cornelia mentioned below. >>> >>> Previously it was stated: >>> >>> On Thu, 17 May 2018 01:01:40 +0530 >>> Kirti Wankhede wrote: Here lock is not for create/remove routines of vendor driver, its about mdev device creation and device registration, which is a common code path, and so is part of mdev core module. >>> >>> So the race condition was handled previously, but as a side-effect of >>> protecting the namespace, aiui. I'm trying to state above that the >>> serialization of create/remove was never intended as a guarantee >>> provided to mdev vendor drivers. I don't see that there's a need to >>> protect "mdev device creation and device registration" beyond conflicts >>> in the UUID namespace, which is done here. Are there others? >>> >> >> Sorry not being elaborative in my earlier response to >> >>> If we can >>> show that vendor drivers handle the create/remove paths themselves, >>> perhaps we can refine the locking granularity. >> >> mdev_device_create() function does : >> - create mdev device >> - register device >> - call vendor driver->create >> - create sysfs files. >> >> mdev_device_remove() removes sysfs files, unregister device and delete >> device. >> >> There is common code in mdev_device_create() and mdev_device_remove() >> independent of what vendor driver does in its create()/remove() >> callback. Moving this code to each vendor driver to handle create/remove >> themselves doesn't make sense to me. > > I don't see where anyone is suggesting that, I'm not. > >> mdev_parent.lock here does take care of race conditions that could occur >> during mdev device creation and deletion in this common code path. > > Exactly what races in the common code path is mdev_parent.lock > preventing? mdev_device_create() calls: > > device_register() > mdev_device_create_ops() > parent->ops->create() > sysfs_create_groups() > mdev_create_sysfs_files() > sysfs_create_files() > sysfs_create_link() > sysfs_create_link() > > mdev_parent.lock is certainly not serializing all calls across the > entire kernel to device_register and sysfs_create_{groups,files,link} > so what is it protecting other than serializing parent->ops->create()? > Locks protect data, not code. The data we're protecting is the shared > mdev_list, there is no shared data once mdev_device_create() has its > mdev_device with uuid reservation placed into that list. > This lock prevents race condition that could occur due to sysfs entries 1. between write on 'create' and 'remove' sysfs file of mdev device As per current code without lock, mdev_create_sysfs_files() creates 'remove' sysfs, but before adding this mdev device to mdev_list, if 'remove' is called, that would return -ENODEV even if the device is seen in sysfs 2. between write on 'remove' and 'create' sysfs file If 'remove' of a device is in progress (device is removed from mdev_list but sysfs entries are not yet removed) and again 'create' of same device with same parent is called, will hit duplicate entries error for sysfs. 3. between multiple writes on 'create' with same uuid current code doesn't handle the case you are fixing here, if same uuid is used to create mdev device on different parents. Y
Re: [PATCH v3] vfio/mdev: Check globally for duplicate devices
On Thu, 17 May 2018 15:37:37 -0600 Alex Williamson wrote: > On Fri, 18 May 2018 01:56:50 +0530 > Kirti Wankhede wrote: > > > On 5/17/2018 9:50 PM, Alex Williamson wrote: > > > On Thu, 17 May 2018 21:25:22 +0530 > > > Kirti Wankhede wrote: > > > > > >> On 5/17/2018 1:39 PM, Cornelia Huck wrote: > > >>> On Wed, 16 May 2018 21:30:19 -0600 > > >>> Alex Williamson wrote: > > >>> > > When we create an mdev device, we check for duplicates against the > > parent device and return -EEXIST if found, but the mdev device > > namespace is global since we'll link all devices from the bus. We do > > catch this later in sysfs_do_create_link_sd() to return -EEXIST, but > > with it comes a kernel warning and stack trace for trying to create > > duplicate sysfs links, which makes it an undesirable response. > > > > Therefore we should really be looking for duplicates across all mdev > > parent devices, or as implemented here, against our mdev device list. > > Using mdev_list to prevent duplicates means that we can remove > > mdev_parent.lock, but in order not to serialize mdev device creation > > and removal globally, we add mdev_device.active which allows UUIDs to > > be reserved such that we can drop the mdev_list_lock before the mdev > > device is fully in place. > > > > NB. there was never intended to be any serialization guarantee > > provided by the mdev core with respect to creation and removal of mdev > > devices, mdev_parent.lock provided this only as a side-effect of the > > implementation for locking the namespace per parent. That > > serialization is now removed. > > >>> > > >> > > >> mdev_parent.lock is to serialize create and remove of that mdev device, > > >> that handles race condition that Cornelia mentioned below. > > > > > > Previously it was stated: > > > > > > On Thu, 17 May 2018 01:01:40 +0530 > > > Kirti Wankhede wrote: > > >> Here lock is not for create/remove routines of vendor driver, its about > > >> mdev device creation and device registration, which is a common code > > >> path, and so is part of mdev core module. > > > > > > So the race condition was handled previously, but as a side-effect of > > > protecting the namespace, aiui. I'm trying to state above that the > > > serialization of create/remove was never intended as a guarantee > > > provided to mdev vendor drivers. I don't see that there's a need to > > > protect "mdev device creation and device registration" beyond conflicts > > > in the UUID namespace, which is done here. Are there others? > > > > > > > Sorry not being elaborative in my earlier response to > > > > > If we can > > > show that vendor drivers handle the create/remove paths themselves, > > > perhaps we can refine the locking granularity. > > > > mdev_device_create() function does : > > - create mdev device > > - register device > > - call vendor driver->create > > - create sysfs files. > > > > mdev_device_remove() removes sysfs files, unregister device and delete > > device. > > > > There is common code in mdev_device_create() and mdev_device_remove() > > independent of what vendor driver does in its create()/remove() > > callback. Moving this code to each vendor driver to handle create/remove > > themselves doesn't make sense to me. > > I don't see where anyone is suggesting that, I'm not. > > > mdev_parent.lock here does take care of race conditions that could occur > > during mdev device creation and deletion in this common code path. > > Exactly what races in the common code path is mdev_parent.lock > preventing? mdev_device_create() calls: > > device_register() > mdev_device_create_ops() > parent->ops->create() > sysfs_create_groups() > mdev_create_sysfs_files() > sysfs_create_files() > sysfs_create_link() > sysfs_create_link() > > mdev_parent.lock is certainly not serializing all calls across the > entire kernel to device_register and sysfs_create_{groups,files,link} > so what is it protecting other than serializing parent->ops->create()? > Locks protect data, not code. The data we're protecting is the shared > mdev_list, there is no shared data once mdev_device_create() has its > mdev_device with uuid reservation placed into that list. > > > What is the urge to remove mdev_parent.lock if that handles all race > > conditions without bothering user to handle -EAGAIN? > > Can you say why -EAGAIN is undesirable? Note that the user is only > going to see this error if they attempt to remove the device in the > minuscule gap between the sysfs remove file being created and the > completion of the write to the create sysfs file. It seems like you're > asking that I decrease the locking granularity, but not too much > because mdev_parent.lock protects "things". If the -EAGAIN is really > so terrible, we can avoid it by spinning until the mdev_device is > either not found in t
Re: [PATCH v3] vfio/mdev: Check globally for duplicate devices
On Fri, 18 May 2018 01:56:50 +0530 Kirti Wankhede wrote: > On 5/17/2018 9:50 PM, Alex Williamson wrote: > > On Thu, 17 May 2018 21:25:22 +0530 > > Kirti Wankhede wrote: > > > >> On 5/17/2018 1:39 PM, Cornelia Huck wrote: > >>> On Wed, 16 May 2018 21:30:19 -0600 > >>> Alex Williamson wrote: > >>> > When we create an mdev device, we check for duplicates against the > parent device and return -EEXIST if found, but the mdev device > namespace is global since we'll link all devices from the bus. We do > catch this later in sysfs_do_create_link_sd() to return -EEXIST, but > with it comes a kernel warning and stack trace for trying to create > duplicate sysfs links, which makes it an undesirable response. > > Therefore we should really be looking for duplicates across all mdev > parent devices, or as implemented here, against our mdev device list. > Using mdev_list to prevent duplicates means that we can remove > mdev_parent.lock, but in order not to serialize mdev device creation > and removal globally, we add mdev_device.active which allows UUIDs to > be reserved such that we can drop the mdev_list_lock before the mdev > device is fully in place. > > NB. there was never intended to be any serialization guarantee > provided by the mdev core with respect to creation and removal of mdev > devices, mdev_parent.lock provided this only as a side-effect of the > implementation for locking the namespace per parent. That > serialization is now removed. > >>> > >> > >> mdev_parent.lock is to serialize create and remove of that mdev device, > >> that handles race condition that Cornelia mentioned below. > > > > Previously it was stated: > > > > On Thu, 17 May 2018 01:01:40 +0530 > > Kirti Wankhede wrote: > >> Here lock is not for create/remove routines of vendor driver, its about > >> mdev device creation and device registration, which is a common code > >> path, and so is part of mdev core module. > > > > So the race condition was handled previously, but as a side-effect of > > protecting the namespace, aiui. I'm trying to state above that the > > serialization of create/remove was never intended as a guarantee > > provided to mdev vendor drivers. I don't see that there's a need to > > protect "mdev device creation and device registration" beyond conflicts > > in the UUID namespace, which is done here. Are there others? > > > > Sorry not being elaborative in my earlier response to > > > If we can > > show that vendor drivers handle the create/remove paths themselves, > > perhaps we can refine the locking granularity. > > mdev_device_create() function does : > - create mdev device > - register device > - call vendor driver->create > - create sysfs files. > > mdev_device_remove() removes sysfs files, unregister device and delete > device. > > There is common code in mdev_device_create() and mdev_device_remove() > independent of what vendor driver does in its create()/remove() > callback. Moving this code to each vendor driver to handle create/remove > themselves doesn't make sense to me. I don't see where anyone is suggesting that, I'm not. > mdev_parent.lock here does take care of race conditions that could occur > during mdev device creation and deletion in this common code path. Exactly what races in the common code path is mdev_parent.lock preventing? mdev_device_create() calls: device_register() mdev_device_create_ops() parent->ops->create() sysfs_create_groups() mdev_create_sysfs_files() sysfs_create_files() sysfs_create_link() sysfs_create_link() mdev_parent.lock is certainly not serializing all calls across the entire kernel to device_register and sysfs_create_{groups,files,link} so what is it protecting other than serializing parent->ops->create()? Locks protect data, not code. The data we're protecting is the shared mdev_list, there is no shared data once mdev_device_create() has its mdev_device with uuid reservation placed into that list. > What is the urge to remove mdev_parent.lock if that handles all race > conditions without bothering user to handle -EAGAIN? Can you say why -EAGAIN is undesirable? Note that the user is only going to see this error if they attempt to remove the device in the minuscule gap between the sysfs remove file being created and the completion of the write to the create sysfs file. It seems like you're asking that I decrease the locking granularity, but not too much because mdev_parent.lock protects "things". If the -EAGAIN is really so terrible, we can avoid it by spinning until the mdev_device is either not found in the list or becomes active, we don't need mdev_parent.lock to solve that, but I don't think that's the best solution and there's no concrete statement to back -EAGAIN being a problem. Thanks, Alex
Re: [PATCH v3] vfio/mdev: Check globally for duplicate devices
On 5/17/2018 9:50 PM, Alex Williamson wrote: > On Thu, 17 May 2018 21:25:22 +0530 > Kirti Wankhede wrote: > >> On 5/17/2018 1:39 PM, Cornelia Huck wrote: >>> On Wed, 16 May 2018 21:30:19 -0600 >>> Alex Williamson wrote: >>> When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. NB. there was never intended to be any serialization guarantee provided by the mdev core with respect to creation and removal of mdev devices, mdev_parent.lock provided this only as a side-effect of the implementation for locking the namespace per parent. That serialization is now removed. >>> >> >> mdev_parent.lock is to serialize create and remove of that mdev device, >> that handles race condition that Cornelia mentioned below. > > Previously it was stated: > > On Thu, 17 May 2018 01:01:40 +0530 > Kirti Wankhede wrote: >> Here lock is not for create/remove routines of vendor driver, its about >> mdev device creation and device registration, which is a common code >> path, and so is part of mdev core module. > > So the race condition was handled previously, but as a side-effect of > protecting the namespace, aiui. I'm trying to state above that the > serialization of create/remove was never intended as a guarantee > provided to mdev vendor drivers. I don't see that there's a need to > protect "mdev device creation and device registration" beyond conflicts > in the UUID namespace, which is done here. Are there others? > Sorry not being elaborative in my earlier response to > If we can > show that vendor drivers handle the create/remove paths themselves, > perhaps we can refine the locking granularity. mdev_device_create() function does : - create mdev device - register device - call vendor driver->create - create sysfs files. mdev_device_remove() removes sysfs files, unregister device and delete device. There is common code in mdev_device_create() and mdev_device_remove() independent of what vendor driver does in its create()/remove() callback. Moving this code to each vendor driver to handle create/remove themselves doesn't make sense to me. mdev_parent.lock here does take care of race conditions that could occur during mdev device creation and deletion in this common code path. What is the urge to remove mdev_parent.lock if that handles all race conditions without bothering user to handle -EAGAIN? Thanks, Kirti >>> This is probably fine; but I noted that documentation on the locking >>> conventions and serialization guarantees for mdev is a bit sparse, and >>> this topic also came up during the vfio-ap review. >>> >>> We probably want to add some more concrete documentation; would the >>> kernel doc for the _ops or vfio-mediated-device.txt be a better place >>> for that? > > I'll look to see where we can add a note withing that file, I suspect > that's the right place to put it. > >>> [Dong Jia, Halil: Can you please take a look whether vfio-ccw is really >>> ok? I don't think we open up any new races, but I'd appreciate a second >>> or third opinion.] >>> Signed-off-by: Alex Williamson --- v3: Rework locking and add a field to mdev_device so we can track completed instances vs those added to reserve the namespace. drivers/vfio/mdev/mdev_core.c| 94 +- drivers/vfio/mdev/mdev_private.h |2 - 2 files changed, 34 insertions(+), 62 deletions(-) diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c index 126991046eb7..55ea9d34ec69 100644 --- a/drivers/vfio/mdev/mdev_core.c +++ b/drivers/vfio/mdev/mdev_core.c @@ -66,34 +66,6 @@ uuid_le mdev_uuid(struct mdev_device *mdev) } EXPORT_SYMBOL(mdev_uuid); -static int _find_mdev_device(struct device *dev, void *data) -{ - struct mdev_device *mdev; - - if (!dev_is_mdev(dev)) - return 0; - - mdev = to_mdev_device(dev); - - if (uuid_le_cmp(mdev->uuid, *(uuid_le *)data) == 0) - return 1; - - return 0; -} - -static bool md
Re: [PATCH v3] vfio/mdev: Check globally for duplicate devices
On Thu, 17 May 2018 21:25:22 +0530 Kirti Wankhede wrote: > On 5/17/2018 1:39 PM, Cornelia Huck wrote: > > On Wed, 16 May 2018 21:30:19 -0600 > > Alex Williamson wrote: > > > >> When we create an mdev device, we check for duplicates against the > >> parent device and return -EEXIST if found, but the mdev device > >> namespace is global since we'll link all devices from the bus. We do > >> catch this later in sysfs_do_create_link_sd() to return -EEXIST, but > >> with it comes a kernel warning and stack trace for trying to create > >> duplicate sysfs links, which makes it an undesirable response. > >> > >> Therefore we should really be looking for duplicates across all mdev > >> parent devices, or as implemented here, against our mdev device list. > >> Using mdev_list to prevent duplicates means that we can remove > >> mdev_parent.lock, but in order not to serialize mdev device creation > >> and removal globally, we add mdev_device.active which allows UUIDs to > >> be reserved such that we can drop the mdev_list_lock before the mdev > >> device is fully in place. > >> > >> NB. there was never intended to be any serialization guarantee > >> provided by the mdev core with respect to creation and removal of mdev > >> devices, mdev_parent.lock provided this only as a side-effect of the > >> implementation for locking the namespace per parent. That > >> serialization is now removed. > > > > mdev_parent.lock is to serialize create and remove of that mdev device, > that handles race condition that Cornelia mentioned below. Previously it was stated: On Thu, 17 May 2018 01:01:40 +0530 Kirti Wankhede wrote: > Here lock is not for create/remove routines of vendor driver, its about > mdev device creation and device registration, which is a common code > path, and so is part of mdev core module. So the race condition was handled previously, but as a side-effect of protecting the namespace, aiui. I'm trying to state above that the serialization of create/remove was never intended as a guarantee provided to mdev vendor drivers. I don't see that there's a need to protect "mdev device creation and device registration" beyond conflicts in the UUID namespace, which is done here. Are there others? > > This is probably fine; but I noted that documentation on the locking > > conventions and serialization guarantees for mdev is a bit sparse, and > > this topic also came up during the vfio-ap review. > > > > We probably want to add some more concrete documentation; would the > > kernel doc for the _ops or vfio-mediated-device.txt be a better place > > for that? I'll look to see where we can add a note withing that file, I suspect that's the right place to put it. > > [Dong Jia, Halil: Can you please take a look whether vfio-ccw is really > > ok? I don't think we open up any new races, but I'd appreciate a second > > or third opinion.] > > > >> > >> Signed-off-by: Alex Williamson > >> --- > >> > >> v3: Rework locking and add a field to mdev_device so we can track > >> completed instances vs those added to reserve the namespace. > >> > >> drivers/vfio/mdev/mdev_core.c| 94 > >> +- > >> drivers/vfio/mdev/mdev_private.h |2 - > >> 2 files changed, 34 insertions(+), 62 deletions(-) > >> > >> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c > >> index 126991046eb7..55ea9d34ec69 100644 > >> --- a/drivers/vfio/mdev/mdev_core.c > >> +++ b/drivers/vfio/mdev/mdev_core.c > >> @@ -66,34 +66,6 @@ uuid_le mdev_uuid(struct mdev_device *mdev) > >> } > >> EXPORT_SYMBOL(mdev_uuid); > >> > >> -static int _find_mdev_device(struct device *dev, void *data) > >> -{ > >> - struct mdev_device *mdev; > >> - > >> - if (!dev_is_mdev(dev)) > >> - return 0; > >> - > >> - mdev = to_mdev_device(dev); > >> - > >> - if (uuid_le_cmp(mdev->uuid, *(uuid_le *)data) == 0) > >> - return 1; > >> - > >> - return 0; > >> -} > >> - > >> -static bool mdev_device_exist(struct mdev_parent *parent, uuid_le uuid) > >> -{ > >> - struct device *dev; > >> - > >> - dev = device_find_child(parent->dev, &uuid, _find_mdev_device); > >> - if (dev) { > >> - put_device(dev); > >> - return true; > >> - } > >> - > >> - return false; > >> -} > >> - > >> /* Should be called holding parent_list_lock */ > >> static struct mdev_parent *__find_parent_device(struct device *dev) > >> { > >> @@ -221,7 +193,6 @@ int mdev_register_device(struct device *dev, const > >> struct mdev_parent_ops *ops) > >>} > >> > >>kref_init(&parent->ref); > >> - mutex_init(&parent->lock); > >> > >>parent->dev = dev; > >>parent->ops = ops; > >> @@ -304,7 +275,7 @@ static void mdev_device_release(struct device *dev) > >> int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le > >> uuid) > >> { > >>int ret; > >> - struct mdev_device *mdev; > >> + struct mdev_device *mdev, *tmp; > >>struct mdev_parent *parent; > >>s
Re: [PATCH v3] vfio/mdev: Check globally for duplicate devices
On 5/17/2018 1:39 PM, Cornelia Huck wrote: > On Wed, 16 May 2018 21:30:19 -0600 > Alex Williamson wrote: > >> When we create an mdev device, we check for duplicates against the >> parent device and return -EEXIST if found, but the mdev device >> namespace is global since we'll link all devices from the bus. We do >> catch this later in sysfs_do_create_link_sd() to return -EEXIST, but >> with it comes a kernel warning and stack trace for trying to create >> duplicate sysfs links, which makes it an undesirable response. >> >> Therefore we should really be looking for duplicates across all mdev >> parent devices, or as implemented here, against our mdev device list. >> Using mdev_list to prevent duplicates means that we can remove >> mdev_parent.lock, but in order not to serialize mdev device creation >> and removal globally, we add mdev_device.active which allows UUIDs to >> be reserved such that we can drop the mdev_list_lock before the mdev >> device is fully in place. >> >> NB. there was never intended to be any serialization guarantee >> provided by the mdev core with respect to creation and removal of mdev >> devices, mdev_parent.lock provided this only as a side-effect of the >> implementation for locking the namespace per parent. That >> serialization is now removed. > mdev_parent.lock is to serialize create and remove of that mdev device, that handles race condition that Cornelia mentioned below. > This is probably fine; but I noted that documentation on the locking > conventions and serialization guarantees for mdev is a bit sparse, and > this topic also came up during the vfio-ap review. > > We probably want to add some more concrete documentation; would the > kernel doc for the _ops or vfio-mediated-device.txt be a better place > for that? > > [Dong Jia, Halil: Can you please take a look whether vfio-ccw is really > ok? I don't think we open up any new races, but I'd appreciate a second > or third opinion.] > >> >> Signed-off-by: Alex Williamson >> --- >> >> v3: Rework locking and add a field to mdev_device so we can track >> completed instances vs those added to reserve the namespace. >> >> drivers/vfio/mdev/mdev_core.c| 94 >> +- >> drivers/vfio/mdev/mdev_private.h |2 - >> 2 files changed, 34 insertions(+), 62 deletions(-) >> >> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c >> index 126991046eb7..55ea9d34ec69 100644 >> --- a/drivers/vfio/mdev/mdev_core.c >> +++ b/drivers/vfio/mdev/mdev_core.c >> @@ -66,34 +66,6 @@ uuid_le mdev_uuid(struct mdev_device *mdev) >> } >> EXPORT_SYMBOL(mdev_uuid); >> >> -static int _find_mdev_device(struct device *dev, void *data) >> -{ >> -struct mdev_device *mdev; >> - >> -if (!dev_is_mdev(dev)) >> -return 0; >> - >> -mdev = to_mdev_device(dev); >> - >> -if (uuid_le_cmp(mdev->uuid, *(uuid_le *)data) == 0) >> -return 1; >> - >> -return 0; >> -} >> - >> -static bool mdev_device_exist(struct mdev_parent *parent, uuid_le uuid) >> -{ >> -struct device *dev; >> - >> -dev = device_find_child(parent->dev, &uuid, _find_mdev_device); >> -if (dev) { >> -put_device(dev); >> -return true; >> -} >> - >> -return false; >> -} >> - >> /* Should be called holding parent_list_lock */ >> static struct mdev_parent *__find_parent_device(struct device *dev) >> { >> @@ -221,7 +193,6 @@ int mdev_register_device(struct device *dev, const >> struct mdev_parent_ops *ops) >> } >> >> kref_init(&parent->ref); >> -mutex_init(&parent->lock); >> >> parent->dev = dev; >> parent->ops = ops; >> @@ -304,7 +275,7 @@ static void mdev_device_release(struct device *dev) >> int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le >> uuid) >> { >> int ret; >> -struct mdev_device *mdev; >> +struct mdev_device *mdev, *tmp; >> struct mdev_parent *parent; >> struct mdev_type *type = to_mdev_type(kobj); >> >> @@ -312,21 +283,26 @@ int mdev_device_create(struct kobject *kobj, struct >> device *dev, uuid_le uuid) >> if (!parent) >> return -EINVAL; >> >> -mutex_lock(&parent->lock); >> +mutex_lock(&mdev_list_lock); >> >> /* Check for duplicate */ >> -if (mdev_device_exist(parent, uuid)) { >> -ret = -EEXIST; >> -goto create_err; >> +list_for_each_entry(tmp, &mdev_list, next) { >> +if (!uuid_le_cmp(tmp->uuid, uuid)) { >> +mutex_unlock(&mdev_list_lock); >> +return -EEXIST; >> +} >> } >> mdev_put_parent(parent) missing before return. >> mdev = kzalloc(sizeof(*mdev), GFP_KERNEL); >> if (!mdev) { >> -ret = -ENOMEM; >> -goto create_err; >> +mutex_unlock(&mdev_list_lock); >> +return -ENOMEM; >> } >> mdev_put_parent(parent) missing here again. Thanks, Kirti >>
Re: [PATCH v3] vfio/mdev: Check globally for duplicate devices
On Wed, 16 May 2018 21:30:19 -0600 Alex Williamson wrote: > When we create an mdev device, we check for duplicates against the > parent device and return -EEXIST if found, but the mdev device > namespace is global since we'll link all devices from the bus. We do > catch this later in sysfs_do_create_link_sd() to return -EEXIST, but > with it comes a kernel warning and stack trace for trying to create > duplicate sysfs links, which makes it an undesirable response. > > Therefore we should really be looking for duplicates across all mdev > parent devices, or as implemented here, against our mdev device list. > Using mdev_list to prevent duplicates means that we can remove > mdev_parent.lock, but in order not to serialize mdev device creation > and removal globally, we add mdev_device.active which allows UUIDs to > be reserved such that we can drop the mdev_list_lock before the mdev > device is fully in place. > > NB. there was never intended to be any serialization guarantee > provided by the mdev core with respect to creation and removal of mdev > devices, mdev_parent.lock provided this only as a side-effect of the > implementation for locking the namespace per parent. That > serialization is now removed. This is probably fine; but I noted that documentation on the locking conventions and serialization guarantees for mdev is a bit sparse, and this topic also came up during the vfio-ap review. We probably want to add some more concrete documentation; would the kernel doc for the _ops or vfio-mediated-device.txt be a better place for that? [Dong Jia, Halil: Can you please take a look whether vfio-ccw is really ok? I don't think we open up any new races, but I'd appreciate a second or third opinion.] > > Signed-off-by: Alex Williamson > --- > > v3: Rework locking and add a field to mdev_device so we can track > completed instances vs those added to reserve the namespace. > > drivers/vfio/mdev/mdev_core.c| 94 > +- > drivers/vfio/mdev/mdev_private.h |2 - > 2 files changed, 34 insertions(+), 62 deletions(-) > > diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c > index 126991046eb7..55ea9d34ec69 100644 > --- a/drivers/vfio/mdev/mdev_core.c > +++ b/drivers/vfio/mdev/mdev_core.c > @@ -66,34 +66,6 @@ uuid_le mdev_uuid(struct mdev_device *mdev) > } > EXPORT_SYMBOL(mdev_uuid); > > -static int _find_mdev_device(struct device *dev, void *data) > -{ > - struct mdev_device *mdev; > - > - if (!dev_is_mdev(dev)) > - return 0; > - > - mdev = to_mdev_device(dev); > - > - if (uuid_le_cmp(mdev->uuid, *(uuid_le *)data) == 0) > - return 1; > - > - return 0; > -} > - > -static bool mdev_device_exist(struct mdev_parent *parent, uuid_le uuid) > -{ > - struct device *dev; > - > - dev = device_find_child(parent->dev, &uuid, _find_mdev_device); > - if (dev) { > - put_device(dev); > - return true; > - } > - > - return false; > -} > - > /* Should be called holding parent_list_lock */ > static struct mdev_parent *__find_parent_device(struct device *dev) > { > @@ -221,7 +193,6 @@ int mdev_register_device(struct device *dev, const struct > mdev_parent_ops *ops) > } > > kref_init(&parent->ref); > - mutex_init(&parent->lock); > > parent->dev = dev; > parent->ops = ops; > @@ -304,7 +275,7 @@ static void mdev_device_release(struct device *dev) > int mdev_device_create(struct kobject *kobj, struct device *dev, uuid_le > uuid) > { > int ret; > - struct mdev_device *mdev; > + struct mdev_device *mdev, *tmp; > struct mdev_parent *parent; > struct mdev_type *type = to_mdev_type(kobj); > > @@ -312,21 +283,26 @@ int mdev_device_create(struct kobject *kobj, struct > device *dev, uuid_le uuid) > if (!parent) > return -EINVAL; > > - mutex_lock(&parent->lock); > + mutex_lock(&mdev_list_lock); > > /* Check for duplicate */ > - if (mdev_device_exist(parent, uuid)) { > - ret = -EEXIST; > - goto create_err; > + list_for_each_entry(tmp, &mdev_list, next) { > + if (!uuid_le_cmp(tmp->uuid, uuid)) { > + mutex_unlock(&mdev_list_lock); > + return -EEXIST; > + } > } > > mdev = kzalloc(sizeof(*mdev), GFP_KERNEL); > if (!mdev) { > - ret = -ENOMEM; > - goto create_err; > + mutex_unlock(&mdev_list_lock); > + return -ENOMEM; > } > > memcpy(&mdev->uuid, &uuid, sizeof(uuid_le)); > + list_add(&mdev->next, &mdev_list); > + mutex_unlock(&mdev_list_lock); > + > mdev->parent = parent; > kref_init(&mdev->ref); > > @@ -352,21 +328,18 @@ int mdev_device_create(struct kobject *kobj, struct > device *dev, uuid_le uuid) > } > > mdev->type_kobj = kobj; > + mdev->active = true; >