On Tue Mar 31, 2026 at 10:06 AM CEST, Danilo Krummrich wrote:
> On Mon Mar 30, 2026 at 10:10 PM CEST, Alex Williamson wrote:
>> On Mon, 30 Mar 2026 19:38:41 +0200
>> "Danilo Krummrich" <[email protected]> wrote:
>>
>>> (Cc: Jason)
>>> 
>>> On Tue Mar 24, 2026 at 1:59 AM CET, Danilo Krummrich wrote:
>>> > diff --git a/drivers/vfio/pci/vfio_pci_core.c 
>>> > b/drivers/vfio/pci/vfio_pci_core.c
>>> > index d43745fe4c84..460852f79f29 100644
>>> > --- a/drivers/vfio/pci/vfio_pci_core.c
>>> > +++ b/drivers/vfio/pci/vfio_pci_core.c
>>> > @@ -1987,9 +1987,8 @@ static int vfio_pci_bus_notifier(struct 
>>> > notifier_block *nb,
>>> >       pdev->is_virtfn && physfn == vdev->pdev) {
>>> >           pci_info(vdev->pdev, "Captured SR-IOV VF %s driver_override\n",
>>> >                    pci_name(pdev));
>>> > -         pdev->driver_override = kasprintf(GFP_KERNEL, "%s",
>>> > -                                           vdev->vdev.ops->name);
>>> > -         WARN_ON(!pdev->driver_override);
>>> > +         WARN_ON(device_set_driver_override(&pdev->dev,
>>> > +                                            vdev->vdev.ops->name));  
>>> 
>>> Technically, this is a change in behavior. If vdev->vdev.ops->name is NULL, 
>>> it
>>> will trigger the WARN_ON(), whereas before it would have just written 
>>> "(null)"
>>> into driver_override.
>>
>> It's worse than that.  Looking at the implementation in [1], we have:
>>
>> +static inline int device_set_driver_override(struct device *dev, const char 
>> *s)
>> +{
>> +    return __device_set_driver_override(dev, s, strlen(s));
>> +}
>>
>> So if name is NULL, we oops in strlen() before we even hit the -EINVAL
>> and WARN_ON().
>
> This was changed in v2 [2] and the actual code in-tree is
>
>       static inline int device_set_driver_override(struct device *dev, const 
> char *s)
>       {
>               return __device_set_driver_override(dev, s, s ? strlen(s) : 0);
>       }
>
> so it does indeed return -EINVAL for a NULL pointer.
>
>> I don't believe we have any vfio-pci variant drivers where the name is
>> NULL, but kasprintf() handling NULL as "(null)" was a consideration in
>> this design, that even if there is no name the device is sequestered
>> with a driver_override that won't match an actual driver.
>>
>>> I assume that vfio_pci_core drivers are expected to set the name in struct
>>> vfio_device_ops in the first place and this code (silently) relies on this
>>> invariant?
>>
>> We do expect that, but it was previously safe either way to make sure
>> VFs are only bound to the same ops driver or barring that, at least
>> don't perform a standard driver match.  The last thing we want to
>> happen automatically is for a user owned PF to create SR-IOV VFs that
>> automatically bind to native kernel drivers.
>>  
>>> Alex, Jason: Should we keep this hunk above as is and check for a proper 
>>> name in
>>> struct vfio_device_ops in vfio_pci_core_register_device() with a subsequent
>>> patch?
>>
>> Given the oops, my preference would be to roll it in here.  This change
>> is what makes it a requirement that name cannot be NULL, where this was
>> safely handled with kasprintf().
>
> Again, no oops here. :)
>
> I still think it makes more sense to fail early in
> vfio_pci_core_register_device(), rather than silently accept "(null)" in
> driver_override. It also doesn't seem unreasonable with only the WARN_ON(), 
> but
> I can also just add vdev->vdev.ops->name ?: "(null)".

(Or just skip the call if !vdev->vdev.ops->name, as a user will read "(null)"
from sysfs either way.)

> Please let me know what you prefer.
>
> - Danilo
>
>> [1] https://lore.kernel.org/all/[email protected]/
>
> [2] 
> https://lore.kernel.org/driver-core/[email protected]/


Reply via email to