mpe asked me to clarify that this was correct. Turns out it's not.

Currently, cxl_probe(pdev):
 1) calls pci_dev_get(pdev)
 2) calls cxl_adapter_init
    a) init calls cxl_adapter_alloc, which creates a struct cxl, 
       conventionally called adapter. This struct contains a 
       device entry, adapter->dev.

    b) init calls cxl_configure_adapter, where we set 
       adapter->dev.parent = &dev->dev (here dev is the pci dev)

So at this point, the cxl adapter's device's parent is the pci device
that I want to be refcounted.

    c) init calls cxl_register_adapter (which inexplicably is in file.c)

       *) cxl_register_adapter calls device_register(&adapter->dev) 

So now we're in device_register, where dev is the adapter device, and we
want to know if the PCI device is safe after we return.

device_register(&adapter->dev) calls device_initialize() and then
device_add().

device_add() does a get_device(). That ends up being a kobject_get() on
the adapter device kobj, which will increment the kref on the adapter
device. The PCI device doesn't get it's kref incremented explicitly.

It looks like we're not protected against that disappearing randomly.

So thanks, mpe, that was a good catch. I'll do a v2 that keeps the get
and adds a matching put.

Regards,
Daniel

On Wed, 2015-09-09 at 15:17 +1000, Daniel Axtens wrote:
> Currently the first thing we do in cxl_probe is to grab a reference
> on the pci device. Later on, we call device_register on our adapter,
> which also holds the PCI device.
> 
> In our remove path, we call device_unregister, but we never call
> pci_dev_put. We therefore leak the device every time we do a
> reflash.
> 
> device_register/unregister is sufficient to hold the reference.
> Drop the call to pci_dev_get.
> 
> Fixes: f204e0b8cedd ("cxl: Driver code for powernv PCIe based cards for 
> userspace access")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Daniel Axtens <d...@axtens.net>
> 
> ---
> 
> This is the cxl bug that caused me to catch this a few weeks back:
> e642d11bdbfe ("powerpc/eeh: Probe after unbalanced kref check")
> 
> I put an printk in the unbalanced kref path and confirmed that it
> was printed with the pci_dev_get in and went away with the
> pci_dev_get out.
> ---
>  drivers/misc/cxl/pci.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
> index 02c85160bfe9..a5e977192b61 100644
> --- a/drivers/misc/cxl/pci.c
> +++ b/drivers/misc/cxl/pci.c
> @@ -1249,8 +1249,6 @@ static int cxl_probe(struct pci_dev *dev, const struct 
> pci_device_id *id)
>       int slice;
>       int rc;
>  
> -     pci_dev_get(dev);
> -
>       if (cxl_verbose)
>               dump_cxl_config_space(dev);
>  

-- 
Regards,
Daniel

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to