Re: [PATCH 1/2] thunderbolt: Fix a leak in tb_retimer_add()

2021-03-30 Thread Mika Westerberg
On Mon, Mar 29, 2021 at 09:07:18AM +0300, Dan Carpenter wrote:
> After the device_register() succeeds, then the correct way to clean up
> is to call device_unregister().  The unregister calls both device_del()
> and device_put().  Since this code was only device_del() it results in
> a memory leak.
> 
> Fixes: dacb12877d92 ("thunderbolt: Add support for on-board retimers")
> Signed-off-by: Dan Carpenter 
> ---
> This is from a new static checker warning.  Not tested.  With new
> warnings it's also possible that I have misunderstood something
> fundamental so review carefully etc.
> 
>  drivers/thunderbolt/retimer.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Both patches applied to thunderbolt.git/fixes. Thanks Dan!


Re: [PATCH 1/2] thunderbolt: Fix a leak in tb_retimer_add()

2021-03-29 Thread Mika Westerberg
On Mon, Mar 29, 2021 at 11:54:05AM -0300, Jason Gunthorpe wrote:
> On Mon, Mar 29, 2021 at 05:43:23PM +0300, Mika Westerberg wrote:
> 
> > The nvm is a separate (physical Linux) device that gets added under this
> > one. It cannot be added before AFAICT.
> 
> Hum, yes, but then it is odd that a parent is holding sysfs attributes
> that refer to a child.

Well the child (NVMem) comes from completely different subsystem that
does not have a concept of "authentication" or anythin similar. This is
what we add on top. We actually exposer two NVMem devices under each
retimer: one that is the current active one, and then the one that is
used to write the new firmware image.

> > The code you refer actually looks like this:
> > 
> > static ssize_t nvm_authenticate_store(struct device *dev,
> > struct device_attribute *attr, const char *buf, size_t count)
> > {
> > ...
> > if (!mutex_trylock(>tb->lock)) {
> > ret = restart_syscall();
> > goto exit_rpm;
> > }
> 
> Is that lock held during tb_retimer_nvm_add() I looked for a bit and
> didn't find something. So someplace more than 4 call site above
> mandatory locking is being held?

Yes it is. It is called from tb_scan_port() where that lock is held.

> static void tb_retimer_remove(struct tb_retimer *rt)
> {
>   dev_info(>dev, "retimer disconnected\n");
>   tb_nvm_free(rt->nvm);
>   device_unregister(>dev);
> }
> 
> Here too?

Yes.

> And this is why it is all trylock because it deadlocks with unregister
> otherwise?

I tried to explain it in 09f11b6c99fe ("thunderbolt: Take domain lock in
switch sysfs attribute callbacks"), except that at that time we did not
have retimers exposed but the same applies here too.


Re: [PATCH 1/2] thunderbolt: Fix a leak in tb_retimer_add()

2021-03-29 Thread Jason Gunthorpe
On Mon, Mar 29, 2021 at 05:43:23PM +0300, Mika Westerberg wrote:

> The nvm is a separate (physical Linux) device that gets added under this
> one. It cannot be added before AFAICT.

Hum, yes, but then it is odd that a parent is holding sysfs attributes
that refer to a child.

> The code you refer actually looks like this:
> 
> static ssize_t nvm_authenticate_store(struct device *dev,
>   struct device_attribute *attr, const char *buf, size_t count)
> {
>   ...
> if (!mutex_trylock(>tb->lock)) {
> ret = restart_syscall();
> goto exit_rpm;
> }

Is that lock held during tb_retimer_nvm_add() I looked for a bit and
didn't find something. So someplace more than 4 call site above
mandatory locking is being held?

static void tb_retimer_remove(struct tb_retimer *rt)
{
dev_info(>dev, "retimer disconnected\n");
tb_nvm_free(rt->nvm);
device_unregister(>dev);
}

Here too?

And this is why it is all trylock because it deadlocks with unregister
otherwise?

Jason


Re: [PATCH 1/2] thunderbolt: Fix a leak in tb_retimer_add()

2021-03-29 Thread Mika Westerberg
Hi,

On Mon, Mar 29, 2021 at 10:02:20AM -0300, Jason Gunthorpe wrote:
> On Mon, Mar 29, 2021 at 09:07:18AM +0300, Dan Carpenter wrote:
> > After the device_register() succeeds, then the correct way to clean up
> > is to call device_unregister().  The unregister calls both device_del()
> > and device_put().  Since this code was only device_del() it results in
> > a memory leak.
> > 
> > Fixes: dacb12877d92 ("thunderbolt: Add support for on-board retimers")
> > Signed-off-by: Dan Carpenter 
> > ---
> > This is from a new static checker warning.  Not tested.  With new
> > warnings it's also possible that I have misunderstood something
> > fundamental so review carefully etc.
> 
> It looks OK to me

I agree too.

> Reviewed-by: Jason Gunthorpe 

Thanks for the review!

> This also highlights the code has an ordering issue too, it calls
> device_register() then goes to do tb_retimer_nvm_add() however
> device_register() makes sysfs attributes visible before the rt->nvm is
> initialized and this:
> 
> static ssize_t nvm_authenticate_store(struct device *dev,
>   struct device_attribute *attr, const char *buf, size_t count)
> {
>   if (!rt->nvm) {
> 
> Isn't strong enough to close the potential racing. The nvm should be
> setup before device_register and all the above tests in the sysfs
> deleted so we can rely on the CPU barriers built into
> device_register() for correctness.
> 
> [which is a general tip, be very suspicious if device_register() is
> being error unwound]

The nvm is a separate (physical Linux) device that gets added under this
one. It cannot be added before AFAICT.

The code you refer actually looks like this:

static ssize_t nvm_authenticate_store(struct device *dev,
struct device_attribute *attr, const char *buf, size_t count)
{
...
if (!mutex_trylock(>tb->lock)) {
ret = restart_syscall();
goto exit_rpm;
}

if (!rt->nvm) {
ret = -EAGAIN;
goto exit_unlock;
}


Idea here is that if the NVMem (nvm) is not yet registered the attribute is
there but we return -EAGAIN to the userspace.


Re: [PATCH 1/2] thunderbolt: Fix a leak in tb_retimer_add()

2021-03-29 Thread Jason Gunthorpe
On Mon, Mar 29, 2021 at 09:07:18AM +0300, Dan Carpenter wrote:
> After the device_register() succeeds, then the correct way to clean up
> is to call device_unregister().  The unregister calls both device_del()
> and device_put().  Since this code was only device_del() it results in
> a memory leak.
> 
> Fixes: dacb12877d92 ("thunderbolt: Add support for on-board retimers")
> Signed-off-by: Dan Carpenter 
> ---
> This is from a new static checker warning.  Not tested.  With new
> warnings it's also possible that I have misunderstood something
> fundamental so review carefully etc.

It looks OK to me

Reviewed-by: Jason Gunthorpe 

This also highlights the code has an ordering issue too, it calls
device_register() then goes to do tb_retimer_nvm_add() however
device_register() makes sysfs attributes visible before the rt->nvm is
initialized and this:

static ssize_t nvm_authenticate_store(struct device *dev,
struct device_attribute *attr, const char *buf, size_t count)
{
if (!rt->nvm) {

Isn't strong enough to close the potential racing. The nvm should be
setup before device_register and all the above tests in the sysfs
deleted so we can rely on the CPU barriers built into
device_register() for correctness.

[which is a general tip, be very suspicious if device_register() is
being error unwound]

Jason


[PATCH 1/2] thunderbolt: Fix a leak in tb_retimer_add()

2021-03-29 Thread Dan Carpenter
After the device_register() succeeds, then the correct way to clean up
is to call device_unregister().  The unregister calls both device_del()
and device_put().  Since this code was only device_del() it results in
a memory leak.

Fixes: dacb12877d92 ("thunderbolt: Add support for on-board retimers")
Signed-off-by: Dan Carpenter 
---
This is from a new static checker warning.  Not tested.  With new
warnings it's also possible that I have misunderstood something
fundamental so review carefully etc.

 drivers/thunderbolt/retimer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thunderbolt/retimer.c b/drivers/thunderbolt/retimer.c
index 620bcf586ee2..7a5d61604c8b 100644
--- a/drivers/thunderbolt/retimer.c
+++ b/drivers/thunderbolt/retimer.c
@@ -347,7 +347,7 @@ static int tb_retimer_add(struct tb_port *port, u8 index, 
u32 auth_status)
ret = tb_retimer_nvm_add(rt);
if (ret) {
dev_err(>dev, "failed to add NVM devices: %d\n", ret);
-   device_del(>dev);
+   device_unregister(>dev);
return ret;
}
 
-- 
2.30.2