Re: [PATCH 1/2] thunderbolt: Fix a leak in tb_retimer_add()
On Mon, Mar 29, 2021 at 09:07:18AM +0300, Dan Carpenter wrote: > After the device_register() succeeds, then the correct way to clean up > is to call device_unregister(). The unregister calls both device_del() > and device_put(). Since this code was only device_del() it results in > a memory leak. > > Fixes: dacb12877d92 ("thunderbolt: Add support for on-board retimers") > Signed-off-by: Dan Carpenter > --- > This is from a new static checker warning. Not tested. With new > warnings it's also possible that I have misunderstood something > fundamental so review carefully etc. > > drivers/thunderbolt/retimer.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Both patches applied to thunderbolt.git/fixes. Thanks Dan!
Re: [PATCH 1/2] thunderbolt: Fix a leak in tb_retimer_add()
On Mon, Mar 29, 2021 at 11:54:05AM -0300, Jason Gunthorpe wrote: > On Mon, Mar 29, 2021 at 05:43:23PM +0300, Mika Westerberg wrote: > > > The nvm is a separate (physical Linux) device that gets added under this > > one. It cannot be added before AFAICT. > > Hum, yes, but then it is odd that a parent is holding sysfs attributes > that refer to a child. Well the child (NVMem) comes from completely different subsystem that does not have a concept of "authentication" or anythin similar. This is what we add on top. We actually exposer two NVMem devices under each retimer: one that is the current active one, and then the one that is used to write the new firmware image. > > The code you refer actually looks like this: > > > > static ssize_t nvm_authenticate_store(struct device *dev, > > struct device_attribute *attr, const char *buf, size_t count) > > { > > ... > > if (!mutex_trylock(>tb->lock)) { > > ret = restart_syscall(); > > goto exit_rpm; > > } > > Is that lock held during tb_retimer_nvm_add() I looked for a bit and > didn't find something. So someplace more than 4 call site above > mandatory locking is being held? Yes it is. It is called from tb_scan_port() where that lock is held. > static void tb_retimer_remove(struct tb_retimer *rt) > { > dev_info(>dev, "retimer disconnected\n"); > tb_nvm_free(rt->nvm); > device_unregister(>dev); > } > > Here too? Yes. > And this is why it is all trylock because it deadlocks with unregister > otherwise? I tried to explain it in 09f11b6c99fe ("thunderbolt: Take domain lock in switch sysfs attribute callbacks"), except that at that time we did not have retimers exposed but the same applies here too.
Re: [PATCH 1/2] thunderbolt: Fix a leak in tb_retimer_add()
On Mon, Mar 29, 2021 at 05:43:23PM +0300, Mika Westerberg wrote: > The nvm is a separate (physical Linux) device that gets added under this > one. It cannot be added before AFAICT. Hum, yes, but then it is odd that a parent is holding sysfs attributes that refer to a child. > The code you refer actually looks like this: > > static ssize_t nvm_authenticate_store(struct device *dev, > struct device_attribute *attr, const char *buf, size_t count) > { > ... > if (!mutex_trylock(>tb->lock)) { > ret = restart_syscall(); > goto exit_rpm; > } Is that lock held during tb_retimer_nvm_add() I looked for a bit and didn't find something. So someplace more than 4 call site above mandatory locking is being held? static void tb_retimer_remove(struct tb_retimer *rt) { dev_info(>dev, "retimer disconnected\n"); tb_nvm_free(rt->nvm); device_unregister(>dev); } Here too? And this is why it is all trylock because it deadlocks with unregister otherwise? Jason
Re: [PATCH 1/2] thunderbolt: Fix a leak in tb_retimer_add()
Hi, On Mon, Mar 29, 2021 at 10:02:20AM -0300, Jason Gunthorpe wrote: > On Mon, Mar 29, 2021 at 09:07:18AM +0300, Dan Carpenter wrote: > > After the device_register() succeeds, then the correct way to clean up > > is to call device_unregister(). The unregister calls both device_del() > > and device_put(). Since this code was only device_del() it results in > > a memory leak. > > > > Fixes: dacb12877d92 ("thunderbolt: Add support for on-board retimers") > > Signed-off-by: Dan Carpenter > > --- > > This is from a new static checker warning. Not tested. With new > > warnings it's also possible that I have misunderstood something > > fundamental so review carefully etc. > > It looks OK to me I agree too. > Reviewed-by: Jason Gunthorpe Thanks for the review! > This also highlights the code has an ordering issue too, it calls > device_register() then goes to do tb_retimer_nvm_add() however > device_register() makes sysfs attributes visible before the rt->nvm is > initialized and this: > > static ssize_t nvm_authenticate_store(struct device *dev, > struct device_attribute *attr, const char *buf, size_t count) > { > if (!rt->nvm) { > > Isn't strong enough to close the potential racing. The nvm should be > setup before device_register and all the above tests in the sysfs > deleted so we can rely on the CPU barriers built into > device_register() for correctness. > > [which is a general tip, be very suspicious if device_register() is > being error unwound] The nvm is a separate (physical Linux) device that gets added under this one. It cannot be added before AFAICT. The code you refer actually looks like this: static ssize_t nvm_authenticate_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { ... if (!mutex_trylock(>tb->lock)) { ret = restart_syscall(); goto exit_rpm; } if (!rt->nvm) { ret = -EAGAIN; goto exit_unlock; } Idea here is that if the NVMem (nvm) is not yet registered the attribute is there but we return -EAGAIN to the userspace.
Re: [PATCH 1/2] thunderbolt: Fix a leak in tb_retimer_add()
On Mon, Mar 29, 2021 at 09:07:18AM +0300, Dan Carpenter wrote: > After the device_register() succeeds, then the correct way to clean up > is to call device_unregister(). The unregister calls both device_del() > and device_put(). Since this code was only device_del() it results in > a memory leak. > > Fixes: dacb12877d92 ("thunderbolt: Add support for on-board retimers") > Signed-off-by: Dan Carpenter > --- > This is from a new static checker warning. Not tested. With new > warnings it's also possible that I have misunderstood something > fundamental so review carefully etc. It looks OK to me Reviewed-by: Jason Gunthorpe This also highlights the code has an ordering issue too, it calls device_register() then goes to do tb_retimer_nvm_add() however device_register() makes sysfs attributes visible before the rt->nvm is initialized and this: static ssize_t nvm_authenticate_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { if (!rt->nvm) { Isn't strong enough to close the potential racing. The nvm should be setup before device_register and all the above tests in the sysfs deleted so we can rely on the CPU barriers built into device_register() for correctness. [which is a general tip, be very suspicious if device_register() is being error unwound] Jason
[PATCH 1/2] thunderbolt: Fix a leak in tb_retimer_add()
After the device_register() succeeds, then the correct way to clean up is to call device_unregister(). The unregister calls both device_del() and device_put(). Since this code was only device_del() it results in a memory leak. Fixes: dacb12877d92 ("thunderbolt: Add support for on-board retimers") Signed-off-by: Dan Carpenter --- This is from a new static checker warning. Not tested. With new warnings it's also possible that I have misunderstood something fundamental so review carefully etc. drivers/thunderbolt/retimer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/thunderbolt/retimer.c b/drivers/thunderbolt/retimer.c index 620bcf586ee2..7a5d61604c8b 100644 --- a/drivers/thunderbolt/retimer.c +++ b/drivers/thunderbolt/retimer.c @@ -347,7 +347,7 @@ static int tb_retimer_add(struct tb_port *port, u8 index, u32 auth_status) ret = tb_retimer_nvm_add(rt); if (ret) { dev_err(>dev, "failed to add NVM devices: %d\n", ret); - device_del(>dev); + device_unregister(>dev); return ret; } -- 2.30.2