Re: [bug report] node: Add memory-side caching attributes

2021-04-01 Thread Keith Busch
On Thu, Apr 01, 2021 at 12:00:39PM +0300, Dan Carpenter wrote:
> Hi Keith,
> 
> I've been trying to figure out ways Smatch can check for device managed
> resources.  Like adding rules that if we call dev_set_name(>bar)
> then it's device managaged and if there is a kfree(foo) without calling
> device_put(>bar) then that's a resource leak.
> 
> Of course one of the rules is that if you call device_register(dev) then
> you can't kfree(dev), it has to released with device_put(dev) and that's
> true even if the register fails.  But this code here feels very
> intentional so maybe there is an exception to the rule?

Thanks for the notice. There was no intentional use of this, so I
think it was a mistake. The proposal in the follow on replies looks much
better.
 
> The patch acc02a109b04: "node: Add memory-side caching attributes"
> from Mar 11, 2019, leads to the following static checker warning:
> 
>   drivers/base/node.c:285 node_init_cache_dev()
>   error: kfree after device_register(): 'dev'
> 
> drivers/base/node.c
>263  static void node_init_cache_dev(struct node *node)
>264  {
>265  struct device *dev;
>266  
>267  dev = kzalloc(sizeof(*dev), GFP_KERNEL);
>268  if (!dev)
>269  return;
>270  
>271  dev->parent = >dev;
>272  dev->release = node_cache_release;
>273  if (dev_set_name(dev, "memory_side_cache"))
>274  goto free_dev;
>275  
>276  if (device_register(dev))
> ^^^
>277  goto free_name;
>278  
>279  pm_runtime_no_callbacks(dev);
>280  node->cache_dev = dev;
>281  return;
>282  free_name:
>283  kfree_const(dev->kobj.name);
>284  free_dev:
>285  kfree(dev);
> ^^
>286  }
> 
> regards,
> dan carpenter


Re: [bug report] node: Add memory-side caching attributes

2021-04-01 Thread Dan Carpenter
On Thu, Apr 01, 2021 at 08:25:11AM -0300, Jason Gunthorpe wrote:
> On Thu, Apr 01, 2021 at 12:00:39PM +0300, Dan Carpenter wrote:
> > Hi Keith,
> > 
> > I've been trying to figure out ways Smatch can check for device managed
> > resources.  Like adding rules that if we call dev_set_name(>bar)
> > then it's device managaged and if there is a kfree(foo) without calling
> > device_put(>bar) then that's a resource leak.
> 
> It seems to be working from what I can see

This check is actually more simple, and older.  It just looks for

device_register(dev);
...
kfree(dev);

I've written your proposed check of:

device_register(>dev);
...
kfree(foo); // warning missing device_put(>dev);

But I just did that earler today and it will probably take a couple
iterations to work out the kinks.  Plus I'm off for a small vacation so
it will be a week before I have the results from that.

> 
> Also I wasn't able to convince myself that any locking around
> node->cache_attrs exist..
> 
> > Of course one of the rules is that if you call device_register(dev) then
> > you can't kfree(dev), it has to released with device_put(dev) and that's
> > true even if the register fails.  But this code here feels very
> > intentional so maybe there is an exception to the rule?
> 
> There is no exception. Open coding this:
> 
> >282  free_name:
> >283  kfree_const(dev->kobj.name);
> 
> To avoid leaking memory from dev_set_name is a straight up layering
> violation, WTF?
> 
> node_cacheinfo_release() is just kfree(), so there is no need.
> Instead (please feel free to send this Dan):

Sure, I can send this (tomorrow).

> 
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index f449dbb2c74666..89c28952863977 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -319,25 +319,24 @@ void node_add_cache(unsigned int nid, struct 
> node_cache_attrs *cache_attrs)
>   return;
>  
>   dev = >dev;
> + device_initialize(dev)
>   dev->parent = node->cache_dev;
>   dev->release = node_cacheinfo_release;
>   dev->groups = cache_groups;
>   if (dev_set_name(dev, "index%d", cache_attrs->level))

Is calling dev_set_name() without doing a device_initialize() a bug?  I
could write a check for that.

regards,
dan carpenter



Re: [bug report] node: Add memory-side caching attributes

2021-04-01 Thread Jason Gunthorpe
On Thu, Apr 01, 2021 at 05:06:52PM +0300, Dan Carpenter wrote:

> > diff --git a/drivers/base/node.c b/drivers/base/node.c
> > index f449dbb2c74666..89c28952863977 100644
> > +++ b/drivers/base/node.c
> > @@ -319,25 +319,24 @@ void node_add_cache(unsigned int nid, struct 
> > node_cache_attrs *cache_attrs)
> > return;
> >  
> > dev = >dev;
> > +   device_initialize(dev)
> > dev->parent = node->cache_dev;
> > dev->release = node_cacheinfo_release;
> > dev->groups = cache_groups;
> > if (dev_set_name(dev, "index%d", cache_attrs->level))
> 
> Is calling dev_set_name() without doing a device_initialize() a bug?  I
> could write a check for that.

IMHO, yes.

However, Greg may not agree as dev_set_name() with no error check
followed by device_register() is a very common pattern. If the user
omits the device_initialize() then dev_set_name() must be immediately
before only device_register() with no error unwind between them. It
must not error unwind dev_set_name() to kfree. (This is really
tricky)

Greg and I have argued on the merits of device_initialize() several
times before.

I argue the error control flow is simpler and easier to get right, he
argues the extra statement is unneeded complexity and people will get
the error unwind wrong.

Every time you find a bug like this is someone getting the complexity
around error handling and device_register() wrong, so my advices is to
stop using device_register (aka device_init_and_add) and make it
explicit so the goto unwind has logical pairing. put_device() pairs
with device_initialize().

The tricky bit is establishing the release function, as complex
release functions often have complex init sequences and you need the
setup done enough to go to release. When things become this complex I
advocate splitting alloc into a function:

/* Caller must use put_device(>dev) */
struct foo *foo_allocate()
{ 
   foo = kzalloc
   allocate release freeing thing 1;
   allocate release freeing thing 2;
   allocate release freeing thing 3;
   device_initialize();
   return foo;

err:
   free thing 3;
err:
   free thing 2;
err:
   free thing 1;
err:
   kfree(foo)
   return rc;
}

Thus there is a clear logical seperation between the world of 'unwind
to kfree' and the world of 'unwind to put_device'. dev_set_name() can
not be inside an alloc function.

Simple cases, like here, should just do device_initialize()
immediately after kzalloc() and never have a kfree() error unwind.

I saw Christoph had a similar opinion on 'init and add' patterns being
bad, maybe he has additional colour to share.

Jason


Re: [bug report] node: Add memory-side caching attributes

2021-04-01 Thread Jason Gunthorpe
On Thu, Apr 01, 2021 at 12:00:39PM +0300, Dan Carpenter wrote:
> Hi Keith,
> 
> I've been trying to figure out ways Smatch can check for device managed
> resources.  Like adding rules that if we call dev_set_name(>bar)
> then it's device managaged and if there is a kfree(foo) without calling
> device_put(>bar) then that's a resource leak.

It seems to be working from what I can see

Also I wasn't able to convince myself that any locking around
node->cache_attrs exist..

> Of course one of the rules is that if you call device_register(dev) then
> you can't kfree(dev), it has to released with device_put(dev) and that's
> true even if the register fails.  But this code here feels very
> intentional so maybe there is an exception to the rule?

There is no exception. Open coding this:

>282  free_name:
>283  kfree_const(dev->kobj.name);

To avoid leaking memory from dev_set_name is a straight up layering
violation, WTF?

node_cacheinfo_release() is just kfree(), so there is no need.
Instead (please feel free to send this Dan):

diff --git a/drivers/base/node.c b/drivers/base/node.c
index f449dbb2c74666..89c28952863977 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -319,25 +319,24 @@ void node_add_cache(unsigned int nid, struct 
node_cache_attrs *cache_attrs)
return;
 
dev = >dev;
+   device_initialize(dev)
dev->parent = node->cache_dev;
dev->release = node_cacheinfo_release;
dev->groups = cache_groups;
if (dev_set_name(dev, "index%d", cache_attrs->level))
-   goto free_cache;
+   goto put_device;
 
info->cache_attrs = *cache_attrs;
-   if (device_register(dev)) {
+   if (device_add(dev)) {
dev_warn(>dev, "failed to add cache level:%d\n",
 cache_attrs->level);
-   goto free_name;
+   goto put_device
}
pm_runtime_no_callbacks(dev);
list_add_tail(>node, >cache_attrs);
return;
-free_name:
-   kfree_const(dev->kobj.name);
-free_cache:
-   kfree(info);
+put_device:
+   put_device(dev);
 }
 
 static void node_remove_caches(struct node *node)


[bug report] node: Add memory-side caching attributes

2021-04-01 Thread Dan Carpenter
Hi Keith,

I've been trying to figure out ways Smatch can check for device managed
resources.  Like adding rules that if we call dev_set_name(>bar)
then it's device managaged and if there is a kfree(foo) without calling
device_put(>bar) then that's a resource leak.

Of course one of the rules is that if you call device_register(dev) then
you can't kfree(dev), it has to released with device_put(dev) and that's
true even if the register fails.  But this code here feels very
intentional so maybe there is an exception to the rule?

The patch acc02a109b04: "node: Add memory-side caching attributes"
from Mar 11, 2019, leads to the following static checker warning:

drivers/base/node.c:285 node_init_cache_dev()
error: kfree after device_register(): 'dev'

drivers/base/node.c
   263  static void node_init_cache_dev(struct node *node)
   264  {
   265  struct device *dev;
   266  
   267  dev = kzalloc(sizeof(*dev), GFP_KERNEL);
   268  if (!dev)
   269  return;
   270  
   271  dev->parent = >dev;
   272  dev->release = node_cache_release;
   273  if (dev_set_name(dev, "memory_side_cache"))
   274  goto free_dev;
   275  
   276  if (device_register(dev))
^^^
   277  goto free_name;
   278  
   279  pm_runtime_no_callbacks(dev);
   280  node->cache_dev = dev;
   281  return;
   282  free_name:
   283  kfree_const(dev->kobj.name);
   284  free_dev:
   285  kfree(dev);
^^
   286  }

regards,
dan carpenter