Re: [PATCH v2 2/2] drivers: core: Remove glue dirs from sysfs earlier

2018-07-10 Thread Benjamin Herrenschmidt
On Tue, 2018-07-10 at 16:55 +0200, Greg Kroah-Hartman wrote:
> 
> > +/**
> > + * kobject_has_children - Returns whether a kobject has children.
> > + * @kobj: the object to test
> > + *
> > + * This will return whether a kobject has other kobjects as children.
> > + *
> > + * It does NOT account for the presence of attribute files, only sub
> > + * directories. It also assumes there is no concurrent addition or
> > + * removal of such children, and thus relies on external locking.
> > + */
> > +static inline bool kobject_has_children(struct kobject *kobj)
> > +{
> > +   WARN_ON_ONCE(kref_read(&kobj->kref) == 0);
> 
> Why warn on?  Who is going to hit this and how are you going to fix up
> the syzbot reports?  :)

Well, that's it, the hope is nobody ever hits it ... but if one does it
would be useful to get a backtrace to figure it out. You can shoot the
reports my way I suppose :-)

> Anyway, this looks good, I can just take this and not the 1/2 patch now,
> right?  I really didn't like that patch.

Yes, it will fix the practical problem. As for patch 1, it's rather
funny, you and Linus seem to have a completely opposite idea of how
this stuff should work :-)

Cheers,
Ben.

> thanks,
> 
> greg k-h


Re: [PATCH v2 2/2] drivers: core: Remove glue dirs from sysfs earlier

2018-07-10 Thread Greg Kroah-Hartman
On Tue, Jul 10, 2018 at 10:29:10AM +1000, Benjamin Herrenschmidt wrote:
> For devices with a class, we create a "glue" directory between
> the parent device and the new device with the class name.
> 
> This directory is never "explicitely" removed when empty however,
> this is left to the implicit sysfs removal done by kobject_release()
> when the object loses its last reference via kobject_put().
> 
> This is problematic because as long as it's not been removed from
> sysfs, it is still present in the class kset and in sysfs directory
> structure.
> 
> The presence in the class kset exposes a use after free bug fixed
> by the previous patch, but the presence in sysfs means that until
> the kobject is released, which can take a while (especially with
> kobject debugging), any attempt at re-creating such as binding a
> new device for that class/parent pair, will result in a sysfs
> duplicate file name error.
> 
> This fixes it by instead doing an explicit kobject_del() when
> the glue dir is empty, by keeping track of the number of
> child devices of the gluedir.
> 
> This is made easy by the fact that all glue dir operations are
> done with a global mutex, and there's already a function
> (cleanup_glue_dir) called in all the right places taking that
> mutex that can be enhanced for this. It appears that this was
> in fact the intent of the function, but the implementation was
> wrong.
> 
> Signed-off-by: Benjamin Herrenschmidt 
> ---
>  drivers/base/core.c |  2 ++
>  include/linux/kobject.h | 17 +
>  2 files changed, 19 insertions(+)
> 
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index e9eff2099896..93c0f8d1a447 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -1572,6 +1572,8 @@ static void cleanup_glue_dir(struct device *dev, struct 
> kobject *glue_dir)
>   return;
>  
>   mutex_lock(&gdp_mutex);
> + if (!kobject_has_children(glue_dir))
> + kobject_del(glue_dir);
>   kobject_put(glue_dir);
>   mutex_unlock(&gdp_mutex);
>  }
> diff --git a/include/linux/kobject.h b/include/linux/kobject.h
> index 7f6f93c3df9c..270b40515e79 100644
> --- a/include/linux/kobject.h
> +++ b/include/linux/kobject.h
> @@ -116,6 +116,23 @@ extern void kobject_put(struct kobject *kobj);
>  extern const void *kobject_namespace(struct kobject *kobj);
>  extern char *kobject_get_path(struct kobject *kobj, gfp_t flag);
>  
> +/**
> + * kobject_has_children - Returns whether a kobject has children.
> + * @kobj: the object to test
> + *
> + * This will return whether a kobject has other kobjects as children.
> + *
> + * It does NOT account for the presence of attribute files, only sub
> + * directories. It also assumes there is no concurrent addition or
> + * removal of such children, and thus relies on external locking.
> + */
> +static inline bool kobject_has_children(struct kobject *kobj)
> +{
> + WARN_ON_ONCE(kref_read(&kobj->kref) == 0);

Why warn on?  Who is going to hit this and how are you going to fix up
the syzbot reports?  :)

Anyway, this looks good, I can just take this and not the 1/2 patch now,
right?  I really didn't like that patch.

thanks,

greg k-h


Re: [PATCH v2 2/2] drivers: core: Remove glue dirs from sysfs earlier

2018-07-09 Thread Benjamin Herrenschmidt
On Mon, 2018-07-09 at 17:33 -0700, Linus Torvalds wrote:
> On Mon, Jul 9, 2018 at 5:29 PM Benjamin Herrenschmidt
>  wrote:
> > 
> > For devices with a class, we create a "glue" directory between
> > the parent device and the new device with the class name.
> > 
> > This directory is never "explicitely" removed when empty however,
> 
> explicitly
> 
> Is the mis-spelling why you had the quotes? I do find that spelling in
> the kernel, but not in drivers/base/.

No idea :-) Just my poor english, I'm not sure why I put quotes, I
think I meant *explictly* as enphasis, not sure.

> > This fixes it by instead doing an explicit kobject_del() when
> > the glue dir is empty, by keeping track of the number of
> > child devices of the gluedir.
> 
> Ack. This looks good to me.
> 
> I didn't see your 1/2 - you should probably re-send that one too so
> that Greg doesn't have to fish for it. But I'll Ack that one too in
> this same email regardless.

OK, I didn't re-send it. Greg just nak'ed it though :-)

Cheers,
Ben.


Re: [PATCH v2 2/2] drivers: core: Remove glue dirs from sysfs earlier

2018-07-09 Thread Linus Torvalds
On Mon, Jul 9, 2018 at 5:29 PM Benjamin Herrenschmidt
 wrote:
>
> For devices with a class, we create a "glue" directory between
> the parent device and the new device with the class name.
>
> This directory is never "explicitely" removed when empty however,

explicitly

Is the mis-spelling why you had the quotes? I do find that spelling in
the kernel, but not in drivers/base/.

> This fixes it by instead doing an explicit kobject_del() when
> the glue dir is empty, by keeping track of the number of
> child devices of the gluedir.

Ack. This looks good to me.

I didn't see your 1/2 - you should probably re-send that one too so
that Greg doesn't have to fish for it. But I'll Ack that one too in
this same email regardless.

Linus


[PATCH v2 2/2] drivers: core: Remove glue dirs from sysfs earlier

2018-07-09 Thread Benjamin Herrenschmidt
For devices with a class, we create a "glue" directory between
the parent device and the new device with the class name.

This directory is never "explicitely" removed when empty however,
this is left to the implicit sysfs removal done by kobject_release()
when the object loses its last reference via kobject_put().

This is problematic because as long as it's not been removed from
sysfs, it is still present in the class kset and in sysfs directory
structure.

The presence in the class kset exposes a use after free bug fixed
by the previous patch, but the presence in sysfs means that until
the kobject is released, which can take a while (especially with
kobject debugging), any attempt at re-creating such as binding a
new device for that class/parent pair, will result in a sysfs
duplicate file name error.

This fixes it by instead doing an explicit kobject_del() when
the glue dir is empty, by keeping track of the number of
child devices of the gluedir.

This is made easy by the fact that all glue dir operations are
done with a global mutex, and there's already a function
(cleanup_glue_dir) called in all the right places taking that
mutex that can be enhanced for this. It appears that this was
in fact the intent of the function, but the implementation was
wrong.

Signed-off-by: Benjamin Herrenschmidt 
---
 drivers/base/core.c |  2 ++
 include/linux/kobject.h | 17 +
 2 files changed, 19 insertions(+)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index e9eff2099896..93c0f8d1a447 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1572,6 +1572,8 @@ static void cleanup_glue_dir(struct device *dev, struct 
kobject *glue_dir)
return;
 
mutex_lock(&gdp_mutex);
+   if (!kobject_has_children(glue_dir))
+   kobject_del(glue_dir);
kobject_put(glue_dir);
mutex_unlock(&gdp_mutex);
 }
diff --git a/include/linux/kobject.h b/include/linux/kobject.h
index 7f6f93c3df9c..270b40515e79 100644
--- a/include/linux/kobject.h
+++ b/include/linux/kobject.h
@@ -116,6 +116,23 @@ extern void kobject_put(struct kobject *kobj);
 extern const void *kobject_namespace(struct kobject *kobj);
 extern char *kobject_get_path(struct kobject *kobj, gfp_t flag);
 
+/**
+ * kobject_has_children - Returns whether a kobject has children.
+ * @kobj: the object to test
+ *
+ * This will return whether a kobject has other kobjects as children.
+ *
+ * It does NOT account for the presence of attribute files, only sub
+ * directories. It also assumes there is no concurrent addition or
+ * removal of such children, and thus relies on external locking.
+ */
+static inline bool kobject_has_children(struct kobject *kobj)
+{
+   WARN_ON_ONCE(kref_read(&kobj->kref) == 0);
+
+   return kobj->sd && kobj->sd->dir.subdirs;
+}
+
 struct kobj_type {
void (*release)(struct kobject *kobj);
const struct sysfs_ops *sysfs_ops;