Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-05-02 Thread Dixit, Ashutosh
On Sun, 01 May 2022 23:22:02 -0700, Andrzej Hajda wrote:
> On 29.04.2022 06:25, Dixit, Ashutosh wrote:
> > On Thu, 28 Apr 2022 07:36:14 -0700, Andrzej Hajda wrote:
> >> See [1], it is quite old, so maybe it is not valid anymore, but I see no
> >> code proving sth has changed.
> > Hi Andrzej,
> >
> > A lot has changed since that article from 2003 (for 2.5 kernel). For
> > instance there is kernfs (as I mention above):
> >
> > https://lwn.net/Articles/571590/
> >
> > A process having a sysfs file open today in my view will result in the
> > following:
> > * It will take a reference on kernfs_node (not on kobject as was the case
> >in kernel 2.5 in [1])
> > * An open file will prevent the module from being unloaded (not the kernel
> >crashing as in 2.5 in [1])
>
> Thats nice, but kernfs_node->priv still points to kobject so their
> lifetimes are bounded.

Yes, of course there has to be some connection between the kernfs and kobject.

> > So this is what I would expect with today's kernel. I am not seeing
> > anything we've done here which violates anything in [1] or [2].
> >> Also current doc says also [2] similar things, especially:
> >> "Once you registered your kobject via kobject_add(), you must never use
> >> kfree() to free it directly"
> > Correct, we are using kobject_put(), not kfree'ing the kobject.
>
> That I wouldn't agree. kobject_put is called, then the object in which
> kobject is embedded is kfree'd somewhere later on driver removal, without
> awareness of this kobject.  According to your analysis it should have 0
> refs, but this is analysis of the current code, even if it is true now it
> could change in the future.

Yes but we cannot anticipate all changes which can happen in the future,
(though we should handle and make any changes which we can anticipate at
present). This is also basically what the kernel philosophy is, don't make
unnecessary generalizations and try to handle unforseen situations which
can happen in the future.

Let me add some explanations about the patch before addressing your next
point.

1. We are adding 'struct kobject sysfs_gt' to 'struct intel_gt'. We are
   adding the kobject directly, not pointer to kobject. This allows us to
   "reach" 'struct intel_gt' from the kobject using a simple container_of:
   see kobj_to_gt().

2. Because the kobject is not kmalloc'd it cannot be kfree'd so the release
   method has to be empty (or NULL). 'struct intel_gt' is kmalloc'd
   separately elsewhere and memory for the kobject will be freed as part of
   intel_gt.

3. To provide a NULL or empty release method we need to provide a 'struct
   kobj_type kobj_gt_type' associated with the sysfs_gt kobject. This works
   nicely because we were anyway need one for .default_groups (we may add
   other attributes to 'id_groups' in the future). Note that the kobject is
   initialized and added to sysfs using kobject_init_and_add().

4. The only reason for providing an empty release method rather than a NULL
   release method is the following pr_debug in kobject_cleanup():

   if (t && !t->release)
pr_debug("kobject: '%s' (%p): does not have a release() 
function, it is broken and must be fixed. See 
Documentation/core-api/kobject.rst.\n",
 kobject_name(kobj), kobj);

  This statement could possibly be removed because the release method is
  not needed in the case I just described above, maybe I'll send a patch to
  suggest removing it. Though I think what they will say is that since NULL
  release methods are uncommon maybe just provide an empty release method
  when you need a NULL release method (which is what I have done in the
  patch).

  Also note that, as described below, there are several other cases in the
  kernel which either have NULL or an empty release methods. See below.

> And IMO it is against docs[2]:
> - "One important point cannot be overstated: every kobject must have a
> release() method, and the kobject must persist (in a consistent state)
> until that method is called. If these constraints are not met, the code is
> flawed." - empty release method means clearly it is against the docs.
> -"The end result is that a structure protected by a kobject cannot be freed
> before its reference count goes to zero. The reference count is not under
> the direct control of the code which created the kobject.".
>
> So either docs and part of kobject code were not updated to reflect
> changes you are assuming, either your assumption is incorrect.

In my view the doc is a general introduction to kobjects and simplifies
things. As shown below there are numerous examples in the kernel of both
NULL and empty release methods. I just went with the empty method because
of the reason mentioned above.

> Looking at other users of kobject it seems they follow docs, their
> release method either frees memory directly either kref_put on containing
> struct, it was just quick scan so I could overlooked sth.
>
> >> [1]: 

Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-05-02 Thread Andrzej Hajda




On 29.04.2022 06:25, Dixit, Ashutosh wrote:

On Thu, 28 Apr 2022 07:36:14 -0700, Andrzej Hajda wrote:

On 27.04.2022 22:46, Dixit, Ashutosh wrote:

On Sun, 24 Apr 2022 15:36:23 -0700, Andi Shyti wrote:

Hi Andrzej and Ashutosh,


b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index 937b2e1a305e..4c72b4f983a6 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -222,6 +222,9 @@ struct intel_gt {
} mocs;
struct intel_pxp pxp;
+
+   /* gt/gtN sysfs */
+   struct kobject sysfs_gtn;

If you put kobject as a part of intel_gt what assures you that lifetime of
kobject is shorter than intel_gt? Ie its refcounter is 0 on removal of
intel_gt?

Because we are explicitly doing a kobject_put() in
intel_gt_sysfs_unregister(). Which is exactly what we are *not* doing in
the previous code.

Let me explain a bit about the previous code (but feel free to skip since
the patch should speak for itself):
* Previously we kzalloc a 'struct kobj_gt'
* But we don't save a pointer to the 'struct kobj_gt' so we don't have the
 pointer to the kobject to be able to do a kobject_put() on it later
* Therefore we need to store the pointer in 'struct intel_gt'
* But if we have to put the pointer in 'struct intel_gt' we might as well
 put the kobject as part of 'struct intel_gt' and that also removes the
 need to have a 'struct kobj_gt' (kobj_to_gt() can just use container_of()
 to get gt from kobj).
* So I think this patch simpler/cleaner than the original code if you take
 the requirement for kobject_put() into account.

This is my oversight. This was something I completely forgot to
fix but it was my intention to do and actually I had some fixes
ongoing. But because this patch took too long to get in I
completely forgot about it (Sujaritha was actually the first who
pointed this out).

Thanks, Ashutosh for taking this.


I fully agree that previous code is incorrect but I am not convinced current
code is correct.
If some objects are kref-counted it means usually they can have multiple
concurrent users and kobject_put does not work as traditional
destructor/cleanup/unregister.
So in this particular case after calling kobject_init_and_add sysfs core can
get multiple references on the object. Later, during driver unregistration
kobject_put is called, but if the object is still in use by sysfs core, the
object will not be destroyed/released. If the driver unregistration
continues memory will be freed, leaving sysfs-core (or other users) with
dangling pointers. Unless there is some additional synchronization mechanism
I am not aware of.

Thanks Andrzej for summarizing this and what you said is actually
what happens. I had a similar solution developed and I had wrong
pointer reference happening.

Hi Andrzej/Andi,

I did do some research into kobject's and such before writing this patch
and based on that I believe the patch is correct. Presenting some evidence
below.

The patch is verified by:

a. Putting a printk in the release() method when it exists (it does for
 sysfs_gtn kobject)
b. Enabling dynamic prints for lib/kobject.c

For example, with the following:

# echo 'file kobject.c +p' > /sys/kernel/debug/dynamic_debug/control
# echo -n ":03:00.0" > /sys/bus/pci/drivers/i915/unbind

We see this in dmesg (see kobject_cleanup() called from kobject_put()):

[ 1034.930007] kobject: '.defaults' (88817130a640): kobject_cleanup, parent 
8882262b5778
[ 1034.930020] kobject: '.defaults' (88817130a640): auto cleanup kobject_del
[ 1034.930336] kobject: '.defaults' (88817130a640): calling ktype release
[ 1034.930340] kobject: (88817130a640): dynamic_kobj_release
[ 1034.930354] kobject: '.defaults': free name
[ 1034.930366] kobject: 'gt0' (8882262b5778): kobject_cleanup, parent 
88817130a240
[ 1034.930371] kobject: 'gt0' (8882262b5778): auto cleanup kobject_del
[ 1034.931930] kobject: 'gt0' (8882262b5778): calling ktype release
[ 1034.931936] kobject: 'gt0': free name
[ 1034.958004] kobject: 'i915__03_00.0' (88810e1f8800): fill_kobj_path: 
path = '/devices/i915__03_00.0'
[ 1034.958155] kobject: 'i915__03_00.0' (88810e1f8800): 
kobject_cleanup, parent 
[ 1034.958162] kobject: 'i915__03_00.0' (88810e1f8800): calling ktype 
release
[ 1034.958188] kobject: 'i915__03_00.0': free name
[ 1034.958729] kobject: 'gt' (88817130a240): kobject_cleanup, parent 
8881160c5000
[ 1034.958736] kobject: 'gt' (88817130a240): auto cleanup kobject_del
[ 1034.958762] kobject: 'gt' (88817130a240): calling ktype release
[ 1034.958767] kobject: (88817130a240): dynamic_kobj_release
[ 1034.958778] kobject: 'gt': free name

We have the following directory structure (one of the patches is creating
/sys/class/drm/card0/gt/gt0/.defaults):

/sys/class/drm/card0/gt
 |-gt0
|-.defaults

And we see from dmesg 

Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-28 Thread Dixit, Ashutosh
On Thu, 28 Apr 2022 07:36:14 -0700, Andrzej Hajda wrote:
> On 27.04.2022 22:46, Dixit, Ashutosh wrote:
> > On Sun, 24 Apr 2022 15:36:23 -0700, Andi Shyti wrote:
> >> Hi Andrzej and Ashutosh,
> >>
> >> b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> >> index 937b2e1a305e..4c72b4f983a6 100644
> >> --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> >> +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> >> @@ -222,6 +222,9 @@ struct intel_gt {
> >>} mocs;
> >>struct intel_pxp pxp;
> >> +
> >> +  /* gt/gtN sysfs */
> >> +  struct kobject sysfs_gtn;
> > If you put kobject as a part of intel_gt what assures you that lifetime 
> > of
> > kobject is shorter than intel_gt? Ie its refcounter is 0 on removal of
> > intel_gt?
>  Because we are explicitly doing a kobject_put() in
>  intel_gt_sysfs_unregister(). Which is exactly what we are *not* doing in
>  the previous code.
> 
>  Let me explain a bit about the previous code (but feel free to skip since
>  the patch should speak for itself):
>  * Previously we kzalloc a 'struct kobj_gt'
>  * But we don't save a pointer to the 'struct kobj_gt' so we don't have 
>  the
>  pointer to the kobject to be able to do a kobject_put() on it later
>  * Therefore we need to store the pointer in 'struct intel_gt'
>  * But if we have to put the pointer in 'struct intel_gt' we might as well
>  put the kobject as part of 'struct intel_gt' and that also removes 
>  the
>  need to have a 'struct kobj_gt' (kobj_to_gt() can just use 
>  container_of()
>  to get gt from kobj).
>  * So I think this patch simpler/cleaner than the original code if you 
>  take
>  the requirement for kobject_put() into account.
> >> This is my oversight. This was something I completely forgot to
> >> fix but it was my intention to do and actually I had some fixes
> >> ongoing. But because this patch took too long to get in I
> >> completely forgot about it (Sujaritha was actually the first who
> >> pointed this out).
> >>
> >> Thanks, Ashutosh for taking this.
> >>
> >>> I fully agree that previous code is incorrect but I am not convinced 
> >>> current
> >>> code is correct.
> >>> If some objects are kref-counted it means usually they can have multiple
> >>> concurrent users and kobject_put does not work as traditional
> >>> destructor/cleanup/unregister.
> >>> So in this particular case after calling kobject_init_and_add sysfs core 
> >>> can
> >>> get multiple references on the object. Later, during driver unregistration
> >>> kobject_put is called, but if the object is still in use by sysfs core, 
> >>> the
> >>> object will not be destroyed/released. If the driver unregistration
> >>> continues memory will be freed, leaving sysfs-core (or other users) with
> >>> dangling pointers. Unless there is some additional synchronization 
> >>> mechanism
> >>> I am not aware of.
> >> Thanks Andrzej for summarizing this and what you said is actually
> >> what happens. I had a similar solution developed and I had wrong
> >> pointer reference happening.
> > Hi Andrzej/Andi,
> >
> > I did do some research into kobject's and such before writing this patch
> > and based on that I believe the patch is correct. Presenting some evidence
> > below.
> >
> > The patch is verified by:
> >
> > a. Putting a printk in the release() method when it exists (it does for
> > sysfs_gtn kobject)
> > b. Enabling dynamic prints for lib/kobject.c
> >
> > For example, with the following:
> >
> > # echo 'file kobject.c +p' > /sys/kernel/debug/dynamic_debug/control
> > # echo -n ":03:00.0" > /sys/bus/pci/drivers/i915/unbind
> >
> > We see this in dmesg (see kobject_cleanup() called from kobject_put()):
> >
> > [ 1034.930007] kobject: '.defaults' (88817130a640): kobject_cleanup, 
> > parent 8882262b5778
> > [ 1034.930020] kobject: '.defaults' (88817130a640): auto cleanup 
> > kobject_del
> > [ 1034.930336] kobject: '.defaults' (88817130a640): calling ktype 
> > release
> > [ 1034.930340] kobject: (88817130a640): dynamic_kobj_release
> > [ 1034.930354] kobject: '.defaults': free name
> > [ 1034.930366] kobject: 'gt0' (8882262b5778): kobject_cleanup, parent 
> > 88817130a240
> > [ 1034.930371] kobject: 'gt0' (8882262b5778): auto cleanup kobject_del
> > [ 1034.931930] kobject: 'gt0' (8882262b5778): calling ktype release
> > [ 1034.931936] kobject: 'gt0': free name
> > [ 1034.958004] kobject: 'i915__03_00.0' (88810e1f8800): 
> > fill_kobj_path: path = '/devices/i915__03_00.0'
> > [ 1034.958155] kobject: 'i915__03_00.0' (88810e1f8800): 
> > kobject_cleanup, parent 
> > [ 1034.958162] kobject: 'i915__03_00.0' (88810e1f8800): calling 
> > ktype release
> > [ 1034.958188] kobject: 'i915__03_00.0': free name
> > [ 1034.958729] kobject: 'gt' (88817130a240): kobject_cleanup, parent 

[Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-28 Thread Ashutosh Dixit
All kmalloc'd kobjects need a kobject_put() to free memory. For example in
previous code, kobj_gt_release() never gets called. The requirement of
kobject_put() now results in a slightly different code organization.

v2: s/gtn/gt/ (Andi)

Cc: Andi Shyti 
Cc: Andrzej Hajda 
Fixes: b770bcfae9ad ("drm/i915/gt: create per-tile sysfs interface")
Signed-off-by: Ashutosh Dixit 
---
 drivers/gpu/drm/i915/gt/intel_gt.c   |  1 +
 drivers/gpu/drm/i915/gt/intel_gt_sysfs.c | 29 ++--
 drivers/gpu/drm/i915/gt/intel_gt_sysfs.h |  6 +
 drivers/gpu/drm/i915/gt/intel_gt_types.h |  3 +++
 drivers/gpu/drm/i915/i915_sysfs.c|  2 ++
 5 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 07cfe66dd0e8..c992ad12d129 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -786,6 +786,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
 {
intel_wakeref_t wakeref;
 
+   intel_gt_sysfs_unregister(gt);
intel_rps_driver_unregister(>rps);
intel_gsc_fini(>gsc);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c 
b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
index 8ec8bc660c8c..9e4ebf53379b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
@@ -24,7 +24,7 @@ bool is_object_gt(struct kobject *kobj)
 
 static struct intel_gt *kobj_to_gt(struct kobject *kobj)
 {
-   return container_of(kobj, struct kobj_gt, base)->gt;
+   return container_of(kobj, struct intel_gt, sysfs_gt);
 }
 
 struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
@@ -72,9 +72,9 @@ static struct attribute *id_attrs[] = {
 };
 ATTRIBUTE_GROUPS(id);
 
+/* A kobject needs a release() method even if it does nothing */
 static void kobj_gt_release(struct kobject *kobj)
 {
-   kfree(kobj);
 }
 
 static struct kobj_type kobj_gt_type = {
@@ -85,8 +85,6 @@ static struct kobj_type kobj_gt_type = {
 
 void intel_gt_sysfs_register(struct intel_gt *gt)
 {
-   struct kobj_gt *kg;
-
/*
 * We need to make things right with the
 * ABI compatibility. The files were originally
@@ -98,25 +96,22 @@ void intel_gt_sysfs_register(struct intel_gt *gt)
if (gt_is_root(gt))
intel_gt_sysfs_pm_init(gt, gt_get_parent_obj(gt));
 
-   kg = kzalloc(sizeof(*kg), GFP_KERNEL);
-   if (!kg)
+   /* init and xfer ownership to sysfs tree */
+   if (kobject_init_and_add(>sysfs_gt, _gt_type,
+gt->i915->sysfs_gt, "gt%d", gt->info.id))
goto exit_fail;
 
-   kobject_init(>base, _gt_type);
-   kg->gt = gt;
-
-   /* xfer ownership to sysfs tree */
-   if (kobject_add(>base, gt->i915->sysfs_gt, "gt%d", gt->info.id))
-   goto exit_kobj_put;
-
-   intel_gt_sysfs_pm_init(gt, >base);
+   intel_gt_sysfs_pm_init(gt, >sysfs_gt);
 
return;
 
-exit_kobj_put:
-   kobject_put(>base);
-
 exit_fail:
+   kobject_put(>sysfs_gt);
drm_warn(>i915->drm,
 "failed to initialize gt%d sysfs root\n", gt->info.id);
 }
+
+void intel_gt_sysfs_unregister(struct intel_gt *gt)
+{
+   kobject_put(>sysfs_gt);
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
index 9471b26752cf..a99aa7e8b01a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
@@ -13,11 +13,6 @@
 
 struct intel_gt;
 
-struct kobj_gt {
-   struct kobject base;
-   struct intel_gt *gt;
-};
-
 bool is_object_gt(struct kobject *kobj);
 
 struct drm_i915_private *kobj_to_i915(struct kobject *kobj);
@@ -28,6 +23,7 @@ intel_gt_create_kobj(struct intel_gt *gt,
 const char *name);
 
 void intel_gt_sysfs_register(struct intel_gt *gt);
+void intel_gt_sysfs_unregister(struct intel_gt *gt);
 struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
const char *name);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h 
b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index b06611c1d4ad..edd7a3cf5f5f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -224,6 +224,9 @@ struct intel_gt {
} mocs;
 
struct intel_pxp pxp;
+
+   /* gt/gtN sysfs */
+   struct kobject sysfs_gt;
 };
 
 enum intel_gt_scratch_field {
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c 
b/drivers/gpu/drm/i915/i915_sysfs.c
index 8521daba212a..3f06106cdcf5 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -259,4 +259,6 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
 
device_remove_bin_file(kdev,  _attrs_1);
device_remove_bin_file(kdev,  _attrs);
+
+   kobject_put(dev_priv->sysfs_gt);
 }
-- 
2.34.1



Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-28 Thread Andrzej Hajda




On 27.04.2022 22:46, Dixit, Ashutosh wrote:

On Sun, 24 Apr 2022 15:36:23 -0700, Andi Shyti wrote:

Hi Andrzej and Ashutosh,


b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index 937b2e1a305e..4c72b4f983a6 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -222,6 +222,9 @@ struct intel_gt {
} mocs;
struct intel_pxp pxp;
+
+   /* gt/gtN sysfs */
+   struct kobject sysfs_gtn;

If you put kobject as a part of intel_gt what assures you that lifetime of
kobject is shorter than intel_gt? Ie its refcounter is 0 on removal of
intel_gt?

Because we are explicitly doing a kobject_put() in
intel_gt_sysfs_unregister(). Which is exactly what we are *not* doing in
the previous code.

Let me explain a bit about the previous code (but feel free to skip since
the patch should speak for itself):
* Previously we kzalloc a 'struct kobj_gt'
* But we don't save a pointer to the 'struct kobj_gt' so we don't have the
pointer to the kobject to be able to do a kobject_put() on it later
* Therefore we need to store the pointer in 'struct intel_gt'
* But if we have to put the pointer in 'struct intel_gt' we might as well
put the kobject as part of 'struct intel_gt' and that also removes the
need to have a 'struct kobj_gt' (kobj_to_gt() can just use container_of()
to get gt from kobj).
* So I think this patch simpler/cleaner than the original code if you take
the requirement for kobject_put() into account.

This is my oversight. This was something I completely forgot to
fix but it was my intention to do and actually I had some fixes
ongoing. But because this patch took too long to get in I
completely forgot about it (Sujaritha was actually the first who
pointed this out).

Thanks, Ashutosh for taking this.


I fully agree that previous code is incorrect but I am not convinced current
code is correct.
If some objects are kref-counted it means usually they can have multiple
concurrent users and kobject_put does not work as traditional
destructor/cleanup/unregister.
So in this particular case after calling kobject_init_and_add sysfs core can
get multiple references on the object. Later, during driver unregistration
kobject_put is called, but if the object is still in use by sysfs core, the
object will not be destroyed/released. If the driver unregistration
continues memory will be freed, leaving sysfs-core (or other users) with
dangling pointers. Unless there is some additional synchronization mechanism
I am not aware of.

Thanks Andrzej for summarizing this and what you said is actually
what happens. I had a similar solution developed and I had wrong
pointer reference happening.

Hi Andrzej/Andi,

I did do some research into kobject's and such before writing this patch
and based on that I believe the patch is correct. Presenting some evidence
below.

The patch is verified by:

a. Putting a printk in the release() method when it exists (it does for
sysfs_gtn kobject)
b. Enabling dynamic prints for lib/kobject.c

For example, with the following:

# echo 'file kobject.c +p' > /sys/kernel/debug/dynamic_debug/control
# echo -n ":03:00.0" > /sys/bus/pci/drivers/i915/unbind

We see this in dmesg (see kobject_cleanup() called from kobject_put()):

[ 1034.930007] kobject: '.defaults' (88817130a640): kobject_cleanup, parent 
8882262b5778
[ 1034.930020] kobject: '.defaults' (88817130a640): auto cleanup kobject_del
[ 1034.930336] kobject: '.defaults' (88817130a640): calling ktype release
[ 1034.930340] kobject: (88817130a640): dynamic_kobj_release
[ 1034.930354] kobject: '.defaults': free name
[ 1034.930366] kobject: 'gt0' (8882262b5778): kobject_cleanup, parent 
88817130a240
[ 1034.930371] kobject: 'gt0' (8882262b5778): auto cleanup kobject_del
[ 1034.931930] kobject: 'gt0' (8882262b5778): calling ktype release
[ 1034.931936] kobject: 'gt0': free name
[ 1034.958004] kobject: 'i915__03_00.0' (88810e1f8800): fill_kobj_path: 
path = '/devices/i915__03_00.0'
[ 1034.958155] kobject: 'i915__03_00.0' (88810e1f8800): 
kobject_cleanup, parent 
[ 1034.958162] kobject: 'i915__03_00.0' (88810e1f8800): calling ktype 
release
[ 1034.958188] kobject: 'i915__03_00.0': free name
[ 1034.958729] kobject: 'gt' (88817130a240): kobject_cleanup, parent 
8881160c5000
[ 1034.958736] kobject: 'gt' (88817130a240): auto cleanup kobject_del
[ 1034.958762] kobject: 'gt' (88817130a240): calling ktype release
[ 1034.958767] kobject: (88817130a240): dynamic_kobj_release
[ 1034.958778] kobject: 'gt': free name

We have the following directory structure (one of the patches is creating
/sys/class/drm/card0/gt/gt0/.defaults):

   /sys/class/drm/card0/gt
|-gt0
   |-.defaults

And we see from dmesg .defaults, gt0 and gt kobjects being cleaned up in
that order.

Looking at lib/kobject.c there are several 

Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-27 Thread Dixit, Ashutosh
On Wed, 27 Apr 2022 04:45:03 -0700, Andi Shyti wrote:
>
> Hi Ashutosh,

Hi Andi,

> > > > -static struct kobj_type kobj_gt_type = {
> > > > -   .release = kobj_gt_release,
> > > > +static struct kobj_type kobj_gtn_type = {
> > >
> > > what does it mean GTN? Or is it GTn? Please use just GT, gtn is
> > > confusing.
> > >
> > > Same for all the rest of the gtn's you have used below.
> >
> > I didn't like gtn either. But a sysfs_gt kobject is already part of 'struct
> > drm_i915_private' so I thought I'll put sysfs_gtn (for gt/gtN) in 'struct
> > intel_gt'. Otherwise browsing the code etc. gets confusing.
>
> we can even use 'gt_n' if the 'n' is really necessary.

I decided to just go with sysfs_gt in v2 as you had suggested. The total
number of instances of sysfs_gt are very few so it didn't seem too bad to
have the same member name in the two struct's.

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-27 Thread Dixit, Ashutosh
On Sun, 24 Apr 2022 15:36:23 -0700, Andi Shyti wrote:
>
> Hi Andrzej and Ashutosh,
>
> > > > > b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > > > > index 937b2e1a305e..4c72b4f983a6 100644
> > > > > --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > > > > +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > > > > @@ -222,6 +222,9 @@ struct intel_gt {
> > > > >   } mocs;
> > > > >   struct intel_pxp pxp;
> > > > > +
> > > > > + /* gt/gtN sysfs */
> > > > > + struct kobject sysfs_gtn;
> > > > If you put kobject as a part of intel_gt what assures you that lifetime 
> > > > of
> > > > kobject is shorter than intel_gt? Ie its refcounter is 0 on removal of
> > > > intel_gt?
> > > Because we are explicitly doing a kobject_put() in
> > > intel_gt_sysfs_unregister(). Which is exactly what we are *not* doing in
> > > the previous code.
> > >
> > > Let me explain a bit about the previous code (but feel free to skip since
> > > the patch should speak for itself):
> > > * Previously we kzalloc a 'struct kobj_gt'
> > > * But we don't save a pointer to the 'struct kobj_gt' so we don't have the
> > >pointer to the kobject to be able to do a kobject_put() on it later
> > > * Therefore we need to store the pointer in 'struct intel_gt'
> > > * But if we have to put the pointer in 'struct intel_gt' we might as well
> > >put the kobject as part of 'struct intel_gt' and that also removes the
> > >need to have a 'struct kobj_gt' (kobj_to_gt() can just use 
> > > container_of()
> > >to get gt from kobj).
> > > * So I think this patch simpler/cleaner than the original code if you take
> > >the requirement for kobject_put() into account.
>
> This is my oversight. This was something I completely forgot to
> fix but it was my intention to do and actually I had some fixes
> ongoing. But because this patch took too long to get in I
> completely forgot about it (Sujaritha was actually the first who
> pointed this out).
>
> Thanks, Ashutosh for taking this.
>
> > I fully agree that previous code is incorrect but I am not convinced current
> > code is correct.
> > If some objects are kref-counted it means usually they can have multiple
> > concurrent users and kobject_put does not work as traditional
> > destructor/cleanup/unregister.
> > So in this particular case after calling kobject_init_and_add sysfs core can
> > get multiple references on the object. Later, during driver unregistration
> > kobject_put is called, but if the object is still in use by sysfs core, the
> > object will not be destroyed/released. If the driver unregistration
> > continues memory will be freed, leaving sysfs-core (or other users) with
> > dangling pointers. Unless there is some additional synchronization mechanism
> > I am not aware of.
>
> Thanks Andrzej for summarizing this and what you said is actually
> what happens. I had a similar solution developed and I had wrong
> pointer reference happening.

Hi Andrzej/Andi,

I did do some research into kobject's and such before writing this patch
and based on that I believe the patch is correct. Presenting some evidence
below.

The patch is verified by:

a. Putting a printk in the release() method when it exists (it does for
   sysfs_gtn kobject)
b. Enabling dynamic prints for lib/kobject.c

For example, with the following:

# echo 'file kobject.c +p' > /sys/kernel/debug/dynamic_debug/control
# echo -n ":03:00.0" > /sys/bus/pci/drivers/i915/unbind

We see this in dmesg (see kobject_cleanup() called from kobject_put()):

[ 1034.930007] kobject: '.defaults' (88817130a640): kobject_cleanup, parent 
8882262b5778
[ 1034.930020] kobject: '.defaults' (88817130a640): auto cleanup kobject_del
[ 1034.930336] kobject: '.defaults' (88817130a640): calling ktype release
[ 1034.930340] kobject: (88817130a640): dynamic_kobj_release
[ 1034.930354] kobject: '.defaults': free name
[ 1034.930366] kobject: 'gt0' (8882262b5778): kobject_cleanup, parent 
88817130a240
[ 1034.930371] kobject: 'gt0' (8882262b5778): auto cleanup kobject_del
[ 1034.931930] kobject: 'gt0' (8882262b5778): calling ktype release
[ 1034.931936] kobject: 'gt0': free name
[ 1034.958004] kobject: 'i915__03_00.0' (88810e1f8800): fill_kobj_path: 
path = '/devices/i915__03_00.0'
[ 1034.958155] kobject: 'i915__03_00.0' (88810e1f8800): 
kobject_cleanup, parent 
[ 1034.958162] kobject: 'i915__03_00.0' (88810e1f8800): calling ktype 
release
[ 1034.958188] kobject: 'i915__03_00.0': free name
[ 1034.958729] kobject: 'gt' (88817130a240): kobject_cleanup, parent 
8881160c5000
[ 1034.958736] kobject: 'gt' (88817130a240): auto cleanup kobject_del
[ 1034.958762] kobject: 'gt' (88817130a240): calling ktype release
[ 1034.958767] kobject: (88817130a240): dynamic_kobj_release
[ 1034.958778] kobject: 'gt': free name

We have the following directory structure (one of the patches is creating
/sys/class/drm/card0/gt/gt0/.defaults):

 

Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-27 Thread Andi Shyti
Hi Ashutosh,

> > > -static struct kobj_type kobj_gt_type = {
> > > - .release = kobj_gt_release,
> > > +static struct kobj_type kobj_gtn_type = {
> >
> > what does it mean GTN? Or is it GTn? Please use just GT, gtn is
> > confusing.
> >
> > Same for all the rest of the gtn's you have used below.
> 
> I didn't like gtn either. But a sysfs_gt kobject is already part of 'struct
> drm_i915_private' so I thought I'll put sysfs_gtn (for gt/gtN) in 'struct
> intel_gt'. Otherwise browsing the code etc. gets confusing.

we can even use 'gt_n' if the 'n' is really necessary.

Andi


Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-26 Thread Dixit, Ashutosh
On Sun, 24 Apr 2022 15:30:59 -0700, Andi Shyti wrote:
>
> Hi Ashutosh,
>

Hi Andi,

> [...]
>
> > -static struct kobj_type kobj_gt_type = {
> > -   .release = kobj_gt_release,
> > +static struct kobj_type kobj_gtn_type = {
>
> what does it mean GTN? Or is it GTn? Please use just GT, gtn is
> confusing.
>
> Same for all the rest of the gtn's you have used below.

I didn't like gtn either. But a sysfs_gt kobject is already part of 'struct
drm_i915_private' so I thought I'll put sysfs_gtn (for gt/gtN) in 'struct
intel_gt'. Otherwise browsing the code etc. gets confusing.

So that was the reason, but I think I'll change the name to 'kobj_gt' (from
the current 'sysfs_gtn') in the next rev. That seems better. Thoughts?

Will respond to your other comments after posting the next rev patches (now
need some rework because Jani doesn't want a dependence between gt and
display code in pcode functions).

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-24 Thread Andi Shyti
Hi Andrzej and Ashutosh,

> > > > b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > > > index 937b2e1a305e..4c72b4f983a6 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > > > @@ -222,6 +222,9 @@ struct intel_gt {
> > > > } mocs;
> > > > struct intel_pxp pxp;
> > > > +
> > > > +   /* gt/gtN sysfs */
> > > > +   struct kobject sysfs_gtn;
> > > If you put kobject as a part of intel_gt what assures you that lifetime of
> > > kobject is shorter than intel_gt? Ie its refcounter is 0 on removal of
> > > intel_gt?
> > Because we are explicitly doing a kobject_put() in
> > intel_gt_sysfs_unregister(). Which is exactly what we are *not* doing in
> > the previous code.
> > 
> > Let me explain a bit about the previous code (but feel free to skip since
> > the patch should speak for itself):
> > * Previously we kzalloc a 'struct kobj_gt'
> > * But we don't save a pointer to the 'struct kobj_gt' so we don't have the
> >pointer to the kobject to be able to do a kobject_put() on it later
> > * Therefore we need to store the pointer in 'struct intel_gt'
> > * But if we have to put the pointer in 'struct intel_gt' we might as well
> >put the kobject as part of 'struct intel_gt' and that also removes the
> >need to have a 'struct kobj_gt' (kobj_to_gt() can just use container_of()
> >to get gt from kobj).
> > * So I think this patch simpler/cleaner than the original code if you take
> >the requirement for kobject_put() into account.

This is my oversight. This was something I completely forgot to
fix but it was my intention to do and actually I had some fixes
ongoing. But because this patch took too long to get in I
completely forgot about it (Sujaritha was actually the first who
pointed this out).

Thanks, Ashutosh for taking this.

> I fully agree that previous code is incorrect but I am not convinced current
> code is correct.
> If some objects are kref-counted it means usually they can have multiple
> concurrent users and kobject_put does not work as traditional
> destructor/cleanup/unregister.
> So in this particular case after calling kobject_init_and_add sysfs core can
> get multiple references on the object. Later, during driver unregistration
> kobject_put is called, but if the object is still in use by sysfs core, the
> object will not be destroyed/released. If the driver unregistration
> continues memory will be freed, leaving sysfs-core (or other users) with
> dangling pointers. Unless there is some additional synchronization mechanism
> I am not aware of.

Thanks Andrzej for summarizing this and what you said is actually
what happens. I had a similar solution developed and I had wrong
pointer reference happening.

Thanks,
Andi


Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-24 Thread Andi Shyti
Hi Ashutosh,

[...]

> -static struct kobj_type kobj_gt_type = {
> - .release = kobj_gt_release,
> +static struct kobj_type kobj_gtn_type = {

what does it mean GTN? Or is it GTn? Please use just GT, gtn is
confusing.

Same for all the rest of the gtn's you have used below.

Thanks,
Andi


Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-20 Thread Andrzej Hajda




On 20.04.2022 18:12, Dixit, Ashutosh wrote:

On Wed, 20 Apr 2022 05:17:57 -0700, Andrzej Hajda wrote:

Hi Ashutosh,

Hi Andrzej,


On 20.04.2022 07:21, Ashutosh Dixit wrote:

All kmalloc'd kobjects need a kobject_put() to free memory. For example in
previous code, kobj_gt_release() never gets called. The requirement of
kobject_put() now results in a slightly different code organization.

Cc: Andi Shyti 
Cc: Andrzej Hajda 
Cc: Rodrigo Vivi 
Fixes: b770bcfae9ad ("drm/i915/gt: create per-tile sysfs interface")

/snip/


+void intel_gt_sysfs_unregister(struct intel_gt *gt)
+{
+   kobject_put(>sysfs_gtn);
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
index 9471b26752cf..a99aa7e8b01a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
@@ -13,11 +13,6 @@
 struct intel_gt;
   -struct kobj_gt {
-   struct kobject base;
-   struct intel_gt *gt;
-};
-
   bool is_object_gt(struct kobject *kobj);
 struct drm_i915_private *kobj_to_i915(struct kobject *kobj);
@@ -28,6 +23,7 @@ intel_gt_create_kobj(struct intel_gt *gt,
 const char *name);
 void intel_gt_sysfs_register(struct intel_gt *gt);
+void intel_gt_sysfs_unregister(struct intel_gt *gt);
   struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
const char *name);
   diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h
b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index 937b2e1a305e..4c72b4f983a6 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -222,6 +222,9 @@ struct intel_gt {
} mocs;
struct intel_pxp pxp;
+
+   /* gt/gtN sysfs */
+   struct kobject sysfs_gtn;

If you put kobject as a part of intel_gt what assures you that lifetime of
kobject is shorter than intel_gt? Ie its refcounter is 0 on removal of
intel_gt?

Because we are explicitly doing a kobject_put() in
intel_gt_sysfs_unregister(). Which is exactly what we are *not* doing in
the previous code.

Let me explain a bit about the previous code (but feel free to skip since
the patch should speak for itself):
* Previously we kzalloc a 'struct kobj_gt'
* But we don't save a pointer to the 'struct kobj_gt' so we don't have the
   pointer to the kobject to be able to do a kobject_put() on it later
* Therefore we need to store the pointer in 'struct intel_gt'
* But if we have to put the pointer in 'struct intel_gt' we might as well
   put the kobject as part of 'struct intel_gt' and that also removes the
   need to have a 'struct kobj_gt' (kobj_to_gt() can just use container_of()
   to get gt from kobj).
* So I think this patch simpler/cleaner than the original code if you take
   the requirement for kobject_put() into account.


I fully agree that previous code is incorrect but I am not convinced 
current code is correct.
If some objects are kref-counted it means usually they can have multiple 
concurrent users and kobject_put does not work as traditional 
destructor/cleanup/unregister.
So in this particular case after calling kobject_init_and_add sysfs core 
can get multiple references on the object. Later, during driver 
unregistration kobject_put is called, but if the object is still in use 
by sysfs core, the object will not be destroyed/released. If the driver 
unregistration continues memory will be freed, leaving sysfs-core (or 
other users) with dangling pointers. Unless there is some additional 
synchronization mechanism I am not aware of.


Regards
Andrzej


Thanks.
--
Ashutosh




Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-20 Thread Dixit, Ashutosh
On Wed, 20 Apr 2022 05:17:57 -0700, Andrzej Hajda wrote:
>
> Hi Ashutosh,

Hi Andrzej,

> On 20.04.2022 07:21, Ashutosh Dixit wrote:
> > All kmalloc'd kobjects need a kobject_put() to free memory. For example in
> > previous code, kobj_gt_release() never gets called. The requirement of
> > kobject_put() now results in a slightly different code organization.
> >
> > Cc: Andi Shyti 
> > Cc: Andrzej Hajda 
> > Cc: Rodrigo Vivi 
> > Fixes: b770bcfae9ad ("drm/i915/gt: create per-tile sysfs interface")

/snip/

> > +void intel_gt_sysfs_unregister(struct intel_gt *gt)
> > +{
> > +   kobject_put(>sysfs_gtn);
> > +}
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h 
> > b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
> > index 9471b26752cf..a99aa7e8b01a 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
> > @@ -13,11 +13,6 @@
> > struct intel_gt;
> >   -struct kobj_gt {
> > -   struct kobject base;
> > -   struct intel_gt *gt;
> > -};
> > -
> >   bool is_object_gt(struct kobject *kobj);
> > struct drm_i915_private *kobj_to_i915(struct kobject *kobj);
> > @@ -28,6 +23,7 @@ intel_gt_create_kobj(struct intel_gt *gt,
> >  const char *name);
> > void intel_gt_sysfs_register(struct intel_gt *gt);
> > +void intel_gt_sysfs_unregister(struct intel_gt *gt);
> >   struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
> > const char *name);
> >   diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > index 937b2e1a305e..4c72b4f983a6 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > @@ -222,6 +222,9 @@ struct intel_gt {
> > } mocs;
> > struct intel_pxp pxp;
> > +
> > +   /* gt/gtN sysfs */
> > +   struct kobject sysfs_gtn;
>
> If you put kobject as a part of intel_gt what assures you that lifetime of
> kobject is shorter than intel_gt? Ie its refcounter is 0 on removal of
> intel_gt?

Because we are explicitly doing a kobject_put() in
intel_gt_sysfs_unregister(). Which is exactly what we are *not* doing in
the previous code.

Let me explain a bit about the previous code (but feel free to skip since
the patch should speak for itself):
* Previously we kzalloc a 'struct kobj_gt'
* But we don't save a pointer to the 'struct kobj_gt' so we don't have the
  pointer to the kobject to be able to do a kobject_put() on it later
* Therefore we need to store the pointer in 'struct intel_gt'
* But if we have to put the pointer in 'struct intel_gt' we might as well
  put the kobject as part of 'struct intel_gt' and that also removes the
  need to have a 'struct kobj_gt' (kobj_to_gt() can just use container_of()
  to get gt from kobj).
* So I think this patch simpler/cleaner than the original code if you take
  the requirement for kobject_put() into account.

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-20 Thread Dixit, Ashutosh
On Wed, 20 Apr 2022 05:17:57 -0700, Andrzej Hajda wrote:
>
> Hi Ashutosh,

Hi Andrzej,

> On 20.04.2022 07:21, Ashutosh Dixit wrote:
> > All kmalloc'd kobjects need a kobject_put() to free memory. For example in
> > previous code, kobj_gt_release() never gets called. The requirement of
> > kobject_put() now results in a slightly different code organization.
> >
> > Cc: Andi Shyti 
> > Cc: Andrzej Hajda 
> > Cc: Rodrigo Vivi 
> > Fixes: b770bcfae9ad ("drm/i915/gt: create per-tile sysfs interface")

/snip/

> > +void intel_gt_sysfs_unregister(struct intel_gt *gt)
> > +{
> > +   kobject_put(>sysfs_gtn);
> > +}
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h 
> > b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
> > index 9471b26752cf..a99aa7e8b01a 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
> > @@ -13,11 +13,6 @@
> > struct intel_gt;
> >   -struct kobj_gt {
> > -   struct kobject base;
> > -   struct intel_gt *gt;
> > -};
> > -
> >   bool is_object_gt(struct kobject *kobj);
> > struct drm_i915_private *kobj_to_i915(struct kobject *kobj);
> > @@ -28,6 +23,7 @@ intel_gt_create_kobj(struct intel_gt *gt,
> >  const char *name);
> > void intel_gt_sysfs_register(struct intel_gt *gt);
> > +void intel_gt_sysfs_unregister(struct intel_gt *gt);
> >   struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
> > const char *name);
> >   diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > index 937b2e1a305e..4c72b4f983a6 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> > @@ -222,6 +222,9 @@ struct intel_gt {
> > } mocs;
> > struct intel_pxp pxp;
> > +
> > +   /* gt/gtN sysfs */
> > +   struct kobject sysfs_gtn;
>
> If you put kobject as a part of intel_gt what assures you that lifetime of
> kobject is shorter than intel_gt? Ie its refcounter is 0 on removal of
> intel_gt?

Because we are explicitly doing a kobject_put() in
intel_gt_sysfs_unregister(). Which is exactly what we are *not* doing in
the previous code.

Let me explain a bit about the previous code (but feel free to skip since
the patch should speak for itself):
* Previously we kzalloc a 'struct kobj_gt'
* But we don't save a pointer to the 'struct kobj_gt' so we don't have the
  pointer to the kobject to be able to do a kobject_put() on it later
* Therefore we need to store the pointer in 'struct intel_gt'
* But if we have to put the pointer in 'struct intel_gt' we might as well
  put the kobject as part of 'struct intel_gt' and that also removes the
  need to have a 'struct kobj_gt' (kobj_to_gt() can just use container_of()
  to get gt from kobj).
* So I think this patch simpler/cleaner than the original code if you take
  the requirement for kobject_put() into account.

Thanks.
--
Ashutosh


Re: [Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-20 Thread Andrzej Hajda

Hi Ashutosh,

On 20.04.2022 07:21, Ashutosh Dixit wrote:

All kmalloc'd kobjects need a kobject_put() to free memory. For example in
previous code, kobj_gt_release() never gets called. The requirement of
kobject_put() now results in a slightly different code organization.

Cc: Andi Shyti 
Cc: Andrzej Hajda 
Cc: Rodrigo Vivi 
Fixes: b770bcfae9ad ("drm/i915/gt: create per-tile sysfs interface")
Signed-off-by: Ashutosh Dixit 
---
  drivers/gpu/drm/i915/gt/intel_gt.c   |  1 +
  drivers/gpu/drm/i915/gt/intel_gt_sysfs.c | 35 ++--
  drivers/gpu/drm/i915/gt/intel_gt_sysfs.h |  6 +---
  drivers/gpu/drm/i915/gt/intel_gt_types.h |  3 ++
  drivers/gpu/drm/i915/i915_sysfs.c|  2 ++
  5 files changed, 22 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index f0014c5072c9..f0c56ca12c0b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -783,6 +783,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
  {
intel_wakeref_t wakeref;
  
+	intel_gt_sysfs_unregister(gt);

intel_rps_driver_unregister(>rps);
  
  	intel_pxp_fini(>pxp);

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c 
b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
index 8ec8bc660c8c..6f1b081ca5b7 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
@@ -24,7 +24,7 @@ bool is_object_gt(struct kobject *kobj)
  
  static struct intel_gt *kobj_to_gt(struct kobject *kobj)

  {
-   return container_of(kobj, struct kobj_gt, base)->gt;
+   return container_of(kobj, struct intel_gt, sysfs_gtn);
  }
  
  struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,

@@ -72,21 +72,19 @@ static struct attribute *id_attrs[] = {
  };
  ATTRIBUTE_GROUPS(id);
  
-static void kobj_gt_release(struct kobject *kobj)

+/* A kobject needs a release() method even if it does nothing */
+static void kobj_gtn_release(struct kobject *kobj)
  {
-   kfree(kobj);
  }
  
-static struct kobj_type kobj_gt_type = {

-   .release = kobj_gt_release,
+static struct kobj_type kobj_gtn_type = {
+   .release = kobj_gtn_release,
.sysfs_ops = _sysfs_ops,
.default_groups = id_groups,
  };
  
  void intel_gt_sysfs_register(struct intel_gt *gt)

  {
-   struct kobj_gt *kg;
-
/*
 * We need to make things right with the
 * ABI compatibility. The files were originally
@@ -98,25 +96,22 @@ void intel_gt_sysfs_register(struct intel_gt *gt)
if (gt_is_root(gt))
intel_gt_sysfs_pm_init(gt, gt_get_parent_obj(gt));
  
-	kg = kzalloc(sizeof(*kg), GFP_KERNEL);

-   if (!kg)
+   /* init and xfer ownership to sysfs tree */
+   if (kobject_init_and_add(>sysfs_gtn, _gtn_type,
+gt->i915->sysfs_gt, "gt%d", gt->info.id))
goto exit_fail;
  
-	kobject_init(>base, _gt_type);

-   kg->gt = gt;
-
-   /* xfer ownership to sysfs tree */
-   if (kobject_add(>base, gt->i915->sysfs_gt, "gt%d", gt->info.id))
-   goto exit_kobj_put;
-
-   intel_gt_sysfs_pm_init(gt, >base);
+   intel_gt_sysfs_pm_init(gt, >sysfs_gtn);
  
  	return;
  
-exit_kobj_put:

-   kobject_put(>base);
-
  exit_fail:
+   kobject_put(>sysfs_gtn);
drm_warn(>i915->drm,
 "failed to initialize gt%d sysfs root\n", gt->info.id);
  }
+
+void intel_gt_sysfs_unregister(struct intel_gt *gt)
+{
+   kobject_put(>sysfs_gtn);
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
index 9471b26752cf..a99aa7e8b01a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
@@ -13,11 +13,6 @@
  
  struct intel_gt;
  
-struct kobj_gt {

-   struct kobject base;
-   struct intel_gt *gt;
-};
-
  bool is_object_gt(struct kobject *kobj);
  
  struct drm_i915_private *kobj_to_i915(struct kobject *kobj);

@@ -28,6 +23,7 @@ intel_gt_create_kobj(struct intel_gt *gt,
 const char *name);
  
  void intel_gt_sysfs_register(struct intel_gt *gt);

+void intel_gt_sysfs_unregister(struct intel_gt *gt);
  struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
const char *name);
  
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h

index 937b2e1a305e..4c72b4f983a6 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -222,6 +222,9 @@ struct intel_gt {
} mocs;
  
  	struct intel_pxp pxp;

+
+   /* gt/gtN sysfs */
+   struct kobject sysfs_gtn;


If you put kobject as a part of intel_gt what assures you that lifetime 
of kobject is shorter than intel_gt? Ie its refcounter is 0 on removal 
of intel_gt?


Regards
Andrzej


  };
  
  enum intel_gt_scratch_field {

diff --git a/drivers/gpu/drm/i915/i915_sysfs.c 

[Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-20 Thread Ashutosh Dixit
All kmalloc'd kobjects need a kobject_put() to free memory. For example in
previous code, kobj_gt_release() never gets called. The requirement of
kobject_put() now results in a slightly different code organization.

Cc: Andi Shyti 
Cc: Andrzej Hajda 
Cc: Rodrigo Vivi 
Fixes: b770bcfae9ad ("drm/i915/gt: create per-tile sysfs interface")
Signed-off-by: Ashutosh Dixit 
---
 drivers/gpu/drm/i915/gt/intel_gt.c   |  1 +
 drivers/gpu/drm/i915/gt/intel_gt_sysfs.c | 35 ++--
 drivers/gpu/drm/i915/gt/intel_gt_sysfs.h |  6 +---
 drivers/gpu/drm/i915/gt/intel_gt_types.h |  3 ++
 drivers/gpu/drm/i915/i915_sysfs.c|  2 ++
 5 files changed, 22 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index f0014c5072c9..f0c56ca12c0b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -783,6 +783,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
 {
intel_wakeref_t wakeref;
 
+   intel_gt_sysfs_unregister(gt);
intel_rps_driver_unregister(>rps);
 
intel_pxp_fini(>pxp);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c 
b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
index 8ec8bc660c8c..6f1b081ca5b7 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
@@ -24,7 +24,7 @@ bool is_object_gt(struct kobject *kobj)
 
 static struct intel_gt *kobj_to_gt(struct kobject *kobj)
 {
-   return container_of(kobj, struct kobj_gt, base)->gt;
+   return container_of(kobj, struct intel_gt, sysfs_gtn);
 }
 
 struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
@@ -72,21 +72,19 @@ static struct attribute *id_attrs[] = {
 };
 ATTRIBUTE_GROUPS(id);
 
-static void kobj_gt_release(struct kobject *kobj)
+/* A kobject needs a release() method even if it does nothing */
+static void kobj_gtn_release(struct kobject *kobj)
 {
-   kfree(kobj);
 }
 
-static struct kobj_type kobj_gt_type = {
-   .release = kobj_gt_release,
+static struct kobj_type kobj_gtn_type = {
+   .release = kobj_gtn_release,
.sysfs_ops = _sysfs_ops,
.default_groups = id_groups,
 };
 
 void intel_gt_sysfs_register(struct intel_gt *gt)
 {
-   struct kobj_gt *kg;
-
/*
 * We need to make things right with the
 * ABI compatibility. The files were originally
@@ -98,25 +96,22 @@ void intel_gt_sysfs_register(struct intel_gt *gt)
if (gt_is_root(gt))
intel_gt_sysfs_pm_init(gt, gt_get_parent_obj(gt));
 
-   kg = kzalloc(sizeof(*kg), GFP_KERNEL);
-   if (!kg)
+   /* init and xfer ownership to sysfs tree */
+   if (kobject_init_and_add(>sysfs_gtn, _gtn_type,
+gt->i915->sysfs_gt, "gt%d", gt->info.id))
goto exit_fail;
 
-   kobject_init(>base, _gt_type);
-   kg->gt = gt;
-
-   /* xfer ownership to sysfs tree */
-   if (kobject_add(>base, gt->i915->sysfs_gt, "gt%d", gt->info.id))
-   goto exit_kobj_put;
-
-   intel_gt_sysfs_pm_init(gt, >base);
+   intel_gt_sysfs_pm_init(gt, >sysfs_gtn);
 
return;
 
-exit_kobj_put:
-   kobject_put(>base);
-
 exit_fail:
+   kobject_put(>sysfs_gtn);
drm_warn(>i915->drm,
 "failed to initialize gt%d sysfs root\n", gt->info.id);
 }
+
+void intel_gt_sysfs_unregister(struct intel_gt *gt)
+{
+   kobject_put(>sysfs_gtn);
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
index 9471b26752cf..a99aa7e8b01a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
@@ -13,11 +13,6 @@
 
 struct intel_gt;
 
-struct kobj_gt {
-   struct kobject base;
-   struct intel_gt *gt;
-};
-
 bool is_object_gt(struct kobject *kobj);
 
 struct drm_i915_private *kobj_to_i915(struct kobject *kobj);
@@ -28,6 +23,7 @@ intel_gt_create_kobj(struct intel_gt *gt,
 const char *name);
 
 void intel_gt_sysfs_register(struct intel_gt *gt);
+void intel_gt_sysfs_unregister(struct intel_gt *gt);
 struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
const char *name);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h 
b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index 937b2e1a305e..4c72b4f983a6 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -222,6 +222,9 @@ struct intel_gt {
} mocs;
 
struct intel_pxp pxp;
+
+   /* gt/gtN sysfs */
+   struct kobject sysfs_gtn;
 };
 
 enum intel_gt_scratch_field {
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c 
b/drivers/gpu/drm/i915/i915_sysfs.c
index 8521daba212a..3f06106cdcf5 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -259,4 +259,6 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv)
 

[Intel-gfx] [PATCH 7/9] drm/i915/gt: Fix memory leaks in per-gt sysfs

2022-04-19 Thread Ashutosh Dixit
All kmalloc'd kobjects need a kobject_put() to free memory. For example in
previous code, kobj_gt_release() never gets called. The requirement of
kobject_put() now results in a slightly different code organization.

Cc: Andi Shyti 
Cc: Andrzej Hajda 
Cc: Rodrigo Vivi 
Fixes: b770bcfae9ad ("drm/i915/gt: create per-tile sysfs interface")
Signed-off-by: Ashutosh Dixit 
---
 drivers/gpu/drm/i915/gt/intel_gt.c   |  1 +
 drivers/gpu/drm/i915/gt/intel_gt_sysfs.c | 35 ++--
 drivers/gpu/drm/i915/gt/intel_gt_sysfs.h |  6 +---
 drivers/gpu/drm/i915/gt/intel_gt_types.h |  3 ++
 drivers/gpu/drm/i915/i915_sysfs.c|  2 ++
 5 files changed, 22 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index f0014c5072c9..f0c56ca12c0b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -783,6 +783,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
 {
intel_wakeref_t wakeref;
 
+   intel_gt_sysfs_unregister(gt);
intel_rps_driver_unregister(>rps);
 
intel_pxp_fini(>pxp);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c 
b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
index 8ec8bc660c8c..6f1b081ca5b7 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.c
@@ -24,7 +24,7 @@ bool is_object_gt(struct kobject *kobj)
 
 static struct intel_gt *kobj_to_gt(struct kobject *kobj)
 {
-   return container_of(kobj, struct kobj_gt, base)->gt;
+   return container_of(kobj, struct intel_gt, sysfs_gtn);
 }
 
 struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
@@ -72,21 +72,19 @@ static struct attribute *id_attrs[] = {
 };
 ATTRIBUTE_GROUPS(id);
 
-static void kobj_gt_release(struct kobject *kobj)
+/* A kobject needs a release() method even if it does nothing */
+static void kobj_gtn_release(struct kobject *kobj)
 {
-   kfree(kobj);
 }
 
-static struct kobj_type kobj_gt_type = {
-   .release = kobj_gt_release,
+static struct kobj_type kobj_gtn_type = {
+   .release = kobj_gtn_release,
.sysfs_ops = _sysfs_ops,
.default_groups = id_groups,
 };
 
 void intel_gt_sysfs_register(struct intel_gt *gt)
 {
-   struct kobj_gt *kg;
-
/*
 * We need to make things right with the
 * ABI compatibility. The files were originally
@@ -98,25 +96,22 @@ void intel_gt_sysfs_register(struct intel_gt *gt)
if (gt_is_root(gt))
intel_gt_sysfs_pm_init(gt, gt_get_parent_obj(gt));
 
-   kg = kzalloc(sizeof(*kg), GFP_KERNEL);
-   if (!kg)
+   /* init and xfer ownership to sysfs tree */
+   if (kobject_init_and_add(>sysfs_gtn, _gtn_type,
+gt->i915->sysfs_gt, "gt%d", gt->info.id))
goto exit_fail;
 
-   kobject_init(>base, _gt_type);
-   kg->gt = gt;
-
-   /* xfer ownership to sysfs tree */
-   if (kobject_add(>base, gt->i915->sysfs_gt, "gt%d", gt->info.id))
-   goto exit_kobj_put;
-
-   intel_gt_sysfs_pm_init(gt, >base);
+   intel_gt_sysfs_pm_init(gt, >sysfs_gtn);
 
return;
 
-exit_kobj_put:
-   kobject_put(>base);
-
 exit_fail:
+   kobject_put(>sysfs_gtn);
drm_warn(>i915->drm,
 "failed to initialize gt%d sysfs root\n", gt->info.id);
 }
+
+void intel_gt_sysfs_unregister(struct intel_gt *gt)
+{
+   kobject_put(>sysfs_gtn);
+}
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
index 9471b26752cf..a99aa7e8b01a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_sysfs.h
@@ -13,11 +13,6 @@
 
 struct intel_gt;
 
-struct kobj_gt {
-   struct kobject base;
-   struct intel_gt *gt;
-};
-
 bool is_object_gt(struct kobject *kobj);
 
 struct drm_i915_private *kobj_to_i915(struct kobject *kobj);
@@ -28,6 +23,7 @@ intel_gt_create_kobj(struct intel_gt *gt,
 const char *name);
 
 void intel_gt_sysfs_register(struct intel_gt *gt);
+void intel_gt_sysfs_unregister(struct intel_gt *gt);
 struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
const char *name);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h 
b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index 937b2e1a305e..4c72b4f983a6 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -222,6 +222,9 @@ struct intel_gt {
} mocs;
 
struct intel_pxp pxp;
+
+   /* gt/gtN sysfs */
+   struct kobject sysfs_gtn;
 };
 
 enum intel_gt_scratch_field {
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c 
b/drivers/gpu/drm/i915/i915_sysfs.c
index 8521daba212a..3f06106cdcf5 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -259,4 +259,6 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv)