Re: [PATCH] drm/amdgpu: Fix the page fault issue in amdgpu_irq_fini

2021-03-19 Thread Christian König



Am 19.03.21 um 02:38 schrieb Deng, Emily:

[AMD Official Use Only - Internal Distribution Only]


-Original Message-
From: Christian König 
Sent: Thursday, March 18, 2021 7:52 PM
To: Deng, Emily ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Fix the page fault issue in amdgpu_irq_fini

Am 18.03.21 um 12:48 schrieb Emily Deng:

For some source, it will be shared by some client ID and source ID.
To fix the page fault issue, set all those to null.

Signed-off-by: Emily Deng 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 16 +---
   1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index af026109421a..623b1ac6231d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -359,7 +359,7 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
*/
   void amdgpu_irq_fini(struct amdgpu_device *adev)
   {
-unsigned i, j;
+unsigned i, j, m, n;

   if (adev->irq.installed) {
   drm_irq_uninstall(adev_to_drm(adev));
@@ -380,12 +380,22 @@ void amdgpu_irq_fini(struct amdgpu_device

*adev)

   if (!src)
   continue;

-kfree(src->enabled_types);
+if (src->enabled_types)
+kfree(src->enabled_types);

A NULL check before kfree() is unecessary and will be complained about by the
static checkers.

Sorry, will remove this.

+
   src->enabled_types = NULL;
+

Unrelated white space change.

Sorry, will remove this also.

   if (src->data) {
   kfree(src->data);
   kfree(src);
-adev->irq.client[i].sources[j] = NULL;
+}
+
+for (m = 0; m < AMDGPU_IRQ_CLIENTID_MAX; ++m) {
+if (!adev->irq.client[m].sources)
+continue;
+for (n = 0; n < AMDGPU_MAX_IRQ_SRC_ID;

++n)

+if (adev->irq.client[m].sources[n] ==

src)

+adev->irq.client[m].sources[n]

= NULL;

Hui what? The memory you set to NULL here is freed on the line below.

Accessing it after that would be illegal, so why do you want to set it to NULL?

[Emily] It is in the loop "for (j = 0; j < AMDGPU_MAX_IRQ_SRC_ID; ++j) {", shouldn't have 
been freed in this loop. Only set " adev->irq.client[i].sources[j] = NULL;" is not enough,
as it maybe have other client ID and src ID will share the same src. Also need 
to set those to NULL.


No, that can't happen.

It is illegal to use a dynamic allocated source with multiple client ID 
and src ID. Where do you see that?


We could also probably completely remove this feature since it is unused 
as far as I know.


Thanks,
Christian.


Christian.


   }
   }
   kfree(adev->irq.client[i].sources);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: Fix the page fault issue in amdgpu_irq_fini

2021-03-19 Thread Li, Dennis
[AMD Official Use Only - Internal Distribution Only]

Hi, Emily,
  What about refine struct amdgpu_irq_src with refcount? Your change could 
fix this issue, but it is unreadable. 

Best Regards
Dennis Li
-Original Message-
From: amd-gfx  On Behalf Of Deng, Emily
Sent: Friday, March 19, 2021 9:38 AM
To: Christian König ; 
amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: Fix the page fault issue in amdgpu_irq_fini

[AMD Official Use Only - Internal Distribution Only]

[AMD Official Use Only - Internal Distribution Only]

>-Original Message-
>From: Christian König 
>Sent: Thursday, March 18, 2021 7:52 PM
>To: Deng, Emily ; amd-gfx@lists.freedesktop.org
>Subject: Re: [PATCH] drm/amdgpu: Fix the page fault issue in 
>amdgpu_irq_fini
>
>Am 18.03.21 um 12:48 schrieb Emily Deng:
>> For some source, it will be shared by some client ID and source ID.
>> To fix the page fault issue, set all those to null.
>>
>> Signed-off-by: Emily Deng 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 16 +---
>>   1 file changed, 13 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> index af026109421a..623b1ac6231d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> @@ -359,7 +359,7 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
>>*/
>>   void amdgpu_irq_fini(struct amdgpu_device *adev)
>>   {
>> -unsigned i, j;
>> +unsigned i, j, m, n;
>>
>>   if (adev->irq.installed) {
>>   drm_irq_uninstall(adev_to_drm(adev));
>> @@ -380,12 +380,22 @@ void amdgpu_irq_fini(struct amdgpu_device
>*adev)
>>   if (!src)
>>   continue;
>>
>> -kfree(src->enabled_types);
>> +if (src->enabled_types)
>> +kfree(src->enabled_types);
>
>A NULL check before kfree() is unecessary and will be complained about 
>by the static checkers.
Sorry, will remove this.
>
>> +
>>   src->enabled_types = NULL;
>> +
>
>Unrelated white space change.
Sorry, will remove this also.
>
>>   if (src->data) {
>>   kfree(src->data);
>>   kfree(src);
>> -adev->irq.client[i].sources[j] = NULL;
>> +}
>> +
>> +for (m = 0; m < AMDGPU_IRQ_CLIENTID_MAX; ++m) { if 
>> +(!adev->irq.client[m].sources) continue; for (n = 0; n < 
>> +AMDGPU_MAX_IRQ_SRC_ID;
>++n)
>> +if (adev->irq.client[m].sources[n] ==
>src)
>> +adev->irq.client[m].sources[n]
>= NULL;
>
>Hui what? The memory you set to NULL here is freed on the line below.
>
>Accessing it after that would be illegal, so why do you want to set it to NULL?
[Emily] It is in the loop "for (j = 0; j < AMDGPU_MAX_IRQ_SRC_ID; ++j) {", 
shouldn't have been freed in this loop. Only set " 
adev->irq.client[i].sources[j] = NULL;" is not enough, as it maybe have other 
client ID and src ID will share the same src. Also need to set those to NULL.
>
>Christian.
>
>>   }
>>   }
>>   kfree(adev->irq.client[i].sources);

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7CDennis.Li%40amd.com%7C3e94ebf6f1d34e31898e08d8ea77b613%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637517147175478166%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=m2sQW4Ncv40K97wxOgC%2BSFiT8yhy6996E%2FR%2FMWLoh64%3Dreserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: Fix the page fault issue in amdgpu_irq_fini

2021-03-18 Thread Deng, Emily
[AMD Official Use Only - Internal Distribution Only]

>-Original Message-
>From: Christian König 
>Sent: Thursday, March 18, 2021 7:52 PM
>To: Deng, Emily ; amd-gfx@lists.freedesktop.org
>Subject: Re: [PATCH] drm/amdgpu: Fix the page fault issue in amdgpu_irq_fini
>
>Am 18.03.21 um 12:48 schrieb Emily Deng:
>> For some source, it will be shared by some client ID and source ID.
>> To fix the page fault issue, set all those to null.
>>
>> Signed-off-by: Emily Deng 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 16 +---
>>   1 file changed, 13 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> index af026109421a..623b1ac6231d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
>> @@ -359,7 +359,7 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
>>*/
>>   void amdgpu_irq_fini(struct amdgpu_device *adev)
>>   {
>> -unsigned i, j;
>> +unsigned i, j, m, n;
>>
>>   if (adev->irq.installed) {
>>   drm_irq_uninstall(adev_to_drm(adev));
>> @@ -380,12 +380,22 @@ void amdgpu_irq_fini(struct amdgpu_device
>*adev)
>>   if (!src)
>>   continue;
>>
>> -kfree(src->enabled_types);
>> +if (src->enabled_types)
>> +kfree(src->enabled_types);
>
>A NULL check before kfree() is unecessary and will be complained about by the
>static checkers.
Sorry, will remove this.
>
>> +
>>   src->enabled_types = NULL;
>> +
>
>Unrelated white space change.
Sorry, will remove this also.
>
>>   if (src->data) {
>>   kfree(src->data);
>>   kfree(src);
>> -adev->irq.client[i].sources[j] = NULL;
>> +}
>> +
>> +for (m = 0; m < AMDGPU_IRQ_CLIENTID_MAX; ++m) {
>> +if (!adev->irq.client[m].sources)
>> +continue;
>> +for (n = 0; n < AMDGPU_MAX_IRQ_SRC_ID;
>++n)
>> +if (adev->irq.client[m].sources[n] ==
>src)
>> +adev->irq.client[m].sources[n]
>= NULL;
>
>Hui what? The memory you set to NULL here is freed on the line below.
>
>Accessing it after that would be illegal, so why do you want to set it to NULL?
[Emily] It is in the loop "for (j = 0; j < AMDGPU_MAX_IRQ_SRC_ID; ++j) {", 
shouldn't have been freed in this loop. Only set " 
adev->irq.client[i].sources[j] = NULL;" is not enough,
as it maybe have other client ID and src ID will share the same src. Also need 
to set those to NULL.
>
>Christian.
>
>>   }
>>   }
>>   kfree(adev->irq.client[i].sources);

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Fix the page fault issue in amdgpu_irq_fini

2021-03-18 Thread Christian König

Am 18.03.21 um 12:48 schrieb Emily Deng:

For some source, it will be shared by some client ID and source ID.
To fix the page fault issue, set all those to null.

Signed-off-by: Emily Deng 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 16 +---
  1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index af026109421a..623b1ac6231d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -359,7 +359,7 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
   */
  void amdgpu_irq_fini(struct amdgpu_device *adev)
  {
-   unsigned i, j;
+   unsigned i, j, m, n;
  
  	if (adev->irq.installed) {

drm_irq_uninstall(adev_to_drm(adev));
@@ -380,12 +380,22 @@ void amdgpu_irq_fini(struct amdgpu_device *adev)
if (!src)
continue;
  
-			kfree(src->enabled_types);

+   if (src->enabled_types)
+   kfree(src->enabled_types);


A NULL check before kfree() is unecessary and will be complained about 
by the static checkers.



+
src->enabled_types = NULL;
+


Unrelated white space change.


if (src->data) {
kfree(src->data);
kfree(src);
-   adev->irq.client[i].sources[j] = NULL;
+   }
+
+   for (m = 0; m < AMDGPU_IRQ_CLIENTID_MAX; ++m) {
+   if (!adev->irq.client[m].sources)
+   continue;
+   for (n = 0; n < AMDGPU_MAX_IRQ_SRC_ID; ++n)
+   if (adev->irq.client[m].sources[n] == 
src)
+   adev->irq.client[m].sources[n] 
= NULL;


Hui what? The memory you set to NULL here is freed on the line below.

Accessing it after that would be illegal, so why do you want to set it 
to NULL?


Christian.


}
}
kfree(adev->irq.client[i].sources);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx