RE: [PATCH] drm/amdgpu: set default noretry=1 to fix kfd SVM issues for raven

2021-08-05 Thread Zhu, Changfeng
[AMD Official Use Only]

Hi Felix,

Can we set noretry=1 for dgpu path(ignore_crat=1) which doesn’t to through 
iommuv2 path on raven as below:
> + case CHIP_RAVEN:
> + /*
> +  * TODO: Raven currently can fix most SVM issues with
> +  * noretry =1. However it has two issues with noretry = 1
> +  * on kfd migrate tests. It still needs to root causes
> +  * with these two migrate fails on raven with noretry = 1.
> +  */
>   if (amdgpu_noretry == -1) {
>   If(ignore_crat)
>   gmc->noretry = 1;
>   else
>   gmc->noretry = 0;
>   }
>   else
>   gmc->noretry = amdgpu_noretry;
>   break;

BR,
Changfeng.

-Original Message-
From: Kuehling, Felix  
Sent: Wednesday, July 28, 2021 10:22 PM
To: Zhu, Changfeng ; amd-gfx@lists.freedesktop.org; 
Huang, Ray ; Zhang, Yifan 
Subject: Re: [PATCH] drm/amdgpu: set default noretry=1 to fix kfd SVM issues 
for raven

Doesn't this break IOMMUv2? Applications that run using IOMMUv2 for system 
memory access depend on correct retry handling in the SQ.
Therefore noretry must be 0 on Raven.

I believe the reason that SVM has trouble with retry enabled is, that
IOMMUv2 is catching the page faults, so the driver never gets to handle the 
page fault interrupts. That breaks page-fault based migration in the SVM code. 
I think the better solution is to disable SVM on APUs where
IOMMUv2 is enabled.

Alternatively, we could give up on IOMMUv2 entirely and always rely on SVM to 
provide that functionality. But that requires more changes in the amdgpu_vm 
code.

Regards,
  Felix


Am 2021-07-28 um 2:36 a.m. schrieb Changfeng:
> From: changzhu 
>
> From: Changfeng 
>
> It can't find any issues with noretry=1 except two SVM migrate issues.
> Oppositely, it will cause most SVM cases fail with noretry=0.
> The two SVM migrate issues also happen with noretry=0. So it can set 
> default noretry=1 for raven firstly to fix most SVM fails.
>
> Change-Id: Idb5cb3c1a04104013e4ab8aed2ad4751aaec4bbc
> Signed-off-by: Changfeng 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 15 ---
>  1 file changed, 8 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 09edfb64cce0..d7f69dbd48e6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -606,19 +606,20 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev)
>* noretry = 0 will cause kfd page fault tests fail
>* for some ASICs, so set default to 1 for these ASICs.
>*/
> + case CHIP_RAVEN:
> + /*
> +  * TODO: Raven currently can fix most SVM issues with
> +  * noretry =1. However it has two issues with noretry = 1
> +  * on kfd migrate tests. It still needs to root causes
> +  * with these two migrate fails on raven with noretry = 1.
> +  */
>   if (amdgpu_noretry == -1)
>   gmc->noretry = 1;
>   else
>   gmc->noretry = amdgpu_noretry;
>   break;
> - case CHIP_RAVEN:
>   default:
> - /* Raven currently has issues with noretry
> -  * regardless of what we decide for other
> -  * asics, we should leave raven with
> -  * noretry = 0 until we root cause the
> -  * issues.
> -  *
> + /*
>* default this to 0 for now, but we may want
>* to change this in the future for certain
>* GPUs as it can increase performance in


RE: [PATCH] drm/amdgpu: Update psp fw attestation support list

2021-06-07 Thread Zhu, Changfeng
Hi John,

As talked offline, the patch fine with apu at present.

Reviewed-by: Changfeng 


BR,
Changfeng.

From: Clements, John 
Sent: Monday, June 7, 2021 11:13 AM
To: amd-gfx@lists.freedesktop.org
Cc: Zhu, Changfeng 
Subject: [PATCH] drm/amdgpu: Update psp fw attestation support list


[AMD Official Use Only - Internal Distribution Only]

Submitting patch to disable PSP FW attestation support on APU

Thank you,
John Clements
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Recall: [PATCH] drm/amdgpu: Update psp fw attestation support list

2021-06-07 Thread Zhu, Changfeng
Zhu, Changfeng would like to recall the message, "[PATCH] drm/amdgpu: Update 
psp fw attestation support list".
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Recall: [PATCH] drm/amdgpu: Update psp fw attestation support list

2021-06-07 Thread Zhu, Changfeng
Zhu, Changfeng would like to recall the message, "[PATCH] drm/amdgpu: Update 
psp fw attestation support list".
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Recall: [PATCH] drm/amdgpu: Update psp fw attestation support list

2021-06-07 Thread Zhu, Changfeng
Zhu, Changfeng would like to recall the message, "[PATCH] drm/amdgpu: Update 
psp fw attestation support list".
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: Update psp fw attestation support list

2021-06-07 Thread Zhu, Changfeng
if (adev->asic_type == CHIP_VANGOGH)

BR,
Changfeng.


From: amd-gfx  On Behalf Of Zhu, 
Changfeng
Sent: Monday, June 7, 2021 11:24 AM
To: Clements, John ; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: Update psp fw attestation support list

Hi John,

I think it's better to replace
if (adev->flags & AMD_IS_APU)
with
if (adev->asic_type >= CHIP_VANGOGH)

As you say, rembrandt should support this feature.

BR,
Changfeng.

From: Clements, John mailto:john.cleme...@amd.com>>
Sent: Monday, June 7, 2021 11:13 AM
To: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Cc: Zhu, Changfeng mailto:changfeng@amd.com>>
Subject: [PATCH] drm/amdgpu: Update psp fw attestation support list


[AMD Official Use Only - Internal Distribution Only]

Submitting patch to disable PSP FW attestation support on APU

Thank you,
John Clements
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: Update psp fw attestation support list

2021-06-07 Thread Zhu, Changfeng
Hi John,

I think it's better to replace
if (adev->flags & AMD_IS_APU)
with
if (adev->asic_type >= CHIP_VANGOGH)

As you say, rembrandt should support this feature.

BR,
Changfeng.

From: Clements, John 
Sent: Monday, June 7, 2021 11:13 AM
To: amd-gfx@lists.freedesktop.org
Cc: Zhu, Changfeng 
Subject: [PATCH] drm/amdgpu: Update psp fw attestation support list


[AMD Official Use Only - Internal Distribution Only]

Submitting patch to disable PSP FW attestation support on APU

Thank you,
John Clements
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: take back kvmalloc_array for entries alloc because of kzalloc memory limit

2021-06-02 Thread Zhu, Changfeng
[AMD Official Use Only]

OK.

Thx, Chris and Das.

I'll try it and verify whether there are issues.

BR,
Changfeng.

-Original Message-
From: amd-gfx  On Behalf Of Christian 
König
Sent: Wednesday, June 2, 2021 5:41 PM
To: Zhu, Changfeng ; Koenig, Christian 
; Das, Nirmoy ; Huang, Ray 
; amd-...@freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: take back kvmalloc_array for entries alloc 
because of kzalloc memory limit

Hi Changfeng,

well that's a funny mix-up :)

The flags describe the backing store requirements, e.g. caching, contiguous etc 
etc...

But the allocation if for the housekeeping structure inside the kernel and is 
not related to the backing store of this BO.

Just switching the BO structure to be allocated using kvzalloc/kvfree should be 
enough.

Thanks,
Christian.

Am 02.06.21 um 11:10 schrieb Zhu, Changfeng:
> [AMD Official Use Only]
>
> Hi Chris,
>
> Actually, I think about switching kzalloc to kvmalloc in amdgpu_bo_create.
> However, I observe bp.flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS in 
> amdgpu_vm_pt_create.
>
> Does it matter we switch kzalloc to kvmalloc if there is a physical 
> continuous memory request when creating bo? Such as 
> AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS?
>
> BR,
> Changfeng.
>
>
>
> -Original Message---
> From: Koenig, Christian 
> Sent: Wednesday, June 2, 2021 4:57 PM
> To: Das, Nirmoy ; Zhu, Changfeng 
> ; Huang, Ray ; 
> amd-...@freedesktop.org
> Subject: Re: [PATCH] drm/amdgpu: take back kvmalloc_array for entries 
> alloc because of kzalloc memory limit
>
>
>
> Am 02.06.21 um 10:54 schrieb Das, Nirmoy:
>> On 6/2/2021 10:30 AM, Changfeng wrote:
>>> From: changzhu 
>>>
>>> From: Changfeng 
>>>
>>> It will cause error when alloc memory larger than 128KB in 
>>> amdgpu_bo_create->kzalloc.
>>
>> I wonder why I didn't see the error on my machine. Is there any 
>> config I might be missing ?
> VM page table layout depends on hardware generation, APU vs dGPU and kernel 
> command line settings.
>
> I think we just need to switch amdgpu_bo_create() from kzalloc to kvmalloc 
> (and kfree to kvfree in amdgpu_bo_destroy of course).
>
> Shouldn't be more than a two line patch.
>
> Regards,
> Christian.
>
>>
>> Thanks,
>>
>> Nirmoy
>>
>>> Call Trace:
>>>      alloc_pages_current+0x6a/0xe0
>>>      kmalloc_order+0x32/0xb0
>>>      kmalloc_order_trace+0x1e/0x80
>>>      __kmalloc+0x249/0x2d0
>>>      amdgpu_bo_create+0x102/0x500 [amdgpu]
>>>      ? xas_create+0x264/0x3e0
>>>      amdgpu_bo_create_vm+0x32/0x60 [amdgpu]
>>>      amdgpu_vm_pt_create+0xf5/0x260 [amdgpu]
>>>      amdgpu_vm_init+0x1fd/0x4d0 [amdgpu]
>>>
>>> Change-Id: I29e479db45ead37c39449e856599fd4f6a0e34ce
>>> Signed-off-by: Changfeng 
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 27
>>> +++---
>>>    1 file changed, 16 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 1923f035713a..714d613d020b 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -894,6 +894,10 @@ static int amdgpu_vm_pt_create(struct 
>>> amdgpu_device *adev,
>>>    num_entries = 0;
>>>      bp.bo_ptr_size = struct_size((*vmbo), entries, 
>>> num_entries);
>>> +    if (bp.bo_ptr_size > 32*AMDGPU_GPU_PAGE_SIZE) {
>>> +    DRM_INFO("Can't alloc memory larger than 128KB by using
>>> kzalloc in amdgpu_bo_create\n");
>>> +    bp.bo_ptr_size = sizeof(struct amdgpu_bo_vm);
>>> +    }
>>>      if (vm->use_cpu_for_update)
>>>    bp.flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>>> @@ -965,15 +969,19 @@ static int amdgpu_vm_alloc_pts(struct 
>>> amdgpu_device *adev,
>>>    struct amdgpu_bo_vm *pt;
>>>    int r;
>>>    -    if (entry->base.bo) {
>>> -    if (cursor->level < AMDGPU_VM_PTB)
>>> -    entry->entries =
>>> -    to_amdgpu_bo_vm(entry->base.bo)->entries;
>>> -    else
>>> -    entry->entries = NULL;
>>> -    return 0;
>>> +    if (cursor->level < AMDGPU_VM_PTB && !entry->entries) {
>>> +    unsigned num_entries;
>>> +    num_entries = amdgpu_vm_num_entries(adev, cursor->level);
>>> +    entry->entries = kvmalloc_array(num_entries,

RE: [PATCH] drm/amdgpu: take back kvmalloc_array for entries alloc because of kzalloc memory limit

2021-06-02 Thread Zhu, Changfeng
[AMD Official Use Only]

Hi Chris,

Actually, I think about switching kzalloc to kvmalloc in amdgpu_bo_create.
However, I observe bp.flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS in 
amdgpu_vm_pt_create.

Does it matter we switch kzalloc to kvmalloc if there is a physical continuous 
memory request when creating bo? Such as AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS?

BR,
Changfeng.



-Original Message---
From: Koenig, Christian  
Sent: Wednesday, June 2, 2021 4:57 PM
To: Das, Nirmoy ; Zhu, Changfeng ; 
Huang, Ray ; amd-...@freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: take back kvmalloc_array for entries alloc 
because of kzalloc memory limit



Am 02.06.21 um 10:54 schrieb Das, Nirmoy:
>
> On 6/2/2021 10:30 AM, Changfeng wrote:
>> From: changzhu 
>>
>> From: Changfeng 
>>
>> It will cause error when alloc memory larger than 128KB in 
>> amdgpu_bo_create->kzalloc.
>
>
> I wonder why I didn't see the error on my machine. Is there any config 
> I might be missing ?

VM page table layout depends on hardware generation, APU vs dGPU and kernel 
command line settings.

I think we just need to switch amdgpu_bo_create() from kzalloc to kvmalloc (and 
kfree to kvfree in amdgpu_bo_destroy of course).

Shouldn't be more than a two line patch.

Regards,
Christian.

>
>
> Thanks,
>
> Nirmoy
>
>> Call Trace:
>>     alloc_pages_current+0x6a/0xe0
>>     kmalloc_order+0x32/0xb0
>>     kmalloc_order_trace+0x1e/0x80
>>     __kmalloc+0x249/0x2d0
>>     amdgpu_bo_create+0x102/0x500 [amdgpu]
>>     ? xas_create+0x264/0x3e0
>>     amdgpu_bo_create_vm+0x32/0x60 [amdgpu]
>>     amdgpu_vm_pt_create+0xf5/0x260 [amdgpu]
>>     amdgpu_vm_init+0x1fd/0x4d0 [amdgpu]
>>
>> Change-Id: I29e479db45ead37c39449e856599fd4f6a0e34ce
>> Signed-off-by: Changfeng 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 27 
>> +++---
>>   1 file changed, 16 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 1923f035713a..714d613d020b 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -894,6 +894,10 @@ static int amdgpu_vm_pt_create(struct 
>> amdgpu_device *adev,
>>   num_entries = 0;
>>     bp.bo_ptr_size = struct_size((*vmbo), entries, num_entries);
>> +    if (bp.bo_ptr_size > 32*AMDGPU_GPU_PAGE_SIZE) {
>> +    DRM_INFO("Can't alloc memory larger than 128KB by using
>> kzalloc in amdgpu_bo_create\n");
>> +    bp.bo_ptr_size = sizeof(struct amdgpu_bo_vm);
>> +    }
>>     if (vm->use_cpu_for_update)
>>   bp.flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>> @@ -965,15 +969,19 @@ static int amdgpu_vm_alloc_pts(struct 
>> amdgpu_device *adev,
>>   struct amdgpu_bo_vm *pt;
>>   int r;
>>   -    if (entry->base.bo) {
>> -    if (cursor->level < AMDGPU_VM_PTB)
>> -    entry->entries =
>> -    to_amdgpu_bo_vm(entry->base.bo)->entries;
>> -    else
>> -    entry->entries = NULL;
>> -    return 0;
>> +    if (cursor->level < AMDGPU_VM_PTB && !entry->entries) {
>> +    unsigned num_entries;
>> +    num_entries = amdgpu_vm_num_entries(adev, cursor->level);
>> +    entry->entries = kvmalloc_array(num_entries,
>> +    sizeof(*entry->entries),
>> +    GFP_KERNEL | __GFP_ZERO);
>> +    if (!entry->entries)
>> +    return -ENOMEM;
>>   }
>>   +    if (entry->base.bo)
>> +    return 0;
>> +
>>   r = amdgpu_vm_pt_create(adev, vm, cursor->level, immediate, 
>> );
>>   if (r)
>>   return r;
>> @@ -984,10 +992,6 @@ static int amdgpu_vm_alloc_pts(struct 
>> amdgpu_device *adev,
>>   pt_bo = >bo;
>>   pt_bo->parent = amdgpu_bo_ref(cursor->parent->base.bo);
>>   amdgpu_vm_bo_base_init(>base, vm, pt_bo);
>> -    if (cursor->level < AMDGPU_VM_PTB)
>> -    entry->entries = pt->entries;
>> -    else
>> -    entry->entries = NULL;
>>     r = amdgpu_vm_clear_bo(adev, vm, pt, immediate);
>>   if (r)
>> @@ -1017,6 +1021,7 @@ static void amdgpu_vm_free_table(struct 
>> amdgpu_vm_pt *entry)
>>   amdgpu_bo_unref();
>>   amdgpu_bo_unref(>base.bo);
>>   }
>> +    kvfree(entry->entries);
>>   entry->entries = NULL;
>>   }
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang

2021-05-18 Thread Zhu, Changfeng
[Public]

Hi Alex,

This is the issue exposed by Nirmoy's patch that provided better load balancing 
across queues.

BR,
Changfeng.

From: Deucher, Alexander 
Sent: Wednesday, May 19, 2021 10:53 AM
To: Zhu, Changfeng ; Alex Deucher 
; Das, Nirmoy 
Cc: Huang, Ray ; amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid 
compute hang


[Public]

+ Nirmoy

I thought we disabled all but one of the compute queues on raven due to this 
issue.  Maybe that patch never landed?  Wasn't this the same issue that was 
exposed by Nirmoy's patch that provided better load balancing across queues?

Alex


From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of Zhu, Changfeng 
mailto:changfeng@amd.com>>
Sent: Tuesday, May 18, 2021 10:28 PM
To: Alex Deucher mailto:alexdeuc...@gmail.com>>
Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list 
mailto:amd-gfx@lists.freedesktop.org>>
Subject: RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid 
compute hang

[AMD Official Use Only - Internal Distribution Only]

Hi Alex.

I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to 
avoid compute hang

Do you mean we have something else to do for re-enabling the extra compute 
queues?

BR,
Changfeng.

-Original Message-
From: Alex Deucher mailto:alexdeuc...@gmail.com>>
Sent: Wednesday, May 19, 2021 10:20 AM
To: Zhu, Changfeng mailto:changfeng@amd.com>>
Cc: Huang, Ray mailto:ray.hu...@amd.com>>; amd-gfx list 
mailto:amd-gfx@lists.freedesktop.org>>
Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid 
compute hang

Care to submit a patch to re-enable the extra compute queues?

Alex

On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng 
mailto:changfeng@amd.com>> wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi Ray and Alex,
>
> I have confirmed it can enable the additional compute queues with this patch:
>
> [   41.823013] This is ring mec 1, pipe 0, queue 0, value 1
> [   41.823028] This is ring mec 1, pipe 1, queue 0, value 1
> [   41.823042] This is ring mec 1, pipe 2, queue 0, value 1
> [   41.823057] This is ring mec 1, pipe 3, queue 0, value 1
> [   41.823071] This is ring mec 1, pipe 0, queue 1, value 1
> [   41.823086] This is ring mec 1, pipe 1, queue 1, value 1
> [   41.823101] This is ring mec 1, pipe 2, queue 1, value 1
> [   41.823115] This is ring mec 1, pipe 3, queue 1, value 1
>
> BR,
> Changfeng.
>
>
> -Original Message-
> From: Huang, Ray mailto:ray.hu...@amd.com>>
> Sent: Monday, May 17, 2021 2:27 PM
> To: Alex Deucher mailto:alexdeuc...@gmail.com>>; Zhu, 
> Changfeng
> mailto:changfeng@amd.com>>
> Cc: amd-gfx list 
> mailto:amd-gfx@lists.freedesktop.org>>
> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to
> avoid compute hang
>
> On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote:
> > On Fri, May 14, 2021 at 4:20 AM 
> > mailto:changfeng@amd.com>> wrote:
> > >
> > > From: changzhu mailto:changfeng@amd.com>>
> > >
> > > From: Changfeng mailto:changfeng@amd.com>>
> > >
> > > There is problem with 3DCGCG firmware and it will cause compute
> > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver
> > > to avoid compute hang.
> > >
> > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87
> > > Signed-off-by: Changfeng 
> > > mailto:changfeng@amd.com>>
> >
> > Reviewed-by: Alex Deucher 
> > mailto:alexander.deuc...@amd.com>>
> >
> > WIth this applied, can we re-enable the additional compute queues?
> >
>
> I think so.
>
> Changfeng, could you please confirm this on all raven series?
>
> Patch is Reviewed-by: Huang Rui mailto:ray.hu...@amd.com>>
>
> > Alex
> >
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++---
> > >  drivers/gpu/drm/amd/amdgpu/soc15.c|  2 --
> > >  2 files changed, 7 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > index 22608c45f07c..feaa5e4a5538 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct 
> > > amdgpu_device *adev,
> > > amdgpu_gfx_rlc_enter_safe_mode(adev);
> > >
> > > /* Enable 3D CGCG/CGLS */
> > > -   if (enable &&a

RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang

2021-05-18 Thread Zhu, Changfeng
[AMD Official Use Only - Internal Distribution Only]

Hi Alex.

I have submitted the patch: drm/amdgpu: disable 3DCGCG on picasso/raven1 to 
avoid compute hang

Do you mean we have something else to do for re-enabling the extra compute 
queues?

BR,
Changfeng.

-Original Message-
From: Alex Deucher  
Sent: Wednesday, May 19, 2021 10:20 AM
To: Zhu, Changfeng 
Cc: Huang, Ray ; amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid 
compute hang

Care to submit a patch to re-enable the extra compute queues?

Alex

On Mon, May 17, 2021 at 4:09 AM Zhu, Changfeng  wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi Ray and Alex,
>
> I have confirmed it can enable the additional compute queues with this patch:
>
> [   41.823013] This is ring mec 1, pipe 0, queue 0, value 1
> [   41.823028] This is ring mec 1, pipe 1, queue 0, value 1
> [   41.823042] This is ring mec 1, pipe 2, queue 0, value 1
> [   41.823057] This is ring mec 1, pipe 3, queue 0, value 1
> [   41.823071] This is ring mec 1, pipe 0, queue 1, value 1
> [   41.823086] This is ring mec 1, pipe 1, queue 1, value 1
> [   41.823101] This is ring mec 1, pipe 2, queue 1, value 1
> [   41.823115] This is ring mec 1, pipe 3, queue 1, value 1
>
> BR,
> Changfeng.
>
>
> -Original Message-
> From: Huang, Ray 
> Sent: Monday, May 17, 2021 2:27 PM
> To: Alex Deucher ; Zhu, Changfeng 
> 
> Cc: amd-gfx list 
> Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to 
> avoid compute hang
>
> On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote:
> > On Fri, May 14, 2021 at 4:20 AM  wrote:
> > >
> > > From: changzhu 
> > >
> > > From: Changfeng 
> > >
> > > There is problem with 3DCGCG firmware and it will cause compute 
> > > test hang on picasso/raven1. It needs to disable 3DCGCG in driver 
> > > to avoid compute hang.
> > >
> > > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87
> > > Signed-off-by: Changfeng 
> >
> > Reviewed-by: Alex Deucher 
> >
> > WIth this applied, can we re-enable the additional compute queues?
> >
>
> I think so.
>
> Changfeng, could you please confirm this on all raven series?
>
> Patch is Reviewed-by: Huang Rui 
>
> > Alex
> >
> > > ---
> > >  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++---
> > >  drivers/gpu/drm/amd/amdgpu/soc15.c|  2 --
> > >  2 files changed, 7 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > index 22608c45f07c..feaa5e4a5538 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct 
> > > amdgpu_device *adev,
> > > amdgpu_gfx_rlc_enter_safe_mode(adev);
> > >
> > > /* Enable 3D CGCG/CGLS */
> > > -   if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) {
> > > +   if (enable) {
> > > /* write cmd to clear cgcg/cgls ov */
> > > def = data = RREG32_SOC15(GC, 0, 
> > > mmRLC_CGTT_MGCG_OVERRIDE);
> > > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ 
> > > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev,
> > > /* enable 3Dcgcg FSM(0x363f) */
> > > def = RREG32_SOC15(GC, 0, 
> > > mmRLC_CGCG_CGLS_CTRL_3D);
> > >
> > > -   data = (0x36 << 
> > > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) |
> > > -   RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK;
> > > +   if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)
> > > +   data = (0x36 << 
> > > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) |
> > > +   RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK;
> > > +   else
> > > +   data = 0x0 << 
> > > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT;
> > > +
> > > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS)
> > > data |= (0x000F << 
> > > RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) |
> > > 
> > > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK;
> > > diff --

RE: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hang

2021-05-17 Thread Zhu, Changfeng
[AMD Official Use Only - Internal Distribution Only]

Hi Ray and Alex,

I have confirmed it can enable the additional compute queues with this patch:

[   41.823013] This is ring mec 1, pipe 0, queue 0, value 1
[   41.823028] This is ring mec 1, pipe 1, queue 0, value 1
[   41.823042] This is ring mec 1, pipe 2, queue 0, value 1
[   41.823057] This is ring mec 1, pipe 3, queue 0, value 1
[   41.823071] This is ring mec 1, pipe 0, queue 1, value 1
[   41.823086] This is ring mec 1, pipe 1, queue 1, value 1
[   41.823101] This is ring mec 1, pipe 2, queue 1, value 1
[   41.823115] This is ring mec 1, pipe 3, queue 1, value 1

BR,
Changfeng.


-Original Message-
From: Huang, Ray  
Sent: Monday, May 17, 2021 2:27 PM
To: Alex Deucher ; Zhu, Changfeng 
Cc: amd-gfx list 
Subject: Re: [PATCH] drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid 
compute hang

On Fri, May 14, 2021 at 10:13:55PM +0800, Alex Deucher wrote:
> On Fri, May 14, 2021 at 4:20 AM  wrote:
> >
> > From: changzhu 
> >
> > From: Changfeng 
> >
> > There is problem with 3DCGCG firmware and it will cause compute test 
> > hang on picasso/raven1. It needs to disable 3DCGCG in driver to 
> > avoid compute hang.
> >
> > Change-Id: Ic7d3c7922b2b32f7ac5193d6a4869cbc5b3baa87
> > Signed-off-by: Changfeng 
> 
> Reviewed-by: Alex Deucher 
> 
> WIth this applied, can we re-enable the additional compute queues?
> 

I think so.

Changfeng, could you please confirm this on all raven series?

Patch is Reviewed-by: Huang Rui 

> Alex
> 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +++---
> >  drivers/gpu/drm/amd/amdgpu/soc15.c|  2 --
> >  2 files changed, 7 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
> > b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > index 22608c45f07c..feaa5e4a5538 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> > @@ -4947,7 +4947,7 @@ static void gfx_v9_0_update_3d_clock_gating(struct 
> > amdgpu_device *adev,
> > amdgpu_gfx_rlc_enter_safe_mode(adev);
> >
> > /* Enable 3D CGCG/CGLS */
> > -   if (enable && (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)) {
> > +   if (enable) {
> > /* write cmd to clear cgcg/cgls ov */
> > def = data = RREG32_SOC15(GC, 0, mmRLC_CGTT_MGCG_OVERRIDE);
> > /* unset CGCG override */ @@ -4959,8 +4959,12 @@ 
> > static void gfx_v9_0_update_3d_clock_gating(struct amdgpu_device *adev,
> > /* enable 3Dcgcg FSM(0x363f) */
> > def = RREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL_3D);
> >
> > -   data = (0x36 << 
> > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) |
> > -   RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK;
> > +   if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGCG)
> > +   data = (0x36 << 
> > RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT) |
> > +   RLC_CGCG_CGLS_CTRL_3D__CGCG_EN_MASK;
> > +   else
> > +   data = 0x0 << 
> > + RLC_CGCG_CGLS_CTRL_3D__CGCG_GFX_IDLE_THRESHOLD__SHIFT;
> > +
> > if (adev->cg_flags & AMD_CG_SUPPORT_GFX_3D_CGLS)
> > data |= (0x000F << 
> > RLC_CGCG_CGLS_CTRL_3D__CGLS_REP_COMPANSAT_DELAY__SHIFT) |
> > RLC_CGCG_CGLS_CTRL_3D__CGLS_EN_MASK;
> > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
> > b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > index 4b660b2d1c22..080e715799d4 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > @@ -1393,7 +1393,6 @@ static int soc15_common_early_init(void *handle)
> > adev->cg_flags = AMD_CG_SUPPORT_GFX_MGCG |
> > AMD_CG_SUPPORT_GFX_MGLS |
> > AMD_CG_SUPPORT_GFX_CP_LS |
> > -   AMD_CG_SUPPORT_GFX_3D_CGCG |
> > AMD_CG_SUPPORT_GFX_3D_CGLS |
> > AMD_CG_SUPPORT_GFX_CGCG |
> > AMD_CG_SUPPORT_GFX_CGLS | @@ -1413,7 
> > +1412,6 @@ static int soc15_common_early_init(void *handle)
> > AMD_CG_SUPPORT_GFX_MGLS |
> > AMD_CG_SUPPORT_GFX_RLC_LS |
> > AMD_CG_SUPPORT_GFX_CP_LS |
> > -   AM

RE: [PATCH] drm/amdgpu: decline max_me for mec2_fw remove in renoir/arcturus

2021-02-24 Thread Zhu, Changfeng
[AMD Official Use Only - Internal Distribution Only]

Thanks,Ray.

BR,
Changfeng.

-Original Message-
From: Huang, Ray  
Sent: Thursday, February 25, 2021 1:42 PM
To: Zhu, Changfeng 
Cc: amd-gfx@lists.freedesktop.org; Clements, John 
Subject: Re: [PATCH] drm/amdgpu: decline max_me for mec2_fw remove in 
renoir/arcturus

On Wed, Feb 24, 2021 at 05:10:55PM +0800, Zhu, Changfeng wrote:
> From: changzhu 
> 
> From: Changfeng 
> 
> The value of max_me in amdgpu_gfx_rlc_setup_cp_table should reduce to 
> 4 when mec2_fw is removed on asic renoir/arcturus. Or it will cause 
> kernel NULL pointer when modprobe driver.
> 
> Change-Id: I268610e85f6acd9200478d0ab1518349ff81469b
> Signed-off-by: Changfeng 

Reviewed-by: Huang Rui 

> ---
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> index 2f56adebbb31..300a07227597 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> @@ -1890,7 +1890,10 @@ static void gfx_v9_0_enable_lbpw(struct 
> amdgpu_device *adev, bool enable)
>  
>  static int gfx_v9_0_cp_jump_table_num(struct amdgpu_device *adev)  {
> - return 5;
> + if (gfx_v9_0_load_mec2_fw_bin_support(adev))
> + return 5;
> + else
> + return 4;
>  }
>  
>  static int gfx_v9_0_rlc_init(struct amdgpu_device *adev)
> --
> 2.17.1
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdkfd: fix null pointer panic while free buffer in kfd

2021-02-01 Thread Zhu, Changfeng
[AMD Official Use Only - Internal Distribution Only]

Tested-by: Changfeng

BR,
Changfeng.

-Original Message-
From: Huang, Ray  
Sent: Monday, February 1, 2021 6:39 PM
To: amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix ; Deucher, Alexander 
; Koenig, Christian ; Zhu, 
Changfeng ; Huang, Ray 
Subject: [PATCH] drm/amdkfd: fix null pointer panic while free buffer in kfd

In drm_gem_object_free, it will call funcs of drm buffer obj. So kfd_alloc 
should use amdgpu_gem_object_create instead of amdgpu_bo_create to initialize 
the funcs as amdgpu_gem_object_funcs.

[  396.231390] amdgpu: Release VA 0x7f76b4ada000 - 0x7f76b4add000
[  396.231394] amdgpu:   remove VA 0x7f76b4ada000 - 0x7f76b4add000 in entry 
85c24a47
[  396.231408] BUG: kernel NULL pointer dereference, address:  
[  396.231445] #PF: supervisor read access in kernel mode [  396.231466] #PF: 
error_code(0x) - not-present page [  396.231484] PGD 0 P4D 0 [  396.231495] 
Oops:  [#1] SMP NOPTI
[  396.231509] CPU: 7 PID: 1352 Comm: clinfo Tainted: G   OE 
5.11.0-rc2-custom #1
[  396.231537] Hardware name: AMD Celadon-RN/Celadon-RN, BIOS 
WCD0401N_Weekly_20_04_0 04/01/2020 [  396.231563] RIP: 
0010:drm_gem_object_free+0xc/0x22 [drm] [  396.231606] Code: eb ec 48 89 c3 eb 
e7 0f 1f 44 00 00 55 48 89 e5 48 8b bf 00 06 00 00 e8 72 0d 01 00 5d c3 0f 1f 
44 00 00 48 8b 87 40 01 00 00 <48> 8b 00 48 85 c0 74 0b 55 48 89 e5 e8 54 37 7c 
db 5d c3 0f 0b c3 [  396.231666] RSP: 0018:b4704177fcf8 EFLAGS: 00010246 [  
396.231686] RAX:  RBX: 993a0d0cc400 RCX: 3113 [ 
 396.231711] RDX: 0001 RSI: e9cda7a5d0791c6d RDI: 993a333a9058 
[  396.231736] RBP: b4704177fdd0 R08: 993a03855858 R09: 
 [  396.231761] R10: 993a0d1f7158 R11: 0001 
R12:  [  396.231785] R13: 993a0d0cc428 R14: 
3000 R15: b4704177fde0 [  396.231811] FS:  
7f76b5730740() GS:993b275c() knlGS: [  
396.231840] CS:  0010 DS:  ES:  CR0: 80050033 [  39
 6.231860] CR2:  CR3: 00016d2e2000 CR4: 00350ee0 [  
396.231885] Call Trace:
[  396.231897]  ? amdgpu_amdkfd_gpuvm_free_memory_of_gpu+0x24c/0x25f [amdgpu] [ 
 396.232056]  ? __dynamic_dev_dbg+0xcd/0x100 [  396.232076]  
kfd_ioctl_free_memory_of_gpu+0x91/0x102 [amdgpu] [  396.232214]  
kfd_ioctl+0x211/0x35b [amdgpu] [  396.232341]  ? 
kfd_ioctl_get_queue_wave_state+0x52/0x52 [amdgpu]

Signed-off-by: Huang Rui 
---

This patch is to fix the issue on latest 5.11-rc2 based amd-staging-drm-next.

---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 0849b68e784f..ac0a432a9bf7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -26,6 +26,7 @@
 #include 
 
 #include "amdgpu_object.h"
+#include "amdgpu_gem.h"
 #include "amdgpu_vm.h"
 #include "amdgpu_amdkfd.h"
 #include "amdgpu_dma_buf.h"
@@ -1152,7 +1153,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
struct sg_table *sg = NULL;
uint64_t user_addr = 0;
struct amdgpu_bo *bo;
-   struct amdgpu_bo_param bp;
+   struct drm_gem_object *gobj;
u32 domain, alloc_domain;
u64 alloc_flags;
int ret;
@@ -1220,19 +1221,14 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
pr_debug("\tcreate BO VA 0x%llx size 0x%llx domain %s\n",
va, size, domain_string(alloc_domain));
 
-   memset(, 0, sizeof(bp));
-   bp.size = size;
-   bp.byte_align = 1;
-   bp.domain = alloc_domain;
-   bp.flags = alloc_flags;
-   bp.type = bo_type;
-   bp.resv = NULL;
-   ret = amdgpu_bo_create(adev, , );
+   ret = amdgpu_gem_object_create(adev, size, 1, alloc_domain, alloc_flags,
+  bo_type, NULL, );
if (ret) {
pr_debug("Failed to create BO on domain %s. ret %d\n",
-   domain_string(alloc_domain), ret);
+domain_string(alloc_domain), ret);
goto err_bo_create;
}
+   bo = gem_to_amdgpu_bo(gobj);
if (bo_type == ttm_bo_type_sg) {
bo->tbo.sg = sg;
bo->tbo.ttm->sg = sg;
--
2.25.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/display: fix the NULL pointer that missed set_disp_pattern_generator callback

2020-11-01 Thread Zhu, Changfeng
[AMD Official Use Only - Internal Distribution Only]

Tested-by: Changfeng 

BR,
Changfeng.

-Original Message-
From: Huang, Ray  
Sent: Monday, November 2, 2020 12:58 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Zhu, Changfeng 
; Li, Roman ; Huang, Ray 

Subject: [PATCH] drm/amd/display: fix the NULL pointer that missed 
set_disp_pattern_generator callback

This patch is to fix the NULL pointer that missed set_disp_pattern_generator 
callback on DCN301

[  505.054167] BUG: kernel NULL pointer dereference, address:  
[  505.054176] #PF: supervisor instruction fetch in kernel mode [  505.054181] 
#PF: error_code(0x0010) - not-present page [  505.054185] PGD 0 P4D 0 [  
505.054199] Oops: 0010 [#1] SMP NOPTI
[  505.054211] CPU: 6 PID: 1306 Comm: modprobe Tainted: GW  OE 
5.9.0-rc5-custom #1
[  505.054216] Hardware name: AMD Chachani-VN/Chachani-VN, BIOS 
WCH0A29N_RAPV16.FD 10/29/2020 [  505.054225] RIP: 0010:0x0 [  505.054234] Code: 
Bad RIP value.
[  505.054239] RSP: 0018:b88541c66f60 EFLAGS: 00010206 [  505.054245] RAX: 
 RBX: 91283607 RCX: 0003 [  505.054248] 
RDX: 000c RSI: 9128365001e8 RDI: 91283607 [  
505.054252] RBP: b88541c66fd8 R08: 0002 R09: b88541c66fa2 [ 
 505.054265] R10: 9580 R11: 0008 R12: 9128365001e8 
[  505.054272] R13: 000c R14: 0438 R15: 
9128a48bd000 [  505.054279] FS:  7f09f999f540() 
GS:9128b3f8() knlGS: [  505.054284] CS:  0010 DS: 
 ES:  CR0: 80050033 [  505.054288] CR2: ffd6 CR3: 
0002db98c000 CR4: 00350ee0 [  505.054291] Call Trace:
[  505.055024]  dcn20_blank_pixel_data+0x148/0x260 [amdgpu] [  505.055730]  
dcn20_enable_stream_timing+0x381/0x47c [amdgpu] [  505.056641]  
dce110_apply_ctx_to_hw+0x337/0x577 [amdgpu] [  505.056667]  ? 
put_object+0x2f/0x40 [  505.057329]  dc_commit_state+0x4b3/0x9d0 [amdgpu] [  
505.058030]  amdgpu_dm_atomic_commit_tail+0x405/0x1ec6 [amdgpu] [  505.058053]  
? update_stack_state+0x103/0x170 [  505.058071]  ? 
__module_text_address+0x12/0x60

Signed-off-by: Huang Rui 
---
 drivers/gpu/drm/amd/display/dc/dcn301/dcn301_init.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_init.c 
b/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_init.c
index 6d9587c39efd..bdad72140cbc 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_init.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_init.c
@@ -97,6 +97,7 @@ static const struct hw_sequencer_funcs dcn301_funcs = {
.set_backlight_level = dcn21_set_backlight_level,
.set_abm_immediate_disable = dcn21_set_abm_immediate_disable,
.set_pipe = dcn21_set_pipe,
+   .set_disp_pattern_generator = dcn30_set_disp_pattern_generator,
 };
 
 static const struct hwseq_private_funcs dcn301_private_funcs = {
--
2.25.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: add ta firmware load in psp_v12_0 for renoir

2020-09-03 Thread Zhu, Changfeng
[AMD Public Use]


[AMD Public Use]


Thanks, Lakha.

BR,
Changfeng.

From: Lakha, Bhawanpreet 
Sent: Thursday, September 3, 2020 11:07 PM
To: Deucher, Alexander ; Zhu, Changfeng 
; amd-gfx@lists.freedesktop.org; Huang, Ray 

Subject: Re: [PATCH] drm/amdgpu: add ta firmware load in psp_v12_0 for renoir


[AMD Public Use]

Hi Alex,

psp_sw_fini() releases the ta firmware



Reviewed-by: Bhawanpreet Lakha 
mailto:bhawanpreet.la...@amd.com>>







From: Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>
Sent: September 2, 2020 10:18 AM
To: Zhu, Changfeng mailto:changfeng@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
mailto:amd-gfx@lists.freedesktop.org>>; Huang, 
Ray mailto:ray.hu...@amd.com>>; Lakha, Bhawanpreet 
mailto:bhawanpreet.la...@amd.com>>
Subject: Re: [PATCH] drm/amdgpu: add ta firmware load in psp_v12_0 for renoir


[AMD Public Use]

We also need to release the firmware when the driver unloads or is that already 
handled in some common path?

Alex


From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of Changfeng.Zhu 
mailto:changfeng@amd.com>>
Sent: Tuesday, September 1, 2020 10:25 PM
To: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> 
mailto:amd-gfx@lists.freedesktop.org>>; Huang, 
Ray mailto:ray.hu...@amd.com>>; Lakha, Bhawanpreet 
mailto:bhawanpreet.la...@amd.com>>
Cc: Zhu, Changfeng mailto:changfeng@amd.com>>
Subject: [PATCH] drm/amdgpu: add ta firmware load in psp_v12_0 for renoir

From: changzhu mailto:changfeng@amd.com>>

From: Changfeng mailto:changfeng@amd.com>>

It needs to load renoir_ta firmware because hdcp is enabled by default
for renoir now. This can avoid error:DTM TA is not initialized

Change-Id: Ib2f03a531013e4b432c2e9d4ec3dc021b4f8da7d
Signed-off-by: Changfeng mailto:changfeng@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/psp_v12_0.c | 54 ++
 1 file changed, 54 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c 
b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
index 6c9614f77d33..75489313dbad 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v12_0.c
@@ -38,6 +38,8 @@
 #include "oss/osssys_4_0_sh_mask.h"

 MODULE_FIRMWARE("amdgpu/renoir_asd.bin");
+MODULE_FIRMWARE("amdgpu/renoir_ta.bin");
+
 /* address block */
 #define smnMP1_FIRMWARE_FLAGS   0x3010024

@@ -45,7 +47,10 @@ static int psp_v12_0_init_microcode(struct psp_context *psp)
 {
 struct amdgpu_device *adev = psp->adev;
 const char *chip_name;
+   char fw_name[30];
 int err = 0;
+   const struct ta_firmware_header_v1_0 *ta_hdr;
+   DRM_DEBUG("\n");

 switch (adev->asic_type) {
 case CHIP_RENOIR:
@@ -56,6 +61,55 @@ static int psp_v12_0_init_microcode(struct psp_context *psp)
 }

 err = psp_init_asd_microcode(psp, chip_name);
+   if (err)
+   goto out;
+
+   snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_ta.bin", chip_name);
+   err = request_firmware(>psp.ta_fw, fw_name, adev->dev);
+   if (err) {
+   release_firmware(adev->psp.ta_fw);
+   adev->psp.ta_fw = NULL;
+   dev_info(adev->dev,
+"psp v12.0: Failed to load firmware \"%s\"\n",
+fw_name);
+   } else {
+   err = amdgpu_ucode_validate(adev->psp.ta_fw);
+   if (err)
+   goto out2;
+
+   ta_hdr = (const struct ta_firmware_header_v1_0 *)
+adev->psp.ta_fw->data;
+   adev->psp.ta_hdcp_ucode_version =
+   le32_to_cpu(ta_hdr->ta_hdcp_ucode_version);
+   adev->psp.ta_hdcp_ucode_size =
+   le32_to_cpu(ta_hdr->ta_hdcp_size_bytes);
+   adev->psp.ta_hdcp_start_addr =
+   (uint8_t *)ta_hdr +
+   le32_to_cpu(ta_hdr->header.ucode_array_offset_bytes);
+
+   adev->psp.ta_fw_version = 
le32_to_cpu(ta_hdr->header.ucode_version);
+
+   adev->psp.ta_dtm_ucode_version =
+   le32_to_cpu(ta_hdr->ta_dtm_ucode_version);
+   adev->psp.ta_dtm_ucode_size =
+   le32_to_cpu(ta_hdr->ta_dtm_size_bytes);
+   adev->psp.ta_dtm_start_addr =
+   (uint8_t *)adev->psp.ta_hdcp_start_addr +
+   le32_to_cpu(ta_hdr->ta_dtm_offset_bytes);
+   }
+
+   return 0;
+
+out2:
+   release_firmware(adev->psp.ta_fw);
+   adev->psp.ta_fw = NULL;
+out:
+   if (err) {
+   dev_err(adev->dev,
+ 

RE: [PATCH] drm/amdgpu/vcn: fix gfxoff issue

2020-04-15 Thread Zhu, Changfeng
[AMD Official Use Only - Internal Distribution Only]

After drop this WA,
It can't enter GFXOFF on raven2. 
And it can't run S3 successfully on Picasso and raven1.

I suggest that it can add chip type and drop this WA only on renoir.

BR,
Changfeng

-Original Message-
From: Zhang, Hawking  
Sent: Wednesday, April 15, 2020 11:05 AM
To: Zhu, James ; Alex Deucher ; Zhu, 
Changfeng ; amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking 
Subject: RE: [PATCH] drm/amdgpu/vcn: fix gfxoff issue

[AMD Official Use Only - Internal Distribution Only]

This actually introduced at very early stage we enabled GFXOFF for the first 
time on Raven platform. At that time gfxoff can't work with Video play back 
(this is general issue across all OSes). So we disabled gfxoff when there is 
workload on VCN.

For most ASICs, it shall be removed. The only concern is some old Raven 
platform where the rlc fw fixes are not available. 

I had a quick chat with @Zhu, Changfeng who will have a quick validation on his 
old Raven platform so that we can safely remove this workaround finally.

Regards,
Hawking
-Original Message-
From: Zhu, James 
Sent: Tuesday, April 14, 2020 23:00
To: Alex Deucher ; Zhu, James ; 
Zhang, Hawking 
Cc: amd-gfx list ; Zhu, Changfeng 

Subject: Re: [PATCH] drm/amdgpu/vcn: fix gfxoff issue

+Hawking

Hi Hawking,

can we drop this WA?

Thanks!

James

On 2020-04-14 10:52 a.m., James Zhu wrote:
> +Rex
>
> This is introduce by below patch.
>
> commit 3fded222f4bf7f4c56ef4854872a39a4de08f7a8
> Author: Rex Zhu 
> Date:   Fri Jul 27 17:00:02 2018 +0800
>
>     drm/amdgpu: Disable gfx off if VCN is busy
>
>     this patch is a workaround for the gpu hang
>     at video begin/end time if gfx off is enabled.
>
>     Reviewed-by: Hawking Zhang 
>     Signed-off-by: Rex Zhu 
>     Signed-off-by: Alex Deucher 
>
>
> On 2020-04-14 10:22 a.m., Alex Deucher wrote:
>> On Tue, Apr 14, 2020 at 8:05 AM James Zhu  wrote:
>>> Turn off gfxoff control when vcn is gated.
>>>
>>> Signed-off-by: James Zhu 
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 8 +---
>>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> index dab34f6..aa9a7a5 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> @@ -369,9 +369,11 @@ void amdgpu_vcn_ring_begin_use(struct 
>>> amdgpu_ring *ring) cancel_delayed_work_sync(>vcn.idle_work);
>>>
>>>  mutex_lock(>vcn.vcn_pg_lock);
>>> -   amdgpu_gfx_off_ctrl(adev, false);
>>> -   amdgpu_device_ip_set_powergating_state(adev,
>>> AMD_IP_BLOCK_TYPE_VCN,
>>> -  AMD_PG_STATE_UNGATE);
>>> +   if (adev->vcn.cur_state == AMD_PG_STATE_GATE) {
>>> +   amdgpu_gfx_off_ctrl(adev, false);
>>> +   amdgpu_device_ip_set_powergating_state(adev,
>>> AMD_IP_BLOCK_TYPE_VCN,
>>> +  AMD_PG_STATE_UNGATE);
>>> +   }
>>>
>> Why are we touching gfxoff with VCN?  Was this a leftover from bring 
>> up?  Can we just drop all of this gfxoff stuff from VCN handling?  I 
>> don't see why there would be a dependency.
>>
>> Alex
>>
>>>  if (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG)    {
>>>  struct dpg_pause_state new_state;
>>> --
>>> 2.7.4
>>>
>>> ___
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli
>>> sts.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=02%7C01%
>>> 7CJames.Zhu%40amd.com%7C3dd9135c717a4b3011e308d7e07f52b6%7C3dd8961fe
>>> 4884e608e11a82d994e183d%7C0%7C0%7C637224709763391845sdata=Y%2Bt
>>> sJQNB1TXCQ9v86DW%2F0FC63gOSHsfzzZFu0MDvCHw%3Dreserved=0
>>>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu/vcn: fix gfxoff issue

2020-04-14 Thread Zhu, Changfeng
[AMD Official Use Only - Internal Distribution Only]

+Ray

BR,
Changfeng.

-Original Message-
From: Zhu, James  
Sent: Tuesday, April 14, 2020 11:00 PM
To: Alex Deucher ; Zhu, James ; 
Zhang, Hawking 
Cc: amd-gfx list ; Zhu, Changfeng 

Subject: Re: [PATCH] drm/amdgpu/vcn: fix gfxoff issue

+Hawking

Hi Hawking,

can we drop this WA?

Thanks!

James

On 2020-04-14 10:52 a.m., James Zhu wrote:
> +Rex
>
> This is introduce by below patch.
>
> commit 3fded222f4bf7f4c56ef4854872a39a4de08f7a8
> Author: Rex Zhu 
> Date:   Fri Jul 27 17:00:02 2018 +0800
>
>     drm/amdgpu: Disable gfx off if VCN is busy
>
>     this patch is a workaround for the gpu hang
>     at video begin/end time if gfx off is enabled.
>
>     Reviewed-by: Hawking Zhang 
>     Signed-off-by: Rex Zhu 
>     Signed-off-by: Alex Deucher 
>
>
> On 2020-04-14 10:22 a.m., Alex Deucher wrote:
>> On Tue, Apr 14, 2020 at 8:05 AM James Zhu  wrote:
>>> Turn off gfxoff control when vcn is gated.
>>>
>>> Signed-off-by: James Zhu 
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 8 +---
>>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> index dab34f6..aa9a7a5 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>>> @@ -369,9 +369,11 @@ void amdgpu_vcn_ring_begin_use(struct 
>>> amdgpu_ring *ring) cancel_delayed_work_sync(>vcn.idle_work);
>>>
>>>  mutex_lock(>vcn.vcn_pg_lock);
>>> -   amdgpu_gfx_off_ctrl(adev, false);
>>> -   amdgpu_device_ip_set_powergating_state(adev,
>>> AMD_IP_BLOCK_TYPE_VCN,
>>> -  AMD_PG_STATE_UNGATE);
>>> +   if (adev->vcn.cur_state == AMD_PG_STATE_GATE) {
>>> +   amdgpu_gfx_off_ctrl(adev, false);
>>> +   amdgpu_device_ip_set_powergating_state(adev,
>>> AMD_IP_BLOCK_TYPE_VCN,
>>> +  AMD_PG_STATE_UNGATE);
>>> +   }
>>>
>> Why are we touching gfxoff with VCN?  Was this a leftover from bring 
>> up?  Can we just drop all of this gfxoff stuff from VCN handling?  I 
>> don't see why there would be a dependency.
>>
>> Alex
>>
>>>  if (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG)    {
>>>  struct dpg_pause_state new_state;
>>> --
>>> 2.7.4
>>>
>>> ___
>>> amd-gfx mailing list
>>> amd-gfx@lists.freedesktop.org
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli
>>> sts.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=02%7C01%
>>> 7CJames.Zhu%40amd.com%7C3dd9135c717a4b3011e308d7e07f52b6%7C3dd8961fe
>>> 4884e608e11a82d994e183d%7C0%7C0%7C637224709763391845sdata=Y%2Bt
>>> sJQNB1TXCQ9v86DW%2F0FC63gOSHsfzzZFu0MDvCHw%3Dreserved=0
>>>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu/vcn: fix gfxoff issue

2020-04-14 Thread Zhu, Changfeng
[AMD Official Use Only - Internal Distribution Only]


Tested-by: changzhu 

BR,
Changfeng.

-Original Message-
From: Zhu, James  
Sent: Tuesday, April 14, 2020 8:05 PM
To: amd-gfx@lists.freedesktop.org
Cc: Zhu, James ; Zhu, Changfeng 
Subject: [PATCH] drm/amdgpu/vcn: fix gfxoff issue

Turn off gfxoff control when vcn is gated.

Signed-off-by: James Zhu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index dab34f6..aa9a7a5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -369,9 +369,11 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
cancel_delayed_work_sync(>vcn.idle_work);
 
mutex_lock(>vcn.vcn_pg_lock);
-   amdgpu_gfx_off_ctrl(adev, false);
-   amdgpu_device_ip_set_powergating_state(adev, AMD_IP_BLOCK_TYPE_VCN,
-  AMD_PG_STATE_UNGATE);
+   if (adev->vcn.cur_state == AMD_PG_STATE_GATE) {
+   amdgpu_gfx_off_ctrl(adev, false);
+   amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_VCN,
+  AMD_PG_STATE_UNGATE);
+   }
 
if (adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG){
struct dpg_pause_state new_state;
-- 
2.7.4
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: avoid using invalidate semaphore for picasso(v2)

2019-12-10 Thread Zhu, Changfeng
[AMD Official Use Only - Internal Distribution Only]

OK, Chris.

What's about SRIOV?

Should we skip using semaphore registers for SRIOV now?

I add REG32_SOC15_OFFSET(MMHUB, 0, mmVM_INVALIDATE_ENG0_SEM, i, 0x0);
in mmhub_v1_0_program_invalidation.

However, the problem still happens.

BR,
Changfeng,

-Original Message-
From: Koenig, Christian  
Sent: Tuesday, December 10, 2019 6:55 PM
To: Zhu, Changfeng ; amd-gfx@lists.freedesktop.org; 
Huang, Ray ; Huang, Shimmer ; Deucher, 
Alexander 
Subject: Re: [PATCH] drm/amdgpu: avoid using invalidate semaphore for 
picasso(v2)

Am 10.12.19 um 03:55 schrieb Changfeng.Zhu:
> From: changzhu 
>
> It may cause timeout waiting for sem acquire in VM flush when using 
> invalidate semaphore for picasso. So it needs to avoid using 
> invalidate semaphore for piasso.

It would probably be better to add a small helper function to decide if the 
semaphore registers should be used or not.

E.g. something like "bool gmc_v9_0_use_semaphore(adev, vmhub...)"

Apart from that looks good to me,
Christian.

>
> Change-Id: I6dc552bde180919cd5ba6c81c6d9e3f800043b03
> Signed-off-by: changzhu 
> ---
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 28 +++
>   1 file changed, 20 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index 231ea9762cb5..601667246a1c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -464,8 +464,11 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device 
> *adev, uint32_t vmid,
>*/
>   
>   /* TODO: It needs to continue working on debugging with semaphore for 
> GFXHUB as well. */
> - if (vmhub == AMDGPU_MMHUB_0 ||
> - vmhub == AMDGPU_MMHUB_1) {
> + if ((vmhub == AMDGPU_MMHUB_0 ||
> +  vmhub == AMDGPU_MMHUB_1) &&
> + (!(adev->asic_type == CHIP_RAVEN &&
> +adev->rev_id < 0x8 &&
> +adev->pdev->device == 0x15d8))) {
>   for (j = 0; j < adev->usec_timeout; j++) {
>   /* a read return value of 1 means semaphore acuqire */
>   tmp = RREG32_NO_KIQ(hub->vm_inv_eng0_sem + eng); @@ 
> -495,8 
> +498,11 @@ static void gmc_v9_0_flush_gpu_tlb(struct amdgpu_device *adev, 
> uint32_t vmid,
>   }
>   
>   /* TODO: It needs to continue working on debugging with semaphore for 
> GFXHUB as well. */
> - if (vmhub == AMDGPU_MMHUB_0 ||
> - vmhub == AMDGPU_MMHUB_1)
> + if ((vmhub == AMDGPU_MMHUB_0 ||
> +  vmhub == AMDGPU_MMHUB_1) &&
> + (!(adev->asic_type == CHIP_RAVEN &&
> +adev->rev_id < 0x8 &&
> +adev->pdev->device == 0x15d8)))
>   /*
>* add semaphore release after invalidation,
>* write with 0 means semaphore release @@ -527,8 +533,11 @@ 
> static uint64_t gmc_v9_0_emit_flush_gpu_tlb(struct amdgpu_ring *ring,
>*/
>   
>   /* TODO: It needs to continue working on debugging with semaphore for 
> GFXHUB as well. */
> - if (ring->funcs->vmhub == AMDGPU_MMHUB_0 ||
> - ring->funcs->vmhub == AMDGPU_MMHUB_1)
> + if ((ring->funcs->vmhub == AMDGPU_MMHUB_0 ||
> +  ring->funcs->vmhub == AMDGPU_MMHUB_1) &&
> + (!(adev->asic_type == CHIP_RAVEN &&
> +adev->rev_id < 0x8 &&
> +adev->pdev->device == 0x15d8)))
>   /* a read return value of 1 means semaphore acuqire */
>   amdgpu_ring_emit_reg_wait(ring,
> hub->vm_inv_eng0_sem + eng, 0x1, 
> 0x1); @@ -544,8 +553,11 @@ 
> static uint64_t gmc_v9_0_emit_flush_gpu_tlb(struct amdgpu_ring *ring,
>   req, 1 << vmid);
>   
>   /* TODO: It needs to continue working on debugging with semaphore for 
> GFXHUB as well. */
> - if (ring->funcs->vmhub == AMDGPU_MMHUB_0 ||
> - ring->funcs->vmhub == AMDGPU_MMHUB_1)
> + if ((ring->funcs->vmhub == AMDGPU_MMHUB_0 ||
> +  ring->funcs->vmhub == AMDGPU_MMHUB_1) &&
> + (!(adev->asic_type == CHIP_RAVEN &&
> +adev->rev_id < 0x8 &&
> +adev->pdev->device == 0x15d8)))
>   /*
>* add semaphore release after invalidation,
>* write with 0 means semaphore release
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/2] drm/amdgpu: avoid using invalidate semaphore for SRIOV

2019-12-03 Thread Zhu, Changfeng
[AMD Official Use Only - Internal Distribution Only]

OK Chris.

I'll try and test it.

BR,
Changfeng.

-Original Message-
From: Christian König  
Sent: Tuesday, December 3, 2019 8:18 PM
To: Zhu, Changfeng ; amd-gfx@lists.freedesktop.org; 
Koenig, Christian ; Huang, Ray ; 
Huang, Shimmer ; Deucher, Alexander 

Subject: Re: [PATCH 1/2] drm/amdgpu: avoid using invalidate semaphore for SRIOV

Am 03.12.19 um 09:50 schrieb Changfeng.Zhu:
> From: changzhu 
>
> It may fail to load guest driver in round 2 when using invalidate 
> semaphore for SRIOV. So it needs to avoid using invalidate semaphore 
> for SRIOV.

That sounds like the registers are just not correctly initialized when the 
driver is reloaded.

I would just add that to mmhub_*_program_invalidation(). Something like this 
should already do it:
>     for (i = 0; i < 18; ++i) {
>     WREG32_SOC15_OFFSET(MMHUB, 0, 
> mmVM_INVALIDATE_ENG0_ADDR_RANGE_LO32,
>     2 * i, 0x);
>     WREG32_SOC15_OFFSET(MMHUB, 0, 
> mmVM_INVALIDATE_ENG0_ADDR_RANGE_HI32,
>     2 * i, 0x1f);

WREG32_SOC15_OFFSET(MMHUB, 0, mmVM_INVALIDATE_ENG0_SEM, i, 0x0);

>     }

Regards,
Christian.

>
> Change-Id: I8db1dc6f990fd0c458953571936467551cd4102d
> Signed-off-by: changzhu 
> ---
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 21 +
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 20 
>   2 files changed, 25 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> index 381bb709f021..d4c7d0319650 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> @@ -243,8 +243,9 @@ static void gmc_v10_0_flush_vm_hub(struct amdgpu_device 
> *adev, uint32_t vmid,
>*/
>   
>   /* TODO: It needs to continue working on debugging with semaphore for 
> GFXHUB as well. */
> - if (vmhub == AMDGPU_MMHUB_0 ||
> - vmhub == AMDGPU_MMHUB_1) {
> + if ((vmhub == AMDGPU_MMHUB_0 ||
> +  vmhub == AMDGPU_MMHUB_1) &&
> + (!amdgpu_sriov_vf(adev))) {
>   for (i = 0; i < adev->usec_timeout; i++) {
>   /* a read return value of 1 means semaphore acuqire */
>   tmp = RREG32_NO_KIQ(hub->vm_inv_eng0_sem + eng); @@ 
> -277,8 +278,9 
> @@ static void gmc_v10_0_flush_vm_hub(struct amdgpu_device *adev, uint32_t 
> vmid,
>   }
>   
>   /* TODO: It needs to continue working on debugging with semaphore for 
> GFXHUB as well. */
> - if (vmhub == AMDGPU_MMHUB_0 ||
> - vmhub == AMDGPU_MMHUB_1)
> + if ((vmhub == AMDGPU_MMHUB_0 ||
> +  vmhub == AMDGPU_MMHUB_1) &&
> + (!amdgpu_sriov_vf(adev)))
>   /*
>* add semaphore release after invalidation,
>* write with 0 means semaphore release @@ -369,6 +371,7 @@ 
> static 
> void gmc_v10_0_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid,
>   static uint64_t gmc_v10_0_emit_flush_gpu_tlb(struct amdgpu_ring *ring,
>unsigned vmid, uint64_t pd_addr)
>   {
> + struct amdgpu_device *adev = ring->adev;
>   struct amdgpu_vmhub *hub = >adev->vmhub[ring->funcs->vmhub];
>   uint32_t req = gmc_v10_0_get_invalidate_req(vmid, 0);
>   unsigned eng = ring->vm_inv_eng;
> @@ -381,8 +384,9 @@ static uint64_t gmc_v10_0_emit_flush_gpu_tlb(struct 
> amdgpu_ring *ring,
>*/
>   
>   /* TODO: It needs to continue working on debugging with semaphore for 
> GFXHUB as well. */
> - if (ring->funcs->vmhub == AMDGPU_MMHUB_0 ||
> - ring->funcs->vmhub == AMDGPU_MMHUB_1)
> + if ((ring->funcs->vmhub == AMDGPU_MMHUB_0 ||
> +  ring->funcs->vmhub == AMDGPU_MMHUB_1) &&
> + (!amdgpu_sriov_vf(adev)))
>   /* a read return value of 1 means semaphore acuqire */
>   amdgpu_ring_emit_reg_wait(ring,
> hub->vm_inv_eng0_sem + eng, 0x1, 
> 0x1); @@ -398,8 +402,9 @@ 
> static uint64_t gmc_v10_0_emit_flush_gpu_tlb(struct amdgpu_ring *ring,
>   req, 1 << vmid);
>   
>   /* TODO: It needs to continue working on debugging with semaphore for 
> GFXHUB as well. */
> - if (ring->funcs->vmhub == AMDGPU_MMHUB_0 ||
> - ring->funcs->vmhub == AMDGPU_MMHUB_1)
> + if ((ring->funcs->vmhub == AMDGPU_MMHUB_0 ||
> +  ring->funcs->vmhub == AMDGPU_MMHUB_1) &&
> + (!amdgpu_sriov_vf(adev)))
> 

RE: [PATCH] drm/amdgpu: invalidate mmhub semphore workaround in gmc9/gmc10

2019-11-22 Thread Zhu, Changfeng
Thanks, Ray

I 'll submit the patch and continue to see the gfxhub semaphore problem.

BR,
Changfeng. 

-Original Message-
From: Huang, Ray  
Sent: Friday, November 22, 2019 5:16 PM
To: Zhu, Changfeng 
Cc: Koenig, Christian ; Xiao, Jack 
; Zhou1, Tao ; Huang, Shimmer 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: invalidate mmhub semphore workaround in 
gmc9/gmc10

[AMD Official Use Only - Internal Distribution Only]

On Thu, Nov 21, 2019 at 11:47:15PM +0800, Zhu, Changfeng wrote:
> From: changzhu 
> 
> It may lose gpuvm invalidate acknowldege state across power-gating off 
> cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore 
> acquire before invalidation and semaphore release after invalidation.
> 
> After adding semaphore acquire before invalidation, the semaphore 
> register become read-only if another process try to acquire semaphore.
> Then it will not be able to release this semaphore. Then it may cause 
> deadlock problem. If this deadlock problem happens, it needs a 
> semaphore firmware fix.
> 
> Change-Id: I9942a2f451265c1f1038ccfe2f70042c7c8118af

Please remove the chang-id, we don't do gerrit review.

> Signed-off-by: changzhu 
> ---
>  drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 52 
> ++  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 52 
> ++
>  drivers/gpu/drm/amd/amdgpu/soc15.h |  4 +-
>  3 files changed, 106 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> index af2615ba52aa..e0104b985c42 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> @@ -234,6 +234,27 @@ static void gmc_v10_0_flush_vm_hub(struct amdgpu_device 
> *adev, uint32_t vmid,
>   const unsigned eng = 17;
>   unsigned int i;
>  
> + spin_lock(>gmc.invalidate_lock);
> + /*
> +  * It may lose gpuvm invalidate acknowldege state across power-gating
> +  * off cycle, add semaphore acquire before invalidation and semaphore
> +  * release after invalidation to avoid entering power gated state
> +  * to WA the Issue
> +  */

Please add the TODO here, and mention you will continue working on debugging 
with semaphore for GFXHUB as well. And remove the checking once you addressed 
the issue with CP designer.

And the comments should be added before all checking here for "MMHUB".

With that fixed, the patch is Acked-by: Huang Rui 

> + if (vmhub == AMDGPU_MMHUB_0 ||
> + vmhub == AMDGPU_MMHUB_1) {
> + for (i = 0; i < adev->usec_timeout; i++) {
> + /* a read return value of 1 means semaphore acuqire */
> + tmp = RREG32_NO_KIQ(hub->vm_inv_eng0_sem + eng);
> + if (tmp & 0x1)
> + break;
> + udelay(1);
> + }
> +
> + if (i >= adev->usec_timeout)
> + DRM_ERROR("Timeout waiting for sem acquire in VM 
> flush!\n");
> + }
> +
>   WREG32_NO_KIQ(hub->vm_inv_eng0_req + eng, tmp);
>  
>   /*
> @@ -253,6 +274,16 @@ static void gmc_v10_0_flush_vm_hub(struct amdgpu_device 
> *adev, uint32_t vmid,
>   udelay(1);
>   }
>  
> + /*
> +  * add semaphore release after invalidation,
> +  * write with 0 means semaphore release
> +  */
> + if (vmhub == AMDGPU_MMHUB_0 ||
> + vmhub == AMDGPU_MMHUB_1)
> + WREG32_NO_KIQ(hub->vm_inv_eng0_sem + eng, 0);
> +
> + spin_unlock(>gmc.invalidate_lock);
> +
>   if (i < adev->usec_timeout)
>   return;
>  
> @@ -338,6 +369,19 @@ static uint64_t gmc_v10_0_emit_flush_gpu_tlb(struct 
> amdgpu_ring *ring,
>   uint32_t req = gmc_v10_0_get_invalidate_req(vmid, 0);
>   unsigned eng = ring->vm_inv_eng;
>  
> + /*
> +  * It may lose gpuvm invalidate acknowldege state across power-gating
> +  * off cycle, add semaphore acquire before invalidation and semaphore
> +  * release after invalidation to avoid entering power gated state
> +  * to WA the Issue
> +  */
> +
> + /* a read return value of 1 means semaphore acuqire */
> + if (ring->funcs->vmhub == AMDGPU_MMHUB_0 ||
> + ring->funcs->vmhub == AMDGPU_MMHUB_1)
> + amdgpu_ring_emit_reg_wait(ring,
> +   hub->vm_inv_eng0_sem + eng, 0x1, 0x1);
> +
>   amdgpu_ring_emit_wreg(ring, hub->ctx0_ptb_addr_lo32 + (2 * vmid),
> lower_32_bits(pd_addr));
>  
> @@ -348,6 +392,14 @@ static uint64_t g

RE: [PATCH 2/2] drm/amdgpu: invalidate mmhub semphore workaround in gmc9/gmc10

2019-11-21 Thread Zhu, Changfeng
Hi Chris,

I have removed DRM_WARN_ONCE.

I think we can write mmhub patch firstly. And It's for ticket:
http://ontrack-internal.amd.com/browse/SWDEV-201459

According to Yang,Zilong's comments on this issue,
GFXHUB is not applicable to the bug thus doesn't need the w/a.

I'll see gfxhub hang root cause with Lisa in the following time.

Could you please help review my new patch(remove DRM_WARN_ONCE)?

BR,
Changfeng.


-Original Message-
From: Christian König  
Sent: Wednesday, November 20, 2019 7:27 PM
To: Zhu, Changfeng ; Koenig, Christian 
; Xiao, Jack ; Zhou1, Tao 
; Huang, Ray ; Huang, Shimmer 
; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/2] drm/amdgpu: invalidate mmhub semphore workaround in 
gmc9/gmc10

Am 20.11.19 um 10:44 schrieb Changfeng.Zhu:
> From: changzhu 
>
> It may lose gpuvm invalidate acknowldege state across power-gating off 
> cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore 
> acquire before invalidation and semaphore release after invalidation.
>
> After adding semaphore acquire before invalidation, the semaphore 
> register become read-only if another process try to acquire semaphore.
> Then it will not be able to release this semaphore. Then it may cause 
> deadlock problem. If this deadlock problem happens, it needs a 
> semaphore firmware fix.

Please remove the DRM_WARN_ONCE, that looks like overkill to me.

And I'm not sure how urgent that issue here is. We could also wait a few more 
days and see if the hw guys figure out why this lockups on the GFX ring.

Regards,
Christian.

>
> Change-Id: I9942a2f451265c1f1038ccfe2f70042c7c8118af
> Signed-off-by: changzhu 
> ---
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 49 ++
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 49 ++
>   drivers/gpu/drm/amd/amdgpu/soc15.h |  4 +--
>   3 files changed, 100 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> index af2615ba52aa..685d0d5ef31e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> @@ -234,6 +234,24 @@ static void gmc_v10_0_flush_vm_hub(struct amdgpu_device 
> *adev, uint32_t vmid,
>   const unsigned eng = 17;
>   unsigned int i;
>   
> + spin_lock(>gmc.invalidate_lock);
> + /*
> +  * It may lose gpuvm invalidate acknowldege state across power-gating
> +  * off cycle, add semaphore acquire before invalidation and semaphore
> +  * release after invalidation to avoid entering power gated state
> +  * to WA the Issue
> +  */
> + for (i = 0; i < adev->usec_timeout; i++) {
> + /* a read return value of 1 means semaphore acuqire */
> + tmp = RREG32_NO_KIQ(hub->vm_inv_eng0_sem + eng);
> + if (tmp & 0x1)
> + break;
> + udelay(1);
> + }
> +
> + if (i >= adev->usec_timeout)
> + DRM_ERROR("Timeout waiting for sem acquire in VM flush!\n");
> +
>   WREG32_NO_KIQ(hub->vm_inv_eng0_req + eng, tmp);
>   
>   /*
> @@ -253,6 +271,14 @@ static void gmc_v10_0_flush_vm_hub(struct amdgpu_device 
> *adev, uint32_t vmid,
>   udelay(1);
>   }
>   
> + /*
> +  * add semaphore release after invalidation,
> +  * write with 0 means semaphore release
> +  */
> + WREG32_NO_KIQ(hub->vm_inv_eng0_sem + eng, 0);
> +
> + spin_unlock(>gmc.invalidate_lock);
> +
>   if (i < adev->usec_timeout)
>   return;
>   
> @@ -338,6 +364,21 @@ static uint64_t gmc_v10_0_emit_flush_gpu_tlb(struct 
> amdgpu_ring *ring,
>   uint32_t req = gmc_v10_0_get_invalidate_req(vmid, 0);
>   unsigned eng = ring->vm_inv_eng;
>   
> + /*
> +  * It may lose gpuvm invalidate acknowldege state across power-gating
> +  * off cycle, add semaphore acquire before invalidation and semaphore
> +  * release after invalidation to avoid entering power gated state
> +  * to WA the Issue
> +  */
> +
> + /* a read return value of 1 means semaphore acuqire */
> + if (ring->funcs->vmhub == AMDGPU_MMHUB_0 ||
> + ring->funcs->vmhub == AMDGPU_MMHUB_1) {
> + amdgpu_ring_emit_reg_wait(ring,
> +   hub->vm_inv_eng0_sem + eng, 0x1, 0x1);
> + DRM_WARN_ONCE("Adding semaphore may cause deadlock and it needs 
> firmware fix\n");
> + }
> +
>   amdgpu_ring_emit_wreg(ring, hub->ctx0_ptb_addr_lo32 + (2 * vmid),
> lower_32_bits(pd_addr));
>   
> @@ -348,6 +389,14 @@ static uint6

RE: 答复: 答复: 答复: [PATCH 1/2] drm/amdgpu: invalidate mmhub semphore workaround in amdgpu_virt

2019-11-20 Thread Zhu, Changfeng
Well, I'll wait the help from IPE GFX team and try to apply GFXHUB as well and 
then perfect these invalidate semaphore patches.

If SRIOV team want to enable invalidate semaphore in future, it can try to take 
this patch back in that time.

BR,
Changfeng.

-Original Message-
From: Christian König  
Sent: Wednesday, November 20, 2019 10:39 PM
To: Liu, Monk ; Zhu, Changfeng ; 
Koenig, Christian ; Xiao, Jack ; 
Zhou1, Tao ; Huang, Ray ; Huang, Shimmer 
; amd-gfx@lists.freedesktop.org
Subject: Re: 答复: 答复: 答复: [PATCH 1/2] drm/amdgpu: invalidate mmhub semphore 
workaround in amdgpu_virt

Hi Monk,

the KIQ is used to invalidate both the GFXHUB as well as the MMHUB on Vega.

> Besides, amdgpu_virt_kiq_reg_write_reg_wait() is not deadly a helper 
> function that only serve VM invalidate, so I don't think You should 
> put the semaphore read/write in this routine, instead you can put 
> semaphore r/w out side of this routine and only Put them around the VM 
> invalidate logic
Yes, agree. But since we now knew that we won't need that we can just drop this 
patch altogether.

Regards,
Christian.

Am 20.11.19 um 15:30 schrieb Liu, Monk:
> Thanks for sharing this JIR
>
> now I got the picture of this issue from you and Christian.
>
> So the semaphore grabbing can prevent RTL to power off the MMHUB, I 
> see
>
> The practice is that SRIOV won't enable PG at all (even our GIM driver 
> won't enable PG, maybe in future we would enable it )
>
> I think I don't have too many concern about your patches,
>
> But I have comments on your patch 1:
>
> void amdgpu_virt_kiq_reg_write_reg_wait(struct amdgpu_device *adev,
>   uint32_t reg0, uint32_t reg1,
> - uint32_t ref, uint32_t mask)
> + uint32_t ref, uint32_t mask,
> + uint32_t sem)
>   {
>   struct amdgpu_kiq *kiq = >gfx.kiq;
>   struct amdgpu_ring *ring = >ring; @@ -144,9 +145,30 @@ void 
> amdgpu_virt_kiq_reg_write_reg_wait(struct amdgpu_device *adev,
>   uint32_t seq;
>   
>   spin_lock_irqsave(>ring_lock, flags);
> - amdgpu_ring_alloc(ring, 32);
> + amdgpu_ring_alloc(ring, 60);
> +
> + /*
> +  * It may lose gpuvm invalidate acknowldege state across power-gating
> +  * off cycle, add semaphore acquire before invalidation and semaphore
> +  * release after invalidation to avoid entering power gated state
> +  * to WA the Issue
> +  */
> +
> + /* a read return value of 1 means semaphore acuqire */
> + if (ring->funcs->vmhub == AMDGPU_MMHUB_0 ||
> + ring->funcs->vmhub == AMDGPU_MMHUB_1)
> + amdgpu_ring_emit_reg_wait(ring, sem, 0x1, 0x1);
>
>
> See that in this routine, the ring is always KIQ, so below code looks 
> redundant :
>
> + /* a read return value of 1 means semaphore acuqire */
> + if (ring->funcs->vmhub == AMDGPU_MMHUB_0 ||
> + ring->funcs->vmhub == AMDGPU_MMHUB_1)
> + amdgpu_ring_emit_reg_wait(ring, sem, 0x1, 0x1);
>
> Besides, amdgpu_virt_kiq_reg_write_reg_wait() is not deadly a helper 
> function that only serve VM invalidate, so I don't think You should 
> put the semaphore read/write in this routine, instead you can put 
> semaphore r/w out side of this routine and only Put them around the VM 
> invalidate logic
>
> Thanks
>
> -邮件原件-
> 发件人: Zhu, Changfeng 
> 发送时间: 2019年11月20日 22:17
> 收件人: Koenig, Christian ; Liu, Monk 
> ; Xiao, Jack ; Zhou1, Tao 
> ; Huang, Ray ; Huang, Shimmer 
> ; amd-gfx@lists.freedesktop.org
> 主题: RE: 答复: 答复: [PATCH 1/2] drm/amdgpu: invalidate mmhub semphore 
> workaround in amdgpu_virt
>
>>>> Did Changfeng already hit this issue under SRIOV ???
> I meet this problem on navi14 under gmc_v10_0_emit_flush_gpu_tlb .
> The problem is also seen by Zhou,Tao.
>
> And this is ticket:
> http://ontrack-internal.amd.com/browse/SWDEV-201459
>
> After the semaphore patch, the problem can be fixed.
>
> If SROV has concern about this problem,  it should not add semaphore in SROV.
>
> However, we should apply semaphore for gmc_v9_0_flush_gpu_tlb/ 
> gmc_v9_0_emit_flush_gpu_tlb/ gmc_v10_0_flush_gpu_tlb/ 
> gmc_v10_0_emit_flush_gpu_tlb
>
> Or how can we handle the ticket above?
>
> BR,
> Changfeng.
>
> -Original Message-
> From: Christian König 
> Sent: Wednesday, November 20, 2019 10:00 PM
> To: Liu, Monk ; Koenig, Christian 
> ; Zhu, Changfeng ; 
> Xiao, Jack ; Zhou1, Tao ; Huang, 
> Ray ; Huang, Shimmer ; 
> amd-gfx@lists.freedesktop.org
> Subject: Re: 答复: 答复: [PATCH 1/2] drm/amdgpu: invalidate mmhub semphore 
> workaround in amdgpu_virt
>
>> Did C

RE: 答复: 答复: [PATCH 1/2] drm/amdgpu: invalidate mmhub semphore workaround in amdgpu_virt

2019-11-20 Thread Zhu, Changfeng
>>> Did Changfeng already hit this issue under SRIOV ???

I meet this problem on navi14 under gmc_v10_0_emit_flush_gpu_tlb .
The problem is also seen by Zhou,Tao.

And this is ticket:
http://ontrack-internal.amd.com/browse/SWDEV-201459

After the semaphore patch, the problem can be fixed.

If SROV has concern about this problem,  it should not add semaphore in SROV.

However, we should apply semaphore for gmc_v9_0_flush_gpu_tlb/ 
gmc_v9_0_emit_flush_gpu_tlb/ gmc_v10_0_flush_gpu_tlb/ 
gmc_v10_0_emit_flush_gpu_tlb

Or how can we handle the ticket above?

BR,
Changfeng.

-Original Message-
From: Christian König  
Sent: Wednesday, November 20, 2019 10:00 PM
To: Liu, Monk ; Koenig, Christian ; 
Zhu, Changfeng ; Xiao, Jack ; Zhou1, 
Tao ; Huang, Ray ; Huang, Shimmer 
; amd-gfx@lists.freedesktop.org
Subject: Re: 答复: 答复: [PATCH 1/2] drm/amdgpu: invalidate mmhub semphore 
workaround in amdgpu_virt

> Did Changfeng already hit this issue under SRIOV ?
I don't think so, but Changfeng needs to answer this.

Question is does the extra semaphore acquire has some negative effect on SRIOV?

I would like to avoid having even more SRIOV specific handling in here which we 
can't really test on bare metal.

Christian.

Am 20.11.19 um 14:54 schrieb Liu, Monk:
> Hah, but in SRIOV case, our guest KMD driver is not allowed to do such 
> things  (and even there is a bug that KMD try to power gate, the 
> SMU firmware would not really do the jobs since We have PSP L1 policy 
> to prevent those danger operations )
>
> Did Changfeng already hit this issue under SRIOV ???
>
> -邮件原件-
> 发件人: Koenig, Christian 
> 发送时间: 2019年11月20日 21:21
> 收件人: Liu, Monk ; Zhu, Changfeng 
> ; Xiao, Jack ; Zhou1, Tao 
> ; Huang, Ray ; Huang, Shimmer 
> ; amd-gfx@lists.freedesktop.org
> 主题: Re: 答复: [PATCH 1/2] drm/amdgpu: invalidate mmhub semphore 
> workaround in amdgpu_virt
>
> Hi Monk,
>
> this is a fix for power gating the MMHUB.
>
> Basic problem is that the MMHUB can power gate while an invalidation is in 
> progress which looses all bits in the ACK register and so deadlocks the 
> engine waiting for the invalidation to finish.
>
> This bug is hit immediately when we enable power gating of the MMHUB.
>
> Regards,
> Christian.
>
> Am 20.11.19 um 14:18 schrieb Liu, Monk:
>> Hi Changfeng
>>
>> Firs of all, there is no power-gating off circle involved in AMDGPU 
>> SRIOV, since we don't allow VF/VM do such things so I do feel strange 
>> why you post something like this Especially on VEGA10 serials which 
>> looks doesn't have any issue on those gpu_flush part
>>
>> Here is my questions for you:
>> 1) Can you point me what issue had you been experienced ? and how to 
>> repro the bug
>> 2) if you do hit some issues, did you verified that your patch can fix it ?
>>
>> besides
>>
>> /Monk
>>
>> -----邮件原件-
>> 发件人: amd-gfx  代表 Changfeng.Zhu
>> 发送时间: 2019年11月20日 17:14
>> 收件人: Koenig, Christian ; Xiao, Jack 
>> ; Zhou1, Tao ; Huang, Ray 
>> ; Huang, Shimmer ; 
>> amd-gfx@lists.freedesktop.org
>> 抄送: Zhu, Changfeng 
>> 主题: [PATCH 1/2] drm/amdgpu: invalidate mmhub semphore workaround in 
>> amdgpu_virt
>>
>> From: changzhu 
>>
>> It may lose gpuvm invalidate acknowldege state across power-gating off 
>> cycle. To avoid this issue in virt invalidation, add semaphore acquire 
>> before invalidation and semaphore release after invalidation.
>>
>> Change-Id: Ie98304e475166b53eed033462d76423b6b0fc25b
>> Signed-off-by: changzhu 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 26 ++--  
>> drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h |  3 ++-
>>drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c|  3 ++-
>>3 files changed, 28 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>> index f04eb1a64271..70ffaf91cd12 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>> @@ -135,7 +135,8 @@ void amdgpu_virt_kiq_wreg(struct amdgpu_device 
>> *adev, uint32_t reg, uint32_t v)
>>
>>void amdgpu_virt_kiq_reg_write_reg_wait(struct amdgpu_device *adev,
>>  uint32_t reg0, uint32_t reg1,
>> -uint32_t ref, uint32_t mask)
>> +uint32_t ref, uint32_t mask,
>> +uint32_t sem)
>>{
>>  struct amdgpu_kiq *kiq = >gfx.kiq;
>>  struct amdgpu_ring *ring = >ring; @@ -144,9 +145,30 @@ void 
>>

Recall: [PATCH 2/2] drm/amdgpu: invalidate mmhub semphore workaround in gmc9/gmc10

2019-11-20 Thread Zhu, Changfeng
Zhu, Changfeng would like to recall the message, "[PATCH 2/2] drm/amdgpu: 
invalidate mmhub semphore workaround in gmc9/gmc10".
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 2/2] drm/amdgpu: invalidate semphore mmhub workaround for gfx9/gfx10

2019-11-15 Thread Zhu, Changfeng

Am 14.11.19 um 11:17 schrieb Changfeng.Zhu:
> From: changzhu 
>
> MMHUB may lose GPUVM invalidate acknowledge state across power-gating 
> off cycle when it does invalidation req/ack work.
>
> So we must acquire/release one of the vm_invalidate_eng*_sem around 
> the invalidation req/ack.
>
> Besides, vm_invalidate_eng*_sem will be read-only after acquire it. So 
> it may cause dead lock when one process acquires 
> vm_invalidate_eng*_sem and another process acquires the same 
> vm_invalidate_eng*_sem immediately.
>
> In case of dead lock, it needs to add spinlock when doing invalidation 
> req/ack.
>
> Change-Id: Ica63593e1dc26444ac9c05cced0988515082def3
> Signed-off-by: changzhu 
> ---
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 60 -
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 90 +-
>   drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |  8 ++-
>   drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c  |  8 ++-
>   drivers/gpu/drm/amd/amdgpu/vce_v4_0.c  |  4 +-
>   drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c  | 12 +++-
>   drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c  | 12 +++-
>   drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c  | 12 +++-
>   8 files changed, 190 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> index af2615ba52aa..b7948c63ad0d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> @@ -29,6 +29,7 @@
>   #include "hdp/hdp_5_0_0_sh_mask.h"
>   #include "gc/gc_10_1_0_sh_mask.h"
>   #include "mmhub/mmhub_2_0_0_sh_mask.h"
> +#include "mmhub/mmhub_2_0_0_offset.h"
>   #include "dcn/dcn_2_0_0_offset.h"
>   #include "dcn/dcn_2_0_0_sh_mask.h"
>   #include "oss/osssys_5_0_0_offset.h"
> @@ -232,7 +233,30 @@ static void gmc_v10_0_flush_vm_hub(struct amdgpu_device 
> *adev, uint32_t vmid,
>   u32 tmp = gmc_v10_0_get_invalidate_req(vmid, flush_type);
>   /* Use register 17 for GART */
>   const unsigned eng = 17;
> - unsigned int i;
> + unsigned int i, j;
> + uint32_t vm_inv_eng0_sem = SOC15_REG_OFFSET(MMHUB, 0,
> + mmMMVM_INVALIDATE_ENG0_SEM);

Please add that register to the amdgpu_vmhub structure instead.

Thanks,Chris.  It's better to change here according to your advice.

> +
> + spin_lock(>gmc.invalidate_lock);
> +
> + /*
> +  * mmhub loses gpuvm invalidate acknowldege state across power-gating
> +  * off cycle, add semaphore acquire before invalidation and semaphore
> +  * release after invalidation to avoid mmhub entering power gated
> +  * state to WA the Issue
> +  */
> + if (vmhub == AMDGPU_MMHUB_0 || vmhub == AMDGPU_MMHUB_1) {

You can apply that to the GFX hub as well.

-This patch is a fix for mmhub issue: 
http://ontrack-internal.amd.com/browse/SWDEV-201459
I don't think I should apply gfxhub here as well.


> + for (j = 0; j < adev->usec_timeout; j++) {
> + /* a read return value of 1 means semaphore acuqire */
> + tmp = RREG32_NO_KIQ(vm_inv_eng0_sem + eng);
> + if (tmp & 0x1)
> + break;
> + udelay(1);
> + }
> +
> + if (j >= adev->usec_timeout)
> + DRM_ERROR("Timeout waiting for sem acquire in VM flush!\n");
> + }
>   
>   WREG32_NO_KIQ(hub->vm_inv_eng0_req + eng, tmp);
>   
> @@ -253,6 +277,15 @@ static void gmc_v10_0_flush_vm_hub(struct amdgpu_device 
> *adev, uint32_t vmid,
>   udelay(1);
>   }
>   
> + /*
> +  * add semaphore release after invalidation,
> +  * write with 0 means semaphore release
> +  */
> + if (vmhub == AMDGPU_MMHUB_0 || vmhub == AMDGPU_MMHUB_1)
> + WREG32_NO_KIQ(vm_inv_eng0_sem + eng, 0);
> +
> + spin_unlock(>gmc.invalidate_lock);
> +
>   if (i < adev->usec_timeout)
>   return;
>   
> @@ -334,9 +367,26 @@ static void gmc_v10_0_flush_gpu_tlb(struct amdgpu_device 
> *adev, uint32_t vmid,
>   static uint64_t gmc_v10_0_emit_flush_gpu_tlb(struct amdgpu_ring *ring,
>unsigned vmid, uint64_t pd_addr)
>   {
> + struct amdgpu_device *adev = ring->adev;
>   struct amdgpu_vmhub *hub = >adev->vmhub[ring->funcs->vmhub];
>   uint32_t req = gmc_v10_0_get_invalidate_req(vmid, 0);
>   unsigned eng = ring->vm_inv_eng;
> + uint32_t vm_inv_eng0_sem = SOC15_REG_OFFSET(MMHUB, 0,
> + mmMMVM_INVALIDATE_ENG0_SEM);
> +
> + spin_lock(>gmc.invalidate_lock);

Taking the lock is completely superfluous here.

--The purpose of this lock is to avoid dead lock.  According to @Xiao, 
Jack, vm_invalidate_eng*_sem will be read-only after acquiring it. So 
 it may cause dead lock when one process acquires  vm_invalidate_eng*_sem and 
another process acquires the same 
 vm_invalidate_eng*_sem immediately. Because 

[PATCH] drm/amdgpu: allow direct upload save restore list for raven2

2019-11-07 Thread Zhu, Changfeng
From: changzhu 

It will cause modprobe atombios stuck problem in raven2 if it doesn't
allow direct upload save restore list from gfx driver.
So it needs to allow direct upload save restore list for raven2
temporarily.

Change-Id: I1fece1b9c61f7a13eec948f34eb60a9120046bc2
Signed-off-by: changzhu 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 4ed31e9a398c..dde9713c1d67 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -2729,7 +2729,9 @@ static void gfx_v9_0_init_pg(struct amdgpu_device *adev)
 * And it's needed by gfxoff feature.
 */
if (adev->gfx.rlc.is_rlc_v2_1) {
-   if (adev->asic_type == CHIP_VEGA12)
+   if (adev->asic_type == CHIP_VEGA12 ||
+   (adev->asic_type == CHIP_RAVEN &&
+adev->rev_id >= 8))
gfx_v9_1_init_rlc_save_restore_list(adev);
gfx_v9_0_enable_save_restore_machine(adev);
}
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 1/2] drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10

2019-11-06 Thread Zhu, Changfeng


-Original Message-
From: amd-gfx  On Behalf Of Zhu, 
Changfeng
Sent: Wednesday, November 6, 2019 8:50 PM
To: Koenig, Christian ; 
amd-gfx@lists.freedesktop.org; Tuikov, Luben ; Huang, Ray 
; Huang, Shimmer 
Subject: RE: [PATCH 1/2] drm/amdgpu: add dummy read by engines for some GCVM 
status registers in gfx10

Thanks, Chris.
You give many good suggestions and guidance for me to finish this patch.

BR,
Changfeng.

-Original Message-
From: Koenig, Christian 
Sent: Wednesday, November 6, 2019 8:48 PM
To: Zhu, Changfeng ; amd-gfx@lists.freedesktop.org; 
Tuikov, Luben ; Huang, Ray ; Huang, 
Shimmer 
Subject: Re: [PATCH 1/2] drm/amdgpu: add dummy read by engines for some GCVM 
status registers in gfx10

Am 06.11.19 um 11:52 schrieb Zhu, Changfeng:
> From: changzhu 
>
> The GRBM register interface is now capable of bursting 1 cycle per 
> register wr->wr, wr->rd much faster than previous muticycle per 
> transaction done interface.  This has caused a problem where status 
> registers requiring HW to update have a 1 cycle delay, due to the 
> register update having to go through GRBM.
>
> For cp ucode, it has realized dummy read in cp firmware.It covers the 
> use of WAIT_REG_MEM operation 1 case only.So it needs to call 
> gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning 
> to update firmware in case firmware is too old to have function to 
> realize dummy read in cp firmware.
>
> For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma 
> is moved to gfxhub in gfx10. So it needs to add dummy read in driver 
> between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.
>
> Change-Id: Ie028f37eb789966d4593984bd661b248ebeb1ac3
> Signed-off-by: changzhu 

Reviewed-by: Christian König 

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h |  1 +
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 48 +
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  8 ++---
>   drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c  | 13 ++-
>   4 files changed, 64 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> index 459aa9059542..a74ecd449775 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> @@ -267,6 +267,7 @@ struct amdgpu_gfx {
>   uint32_tmec2_feature_version;
>   boolmec_fw_write_wait;
>   boolme_fw_write_wait;
> + boolcp_fw_write_wait;
>   struct amdgpu_ring  gfx_ring[AMDGPU_MAX_GFX_RINGS];
>   unsignednum_gfx_rings;
>   struct amdgpu_ring  compute_ring[AMDGPU_MAX_COMPUTE_RINGS];
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 17a5cbfd0024..c7a6f98bf6b8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -561,6 +561,32 @@ static void gfx_v10_0_free_microcode(struct 
> amdgpu_device *adev)
>   kfree(adev->gfx.rlc.register_list_format);
>   }
>   
> +static void gfx_v10_0_check_fw_write_wait(struct amdgpu_device *adev) 
> +{
> + adev->gfx.cp_fw_write_wait = false;
> +
> + switch (adev->asic_type) {
> + case CHIP_NAVI10:
> + case CHIP_NAVI12:
> + case CHIP_NAVI14:
> + if ((adev->gfx.me_fw_version >= 0x0046) &&
> + (adev->gfx.me_feature_version >= 27) &&
> + (adev->gfx.pfp_fw_version >= 0x0068) &&
> + (adev->gfx.pfp_feature_version >= 27) &&
> + (adev->gfx.mec_fw_version >= 0x005b) &&
> + (adev->gfx.mec_feature_version >= 27))
> + adev->gfx.cp_fw_write_wait = true;
> + break;
> + default:
> + break;
> + }
> +
> + if (adev->gfx.cp_fw_write_wait == false)
> + DRM_WARN_ONCE("Warning: check cp_fw_version and update it to 
> realize \
> +   GRBM requires 1-cycle delay in cp firmware\n"); }
> +
> +
>   static void gfx_v10_0_init_rlc_ext_microcode(struct amdgpu_device *adev)
>   {
>   const struct rlc_firmware_header_v2_1 *rlc_hdr; @@ -829,6 +855,7 @@ 
> static int gfx_v10_0_init_microcode(struct amdgpu_device *adev)
>   }
>   }
>   
> + gfx_v10_0_check_fw_write_wait(adev);
>   out:
>   if (err) {
>   dev_err(adev->dev,
> @@ -4768,6 +4795,24 @@ static void gfx_v10_0_ring_emit_reg_wait(struct 
> amdgpu_ring *ring, uint3

RE: [PATCH 1/2] drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10

2019-11-06 Thread Zhu, Changfeng
Thanks, Chris.

BR,
Changfeng.

-Original Message-
From: Koenig, Christian  
Sent: Wednesday, November 6, 2019 8:48 PM
To: Zhu, Changfeng ; amd-gfx@lists.freedesktop.org; 
Tuikov, Luben ; Huang, Ray ; Huang, 
Shimmer 
Subject: Re: [PATCH 1/2] drm/amdgpu: add dummy read by engines for some GCVM 
status registers in gfx10

Am 06.11.19 um 11:52 schrieb Zhu, Changfeng:
> From: changzhu 
>
> The GRBM register interface is now capable of bursting 1 cycle per 
> register wr->wr, wr->rd much faster than previous muticycle per 
> transaction done interface.  This has caused a problem where status 
> registers requiring HW to update have a 1 cycle delay, due to the 
> register update having to go through GRBM.
>
> For cp ucode, it has realized dummy read in cp firmware.It covers the 
> use of WAIT_REG_MEM operation 1 case only.So it needs to call 
> gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning 
> to update firmware in case firmware is too old to have function to 
> realize dummy read in cp firmware.
>
> For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma 
> is moved to gfxhub in gfx10. So it needs to add dummy read in driver 
> between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.
>
> Change-Id: Ie028f37eb789966d4593984bd661b248ebeb1ac3
> Signed-off-by: changzhu 

Reviewed-by: Christian König 

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h |  1 +
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 48 +
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  8 ++---
>   drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c  | 13 ++-
>   4 files changed, 64 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> index 459aa9059542..a74ecd449775 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> @@ -267,6 +267,7 @@ struct amdgpu_gfx {
>   uint32_tmec2_feature_version;
>   boolmec_fw_write_wait;
>   boolme_fw_write_wait;
> + boolcp_fw_write_wait;
>   struct amdgpu_ring  gfx_ring[AMDGPU_MAX_GFX_RINGS];
>   unsignednum_gfx_rings;
>   struct amdgpu_ring  compute_ring[AMDGPU_MAX_COMPUTE_RINGS];
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 17a5cbfd0024..c7a6f98bf6b8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -561,6 +561,32 @@ static void gfx_v10_0_free_microcode(struct 
> amdgpu_device *adev)
>   kfree(adev->gfx.rlc.register_list_format);
>   }
>   
> +static void gfx_v10_0_check_fw_write_wait(struct amdgpu_device *adev) 
> +{
> + adev->gfx.cp_fw_write_wait = false;
> +
> + switch (adev->asic_type) {
> + case CHIP_NAVI10:
> + case CHIP_NAVI12:
> + case CHIP_NAVI14:
> + if ((adev->gfx.me_fw_version >= 0x0046) &&
> + (adev->gfx.me_feature_version >= 27) &&
> + (adev->gfx.pfp_fw_version >= 0x0068) &&
> + (adev->gfx.pfp_feature_version >= 27) &&
> + (adev->gfx.mec_fw_version >= 0x005b) &&
> + (adev->gfx.mec_feature_version >= 27))
> + adev->gfx.cp_fw_write_wait = true;
> + break;
> + default:
> + break;
> + }
> +
> + if (adev->gfx.cp_fw_write_wait == false)
> + DRM_WARN_ONCE("Warning: check cp_fw_version and update it to 
> realize \
> +   GRBM requires 1-cycle delay in cp firmware\n"); }
> +
> +
>   static void gfx_v10_0_init_rlc_ext_microcode(struct amdgpu_device *adev)
>   {
>   const struct rlc_firmware_header_v2_1 *rlc_hdr; @@ -829,6 +855,7 @@ 
> static int gfx_v10_0_init_microcode(struct amdgpu_device *adev)
>   }
>   }
>   
> + gfx_v10_0_check_fw_write_wait(adev);
>   out:
>   if (err) {
>   dev_err(adev->dev,
> @@ -4768,6 +4795,24 @@ static void gfx_v10_0_ring_emit_reg_wait(struct 
> amdgpu_ring *ring, uint32_t reg,
>   gfx_v10_0_wait_reg_mem(ring, 0, 0, 0, reg, 0, val, mask, 0x20);
>   }
>   
> +static void gfx_v10_0_ring_emit_reg_write_reg_wait(struct amdgpu_ring *ring,
> +uint32_t reg0, uint32_t reg1,
> +uint32_t ref, uin

[PATCH 1/2] drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10

2019-11-06 Thread Zhu, Changfeng
From: changzhu 

The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface.  This has caused a problem where
status registers requiring HW to update have a 1 cycle delay, due
to the register update having to go through GRBM.

For cp ucode, it has realized dummy read in cp firmware.It covers
the use of WAIT_REG_MEM operation 1 case only.So it needs to call
gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to
update firmware in case firmware is too old to have function to realize
dummy read in cp firmware.

For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is
moved to gfxhub in gfx10. So it needs to add dummy read in driver
between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.

Change-Id: Ie028f37eb789966d4593984bd661b248ebeb1ac3
Signed-off-by: changzhu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h |  1 +
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 48 +
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  8 ++---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c  | 13 ++-
 4 files changed, 64 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index 459aa9059542..a74ecd449775 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -267,6 +267,7 @@ struct amdgpu_gfx {
uint32_tmec2_feature_version;
boolmec_fw_write_wait;
boolme_fw_write_wait;
+   boolcp_fw_write_wait;
struct amdgpu_ring  gfx_ring[AMDGPU_MAX_GFX_RINGS];
unsignednum_gfx_rings;
struct amdgpu_ring  compute_ring[AMDGPU_MAX_COMPUTE_RINGS];
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 17a5cbfd0024..c7a6f98bf6b8 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -561,6 +561,32 @@ static void gfx_v10_0_free_microcode(struct amdgpu_device 
*adev)
kfree(adev->gfx.rlc.register_list_format);
 }
 
+static void gfx_v10_0_check_fw_write_wait(struct amdgpu_device *adev)
+{
+   adev->gfx.cp_fw_write_wait = false;
+
+   switch (adev->asic_type) {
+   case CHIP_NAVI10:
+   case CHIP_NAVI12:
+   case CHIP_NAVI14:
+   if ((adev->gfx.me_fw_version >= 0x0046) &&
+   (adev->gfx.me_feature_version >= 27) &&
+   (adev->gfx.pfp_fw_version >= 0x0068) &&
+   (adev->gfx.pfp_feature_version >= 27) &&
+   (adev->gfx.mec_fw_version >= 0x005b) &&
+   (adev->gfx.mec_feature_version >= 27))
+   adev->gfx.cp_fw_write_wait = true;
+   break;
+   default:
+   break;
+   }
+
+   if (adev->gfx.cp_fw_write_wait == false)
+   DRM_WARN_ONCE("Warning: check cp_fw_version and update it to 
realize \
+ GRBM requires 1-cycle delay in cp firmware\n");
+}
+
+
 static void gfx_v10_0_init_rlc_ext_microcode(struct amdgpu_device *adev)
 {
const struct rlc_firmware_header_v2_1 *rlc_hdr;
@@ -829,6 +855,7 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device 
*adev)
}
}
 
+   gfx_v10_0_check_fw_write_wait(adev);
 out:
if (err) {
dev_err(adev->dev,
@@ -4768,6 +4795,24 @@ static void gfx_v10_0_ring_emit_reg_wait(struct 
amdgpu_ring *ring, uint32_t reg,
gfx_v10_0_wait_reg_mem(ring, 0, 0, 0, reg, 0, val, mask, 0x20);
 }
 
+static void gfx_v10_0_ring_emit_reg_write_reg_wait(struct amdgpu_ring *ring,
+  uint32_t reg0, uint32_t reg1,
+  uint32_t ref, uint32_t mask)
+{
+   int usepfp = (ring->funcs->type == AMDGPU_RING_TYPE_GFX);
+   struct amdgpu_device *adev = ring->adev;
+   bool fw_version_ok = false;
+
+   fw_version_ok = adev->gfx.cp_fw_write_wait;
+
+   if (fw_version_ok)
+   gfx_v10_0_wait_reg_mem(ring, usepfp, 0, 1, reg0, reg1,
+  ref, mask, 0x20);
+   else
+   amdgpu_ring_emit_reg_write_reg_wait_helper(ring, reg0, reg1,
+  ref, mask);
+}
+
 static void
 gfx_v10_0_set_gfx_eop_interrupt_state(struct amdgpu_device *adev,
  uint32_t me, uint32_t pipe,
@@ -5158,6 +5203,7 @@ static const struct amdgpu_ring_funcs 
gfx_v10_0_ring_funcs_gfx = {
.emit_tmz = gfx_v10_0_ring_emit_tmz,
.emit_wreg = gfx_v10_0_ring_emit_wreg,
.emit_reg_wait = gfx_v10_0_ring_emit_reg_wait,
+   .emit_reg_write_reg_wait = 

RE: [PATCH 1/2] drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10

2019-11-06 Thread Zhu, Changfeng
Hi Chris,

I move gfx_v10_0_check_fw_write_wait(adev);
to gfx_v10_0_init_microcode

BR,
Changfeng.

-Original Message-
From: Koenig, Christian  
Sent: Wednesday, November 6, 2019 5:26 PM
To: Zhu, Changfeng ; amd-gfx@lists.freedesktop.org; 
Tuikov, Luben ; Huang, Ray ; Huang, 
Shimmer 
Subject: Re: [PATCH 1/2] drm/amdgpu: add dummy read by engines for some GCVM 
status registers in gfx10

Am 06.11.19 um 09:21 schrieb Zhu, Changfeng:
> From: changzhu 
>
> The GRBM register interface is now capable of bursting 1 cycle per 
> register wr->wr, wr->rd much faster than previous muticycle per 
> transaction done interface.  This has caused a problem where status 
> registers requiring HW to update have a 1 cycle delay, due to the 
> register update having to go through GRBM.
>
> For cp ucode, it has realized dummy read in cp firmware.It covers the 
> use of WAIT_REG_MEM operation 1 case only.So it needs to call 
> gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning 
> to update firmware in case firmware is too old to have function to 
> realize dummy read in cp firmware.
>
> For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma 
> is moved to gfxhub in gfx10. So it needs to add dummy read in driver 
> between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.
>
> Change-Id: Ie028f37eb789966d4593984bd661b248ebeb1ac3
> Signed-off-by: changzhu 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h |  1 +
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 48 +
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  8 ++---
>   drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c  | 13 ++-
>   4 files changed, 64 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> index 459aa9059542..a74ecd449775 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> @@ -267,6 +267,7 @@ struct amdgpu_gfx {
>   uint32_tmec2_feature_version;
>   boolmec_fw_write_wait;
>   boolme_fw_write_wait;
> + boolcp_fw_write_wait;
>   struct amdgpu_ring  gfx_ring[AMDGPU_MAX_GFX_RINGS];
>   unsignednum_gfx_rings;
>   struct amdgpu_ring  compute_ring[AMDGPU_MAX_COMPUTE_RINGS];
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 17a5cbfd0024..acdb0e4df9b4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -561,6 +561,32 @@ static void gfx_v10_0_free_microcode(struct 
> amdgpu_device *adev)
>   kfree(adev->gfx.rlc.register_list_format);
>   }
>   
> +static void gfx_v10_0_check_fw_write_wait(struct amdgpu_device *adev) 
> +{
> + adev->gfx.cp_fw_write_wait = false;
> +
> + switch (adev->asic_type) {
> + case CHIP_NAVI10:
> + case CHIP_NAVI12:
> + case CHIP_NAVI14:
> + if ((adev->gfx.me_fw_version >= 0x0046) &&
> + (adev->gfx.me_feature_version >= 27) &&
> + (adev->gfx.pfp_fw_version >= 0x0068) &&
> + (adev->gfx.pfp_feature_version >= 27) &&
> + (adev->gfx.mec_fw_version >= 0x005b) &&
> + (adev->gfx.mec_feature_version >= 27))
> + adev->gfx.cp_fw_write_wait = true;
> + break;
> + default:
> + break;
> + }
> +
> + if (adev->gfx.cp_fw_write_wait == false)
> + DRM_WARN_ONCE("Warning: check cp_fw_version and update it to 
> realize \
> +   GRBM requires 1-cycle delay in cp 
> firmware\n"); }
> +
> +
>   static void gfx_v10_0_init_rlc_ext_microcode(struct amdgpu_device *adev)
>   {
>   const struct rlc_firmware_header_v2_1 *rlc_hdr; @@ -4768,6 +4794,25 
> @@ static void gfx_v10_0_ring_emit_reg_wait(struct amdgpu_ring *ring, 
> uint32_t reg,
>   gfx_v10_0_wait_reg_mem(ring, 0, 0, 0, reg, 0, val, mask, 0x20);
>   }
>   
> +static void gfx_v10_0_ring_emit_reg_write_reg_wait(struct amdgpu_ring *ring,
> +   uint32_t reg0, uint32_t reg1,
> +   uint32_t ref, uint32_t mask)
> +{
> + int usepfp = (ring->funcs->type == AMDGPU_RING_TYPE_GFX);
> + struct amdgpu_device *adev = ring->adev;
> + bool fw_version_ok = false;
> +
> + gfx_v10_0_check_fw_write_wait(adev);

Doing

[PATCH 1/2] drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10

2019-11-06 Thread Zhu, Changfeng
From: changzhu 

The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface.  This has caused a problem where
status registers requiring HW to update have a 1 cycle delay, due
to the register update having to go through GRBM.

For cp ucode, it has realized dummy read in cp firmware.It covers
the use of WAIT_REG_MEM operation 1 case only.So it needs to call
gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to
update firmware in case firmware is too old to have function to realize
dummy read in cp firmware.

For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is
moved to gfxhub in gfx10. So it needs to add dummy read in driver
between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.

Change-Id: Ie028f37eb789966d4593984bd661b248ebeb1ac3
Signed-off-by: changzhu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h |  1 +
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 48 +
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  8 ++---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c  | 13 ++-
 4 files changed, 64 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index 459aa9059542..a74ecd449775 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -267,6 +267,7 @@ struct amdgpu_gfx {
uint32_tmec2_feature_version;
boolmec_fw_write_wait;
boolme_fw_write_wait;
+   boolcp_fw_write_wait;
struct amdgpu_ring  gfx_ring[AMDGPU_MAX_GFX_RINGS];
unsignednum_gfx_rings;
struct amdgpu_ring  compute_ring[AMDGPU_MAX_COMPUTE_RINGS];
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 17a5cbfd0024..acdb0e4df9b4 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -561,6 +561,32 @@ static void gfx_v10_0_free_microcode(struct amdgpu_device 
*adev)
kfree(adev->gfx.rlc.register_list_format);
 }
 
+static void gfx_v10_0_check_fw_write_wait(struct amdgpu_device *adev)
+{
+   adev->gfx.cp_fw_write_wait = false;
+
+   switch (adev->asic_type) {
+   case CHIP_NAVI10:
+   case CHIP_NAVI12:
+   case CHIP_NAVI14:
+   if ((adev->gfx.me_fw_version >= 0x0046) &&
+   (adev->gfx.me_feature_version >= 27) &&
+   (adev->gfx.pfp_fw_version >= 0x0068) &&
+   (adev->gfx.pfp_feature_version >= 27) &&
+   (adev->gfx.mec_fw_version >= 0x005b) &&
+   (adev->gfx.mec_feature_version >= 27))
+   adev->gfx.cp_fw_write_wait = true;
+   break;
+   default:
+   break;
+   }
+
+   if (adev->gfx.cp_fw_write_wait == false)
+   DRM_WARN_ONCE("Warning: check cp_fw_version and update it to 
realize \
+ GRBM requires 1-cycle delay in cp 
firmware\n");
+}
+
+
 static void gfx_v10_0_init_rlc_ext_microcode(struct amdgpu_device *adev)
 {
const struct rlc_firmware_header_v2_1 *rlc_hdr;
@@ -4768,6 +4794,25 @@ static void gfx_v10_0_ring_emit_reg_wait(struct 
amdgpu_ring *ring, uint32_t reg,
gfx_v10_0_wait_reg_mem(ring, 0, 0, 0, reg, 0, val, mask, 0x20);
 }
 
+static void gfx_v10_0_ring_emit_reg_write_reg_wait(struct amdgpu_ring *ring,
+ uint32_t reg0, uint32_t reg1,
+ uint32_t ref, uint32_t mask)
+{
+   int usepfp = (ring->funcs->type == AMDGPU_RING_TYPE_GFX);
+   struct amdgpu_device *adev = ring->adev;
+   bool fw_version_ok = false;
+
+   gfx_v10_0_check_fw_write_wait(adev);
+   fw_version_ok = adev->gfx.cp_fw_write_wait;
+
+   if (fw_version_ok)
+   gfx_v10_0_wait_reg_mem(ring, usepfp, 0, 1, reg0, reg1,
+ ref, mask, 0x20);
+   else
+   amdgpu_ring_emit_reg_write_reg_wait_helper(ring, reg0, reg1,
+  ref, mask);
+}
+
 static void
 gfx_v10_0_set_gfx_eop_interrupt_state(struct amdgpu_device *adev,
  uint32_t me, uint32_t pipe,
@@ -5158,6 +5203,7 @@ static const struct amdgpu_ring_funcs 
gfx_v10_0_ring_funcs_gfx = {
.emit_tmz = gfx_v10_0_ring_emit_tmz,
.emit_wreg = gfx_v10_0_ring_emit_wreg,
.emit_reg_wait = gfx_v10_0_ring_emit_reg_wait,
+   .emit_reg_write_reg_wait = gfx_v10_0_ring_emit_reg_write_reg_wait,
 };
 
 static const struct amdgpu_ring_funcs gfx_v10_0_ring_funcs_compute = {
@@ -5191,6 +5237,7 @@ static const struct amdgpu_ring_funcs 

RE: [PATCH 1/2] drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10

2019-11-06 Thread Zhu, Changfeng
Thanks, Chris.

I double check all engines on gfx10 implement the emit_reg_write_reg_wait 
callback.
You're right. I miss .emit_reg_write_reg_wait = 
gfx_v10_0_ring_emit_reg_write_reg_wait, for gfx_v10_0_ring_funcs_kiq. So I add 
it.

For sdma5, there is one amdgpu_ring_funs engine: amdgpu_ring_funcs 
sdma_v5_0_ring_funcs
We have defined:
.emit_reg_write_reg_wait = sdma_v5_0_ring_emit_reg_write_reg_wait,

For vcn/uvd/vce,Their engines have all been defined:
.emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper;
before.

So after adding .emit_reg_write_reg_wait = 
gfx_v10_0_ring_emit_reg_write_reg_wait
for gfx_v10_0_ring_funcs_kiq.
I think there should be no NULL pointer deref problem.

BR,
Changfeng.

-Original Message-
From: Koenig, Christian  
Sent: Tuesday, November 5, 2019 7:58 PM
To: Zhu, Changfeng ; amd-gfx@lists.freedesktop.org; 
Tuikov, Luben ; Huang, Ray ; Huang, 
Shimmer 
Subject: Re: [PATCH 1/2] drm/amdgpu: add dummy read by engines for some GCVM 
status registers in gfx10

Am 05.11.19 um 12:42 schrieb Zhu, Changfeng:
> From: changzhu 
>
> The GRBM register interface is now capable of bursting 1 cycle per 
> register wr->wr, wr->rd much faster than previous muticycle per 
> transaction done interface.  This has caused a problem where status 
> registers requiring HW to update have a 1 cycle delay, due to the 
> register update having to go through GRBM.
>
> For cp ucode, it has realized dummy read in cp firmware.It covers the 
> use of WAIT_REG_MEM operation 1 case only.So it needs to call 
> gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning 
> to update firmware in case firmware is too old to have function to 
> realize dummy read in cp firmware.
>
> For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma 
> is moved to gfxhub in gfx10. So it needs to add dummy read in driver 
> between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.
>
> Change-Id: Ie028f37eb789966d4593984bd661b248ebeb1ac3
> Signed-off-by: changzhu 
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h |  1 +
>   drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 47 +
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  8 ++---
>   drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c  | 13 ++-
>   4 files changed, 63 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> index 459aa9059542..a74ecd449775 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
> @@ -267,6 +267,7 @@ struct amdgpu_gfx {
>   uint32_tmec2_feature_version;
>   boolmec_fw_write_wait;
>   boolme_fw_write_wait;
> + boolcp_fw_write_wait;
>   struct amdgpu_ring  gfx_ring[AMDGPU_MAX_GFX_RINGS];
>   unsignednum_gfx_rings;
>   struct amdgpu_ring  compute_ring[AMDGPU_MAX_COMPUTE_RINGS];
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 17a5cbfd0024..e82b6d796b69 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -561,6 +561,32 @@ static void gfx_v10_0_free_microcode(struct 
> amdgpu_device *adev)
>   kfree(adev->gfx.rlc.register_list_format);
>   }
>   
> +static void gfx_v10_0_check_fw_write_wait(struct amdgpu_device *adev) 
> +{
> + adev->gfx.cp_fw_write_wait = false;
> +
> + switch (adev->asic_type) {
> + case CHIP_NAVI10:
> + case CHIP_NAVI12:
> + case CHIP_NAVI14:
> + if ((adev->gfx.me_fw_version >= 0x0046) &&
> + (adev->gfx.me_feature_version >= 27) &&
> + (adev->gfx.pfp_fw_version >= 0x0068) &&
> + (adev->gfx.pfp_feature_version >= 27) &&
> + (adev->gfx.mec_fw_version >= 0x005b) &&
> + (adev->gfx.mec_feature_version >= 27))
> + adev->gfx.cp_fw_write_wait = true;
> + break;
> + default:
> + break;
> + }
> +
> + if (adev->gfx.cp_fw_write_wait == false)
> + DRM_WARN_ONCE("Warning: check cp_fw_version and update it to 
> realize \
> +   GRBM requires 1-cycle delay in cp 
> firmware\n"); }
> +
> +
>   static void gfx_v10_0_init_rlc_ext_microcode(struct amdgpu_device *adev)
>   {
>   const struct rlc_firmware_header_v2_1 *rlc_hdr; @@ -4768,6 +4794,25 
> @@ static void gfx_v10_0_ring_emit_reg_wait(struct a

[PATCH 2/2] drm/amdgpu: add warning for GRBM 1-cycle delay issue in gfx9

2019-11-05 Thread Zhu, Changfeng
From: changzhu 

It needs to add warning to update firmware in gfx9
in case that firmware is too old to have function to
realize dummy read in cp firmware.

Change-Id: I6aef94f0823138f244f1eedb62fde833dd697023
Signed-off-by: changzhu 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 9d5f900e3e1c..f2deb225c8a9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -982,6 +982,13 @@ static void gfx_v9_0_check_fw_write_wait(struct 
amdgpu_device *adev)
adev->gfx.me_fw_write_wait = false;
adev->gfx.mec_fw_write_wait = false;
 
+   if ((adev->gfx.mec_fw_version < 0x01a5) ||
+   (adev->gfx.mec_feature_version < 46) ||
+   (adev->gfx.pfp_fw_version < 0x00b7) ||
+   (adev->gfx.pfp_feature_version < 46))
+   DRM_WARN_ONCE("Warning: check cp_fw_version and update it to 
realize \
+   GRBM requires 1-cycle delay in 
cp firmware\n");
+
switch (adev->asic_type) {
case CHIP_VEGA10:
if ((adev->gfx.me_fw_version >= 0x009c) &&
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/2] drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10

2019-11-05 Thread Zhu, Changfeng
From: changzhu 

The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface.  This has caused a problem where
status registers requiring HW to update have a 1 cycle delay, due
to the register update having to go through GRBM.

For cp ucode, it has realized dummy read in cp firmware.It covers
the use of WAIT_REG_MEM operation 1 case only.So it needs to call
gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to
update firmware in case firmware is too old to have function to realize
dummy read in cp firmware.

For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is
moved to gfxhub in gfx10. So it needs to add dummy read in driver
between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.

Change-Id: Ie028f37eb789966d4593984bd661b248ebeb1ac3
Signed-off-by: changzhu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h |  1 +
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 47 +
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  8 ++---
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c  | 13 ++-
 4 files changed, 63 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index 459aa9059542..a74ecd449775 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -267,6 +267,7 @@ struct amdgpu_gfx {
uint32_tmec2_feature_version;
boolmec_fw_write_wait;
boolme_fw_write_wait;
+   boolcp_fw_write_wait;
struct amdgpu_ring  gfx_ring[AMDGPU_MAX_GFX_RINGS];
unsignednum_gfx_rings;
struct amdgpu_ring  compute_ring[AMDGPU_MAX_COMPUTE_RINGS];
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 17a5cbfd0024..e82b6d796b69 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -561,6 +561,32 @@ static void gfx_v10_0_free_microcode(struct amdgpu_device 
*adev)
kfree(adev->gfx.rlc.register_list_format);
 }
 
+static void gfx_v10_0_check_fw_write_wait(struct amdgpu_device *adev)
+{
+   adev->gfx.cp_fw_write_wait = false;
+
+   switch (adev->asic_type) {
+   case CHIP_NAVI10:
+   case CHIP_NAVI12:
+   case CHIP_NAVI14:
+   if ((adev->gfx.me_fw_version >= 0x0046) &&
+   (adev->gfx.me_feature_version >= 27) &&
+   (adev->gfx.pfp_fw_version >= 0x0068) &&
+   (adev->gfx.pfp_feature_version >= 27) &&
+   (adev->gfx.mec_fw_version >= 0x005b) &&
+   (adev->gfx.mec_feature_version >= 27))
+   adev->gfx.cp_fw_write_wait = true;
+   break;
+   default:
+   break;
+   }
+
+   if (adev->gfx.cp_fw_write_wait == false)
+   DRM_WARN_ONCE("Warning: check cp_fw_version and update it to 
realize \
+ GRBM requires 1-cycle delay in cp 
firmware\n");
+}
+
+
 static void gfx_v10_0_init_rlc_ext_microcode(struct amdgpu_device *adev)
 {
const struct rlc_firmware_header_v2_1 *rlc_hdr;
@@ -4768,6 +4794,25 @@ static void gfx_v10_0_ring_emit_reg_wait(struct 
amdgpu_ring *ring, uint32_t reg,
gfx_v10_0_wait_reg_mem(ring, 0, 0, 0, reg, 0, val, mask, 0x20);
 }
 
+static void gfx_v10_0_ring_emit_reg_write_reg_wait(struct amdgpu_ring *ring,
+ uint32_t reg0, uint32_t reg1,
+ uint32_t ref, uint32_t mask)
+{
+   int usepfp = (ring->funcs->type == AMDGPU_RING_TYPE_GFX);
+   struct amdgpu_device *adev = ring->adev;
+   bool fw_version_ok = false;
+
+   gfx_v10_0_check_fw_write_wait(adev);
+   fw_version_ok = adev->gfx.cp_fw_write_wait;
+
+   if (fw_version_ok)
+   gfx_v10_0_wait_reg_mem(ring, usepfp, 0, 1, reg0, reg1,
+ ref, mask, 0x20);
+   else
+   amdgpu_ring_emit_reg_write_reg_wait_helper(ring, reg0, reg1,
+  ref, mask);
+}
+
 static void
 gfx_v10_0_set_gfx_eop_interrupt_state(struct amdgpu_device *adev,
  uint32_t me, uint32_t pipe,
@@ -5158,6 +5203,7 @@ static const struct amdgpu_ring_funcs 
gfx_v10_0_ring_funcs_gfx = {
.emit_tmz = gfx_v10_0_ring_emit_tmz,
.emit_wreg = gfx_v10_0_ring_emit_wreg,
.emit_reg_wait = gfx_v10_0_ring_emit_reg_wait,
+   .emit_reg_write_reg_wait = gfx_v10_0_ring_emit_reg_write_reg_wait,
 };
 
 static const struct amdgpu_ring_funcs gfx_v10_0_ring_funcs_compute = {
@@ -5191,6 +5237,7 @@ static const struct amdgpu_ring_funcs 

[PATCH] drm/amdgpu: add dummy read by engines for some GCVM status registers

2019-11-04 Thread Zhu, Changfeng
From: changzhu 

The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface.  This has caused a problem where
status registers requiring HW to update have a 1 cycle delay, due
to the register update having to go through GRBM.

For cp ucode, it has realized dummy read in cp firmware.It covers
the use of WAIT_REG_MEM operation 1 case only.So it needs to call
gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to
update firmware in case firmware is too old to have function to realize
dummy read in cp firmware.

For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is
moved to gfxhub in gfx10. So it needs to add dummy read in driver
between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.

Change-Id: Ie028f37eb789966d4593984bd661b248ebeb1ac3
Signed-off-by: changzhu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h |  1 +
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c  | 50 +
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   |  7 
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c  |  8 ++--
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c  | 13 ++-
 5 files changed, 73 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
index 459aa9059542..a74ecd449775 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h
@@ -267,6 +267,7 @@ struct amdgpu_gfx {
uint32_tmec2_feature_version;
boolmec_fw_write_wait;
boolme_fw_write_wait;
+   boolcp_fw_write_wait;
struct amdgpu_ring  gfx_ring[AMDGPU_MAX_GFX_RINGS];
unsignednum_gfx_rings;
struct amdgpu_ring  compute_ring[AMDGPU_MAX_COMPUTE_RINGS];
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 17a5cbfd0024..814764723c26 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -561,6 +561,32 @@ static void gfx_v10_0_free_microcode(struct amdgpu_device 
*adev)
kfree(adev->gfx.rlc.register_list_format);
 }
 
+static void gfx_v10_0_check_fw_write_wait(struct amdgpu_device *adev)
+{
+   adev->gfx.cp_fw_write_wait = false;
+
+   switch (adev->asic_type) {
+   case CHIP_NAVI10:
+   case CHIP_NAVI12:
+   case CHIP_NAVI14:
+   if ((adev->gfx.me_fw_version >= 0x0046) &&
+   (adev->gfx.me_feature_version >= 27) &&
+   (adev->gfx.pfp_fw_version >= 0x0068) &&
+   (adev->gfx.pfp_feature_version >= 27) &&
+   (adev->gfx.mec_fw_version >= 0x005b) &&
+   (adev->gfx.mec_feature_version >= 27))
+   adev->gfx.cp_fw_write_wait = true;
+   break;
+   default:
+   break;
+   }
+
+   if (adev->gfx.cp_fw_write_wait == false)
+   DRM_WARN_ONCE("Warning: check cp_fw_version and update it to 
realize \
+ GRBM requires 1-cycle delay in cp 
firmware\n");
+}
+
+
 static void gfx_v10_0_init_rlc_ext_microcode(struct amdgpu_device *adev)
 {
const struct rlc_firmware_header_v2_1 *rlc_hdr;
@@ -4768,6 +4794,28 @@ static void gfx_v10_0_ring_emit_reg_wait(struct 
amdgpu_ring *ring, uint32_t reg,
gfx_v10_0_wait_reg_mem(ring, 0, 0, 0, reg, 0, val, mask, 0x20);
 }
 
+static void gfx_v10_0_ring_emit_reg_write_reg_wait(struct amdgpu_ring *ring,
+ uint32_t reg0, uint32_t reg1,
+ uint32_t ref, uint32_t mask)
+{
+   int usepfp = (ring->funcs->type == AMDGPU_RING_TYPE_GFX);
+   struct amdgpu_device *adev = ring->adev;
+   bool fw_version_ok = false;
+
+   gfx_v10_0_check_fw_write_wait(adev);
+
+   if (ring->funcs->type == AMDGPU_RING_TYPE_GFX ||
+   ring->funcs->type == AMDGPU_RING_TYPE_COMPUTE)
+   fw_version_ok = adev->gfx.cp_fw_write_wait;
+
+   if (fw_version_ok)
+   gfx_v10_0_wait_reg_mem(ring, usepfp, 0, 1, reg0, reg1,
+ ref, mask, 0x20);
+   else
+   amdgpu_ring_emit_reg_write_reg_wait_helper(ring, reg0, reg1,
+  ref, mask);
+}
+
 static void
 gfx_v10_0_set_gfx_eop_interrupt_state(struct amdgpu_device *adev,
  uint32_t me, uint32_t pipe,
@@ -5158,6 +5206,7 @@ static const struct amdgpu_ring_funcs 
gfx_v10_0_ring_funcs_gfx = {
.emit_tmz = gfx_v10_0_ring_emit_tmz,
.emit_wreg = gfx_v10_0_ring_emit_wreg,
.emit_reg_wait = gfx_v10_0_ring_emit_reg_wait,
+   .emit_reg_write_reg_wait = 

[PATCH libdrm] tests/amdgpu: enable dispatch/draw tests for Renoir

2019-11-04 Thread Zhu, Changfeng
From: changzhu 

It can run dispatch/draw tests on new renoir chips. So it needs to
enable dispatch/draw tests for Renoir again.

Change-Id: I3a72a4bbfe0fc663ee0e3e58d8e9c304f513e568
Signed-off-by: changzhu 
Reviewed-by: Flora Cui 
Reviewed-by: Marek Olšák 
Reviewed-by: Huang Rui 
---
 tests/amdgpu/basic_tests.c | 16 +---
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/tests/amdgpu/basic_tests.c b/tests/amdgpu/basic_tests.c
index e75b9d0d..a57dcbb4 100644
--- a/tests/amdgpu/basic_tests.c
+++ b/tests/amdgpu/basic_tests.c
@@ -592,20 +592,6 @@ int suite_basic_tests_init(void)
 
family_id = gpu_info.family_id;
 
-   if (gpu_info.asic_id == 0x1636) {
-   if (amdgpu_set_test_active("Basic Tests",
-  "Dispatch Test",
-  CU_FALSE))
-   fprintf(stderr, "test deactivation failed - %s\n",
-   CU_get_error_msg());
-
-   if (amdgpu_set_test_active("Basic Tests",
-  "Draw Test",
-  CU_FALSE))
-   fprintf(stderr, "test deactivation failed - %s\n",
-   CU_get_error_msg());
-   }
-
return CUE_SUCCESS;
 }
 
@@ -2992,7 +2978,7 @@ void amdgpu_memset_draw(amdgpu_device_handle 
device_handle,
resources[1] = bo_shader_ps;
resources[2] = bo_shader_vs;
resources[3] = bo_cmd;
-   r = amdgpu_bo_list_create(device_handle, 3, resources, NULL, _list);
+   r = amdgpu_bo_list_create(device_handle, 4, resources, NULL, _list);
CU_ASSERT_EQUAL(r, 0);
 
ib_info.ib_mc_address = mc_address_cmd;
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: GFX9, GFX10: GRBM requires 1-cycle delay

2019-10-28 Thread Zhu, Changfeng
Hi Christian,

Should we also realize the function of gfx_v9_0_wait_reg_mem in gfx10 like gfx9 
since gfx10 also realize write/wait command in a single packet after CL#1761300?

Or we can add dummy read in gmc10 by using emit_wait like Luben's way?

BR,
Changfeng. 

-Original Message-
From: Koenig, Christian  
Sent: Monday, October 28, 2019 6:47 PM
To: Zhu, Changfeng ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Pelloux-prayer, Pierre-eric 
; Huang, Ray ; Tuikov, 
Luben 
Subject: Re: [PATCH] drm/amdgpu: GFX9, GFX10: GRBM requires 1-cycle delay

Hi Changfeng,

> So how can we deal with the firmware between mec version(402) and mec 
> version(421)?
Well of hand I see only two options: Either print a warning or completely 
reject loading the driver.

Completely rejecting loading the driver is probably not a good idea and the 
issue is actually extremely unlikely to cause any problems.

So printing a warning that the user should update their firmware is probably 
the best approach.

Regards,
Christian.

Am 28.10.19 um 04:01 schrieb Zhu, Changfeng:
> Hi Christian,
>
> Re- that won't work, you can't add this to 
> amdgpu_ring_emit_reg_write_reg_wait_helper or break all read triggered 
> registers (like the semaphore ones).
>
> Do you mean that I should use reg_wait registers(wait_reg_mem) like Luben to 
> replace read triggered registers for adding dummy read?
>
> Re-Additional to that it will never work on GFX9, since the CP firmware there 
> uses the integrated write/wait command and you can't add an additional dummy 
> read there.
>
> Yes, I see the integrated write/wait command and they are realized in 
> gfx_v9_0_wait_reg_mem:
> Emily's patch:
> drm/amdgpu: Remove the sriov checking and add firmware checking 
> decides when to go into gfx_v9_0_wait_reg_mem and when go into 
> amdgpu_ring_emit_reg_write_reg_wait_helper.
>
> However there are two problems now.
> 1.Before the fw_version_ok fw version, the code goes into 
> amdgpu_ring_emit_reg_write_reg_wait_helper. In this case, should not we add 
> dummy read in amdgpu_ring_emit_reg_write_reg_wait_helper?
> 2.After the fw_version_ok fw version, the code goes into 
> gfx_v9_0_wait_reg_mem. However, it realizes write/wait command in firmware. 
> Then how can we add this dummy read? According to Yang,Zilong, the CP 
> firmware has realized dummy in firmware in CL:
> Vega20 CL#1762470 @3/27/2019
> Navi10 CL#1761300 @3/25/2019
> Accodring to CL#1762470,
> The firmware which realized dummy read is(Raven for example):
> Mec version:
> #define F32_MEC_UCODE_VERSION "#421"
> #define F32_MEC_FEATURE_VERSION 46
> Pfp version:
> #define F32_PFP_UCODE_VERSION "#183"
> #define F32_PFP_FEATURE_VERSION 46
> In Emily's patch:
> The CP firmware whichuses the integrated write/wait command begins from 
> version:
> +   case CHIP_RAVEN:
> +   if ((adev->gfx.me_fw_version >= 0x009c) &&
> +   (adev->gfx.me_feature_version >= 42) &&
> +   (adev->gfx.pfp_fw_version >=  0x00b1(177)) &&
> +   (adev->gfx.pfp_feature_version >= 42))
> +   adev->gfx.me_fw_write_wait = true;
> +
> +   if ((adev->gfx.mec_fw_version >=  0x0192(402)) &&
> +   (adev->gfx.mec_feature_version >= 42))
> +   adev->gfx.mec_fw_write_wait = true;
> +   break;
>
> So how can we deal with the firmware between mec version(402) and mec 
> version(421)?
> It will realize write/wait command in CP firmware but it doesn't have dummy 
> read.
>
> BR,
> Changfeng.
>
> -Original Message-
> From: Koenig, Christian 
> Sent: Friday, October 25, 2019 11:54 PM
> To: Zhu, Changfeng ; 
> amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Pelloux-prayer, 
> Pierre-eric ; Huang, Ray 
> ; Tuikov, Luben 
> Subject: Re: [PATCH] drm/amdgpu: GFX9, GFX10: GRBM requires 1-cycle 
> delay
>
> Hi Changfeng,
>
> that won't work, you can't add this to 
> amdgpu_ring_emit_reg_write_reg_wait_helper or break all read triggered 
> registers (like the semaphore ones).
>
> Additional to that it will never work on GFX9, since the CP firmware there 
> uses the integrated write/wait command and you can't add an additional dummy 
> read there.
>
> Regards,
> Christian.
>
> Am 25.10.19 um 16:22 schrieb Zhu, Changfeng:
>> I try to write a patch based on the patch of Tuikov,Luben.
>>
>> Inspired by Luben,here is the patch:
>>
>>   From 1980d8f1ed44fb9a84a5ea1f6e2edd2bc25c629a Mon Sep 17 00:00:00
>> 2001
>> From: changzhu 
>> Date: Thu, 10 Oct 2019 11:02:33 +0800
>> Sub

RE: [PATCH] drm/amdgpu: GFX9, GFX10: GRBM requires 1-cycle delay

2019-10-25 Thread Zhu, Changfeng
I try to write a patch based on the patch of Tuikov,Luben.

Inspired by Luben,here is the patch:

From 1980d8f1ed44fb9a84a5ea1f6e2edd2bc25c629a Mon Sep 17 00:00:00 2001
From: changzhu 
Date: Thu, 10 Oct 2019 11:02:33 +0800
Subject: [PATCH] drm/amdgpu: add dummy read by engines for some GCVM status
 registers

The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface.  This has caused a problem where
status registers requiring HW to update have a 1 cycle delay, due
to the register update having to go through GRBM.

SW may operate on an incorrect value if they write a register and
immediately check the corresponding status register.

Registers requiring HW to clear or set fields may be delayed by 1 cycle.
For example,

1. write VM_INVALIDATE_ENG0_REQ mask = 5a
2. read VM_INVALIDATE_ENG0_ACKb till the ack is same as the request mask = 5a
a. HW will reset VM_INVALIDATE_ENG0_ACK = 0 until invalidation is 
complete
3. write VM_INVALIDATE_ENG0_REQ mask = 5a
4. read VM_INVALIDATE_ENG0_ACK till the ack is same as the request mask = 5a
a. First read of VM_INVALIDATE_ENG0_ACK = 5a instead of 0
b. Second read of VM_INVALIDATE_ENG0_ACK = 0 because the remote GRBM h/w
   register takes one extra cycle to be cleared
c. In this case,SW wil see a false ACK if they exit on first read

Affected registers (only GC variant)  | Recommended Dummy Read
--+
VM_INVALIDATE_ENG*_ACK|  VM_INVALIDATE_ENG*_REQ
VM_L2_STATUS  |  VM_L2_STATUS
VM_L2_PROTECTION_FAULT_STATUS |  VM_L2_PROTECTION_FAULT_STATUS
VM_L2_PROTECTION_FAULT_ADDR_HI/LO32   |  VM_L2_PROTECTION_FAULT_ADDR_HI/LO32
VM_L2_IH_LOG_BUSY |  VM_L2_IH_LOG_BUSY
MC_VM_L2_PERFCOUNTER_HI/LO|  MC_VM_L2_PERFCOUNTER_HI/LO
ATC_L2_PERFCOUNTER_HI/LO  |  ATC_L2_PERFCOUNTER_HI/LO
ATC_L2_PERFCOUNTER2_HI/LO |  ATC_L2_PERFCOUNTER2_HI/LO

It also needs dummy read by engines for these gc registers.

Change-Id: Ie028f37eb789966d4593984bd661b248ebeb1ac3
Signed-off-by: changzhu 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c |  5 +
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c   |  2 ++
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c|  2 ++
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c   |  4 
 drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c   | 18 ++
 5 files changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 4b3f58dbf36f..c2fbf6087ecf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -392,6 +392,11 @@ void amdgpu_ring_emit_reg_write_reg_wait_helper(struct 
amdgpu_ring *ring,
uint32_t ref, uint32_t mask)
 {
amdgpu_ring_emit_wreg(ring, reg0, ref);
+
+   /* wait for a cycle to reset vm_inv_eng0_ack */
+   if (ring->funcs->vmhub == AMDGPU_GFXHUB_0)
+   amdgpu_ring_emit_rreg(ring, reg0);
+
amdgpu_ring_emit_reg_wait(ring, reg1, mask, mask);
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index ef1975a5323a..104c47734316 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -5155,6 +5155,7 @@ static const struct amdgpu_ring_funcs 
gfx_v10_0_ring_funcs_gfx = {
.patch_cond_exec = gfx_v10_0_ring_emit_patch_cond_exec,
.preempt_ib = gfx_v10_0_ring_preempt_ib,
.emit_tmz = gfx_v10_0_ring_emit_tmz,
+   .emit_rreg = gfx_v10_0_ring_emit_rreg,
.emit_wreg = gfx_v10_0_ring_emit_wreg,
.emit_reg_wait = gfx_v10_0_ring_emit_reg_wait,
 };
@@ -5188,6 +5189,7 @@ static const struct amdgpu_ring_funcs 
gfx_v10_0_ring_funcs_compute = {
.test_ib = gfx_v10_0_ring_test_ib,
.insert_nop = amdgpu_ring_insert_nop,
.pad_ib = amdgpu_ring_generic_pad_ib,
+   .emit_rreg = gfx_v10_0_ring_emit_rreg,
.emit_wreg = gfx_v10_0_ring_emit_wreg,
.emit_reg_wait = gfx_v10_0_ring_emit_reg_wait,
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 2f03bf533d41..d00b53de0fdc 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -6253,6 +6253,7 @@ static const struct amdgpu_ring_funcs 
gfx_v9_0_ring_funcs_gfx = {
.init_cond_exec = gfx_v9_0_ring_emit_init_cond_exec,
.patch_cond_exec = gfx_v9_0_ring_emit_patch_cond_exec,
.emit_tmz = gfx_v9_0_ring_emit_tmz,
+   .emit_rreg = gfx_v9_0_ring_emit_rreg,
.emit_wreg = gfx_v9_0_ring_emit_wreg,
.emit_reg_wait = gfx_v9_0_ring_emit_reg_wait,
.emit_reg_write_reg_wait = gfx_v9_0_ring_emit_reg_write_reg_wait,
@@ -6289,6 +6290,7 @@ static const struct amdgpu_ring_funcs 

RE: [PATCH] drm/amdgpu: GFX9, GFX10: GRBM requires 1-cycle delay

2019-10-24 Thread Zhu, Changfeng
Inline.


-Original Message-
From: amd-gfx  On Behalf Of Tuikov, Luben
Sent: Friday, October 25, 2019 5:17 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Pelloux-prayer, Pierre-eric 
; Tuikov, Luben ; 
Koenig, Christian 
Subject: [PATCH] drm/amdgpu: GFX9, GFX10: GRBM requires 1-cycle delay

The GRBM interface is now capable of bursting 1-cycle op per register, a WRITE 
followed by another WRITE, or a WRITE followed by a READ--much faster than 
previous muti-cycle per completed-transaction interface. This causes a problem, 
whereby status registers requiring a read/write by hardware, have a 1-cycle 
delay, due to the register update having to go through GRBM interface.

This patch adds this delay.

A one cycle read op is added after updating the invalidate request and before 
reading the invalidate-ACK status.

See also commit
534991731cb5fa94b5519957646cf849ca10d17d.

Signed-off-by: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 4 ++--  
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c  | 4 ++--  
drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 9 +  
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 8   
drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 2 +-
 5 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index ac43b1af69e3..0042868dbd53 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -5129,7 +5129,7 @@ static const struct amdgpu_ring_funcs 
gfx_v10_0_ring_funcs_gfx = {
5 + /* COND_EXEC */
7 + /* PIPELINE_SYNC */
SOC15_FLUSH_GPU_TLB_NUM_WREG * 5 +
-   SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 7 +
+   SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 7 * 2 +
2 + /* VM_FLUSH */
8 + /* FENCE for VM_FLUSH */
20 + /* GDS switch */
@@ -5182,7 +5182,7 @@ static const struct amdgpu_ring_funcs 
gfx_v10_0_ring_funcs_compute = {
5 + /* hdp invalidate */
7 + /* gfx_v10_0_ring_emit_pipeline_sync */
SOC15_FLUSH_GPU_TLB_NUM_WREG * 5 +
-   SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 7 +
+   SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 7 * 2 +
2 + /* gfx_v10_0_ring_emit_vm_flush */
8 + 8 + 8, /* gfx_v10_0_ring_emit_fence x3 for user fence, vm 
fence */
.emit_ib_size = 7, /* gfx_v10_0_ring_emit_ib_compute */
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 9fe95e7693d5..9a7a717208de 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -6218,7 +6218,7 @@ static const struct amdgpu_ring_funcs 
gfx_v9_0_ring_funcs_gfx = {
5 +  /* COND_EXEC */
7 +  /* PIPELINE_SYNC */
SOC15_FLUSH_GPU_TLB_NUM_WREG * 5 +
-   SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 7 +
+   SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 7 * 2 +
2 + /* VM_FLUSH */
8 +  /* FENCE for VM_FLUSH */
20 + /* GDS switch */
@@ -6271,7 +6271,7 @@ static const struct amdgpu_ring_funcs 
gfx_v9_0_ring_funcs_compute = {
5 + /* hdp invalidate */
7 + /* gfx_v9_0_ring_emit_pipeline_sync */
SOC15_FLUSH_GPU_TLB_NUM_WREG * 5 +
-   SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 7 +
+   SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 7 * 2 +
2 + /* gfx_v9_0_ring_emit_vm_flush */
8 + 8 + 8, /* gfx_v9_0_ring_emit_fence x3 for user fence, vm 
fence */
.emit_ib_size = 7, /* gfx_v9_0_ring_emit_ib_compute */
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 6e1b25bd1fe7..100d526e9a42 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -346,6 +346,15 @@ static uint64_t gmc_v10_0_emit_flush_gpu_tlb(struct 
amdgpu_ring *ring,
 
amdgpu_ring_emit_wreg(ring, hub->vm_inv_eng0_req + eng, req);
 
+   /* Insert a dummy read to delay one cycle before the ACK
+* inquiry.
+*/
+   if (ring->funcs->type == AMDGPU_RING_TYPE_SDMA ||
+   ring->funcs->type == AMDGPU_RING_TYPE_GFX  ||
+   ring->funcs->type == AMDGPU_RING_TYPE_COMPUTE)
+   amdgpu_ring_emit_reg_wait(ring,
+ hub->vm_inv_eng0_req + eng, 0, 0);
+
/* wait for the invalidate to complete */
amdgpu_ring_emit_reg_wait(ring, hub->vm_inv_eng0_ack + eng,
  1 << vmid, 1 << vmid);
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 9f2a893871ec..8f3097e45299 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -495,6 +495,14 @@ static uint64_t gmc_v9_0_emit_flush_gpu_tlb(struct 
amdgpu_ring *ring,

RE: [PATCH] drm/amdgpu: Set no-retry as default.

2019-08-15 Thread Zhu, Changfeng
Reviewed-by: changzhu 

-Original Message-
From: amd-gfx  On Behalf Of Feifei Xu
Sent: Friday, August 16, 2019 11:22 AM
To: amd-gfx@lists.freedesktop.org
Cc: Xu, Feifei ; Li, Candice 
Subject: [PATCH] drm/amdgpu: Set no-retry as default.

This is to improve performance.

Signed-off-by: Feifei Xu 
Tested-by: Candice Li 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 153705848cc8..0df54d45369c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -145,7 +145,7 @@ int amdgpu_async_gfx_ring = 1;  int amdgpu_mcbp = 0;  int 
amdgpu_discovery = -1;  int amdgpu_mes = 0; -int amdgpu_noretry;
+int amdgpu_noretry = 1;
 
 struct amdgpu_mgpu_info mgpu_info = {
.mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
@@ -621,7 +621,7 @@ MODULE_PARM_DESC(mes,  module_param_named(mes, amdgpu_mes, 
int, 0444);
 
 MODULE_PARM_DESC(noretry,
-   "Disable retry faults (0 = retry enabled (default), 1 = retry 
disabled)");
+   "Disable retry faults (0 = retry enabled, 1 = retry disabled 
+(default))");
 module_param_named(noretry, amdgpu_noretry, int, 0644);
 
 #ifdef CONFIG_HSA_AMD
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx