Re: [PATCH 2/2] drm/amdgpu: Use offsets local to VCN in VF

2024-03-05 Thread Christian König

Am 05.03.24 um 10:33 schrieb Lazar, Lijo:

On 3/5/2024 2:48 PM, Christian König wrote:

Am 05.03.24 um 10:03 schrieb Lazar, Lijo:

On 3/5/2024 2:24 PM, Christian König wrote:

Am 05.03.24 um 07:40 schrieb Lijo Lazar:

For VCN 4.0.3, use only the local addressing scheme while in VF
mode. This includes addressing scheme used for HUB offsets.

Signed-off-by: Lijo Lazar 
---
    drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 20 +++-
    1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
index 7b5ad13b618e..a27f3f260aab 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
@@ -1381,6 +1381,24 @@ static uint64_t
vcn_v4_0_3_unified_ring_get_wptr(struct amdgpu_ring *ring)
    regUVD_RB_WPTR);
    }
    +static void vcn_v4_0_3_enc_ring_emit_vm_flush(struct amdgpu_ring
*ring,
+    unsigned int vmid, uint64_t pd_addr)
+{
+    struct amdgpu_vmhub *hub;
+
+    /* For VF, only local offsets should be used */
+    if (amdgpu_sriov_vf(ring->adev))
+    ring->vm_hub = AMDGPU_MMHUB0(0);

That is clearly a no-go since the vm_hub must be statically and can't be
changed here.


After HUB allocation, the only usage of this hub pointer is to calculate
use the right offset. We still want VCN to use the right hub, only thing
is register offsets in MMHUB(0) are equal to 'local offsets'.

The vm_hub is a static setup describing how the engine works. You
basically just insert an illegal value here to fix your register offset
calculation.

That is absolutely *not* something you can do.

What exactly is the requirement?


The requirement is this way -

We have multiple MMHUBs and each VCN can talk only to the local MMHUB.

The absolute register offsets we store for each for an example reg at
offset x is
HUB0 = x, HUB1 = x + Y (stride), HUB2 = x + 2Y and so forth.

However VCN cannot use the absolute register offset in the packet in VF
mode, instead it should use a local offset for the local HUB in VF mode.
A local offset of a register in the HUB is exactly the same as offset of
HUB0 = x.

What we do here is assign the HUB as regular here
ring->vm_hub = AMDGPU_MMHUB0(adev->vcn.inst[i].aid_id);

so that amdgpu_gmc_allocate_vm_inv_eng is allocated in the right way.

Then at the time of flush, switch it back to point to HUB0 so that it
uses the offsets of HUB0 in the packet. Actually, it's not required to
switch this every time, one-time is fine after
amdgpu_gmc_allocate_vm_inv_eng() is complete.


Yeah, but that is not something we can do. vm_hub is a constant and 
should never be manipulated to fulfill a requirement like that.


If the VCN engines requires that we don't use the aid offsets in the 
VCN_ENC_CMD_REG_WRITE and VCN_ENC_CMD_REG_WAIT packets then we need to 
adjust those packets and not the vm_hub.


Also please sync up such with with Leo, he is responsible for 
coordinating all MM work on the kernel. In general the VCN team should 
never contact your team directly with such requirements.


Regards,
Christian.



Thanks,
Lijo


Regards,
Christian.


Thanks,
Lijo


Regards,
Christian.


+    hub = >adev->vmhub[ring->vm_hub];
+
+    pd_addr = amdgpu_gmc_emit_flush_gpu_tlb(ring, vmid, pd_addr);
+
+    /* wait for reg writes */
+    vcn_v2_0_enc_ring_emit_reg_wait(ring, hub->ctx0_ptb_addr_lo32 +
+    vmid * hub->ctx_addr_distance,
+    lower_32_bits(pd_addr), 0x);
+}
+
    static void vcn_v4_0_3_ring_emit_hdp_flush(struct amdgpu_ring *ring)
    {
    /* VCN engine access for HDP flush doesn't work when RRMT is
enabled.
@@ -1443,7 +1461,7 @@ static const struct amdgpu_ring_funcs
vcn_v4_0_3_unified_ring_vm_funcs = {
    .emit_ib_size = 5, /* vcn_v2_0_enc_ring_emit_ib */
    .emit_ib = vcn_v2_0_enc_ring_emit_ib,
    .emit_fence = vcn_v2_0_enc_ring_emit_fence,
-    .emit_vm_flush = vcn_v2_0_enc_ring_emit_vm_flush,
+    .emit_vm_flush = vcn_v4_0_3_enc_ring_emit_vm_flush,
    .emit_hdp_flush = vcn_v4_0_3_ring_emit_hdp_flush,
    .test_ring = amdgpu_vcn_enc_ring_test_ring,
    .test_ib = amdgpu_vcn_unified_ring_test_ib,




Re: [PATCH 2/2] drm/amdgpu: Use offsets local to VCN in VF

2024-03-05 Thread Lazar, Lijo



On 3/5/2024 2:48 PM, Christian König wrote:
> Am 05.03.24 um 10:03 schrieb Lazar, Lijo:
>>
>> On 3/5/2024 2:24 PM, Christian König wrote:
>>>
>>> Am 05.03.24 um 07:40 schrieb Lijo Lazar:
 For VCN 4.0.3, use only the local addressing scheme while in VF
 mode. This includes addressing scheme used for HUB offsets.

 Signed-off-by: Lijo Lazar 
 ---
    drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 20 +++-
    1 file changed, 19 insertions(+), 1 deletion(-)

 diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
 b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
 index 7b5ad13b618e..a27f3f260aab 100644
 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
 +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
 @@ -1381,6 +1381,24 @@ static uint64_t
 vcn_v4_0_3_unified_ring_get_wptr(struct amdgpu_ring *ring)
    regUVD_RB_WPTR);
    }
    +static void vcn_v4_0_3_enc_ring_emit_vm_flush(struct amdgpu_ring
 *ring,
 +    unsigned int vmid, uint64_t pd_addr)
 +{
 +    struct amdgpu_vmhub *hub;
 +
 +    /* For VF, only local offsets should be used */
 +    if (amdgpu_sriov_vf(ring->adev))
 +    ring->vm_hub = AMDGPU_MMHUB0(0);
>>> That is clearly a no-go since the vm_hub must be statically and can't be
>>> changed here.
>>>
>> After HUB allocation, the only usage of this hub pointer is to calculate
>> use the right offset. We still want VCN to use the right hub, only thing
>> is register offsets in MMHUB(0) are equal to 'local offsets'.
> 
> The vm_hub is a static setup describing how the engine works. You
> basically just insert an illegal value here to fix your register offset
> calculation.
> 
> That is absolutely *not* something you can do.
> 
> What exactly is the requirement?
> 

The requirement is this way -

We have multiple MMHUBs and each VCN can talk only to the local MMHUB.

The absolute register offsets we store for each for an example reg at
offset x is
HUB0 = x, HUB1 = x + Y (stride), HUB2 = x + 2Y and so forth.

However VCN cannot use the absolute register offset in the packet in VF
mode, instead it should use a local offset for the local HUB in VF mode.
A local offset of a register in the HUB is exactly the same as offset of
HUB0 = x.

What we do here is assign the HUB as regular here
ring->vm_hub = AMDGPU_MMHUB0(adev->vcn.inst[i].aid_id);

so that amdgpu_gmc_allocate_vm_inv_eng is allocated in the right way.

Then at the time of flush, switch it back to point to HUB0 so that it
uses the offsets of HUB0 in the packet. Actually, it's not required to
switch this every time, one-time is fine after
amdgpu_gmc_allocate_vm_inv_eng() is complete.

Thanks,
Lijo

> Regards,
> Christian.
> 
>>
>> Thanks,
>> Lijo
>>
>>> Regards,
>>> Christian.
>>>
 +    hub = >adev->vmhub[ring->vm_hub];
 +
 +    pd_addr = amdgpu_gmc_emit_flush_gpu_tlb(ring, vmid, pd_addr);
 +
 +    /* wait for reg writes */
 +    vcn_v2_0_enc_ring_emit_reg_wait(ring, hub->ctx0_ptb_addr_lo32 +
 +    vmid * hub->ctx_addr_distance,
 +    lower_32_bits(pd_addr), 0x);
 +}
 +
    static void vcn_v4_0_3_ring_emit_hdp_flush(struct amdgpu_ring *ring)
    {
    /* VCN engine access for HDP flush doesn't work when RRMT is
 enabled.
 @@ -1443,7 +1461,7 @@ static const struct amdgpu_ring_funcs
 vcn_v4_0_3_unified_ring_vm_funcs = {
    .emit_ib_size = 5, /* vcn_v2_0_enc_ring_emit_ib */
    .emit_ib = vcn_v2_0_enc_ring_emit_ib,
    .emit_fence = vcn_v2_0_enc_ring_emit_fence,
 -    .emit_vm_flush = vcn_v2_0_enc_ring_emit_vm_flush,
 +    .emit_vm_flush = vcn_v4_0_3_enc_ring_emit_vm_flush,
    .emit_hdp_flush = vcn_v4_0_3_ring_emit_hdp_flush,
    .test_ring = amdgpu_vcn_enc_ring_test_ring,
    .test_ib = amdgpu_vcn_unified_ring_test_ib,
> 


Re: [PATCH 2/2] drm/amdgpu: Use offsets local to VCN in VF

2024-03-05 Thread Christian König

Am 05.03.24 um 10:03 schrieb Lazar, Lijo:


On 3/5/2024 2:24 PM, Christian König wrote:


Am 05.03.24 um 07:40 schrieb Lijo Lazar:

For VCN 4.0.3, use only the local addressing scheme while in VF
mode. This includes addressing scheme used for HUB offsets.

Signed-off-by: Lijo Lazar 
---
   drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 20 +++-
   1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
index 7b5ad13b618e..a27f3f260aab 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
@@ -1381,6 +1381,24 @@ static uint64_t
vcn_v4_0_3_unified_ring_get_wptr(struct amdgpu_ring *ring)
   regUVD_RB_WPTR);
   }
   +static void vcn_v4_0_3_enc_ring_emit_vm_flush(struct amdgpu_ring
*ring,
+    unsigned int vmid, uint64_t pd_addr)
+{
+    struct amdgpu_vmhub *hub;
+
+    /* For VF, only local offsets should be used */
+    if (amdgpu_sriov_vf(ring->adev))
+    ring->vm_hub = AMDGPU_MMHUB0(0);

That is clearly a no-go since the vm_hub must be statically and can't be
changed here.


After HUB allocation, the only usage of this hub pointer is to calculate
use the right offset. We still want VCN to use the right hub, only thing
is register offsets in MMHUB(0) are equal to 'local offsets'.


The vm_hub is a static setup describing how the engine works. You 
basically just insert an illegal value here to fix your register offset 
calculation.


That is absolutely *not* something you can do.

What exactly is the requirement?

Regards,
Christian.



Thanks,
Lijo


Regards,
Christian.


+    hub = >adev->vmhub[ring->vm_hub];
+
+    pd_addr = amdgpu_gmc_emit_flush_gpu_tlb(ring, vmid, pd_addr);
+
+    /* wait for reg writes */
+    vcn_v2_0_enc_ring_emit_reg_wait(ring, hub->ctx0_ptb_addr_lo32 +
+    vmid * hub->ctx_addr_distance,
+    lower_32_bits(pd_addr), 0x);
+}
+
   static void vcn_v4_0_3_ring_emit_hdp_flush(struct amdgpu_ring *ring)
   {
   /* VCN engine access for HDP flush doesn't work when RRMT is
enabled.
@@ -1443,7 +1461,7 @@ static const struct amdgpu_ring_funcs
vcn_v4_0_3_unified_ring_vm_funcs = {
   .emit_ib_size = 5, /* vcn_v2_0_enc_ring_emit_ib */
   .emit_ib = vcn_v2_0_enc_ring_emit_ib,
   .emit_fence = vcn_v2_0_enc_ring_emit_fence,
-    .emit_vm_flush = vcn_v2_0_enc_ring_emit_vm_flush,
+    .emit_vm_flush = vcn_v4_0_3_enc_ring_emit_vm_flush,
   .emit_hdp_flush = vcn_v4_0_3_ring_emit_hdp_flush,
   .test_ring = amdgpu_vcn_enc_ring_test_ring,
   .test_ib = amdgpu_vcn_unified_ring_test_ib,




Re: [PATCH 2/2] drm/amdgpu: Use offsets local to VCN in VF

2024-03-05 Thread Lazar, Lijo



On 3/5/2024 2:24 PM, Christian König wrote:
> 
> 
> Am 05.03.24 um 07:40 schrieb Lijo Lazar:
>> For VCN 4.0.3, use only the local addressing scheme while in VF
>> mode. This includes addressing scheme used for HUB offsets.
>>
>> Signed-off-by: Lijo Lazar 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 20 +++-
>>   1 file changed, 19 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
>> b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
>> index 7b5ad13b618e..a27f3f260aab 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
>> @@ -1381,6 +1381,24 @@ static uint64_t
>> vcn_v4_0_3_unified_ring_get_wptr(struct amdgpu_ring *ring)
>>   regUVD_RB_WPTR);
>>   }
>>   +static void vcn_v4_0_3_enc_ring_emit_vm_flush(struct amdgpu_ring
>> *ring,
>> +    unsigned int vmid, uint64_t pd_addr)
>> +{
>> +    struct amdgpu_vmhub *hub;
>> +
>> +    /* For VF, only local offsets should be used */
>> +    if (amdgpu_sriov_vf(ring->adev))
>> +    ring->vm_hub = AMDGPU_MMHUB0(0);
> 
> That is clearly a no-go since the vm_hub must be statically and can't be
> changed here.
> 

After HUB allocation, the only usage of this hub pointer is to calculate
use the right offset. We still want VCN to use the right hub, only thing
is register offsets in MMHUB(0) are equal to 'local offsets'.

Thanks,
Lijo

> Regards,
> Christian.
> 
>> +    hub = >adev->vmhub[ring->vm_hub];
>> +
>> +    pd_addr = amdgpu_gmc_emit_flush_gpu_tlb(ring, vmid, pd_addr);
>> +
>> +    /* wait for reg writes */
>> +    vcn_v2_0_enc_ring_emit_reg_wait(ring, hub->ctx0_ptb_addr_lo32 +
>> +    vmid * hub->ctx_addr_distance,
>> +    lower_32_bits(pd_addr), 0x);
>> +}
>> +
>>   static void vcn_v4_0_3_ring_emit_hdp_flush(struct amdgpu_ring *ring)
>>   {
>>   /* VCN engine access for HDP flush doesn't work when RRMT is
>> enabled.
>> @@ -1443,7 +1461,7 @@ static const struct amdgpu_ring_funcs
>> vcn_v4_0_3_unified_ring_vm_funcs = {
>>   .emit_ib_size = 5, /* vcn_v2_0_enc_ring_emit_ib */
>>   .emit_ib = vcn_v2_0_enc_ring_emit_ib,
>>   .emit_fence = vcn_v2_0_enc_ring_emit_fence,
>> -    .emit_vm_flush = vcn_v2_0_enc_ring_emit_vm_flush,
>> +    .emit_vm_flush = vcn_v4_0_3_enc_ring_emit_vm_flush,
>>   .emit_hdp_flush = vcn_v4_0_3_ring_emit_hdp_flush,
>>   .test_ring = amdgpu_vcn_enc_ring_test_ring,
>>   .test_ib = amdgpu_vcn_unified_ring_test_ib,
> 


Re: [PATCH 2/2] drm/amdgpu: Use offsets local to VCN in VF

2024-03-05 Thread Christian König




Am 05.03.24 um 07:40 schrieb Lijo Lazar:

For VCN 4.0.3, use only the local addressing scheme while in VF
mode. This includes addressing scheme used for HUB offsets.

Signed-off-by: Lijo Lazar 
---
  drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 20 +++-
  1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
index 7b5ad13b618e..a27f3f260aab 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
@@ -1381,6 +1381,24 @@ static uint64_t vcn_v4_0_3_unified_ring_get_wptr(struct 
amdgpu_ring *ring)
regUVD_RB_WPTR);
  }
  
+static void vcn_v4_0_3_enc_ring_emit_vm_flush(struct amdgpu_ring *ring,

+   unsigned int vmid, uint64_t pd_addr)
+{
+   struct amdgpu_vmhub *hub;
+
+   /* For VF, only local offsets should be used */
+   if (amdgpu_sriov_vf(ring->adev))
+   ring->vm_hub = AMDGPU_MMHUB0(0);


That is clearly a no-go since the vm_hub must be statically and can't be 
changed here.


Regards,
Christian.


+   hub = >adev->vmhub[ring->vm_hub];
+
+   pd_addr = amdgpu_gmc_emit_flush_gpu_tlb(ring, vmid, pd_addr);
+
+   /* wait for reg writes */
+   vcn_v2_0_enc_ring_emit_reg_wait(ring, hub->ctx0_ptb_addr_lo32 +
+   vmid * hub->ctx_addr_distance,
+   lower_32_bits(pd_addr), 0x);
+}
+
  static void vcn_v4_0_3_ring_emit_hdp_flush(struct amdgpu_ring *ring)
  {
/* VCN engine access for HDP flush doesn't work when RRMT is enabled.
@@ -1443,7 +1461,7 @@ static const struct amdgpu_ring_funcs 
vcn_v4_0_3_unified_ring_vm_funcs = {
.emit_ib_size = 5, /* vcn_v2_0_enc_ring_emit_ib */
.emit_ib = vcn_v2_0_enc_ring_emit_ib,
.emit_fence = vcn_v2_0_enc_ring_emit_fence,
-   .emit_vm_flush = vcn_v2_0_enc_ring_emit_vm_flush,
+   .emit_vm_flush = vcn_v4_0_3_enc_ring_emit_vm_flush,
.emit_hdp_flush = vcn_v4_0_3_ring_emit_hdp_flush,
.test_ring = amdgpu_vcn_enc_ring_test_ring,
.test_ib = amdgpu_vcn_unified_ring_test_ib,




[PATCH 2/2] drm/amdgpu: Use offsets local to VCN in VF

2024-03-04 Thread Lijo Lazar
For VCN 4.0.3, use only the local addressing scheme while in VF
mode. This includes addressing scheme used for HUB offsets.

Signed-off-by: Lijo Lazar 
---
 drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c 
b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
index 7b5ad13b618e..a27f3f260aab 100644
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c
@@ -1381,6 +1381,24 @@ static uint64_t vcn_v4_0_3_unified_ring_get_wptr(struct 
amdgpu_ring *ring)
regUVD_RB_WPTR);
 }
 
+static void vcn_v4_0_3_enc_ring_emit_vm_flush(struct amdgpu_ring *ring,
+   unsigned int vmid, uint64_t pd_addr)
+{
+   struct amdgpu_vmhub *hub;
+
+   /* For VF, only local offsets should be used */
+   if (amdgpu_sriov_vf(ring->adev))
+   ring->vm_hub = AMDGPU_MMHUB0(0);
+   hub = >adev->vmhub[ring->vm_hub];
+
+   pd_addr = amdgpu_gmc_emit_flush_gpu_tlb(ring, vmid, pd_addr);
+
+   /* wait for reg writes */
+   vcn_v2_0_enc_ring_emit_reg_wait(ring, hub->ctx0_ptb_addr_lo32 +
+   vmid * hub->ctx_addr_distance,
+   lower_32_bits(pd_addr), 0x);
+}
+
 static void vcn_v4_0_3_ring_emit_hdp_flush(struct amdgpu_ring *ring)
 {
/* VCN engine access for HDP flush doesn't work when RRMT is enabled.
@@ -1443,7 +1461,7 @@ static const struct amdgpu_ring_funcs 
vcn_v4_0_3_unified_ring_vm_funcs = {
.emit_ib_size = 5, /* vcn_v2_0_enc_ring_emit_ib */
.emit_ib = vcn_v2_0_enc_ring_emit_ib,
.emit_fence = vcn_v2_0_enc_ring_emit_fence,
-   .emit_vm_flush = vcn_v2_0_enc_ring_emit_vm_flush,
+   .emit_vm_flush = vcn_v4_0_3_enc_ring_emit_vm_flush,
.emit_hdp_flush = vcn_v4_0_3_ring_emit_hdp_flush,
.test_ring = amdgpu_vcn_enc_ring_test_ring,
.test_ib = amdgpu_vcn_unified_ring_test_ib,
-- 
2.25.1