Re: [PATCH 1/2] drm/ttm: fix out-of-bounds read in ttm_put_pages() v2

2019-04-09 Thread Zhang, Jerry(Junwei)

On 4/8/19 9:13 PM, Christian König wrote:

When ttm_put_pages() tries to figure out whether it's dealing with
transparent hugepages, it just reads past the bounds of the pages array
without a check.

v2: simplify the test if enough pages are left in the array (Christian).

Series is Reviewed-by: Junwei Zhang 

Regards,
Jerry


Signed-off-by: Jann Horn 
Signed-off-by: Christian König 
Fixes: 5c42c64f7d54 ("drm/ttm: fix the fix for huge compound pages")
Cc: sta...@vger.kernel.org
---
  drivers/gpu/drm/ttm/ttm_page_alloc.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index f841accc2c00..f77c81db161b 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -730,7 +730,8 @@ static void ttm_put_pages(struct page **pages, unsigned 
npages, int flags,
}
  
  #ifdef CONFIG_TRANSPARENT_HUGEPAGE

-   if (!(flags & TTM_PAGE_FLAG_DMA32)) {
+   if (!(flags & TTM_PAGE_FLAG_DMA32) &&
+   (npages - i) >= HPAGE_PMD_NR) {
for (j = 0; j < HPAGE_PMD_NR; ++j)
if (p++ != pages[i + j])
break;
@@ -759,7 +760,7 @@ static void ttm_put_pages(struct page **pages, unsigned 
npages, int flags,
unsigned max_size, n2free;
  
  		spin_lock_irqsave(>lock, irq_flags);

-   while (i < npages) {
+   while ((npages - i) >= HPAGE_PMD_NR) {
struct page *p = pages[i];
unsigned j;
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: fix CPDMA hang in PRT mode for VEGA20

2019-01-08 Thread Zhang, Jerry(Junwei)

On 1/9/19 10:23 AM, Zhou1, Tao wrote:



-Original Message-
From: amd-gfx  On Behalf Of
Zhang, Jerry(Junwei)
Sent: 2019年1月9日 9:39
To: Zhou1, Tao ; amd-gfx@lists.freedesktop.org
Cc: Li, Yukun1 
Subject: Re: [PATCH] drm/amdgpu: fix CPDMA hang in PRT mode for VEGA20

On 1/8/19 6:55 PM, Tao Zhou wrote:

Fix CPDMA hang in PRT mode for both of VEGA10 and VEGA20

Change-Id: I0e5e089d2192063c4a04fa6dbd534f25eb0177be
Signed-off-by: Tao Zhou 
Tested-by: Yukun.Li 
---
   drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +-
   1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 968b127c6c8f..fbca0494f871 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -113,7 +113,10 @@ static const struct soc15_reg_golden

golden_settings_gc_9_0[] =

SOC15_REG_GOLDEN_VALUE(GC, 0, mmTCP_CHAN_STEER_HI,

0x, 0x4a2c0e68),

SOC15_REG_GOLDEN_VALUE(GC, 0, mmTCP_CHAN_STEER_LO,

0x, 0xb5d3f197),

SOC15_REG_GOLDEN_VALUE(GC, 0, mmVGT_CACHE_INVALIDATION,

0x3fff3af3, 0x1920),

-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmVGT_GS_MAX_WAVE_ID,

0x0fff, 0x03ff)

+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmVGT_GS_MAX_WAVE_ID,

0x0fff, 0x03ff),

+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmCP_MEC1_F32_INT_DIS,

0x, 0x0800),

+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmCP_MEC2_F32_INT_DIS,

0x, 0x0800),

+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmCP_DEBUG, 0x,

0x8000)
IIRC, the CP_DEBUG is the key to fix the CPDMA hang, do we need other
settings?
or just to align the settings with latest status?

Jerry


Set CPF_INT_DMA in reg CP_MECx_F32_INT_DIS for Compute and set
DISABLE_GFX_HALT_ON_UTCL1_ERROR in CP_DEBUG for GFX.

All the settings are needed.


Thanks for confirmation. Almost forgot compute. That's fine.
BTW, not sure PRT is used for compute as well.

Jerry



Tao


   };

   static const struct soc15_reg_golden golden_settings_gc_9_0_vg10[] =
@@ -135,10 +138,7 @@ static const struct soc15_reg_golden

golden_settings_gc_9_0_vg10[] =

SOC15_REG_GOLDEN_VALUE(GC, 0, mmRMI_UTCL1_CNTL2,

0x0003, 0x0002),

SOC15_REG_GOLDEN_VALUE(GC, 0, mmSPI_CONFIG_CNTL_1,

0x000f, 0x01000107),

SOC15_REG_GOLDEN_VALUE(GC, 0, mmTD_CNTL, 0x1800,

0x0800),

-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmWD_UTCL1_CNTL,

0x0800, 0x0880),

-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmCP_MEC1_F32_INT_DIS,

0x, 0x0800),

-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmCP_MEC2_F32_INT_DIS,

0x, 0x0800),

-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmCP_DEBUG, 0x,

0x8000)

+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmWD_UTCL1_CNTL,

0x0800,

+0x0880)
   };

   static const struct soc15_reg_golden golden_settings_gc_9_0_vg20[] =

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix CPDMA hang in PRT mode for VEGA20

2019-01-08 Thread Zhang, Jerry(Junwei)

On 1/8/19 6:55 PM, Tao Zhou wrote:

Fix CPDMA hang in PRT mode for both of VEGA10 and VEGA20

Change-Id: I0e5e089d2192063c4a04fa6dbd534f25eb0177be
Signed-off-by: Tao Zhou 
Tested-by: Yukun.Li 
---
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 968b127c6c8f..fbca0494f871 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -113,7 +113,10 @@ static const struct soc15_reg_golden 
golden_settings_gc_9_0[] =
SOC15_REG_GOLDEN_VALUE(GC, 0, mmTCP_CHAN_STEER_HI, 0x, 
0x4a2c0e68),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmTCP_CHAN_STEER_LO, 0x, 
0xb5d3f197),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmVGT_CACHE_INVALIDATION, 0x3fff3af3, 
0x1920),
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmVGT_GS_MAX_WAVE_ID, 0x0fff, 
0x03ff)
+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmVGT_GS_MAX_WAVE_ID, 0x0fff, 
0x03ff),
+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmCP_MEC1_F32_INT_DIS, 0x, 
0x0800),
+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmCP_MEC2_F32_INT_DIS, 0x, 
0x0800),
+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmCP_DEBUG, 0x, 0x8000)
IIRC, the CP_DEBUG is the key to fix the CPDMA hang, do we need other 
settings?

or just to align the settings with latest status?

Jerry


  };
  
  static const struct soc15_reg_golden golden_settings_gc_9_0_vg10[] =

@@ -135,10 +138,7 @@ static const struct soc15_reg_golden 
golden_settings_gc_9_0_vg10[] =
SOC15_REG_GOLDEN_VALUE(GC, 0, mmRMI_UTCL1_CNTL2, 0x0003, 
0x0002),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmSPI_CONFIG_CNTL_1, 0x000f, 
0x01000107),
SOC15_REG_GOLDEN_VALUE(GC, 0, mmTD_CNTL, 0x1800, 0x0800),
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmWD_UTCL1_CNTL, 0x0800, 0x0880),
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmCP_MEC1_F32_INT_DIS, 0x, 
0x0800),
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmCP_MEC2_F32_INT_DIS, 0x, 
0x0800),
-   SOC15_REG_GOLDEN_VALUE(GC, 0, mmCP_DEBUG, 0x, 0x8000)
+   SOC15_REG_GOLDEN_VALUE(GC, 0, mmWD_UTCL1_CNTL, 0x0800, 0x0880)
  };
  
  static const struct soc15_reg_golden golden_settings_gc_9_0_vg20[] =


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: update MC firmware image for polaris10 variants

2018-12-10 Thread Zhang, Jerry(Junwei)

On 12/11/18 4:06 AM, Alex Deucher wrote:

On Fri, Dec 7, 2018 at 3:40 AM Zhang, Jerry(Junwei)  wrote:

we can drop MC update patch, since a new fw could fix that.

Shouldn't we apply this as well for consistency?


I did apply it for simple test.
That looks no harm.

But confirmed the MC firmware version table, it shares the same MC as P10.
So I drop this patch now.

That means not every variant uses a newer MC firmware, as I could see now.

Regards,
Jerry



Alex


Regards,
Jerry

On 12/7/18 3:19 PM, Junwei Zhang wrote:

Some new variants require different firmwares.

Signed-off-by: Junwei Zhang 
---
   drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 5 -
   1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 1ad7e6b8ed1d..0edb8622f666 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -244,7 +244,10 @@ static int gmc_v8_0_init_microcode(struct amdgpu_device 
*adev)
   case CHIP_POLARIS10:
   if ((adev->pdev->device == 0x67df) &&
   ((adev->pdev->revision == 0xe1) ||
-  (adev->pdev->revision == 0xf7)))
+  (adev->pdev->revision == 0xf7)) ||
+ ((adev->pdev->device == 0x6fdf) &&
+  ((adev->pdev->revision == 0xef) ||
+   (adev->pdev->revision == 0xff
   chip_name = "polaris10_k";
   else
   chip_name = "polaris10";

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: update MC firmware image for polaris10 variants

2018-12-07 Thread Zhang, Jerry(Junwei)

we can drop MC update patch, since a new fw could fix that.

Regards,
Jerry

On 12/7/18 3:19 PM, Junwei Zhang wrote:

Some new variants require different firmwares.

Signed-off-by: Junwei Zhang 
---
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 1ad7e6b8ed1d..0edb8622f666 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -244,7 +244,10 @@ static int gmc_v8_0_init_microcode(struct amdgpu_device 
*adev)
case CHIP_POLARIS10:
if ((adev->pdev->device == 0x67df) &&
((adev->pdev->revision == 0xe1) ||
-(adev->pdev->revision == 0xf7)))
+(adev->pdev->revision == 0xf7)) ||
+   ((adev->pdev->device == 0x6fdf) &&
+((adev->pdev->revision == 0xef) ||
+ (adev->pdev->revision == 0xff
chip_name = "polaris10_k";
else
chip_name = "polaris10";


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu/powerplay: fix clock stretcher limits on polaris

2018-12-03 Thread Zhang, Jerry(Junwei)

On 12/4/18 12:21 AM, Alex Deucher wrote:

Adjust limits for newer polaris variants.

Signed-off-by: Alex Deucher 
---
  drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c | 17 +++--
  1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
index 2b2c26616902..1f8736b8291d 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
@@ -1528,8 +1528,21 @@ static int 
polaris10_populate_clock_stretcher_data_table(struct pp_hwmgr *hwmgr)
efuse = efuse >> 24;
  
  	if (hwmgr->chip_id == CHIP_POLARIS10) {

-   min = 1000;
-   max = 2300;
+   if (hwmgr->is_kicker) {
+   min = 1200;
+   max = 2500;
+   } else {
+   min = 1000;
+   max = 2300;
+   }
+   } else if (hwmgr->chip_id == CHIP_POLARIS11) {
+   if (hwmgr->is_kicker) {
+   min = 900;
+   max = 2500;


the max is 2100, I think.

Apart from that, it's
Reviewed-by: Junwei Zhang 

Regards,
Jerry


+   } else {
+   min = 1100;
+   max = 2100;
+   }
} else {
min = 1100;
max = 2100;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu/powerplay: fix mclk switch limit on polaris

2018-12-03 Thread Zhang, Jerry(Junwei)

On 12/4/18 12:21 AM, Alex Deucher wrote:

Update switch limit on newer polaris variants.  This may fix
flickering with high refresh rates with mclk switching enabled.

Signed-off-by: Alex Deucher 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index 5dcd21d29dbf..1f12fc7ea7c9 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -2859,7 +2859,10 @@ static int smu7_vblank_too_short(struct pp_hwmgr *hwmgr,
case CHIP_POLARIS10:
case CHIP_POLARIS11:
case CHIP_POLARIS12:
-   switch_limit_us = data->is_memory_gddr5 ? 190 : 150;
+   if (hwmgr->is_kicker)
+   switch_limit_us = data->is_memory_gddr5 ? 450 : 150;
+   else
+   switch_limit_us = data->is_memory_gddr5 ? 190 : 150;
break;
case CHIP_VEGAM:
switch_limit_us = 30;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: disable UVD/VCE for some polaris 12 variants

2018-11-26 Thread Zhang, Jerry(Junwei)

On 11/26/18 5:28 PM, Christian König wrote:

Am 26.11.18 um 03:38 schrieb Zhang, Jerry(Junwei):

On 11/24/18 3:32 AM, Deucher, Alexander wrote:


Is this required? Are the harvesting fuses incorrect?  If the blocks 
are harvested, we should bail out of the blocks properly during 
init.  Also, please make this more explicit if we still need it.  E.g.,





The harvest fuse is indeed disabling UVD and VCE, as it's a mining card.
Then any command to UVD/VCE causing NULL pointer issue, like amdgpu_test.


In this case we should fix the NULL pointer issue instead. Do you have 
a backtrace for this?


Sorry to miss the detail.
The NULL pointer is caused by UVD is not initialized as it's disabled in 
VBIOS for this kind of card.


When cs submit, it will check ring->funcs->parse_cs in amdgpu_cs_ib_fill().
However, uvd_v6_0_early_init() skip the set ring function, as 
CC_HARVEST_FUSES is set UVD/VCE disabled.

Then the access to UVD/VCE ring's funcs will cause NULL pointer issue.

BTW, Windows driver disables UVD/VCE for it as well.

Regards,
Jerry



Regards,
Christian.



AFAIW, windows also disable UVD and VCE in initialization.


   if ((adev->pdev->device == 0x67df) &&
  (adev->pdev->revision == 0xf7)) {

/* Some polaris12 variants don't support UVD/VCE */

  } else  {

amdgpu_device_ip_block_add(adev, _v6_3_ip_block);

amdgpu_device_ip_block_add(adev, _v3_4_ip_block);

    }




OK, will explicit the process.

Regards,
Jerry


That way if we re-arrange the order later, it will be easier to track.


Alex


*From:* amd-gfx  on behalf of 
Junwei Zhang 

*Sent:* Friday, November 23, 2018 3:32:27 AM
*To:* amd-gfx@lists.freedesktop.org
*Cc:* Zhang, Jerry
*Subject:* [PATCH] drm/amdgpu: disable UVD/VCE for some polaris 12 
variants

Some variants don't support UVD and VCE.

Signed-off-by: Junwei Zhang 
---
 drivers/gpu/drm/amd/amdgpu/vi.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c 
b/drivers/gpu/drm/amd/amdgpu/vi.c

index f3a4cf1f013a..3338b013ded4 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -1660,6 +1660,10 @@ int vi_set_ip_blocks(struct amdgpu_device *adev)
amdgpu_device_ip_block_add(adev, _v11_2_ip_block);
 amdgpu_device_ip_block_add(adev, _v8_0_ip_block);
 amdgpu_device_ip_block_add(adev, _v3_1_ip_block);
+   /* Some polaris12 variants don't support UVD/VCE */
+   if ((adev->pdev->device == 0x67df) &&
+ (adev->pdev->revision == 0xf7))
+   break;
 amdgpu_device_ip_block_add(adev, _v6_3_ip_block);
 amdgpu_device_ip_block_add(adev, _v3_4_ip_block);
 break;
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: disable UVD/VCE for some polaris 12 variants

2018-11-25 Thread Zhang, Jerry(Junwei)

On 11/24/18 3:32 AM, Deucher, Alexander wrote:


Is this required?  Are the harvesting fuses incorrect?  If the blocks 
are harvested, we should bail out of the blocks properly during init. 
 Also, please make this more explicit if we still need it.  E.g.,





The harvest fuse is indeed disabling UVD and VCE, as it's a mining card.
Then any command to UVD/VCE causing NULL pointer issue, like amdgpu_test.

AFAIW, windows also disable UVD and VCE in initialization.


   if ((adev->pdev->device == 0x67df) &&
  (adev->pdev->revision == 0xf7)) {

/* Some polaris12 variants don't support UVD/VCE */

  } else  {

amdgpu_device_ip_block_add(adev, _v6_3_ip_block);

amdgpu_device_ip_block_add(adev, _v3_4_ip_block);

    }




OK, will explicit the process.

Regards,
Jerry


That way if we re-arrange the order later, it will be easier to track.


Alex


*From:* amd-gfx  on behalf of 
Junwei Zhang 

*Sent:* Friday, November 23, 2018 3:32:27 AM
*To:* amd-gfx@lists.freedesktop.org
*Cc:* Zhang, Jerry
*Subject:* [PATCH] drm/amdgpu: disable UVD/VCE for some polaris 12 
variants

Some variants don't support UVD and VCE.

Signed-off-by: Junwei Zhang 
---
 drivers/gpu/drm/amd/amdgpu/vi.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c 
b/drivers/gpu/drm/amd/amdgpu/vi.c

index f3a4cf1f013a..3338b013ded4 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -1660,6 +1660,10 @@ int vi_set_ip_blocks(struct amdgpu_device *adev)
 amdgpu_device_ip_block_add(adev, 
_v11_2_ip_block);

 amdgpu_device_ip_block_add(adev, _v8_0_ip_block);
 amdgpu_device_ip_block_add(adev, _v3_1_ip_block);
+   /* Some polaris12 variants don't support UVD/VCE */
+   if ((adev->pdev->device == 0x67df) &&
+ (adev->pdev->revision == 0xf7))
+   break;
 amdgpu_device_ip_block_add(adev, _v6_3_ip_block);
 amdgpu_device_ip_block_add(adev, _v3_4_ip_block);
 break;
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: disable UVD/VCE for some polaris 12 variants

2018-11-23 Thread Zhang, Jerry(Junwei)

please ignore this patch, there is typo in code.

On 11/23/18 4:01 PM, Junwei Zhang wrote:

Some variants don't support UVD and VCE.

Signed-off-by: Junwei Zhang 
---
  drivers/gpu/drm/amd/amdgpu/vi.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
index f3a4cf1f013a..46a92eca831b 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -1660,6 +1660,11 @@ int vi_set_ip_blocks(struct amdgpu_device *adev)
amdgpu_device_ip_block_add(adev, _v11_2_ip_block);
amdgpu_device_ip_block_add(adev, _v8_0_ip_block);
amdgpu_device_ip_block_add(adev, _v3_1_ip_block);
+   /* Some polaris12 variants don't support UVD/VCE */
+   if (((adev->pdev->device == 0x67df) &&
+((adev->pdev->revision == 0xe1) ||
+ (adev->pdev->revision == 0xf7
+   break;
amdgpu_device_ip_block_add(adev, _v6_3_ip_block);
amdgpu_device_ip_block_add(adev, _v3_4_ip_block);
break;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu/sdma4: use paging queue for buffer funcs

2018-11-07 Thread Zhang, Jerry(Junwei)

+ Curry

On 11/8/18 10:59 AM, Alex Deucher wrote:

On Wed, Nov 7, 2018 at 9:05 PM Zhang, Jerry(Junwei)  wrote:

On 11/8/18 1:29 AM, Alex Deucher wrote:

Use the paging queue for buffer functions to avoid contention
with the other queues.

Signed-off-by: Alex Deucher 

Reviewed-by: Junwei Zhang 

Can someone with a vega10 test this?

Alex


---
   drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 12 +++-
   1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index e39a09eb0fa1..4b5b47dd2f4c 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -662,6 +662,10 @@ static void sdma_v4_0_page_stop(struct amdgpu_device *adev)
   u32 rb_cntl, ib_cntl;
   int i;

+ if ((adev->mman.buffer_funcs_ring == sdma0) ||
+ (adev->mman.buffer_funcs_ring == sdma1))
+ amdgpu_ttm_set_buffer_funcs_status(adev, false);
+
   for (i = 0; i < adev->sdma.num_instances; i++) {
   rb_cntl = RREG32_SDMA(i, mmSDMA0_PAGE_RB_CNTL);
   rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_PAGE_RB_CNTL,
@@ -1152,6 +1156,9 @@ static int sdma_v4_0_start(struct amdgpu_device *adev)
   r = amdgpu_ring_test_helper(page);
   if (r)
   return r;
+
+ if (adev->mman.buffer_funcs_ring == page)
+ amdgpu_ttm_set_buffer_funcs_status(adev, true);
   }

   if (adev->mman.buffer_funcs_ring == ring)
@@ -2054,7 +2061,10 @@ static const struct amdgpu_buffer_funcs 
sdma_v4_0_buffer_funcs = {
   static void sdma_v4_0_set_buffer_funcs(struct amdgpu_device *adev)
   {
   adev->mman.buffer_funcs = _v4_0_buffer_funcs;
- adev->mman.buffer_funcs_ring = >sdma.instance[0].ring;
+ if (adev->sdma.has_page_queue)
+ adev->mman.buffer_funcs_ring = >sdma.instance[0].page;
+ else
+ adev->mman.buffer_funcs_ring = >sdma.instance[0].ring;
   }

   static const struct amdgpu_vm_pte_funcs sdma_v4_0_vm_pte_funcs = {


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu/sdma4: use page queue 1 for buffer funcs

2018-11-07 Thread Zhang, Jerry(Junwei)

On 11/8/18 1:29 AM, Alex Deucher wrote:

Use page queue 0 rather than 1 to avoid contention with GPUVM
updates using page queue 0.

Signed-off-by: Alex Deucher 


A little confuse, I thought we were going to use page queue(in any 
instance) for PT update,

gfx ring for general sdma jobs.

Any missing?

Regards,
Jerry

---
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index 4b5b47dd2f4c..44c16a5c5428 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -2062,7 +2062,8 @@ static void sdma_v4_0_set_buffer_funcs(struct 
amdgpu_device *adev)
  {
adev->mman.buffer_funcs = _v4_0_buffer_funcs;
if (adev->sdma.has_page_queue)
-   adev->mman.buffer_funcs_ring = >sdma.instance[0].page;
+   /* use page queue 1 since page queue 0 will be used for VM 
updates */
+   adev->mman.buffer_funcs_ring = >sdma.instance[1].page;
else
adev->mman.buffer_funcs_ring = >sdma.instance[0].ring;
  }


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu/sdma4: use paging queue for buffer funcs

2018-11-07 Thread Zhang, Jerry(Junwei)

On 11/8/18 1:29 AM, Alex Deucher wrote:

Use the paging queue for buffer functions to avoid contention
with the other queues.

Signed-off-by: Alex Deucher 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 12 +++-
  1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index e39a09eb0fa1..4b5b47dd2f4c 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -662,6 +662,10 @@ static void sdma_v4_0_page_stop(struct amdgpu_device *adev)
u32 rb_cntl, ib_cntl;
int i;
  
+	if ((adev->mman.buffer_funcs_ring == sdma0) ||

+   (adev->mman.buffer_funcs_ring == sdma1))
+   amdgpu_ttm_set_buffer_funcs_status(adev, false);
+
for (i = 0; i < adev->sdma.num_instances; i++) {
rb_cntl = RREG32_SDMA(i, mmSDMA0_PAGE_RB_CNTL);
rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_PAGE_RB_CNTL,
@@ -1152,6 +1156,9 @@ static int sdma_v4_0_start(struct amdgpu_device *adev)
r = amdgpu_ring_test_helper(page);
if (r)
return r;
+
+   if (adev->mman.buffer_funcs_ring == page)
+   amdgpu_ttm_set_buffer_funcs_status(adev, true);
}
  
  		if (adev->mman.buffer_funcs_ring == ring)

@@ -2054,7 +2061,10 @@ static const struct amdgpu_buffer_funcs 
sdma_v4_0_buffer_funcs = {
  static void sdma_v4_0_set_buffer_funcs(struct amdgpu_device *adev)
  {
adev->mman.buffer_funcs = _v4_0_buffer_funcs;
-   adev->mman.buffer_funcs_ring = >sdma.instance[0].ring;
+   if (adev->sdma.has_page_queue)
+   adev->mman.buffer_funcs_ring = >sdma.instance[0].page;
+   else
+   adev->mman.buffer_funcs_ring = >sdma.instance[0].ring;
  }
  
  static const struct amdgpu_vm_pte_funcs sdma_v4_0_vm_pte_funcs = {


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: disable page queue on Vega10 SR-IOV VF

2018-11-07 Thread Zhang, Jerry(Junwei)

On 11/7/18 3:55 PM, Koenig, Christian wrote:

Am 07.11.18 um 08:41 schrieb Zhang, Jerry(Junwei):

On 11/7/18 3:29 PM, Koenig, Christian wrote:

Hi guys,

this is necessary for recoverable page fault handling.

When the normal SDMA queue is blocked because of a page fault the SDMA
firmware will switch to the paging queue so that we are able to handle
the fault.

Thanks for your info.

IIRC, page queue has higher priority than gfx queue(previously we were
using),
so the PT update job on page queue will always be scheduled first in HW.

I think so, but that is not it's primary purpose. The key feature is
that it still works even when the GFX or RLC queues are blocked because
of fault handling.


That sounds good functionality.




And (not 100% sure) page queue is designed for page migration?

Yes, well it is designed for page tables updates. Either while doing
migration, fault handling or whatever reason you got.


Anyway, we can disable it for SRIOV for their existing issues.

It would be nice to have for normal PD/PT updates under SRIOV as well,
but as a short term workaround we can probably disable it.


Agree.

Regards,
Jerry



Regards,
Christian.


Regards,
Jerry


In general it should work on all Vega (but not Raven) components and we
are going to need it when we enable recoverable page faults.

The only case I can see where we don't immediately need it is SRIOV,
because the current planning is to not support recoverable page faults
there.

Christian.

Am 07.11.18 um 08:21 schrieb Liu, Monk:

Hi team

Why we need this page_queue in amdgpu ?  can anyone share something
of its introduction to the kmd ?
According to my understanding , gpu-scheduler already have couple
levels of priority for contexts/entities , thus the job page_queue
supposed to do (should be mapping/unmapping/moving) is already good
took care of by "KERNEL" priority entities, and all other
context/entity SDMA jobs will be handled after "KERNEL" jobs ...

So there is no real benefit to introduce page_queue (also for
rlc_queue) to amdgpu with the existence of priority aware
gpu-scheduler ... unless we are going to remove the "KERNEL"
priority and always do the mapping/unmapping in page_queue ...

/Monk

-Original Message-
From: amd-gfx  On Behalf Of
Zhang, Jerry(Junwei)
Sent: Wednesday, November 7, 2018 1:26 PM
To: Huang, Trigger ;
amd-gfx@lists.freedesktop.org; Deucher, Alexander
; Koenig, Christian
; Kuehling, Felix 
Subject: Re: [PATCH] drm/amdgpu: disable page queue on Vega10 SR-IOV VF

On 11/7/18 1:15 PM, Trigger Huang wrote:

Currently, SDMA page queue is not used under SR-IOV VF, and this queue
will cause ring test failure in amdgpu module reload case. So just
disable it.

Signed-off-by: Trigger Huang 

Looks we ran into several issues about it on vega.
kfd also disabled vega10 for development.(but not sure the detail
issue for them)

Thus, we may disable it for vega10 as well?
any comment? Alex, Christian, Flex.

Regards,
Jerry

---
     drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 +++-
     1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index e39a09eb0f..4edc848 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -1451,7 +1451,9 @@ static int sdma_v4_0_early_init(void *handle)
     adev->sdma.has_page_queue = false;
     } else {
     adev->sdma.num_instances = 2;
-    if (adev->asic_type != CHIP_VEGA20 &&
+    if ((adev->asic_type == CHIP_VEGA10) &&
amdgpu_sriov_vf((adev)))
+    adev->sdma.has_page_queue = false;
+    else if (adev->asic_type != CHIP_VEGA20 &&
     adev->asic_type != CHIP_VEGA12)
     adev->sdma.has_page_queue = true;
     }

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: disable page queue on Vega10 SR-IOV VF

2018-11-06 Thread Zhang, Jerry(Junwei)

On 11/7/18 3:29 PM, Koenig, Christian wrote:

Hi guys,

this is necessary for recoverable page fault handling.

When the normal SDMA queue is blocked because of a page fault the SDMA
firmware will switch to the paging queue so that we are able to handle
the fault.

Thanks for your info.

IIRC, page queue has higher priority than gfx queue(previously we were 
using),

so the PT update job on page queue will always be scheduled first in HW.

And (not 100% sure) page queue is designed for page migration?

Anyway, we can disable it for SRIOV for their existing issues.

Regards,
Jerry



In general it should work on all Vega (but not Raven) components and we
are going to need it when we enable recoverable page faults.

The only case I can see where we don't immediately need it is SRIOV,
because the current planning is to not support recoverable page faults
there.

Christian.

Am 07.11.18 um 08:21 schrieb Liu, Monk:

Hi team

Why we need this page_queue in amdgpu ?  can anyone share something of its 
introduction to the kmd ?
According to my understanding , gpu-scheduler already have couple levels of priority for 
contexts/entities , thus the job page_queue supposed to do (should be mapping/unmapping/moving) is 
already good took care of by "KERNEL" priority entities, and all other context/entity 
SDMA jobs will be handled after "KERNEL" jobs ...

So there is no real benefit to introduce page_queue (also for rlc_queue) to amdgpu with 
the existence of priority aware gpu-scheduler ... unless we are going to remove the 
"KERNEL" priority and always do the mapping/unmapping in page_queue ...

/Monk

-Original Message-
From: amd-gfx  On Behalf Of Zhang, 
Jerry(Junwei)
Sent: Wednesday, November 7, 2018 1:26 PM
To: Huang, Trigger ; amd-gfx@lists.freedesktop.org; Deucher, Alexander 
; Koenig, Christian ; Kuehling, Felix 

Subject: Re: [PATCH] drm/amdgpu: disable page queue on Vega10 SR-IOV VF

On 11/7/18 1:15 PM, Trigger Huang wrote:

Currently, SDMA page queue is not used under SR-IOV VF, and this queue
will cause ring test failure in amdgpu module reload case. So just disable it.

Signed-off-by: Trigger Huang 

Looks we ran into several issues about it on vega.
kfd also disabled vega10 for development.(but not sure the detail issue for 
them)

Thus, we may disable it for vega10 as well?
any comment? Alex, Christian, Flex.

Regards,
Jerry

---
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index e39a09eb0f..4edc848 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -1451,7 +1451,9 @@ static int sdma_v4_0_early_init(void *handle)
adev->sdma.has_page_queue = false;
} else {
adev->sdma.num_instances = 2;
-   if (adev->asic_type != CHIP_VEGA20 &&
+   if ((adev->asic_type == CHIP_VEGA10) && amdgpu_sriov_vf((adev)))
+   adev->sdma.has_page_queue = false;
+   else if (adev->asic_type != CHIP_VEGA20 &&
adev->asic_type != CHIP_VEGA12)
adev->sdma.has_page_queue = true;
}

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: disable page queue on Vega10 SR-IOV VF

2018-11-06 Thread Zhang, Jerry(Junwei)

On 11/7/18 1:15 PM, Trigger Huang wrote:

Currently, SDMA page queue is not used under SR-IOV VF, and this queue will
cause ring test failure in amdgpu module reload case. So just disable it.

Signed-off-by: Trigger Huang 


Looks we ran into several issues about it on vega.
kfd also disabled vega10 for development.(but not sure the detail issue 
for them)


Thus, we may disable it for vega10 as well?
any comment? Alex, Christian, Flex.

Regards,
Jerry

---
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index e39a09eb0f..4edc848 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -1451,7 +1451,9 @@ static int sdma_v4_0_early_init(void *handle)
adev->sdma.has_page_queue = false;
} else {
adev->sdma.num_instances = 2;
-   if (adev->asic_type != CHIP_VEGA20 &&
+   if ((adev->asic_type == CHIP_VEGA10) && amdgpu_sriov_vf((adev)))
+   adev->sdma.has_page_queue = false;
+   else if (adev->asic_type != CHIP_VEGA20 &&
adev->asic_type != CHIP_VEGA12)
adev->sdma.has_page_queue = true;
}


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: Fix bo_global and mem_global kfree error

2018-11-06 Thread Zhang, Jerry(Junwei)

On 11/6/18 7:59 PM, Christian König wrote:

Am 06.11.18 um 12:54 schrieb Trigger Huang:

ttm_bo_glob and ttm_mem_glob are defined as structure instance, while
not allocated by kzalloc, so kfree should not be invoked to release
them anymore. Otherwise, it will cause the following kernel BUG when
unloading amdgpu module

[   48.419294] kernel BUG at 
/build/linux-5s7Xkn/linux-4.15.0/mm/slub.c:3894!

[   48.419352] invalid opcode:  [#1] SMP PTI
[   48.419387] Modules linked in: amdgpu(OE-) amdchash(OE) amdttm(OE) 
amd_sched(OE) amdkcl(OE) amd_iommu_v2 drm_kms_helper drm i2c_algo_bit 
fb_sys_fops syscopyarea sysfillrect sysimgblt snd_hda_codec_generic 
snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep kvm_intel kvm 
irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_pcm 
snd_seq_midi snd_seq_midi_event snd_rawmidi pcbc snd_seq 
snd_seq_device snd_timer aesni_intel snd soundcore joydev aes_x86_64 
crypto_simd glue_helper cryptd input_leds mac_hid serio_raw 
binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel 
parport_pc ppdev lp parport ip_tables x_tables autofs4 8139too 
psmouse i2c_piix4 8139cp mii floppy pata_acpi
[   48.419782] CPU: 1 PID: 1281 Comm: modprobe Tainted: G   
OE    4.15.0-20-generic #21-Ubuntu
[   48.419838] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014

[   48.419901] RIP: 0010:kfree+0x137/0x180
[   48.419934] RSP: 0018:b02101273bf8 EFLAGS: 00010246
[   48.419974] RAX: eee1418ad7e0 RBX: c075f100 RCX: 
8fed7fca7ed0
[   48.420025] RDX:  RSI: 0003440e RDI: 
2240
[   48.420073] RBP: b02101273c10 R08: 0010 R09: 
8fed7ffd3680
[   48.420121] R10: eee1418ad7c0 R11: 8fed7ffd3000 R12: 
c075e2c0
[   48.420169] R13: c074ec10 R14: 8fed73063900 R15: 
8fed737428e8
[   48.420216] FS:  7fdc912ec540() GS:8fed7fc8() 
knlGS:

[   48.420267] CS:  0010 DS:  ES:  CR0: 80050033
[   48.420308] CR2: 55fa40c30060 CR3: 00023470a006 CR4: 
003606e0
[   48.420358] DR0:  DR1:  DR2: 

[   48.420405] DR3:  DR6: fffe0ff0 DR7: 
0400

[   48.420452] Call Trace:
[   48.420485]  ttm_bo_global_kobj_release+0x20/0x30 [amdttm]
[   48.420528]  kobject_release+0x6a/0x180
[   48.420562]  kobject_put+0x28/0x50
[   48.420595]  ttm_bo_global_release+0x36/0x50 [amdttm]
[   48.420636]  amdttm_bo_device_release+0x119/0x180 [amdttm]
[   48.420678]  ? amdttm_bo_clean_mm+0xa6/0xf0 [amdttm]
[   48.420760]  amdgpu_ttm_fini+0xc9/0x180 [amdgpu]
[   48.420821]  amdgpu_bo_fini+0x12/0x40 [amdgpu]
[   48.420889]  gmc_v9_0_sw_fini+0x40/0x50 [amdgpu]
[   48.420947]  amdgpu_device_fini+0x36f/0x4c0 [amdgpu]
[   48.421007]  amdgpu_driver_unload_kms+0xb4/0x150 [amdgpu]
[   48.421058]  drm_dev_unregister+0x46/0xf0 [drm]
[   48.421102]  drm_dev_unplug+0x12/0x70 [drm]

Signed-off-by: Trigger Huang 


Reviewed-by: Christian König 


Reviewed-by: Junwei Zhang 




---
  drivers/gpu/drm/ttm/ttm_bo.c | 1 -
  drivers/gpu/drm/ttm/ttm_memory.c | 9 -
  2 files changed, 10 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index df02880..01c6d14 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1527,7 +1527,6 @@ static void ttm_bo_global_kobj_release(struct 
kobject *kobj)

  container_of(kobj, struct ttm_bo_global, kobj);
    __free_page(glob->dummy_read_page);
-    kfree(glob);
  }
    static void ttm_bo_global_release(void)
diff --git a/drivers/gpu/drm/ttm/ttm_memory.c 
b/drivers/gpu/drm/ttm/ttm_memory.c

index 7704e17..f1567c3 100644
--- a/drivers/gpu/drm/ttm/ttm_memory.c
+++ b/drivers/gpu/drm/ttm/ttm_memory.c
@@ -219,14 +219,6 @@ static ssize_t ttm_mem_global_store(struct 
kobject *kobj,

  return size;
  }
  -static void ttm_mem_global_kobj_release(struct kobject *kobj)
-{
-    struct ttm_mem_global *glob =
-    container_of(kobj, struct ttm_mem_global, kobj);
-
-    kfree(glob);
-}
-
  static struct attribute *ttm_mem_global_attrs[] = {
  _mem_global_lower_mem_limit,
  NULL
@@ -238,7 +230,6 @@ static const struct sysfs_ops ttm_mem_global_ops = {
  };
    static struct kobj_type ttm_mem_glob_kobj_type = {
-    .release = _mem_global_kobj_release,
  .sysfs_ops = _mem_global_ops,
  .default_attrs = ttm_mem_global_attrs,
  };


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: wait for IB test on first device open

2018-11-02 Thread Zhang, Jerry(Junwei)

On 11/2/18 5:32 PM, Christian König wrote:

Am 02.11.18 um 10:19 schrieb Zhang, Jerry(Junwei):

On 11/2/18 4:44 PM, Christian König wrote:
Instead of delaying that to the first query. Otherwise we could try 
to use the

SDMA for VM updates before the IB tests are done.


Be curious:
Does that happen when App opens dri node without libdrm?
since device init always queries info at first.


The problem is at this point we have already created the root PD and 
cleared it using the SDMA.


So we can end up with the sequence root PD clear -> IB test.

Not much of an issue, but I just noticed this during one of my tests.


Yeah, indeed.
Hope that delay work is executed prior to later root PD clear operation.

Anyway, feel free to add
Reviewed-by: Junwei Zhang 

Jerry



Christian.



Regards,
Jerry


Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c

index 08d04f68dfeb..f87f717cc905 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -467,9 +467,6 @@ static int amdgpu_info_ioctl(struct drm_device 
*dev, void *data, struct drm_file

  if (!info->return_size || !info->return_pointer)
  return -EINVAL;
  -    /* Ensure IB tests are run on ring */
-    flush_delayed_work(>late_init_work);
-
  switch (info->query) {
  case AMDGPU_INFO_ACCEL_WORKING:
  ui32 = adev->accel_working;
@@ -950,6 +947,9 @@ int amdgpu_driver_open_kms(struct drm_device 
*dev, struct drm_file *file_priv)

  struct amdgpu_fpriv *fpriv;
  int r, pasid;
  +    /* Ensure IB tests are run on ring */
+    flush_delayed_work(>late_init_work);
+
  file_priv->driver_priv = NULL;
    r = pm_runtime_get_sync(dev->dev);






___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: wait for IB test on first device open

2018-11-02 Thread Zhang, Jerry(Junwei)

On 11/2/18 4:44 PM, Christian König wrote:

Instead of delaying that to the first query. Otherwise we could try to use the
SDMA for VM updates before the IB tests are done.


Be curious:
Does that happen when App opens dri node without libdrm?
since device init always queries info at first.

Regards,
Jerry


Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 08d04f68dfeb..f87f717cc905 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -467,9 +467,6 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void 
*data, struct drm_file
if (!info->return_size || !info->return_pointer)
return -EINVAL;
  
-	/* Ensure IB tests are run on ring */

-   flush_delayed_work(>late_init_work);
-
switch (info->query) {
case AMDGPU_INFO_ACCEL_WORKING:
ui32 = adev->accel_working;
@@ -950,6 +947,9 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct 
drm_file *file_priv)
struct amdgpu_fpriv *fpriv;
int r, pasid;
  
+	/* Ensure IB tests are run on ring */

+   flush_delayed_work(>late_init_work);
+
file_priv->driver_priv = NULL;
  
  	r = pm_runtime_get_sync(dev->dev);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 6/8] drm/amdgpu: always reserve two slots for the VM

2018-10-23 Thread Zhang, Jerry(Junwei)

On 10/4/18 9:12 PM, Christian König wrote:

And drop the now superflous extra reservations.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  4 
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 15 ++-
  2 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index b8de56d1a866..ba406bd1b08f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -964,10 +964,6 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser 
*p)
if (r)
return r;
  
-	r = reservation_object_reserve_shared(vm->root.base.bo->tbo.resv, 1);

-   if (r)
-   return r;
-
p->job->vm_pd_addr = amdgpu_gmc_pd_addr(vm->root.base.bo);
  
  	if (amdgpu_vm_debug) {

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 218527bb0156..1b39b0144698 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -616,7 +616,8 @@ void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
  {
entry->priority = 0;
entry->tv.bo = >root.base.bo->tbo;
-   entry->tv.num_shared = 1;
+   /* One for the VM updates and one for the CS job */
+   entry->tv.num_shared = 2;
entry->user_pages = NULL;
list_add(>tv.head, validated);
  }
@@ -772,10 +773,6 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
  
  	ring = container_of(vm->entity.rq->sched, struct amdgpu_ring, sched);
  
-	r = reservation_object_reserve_shared(bo->tbo.resv, 1);

-   if (r)
-   return r;
-


A trivial thing, this change may belong to next patch.
this patch looks dropping the resv for root bo.

Regards,
Jerry


r = ttm_bo_validate(>tbo, >placement, );
if (r)
goto error;
@@ -1839,10 +1836,6 @@ static int amdgpu_vm_bo_update_mapping(struct 
amdgpu_device *adev,
if (r)
goto error_free;
  
-	r = reservation_object_reserve_shared(vm->root.base.bo->tbo.resv, 1);

-   if (r)
-   goto error_free;
-
r = amdgpu_vm_update_ptes(, start, last + 1, addr, flags);
if (r)
goto error_free;
@@ -3023,6 +3016,10 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
if (r)
goto error_free_root;
  
+	r = reservation_object_reserve_shared(root->tbo.resv, 1);

+   if (r)
+   goto error_unreserve;
+
r = amdgpu_vm_clear_bo(adev, vm, root,
   adev->vm_manager.root_level,
   vm->pte_support_ats);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/8] dma-buf: remove shared fence staging in reservation object

2018-10-23 Thread Zhang, Jerry(Junwei)

Patch 3, 5 is
Acked-by: Junwei Zhang 

Others are
Reviewed-by: Junwei Zhang 

On 10/4/18 9:12 PM, Christian König wrote:

No need for that any more. Just replace the list when there isn't enough
room any more for the additional fence.

Signed-off-by: Christian König 
---
  drivers/dma-buf/reservation.c | 178 ++
  include/linux/reservation.h   |   4 -
  2 files changed, 58 insertions(+), 124 deletions(-)

diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
index 6c95f61a32e7..5825fc336a13 100644
--- a/drivers/dma-buf/reservation.c
+++ b/drivers/dma-buf/reservation.c
@@ -68,105 +68,23 @@ EXPORT_SYMBOL(reservation_seqcount_string);
   */
  int reservation_object_reserve_shared(struct reservation_object *obj)
  {
-   struct reservation_object_list *fobj, *old;
-   u32 max;
+   struct reservation_object_list *old, *new;
+   unsigned int i, j, k, max;
  
  	old = reservation_object_get_list(obj);
  
  	if (old && old->shared_max) {

-   if (old->shared_count < old->shared_max) {
-   /* perform an in-place update */
-   kfree(obj->staged);
-   obj->staged = NULL;
+   if (old->shared_count < old->shared_max)
return 0;
-   } else
+   else
max = old->shared_max * 2;
-   } else
-   max = 4;
-
-   /*
-* resize obj->staged or allocate if it doesn't exist,
-* noop if already correct size
-*/
-   fobj = krealloc(obj->staged, offsetof(typeof(*fobj), shared[max]),
-   GFP_KERNEL);
-   if (!fobj)
-   return -ENOMEM;
-
-   obj->staged = fobj;
-   fobj->shared_max = max;
-   return 0;
-}
-EXPORT_SYMBOL(reservation_object_reserve_shared);
-
-static void
-reservation_object_add_shared_inplace(struct reservation_object *obj,
- struct reservation_object_list *fobj,
- struct dma_fence *fence)
-{
-   struct dma_fence *signaled = NULL;
-   u32 i, signaled_idx;
-
-   dma_fence_get(fence);
-
-   preempt_disable();
-   write_seqcount_begin(>seq);
-
-   for (i = 0; i < fobj->shared_count; ++i) {
-   struct dma_fence *old_fence;
-
-   old_fence = rcu_dereference_protected(fobj->shared[i],
-   reservation_object_held(obj));
-
-   if (old_fence->context == fence->context) {
-   /* memory barrier is added by write_seqcount_begin */
-   RCU_INIT_POINTER(fobj->shared[i], fence);
-   write_seqcount_end(>seq);
-   preempt_enable();
-
-   dma_fence_put(old_fence);
-   return;
-   }
-
-   if (!signaled && dma_fence_is_signaled(old_fence)) {
-   signaled = old_fence;
-   signaled_idx = i;
-   }
-   }
-
-   /*
-* memory barrier is added by write_seqcount_begin,
-* fobj->shared_count is protected by this lock too
-*/
-   if (signaled) {
-   RCU_INIT_POINTER(fobj->shared[signaled_idx], fence);
} else {
-   BUG_ON(fobj->shared_count >= fobj->shared_max);
-   RCU_INIT_POINTER(fobj->shared[fobj->shared_count], fence);
-   fobj->shared_count++;
+   max = 4;
}
  
-	write_seqcount_end(>seq);

-   preempt_enable();
-
-   dma_fence_put(signaled);
-}
-
-static void
-reservation_object_add_shared_replace(struct reservation_object *obj,
- struct reservation_object_list *old,
- struct reservation_object_list *fobj,
- struct dma_fence *fence)
-{
-   unsigned i, j, k;
-
-   dma_fence_get(fence);
-
-   if (!old) {
-   RCU_INIT_POINTER(fobj->shared[0], fence);
-   fobj->shared_count = 1;
-   goto done;
-   }
+   new = kmalloc(offsetof(typeof(*new), shared[max]), GFP_KERNEL);
+   if (!new)
+   return -ENOMEM;
  
  	/*

 * no need to bump fence refcounts, rcu_read access
@@ -174,46 +92,45 @@ reservation_object_add_shared_replace(struct 
reservation_object *obj,
 * references from the old struct are carried over to
 * the new.
 */
-   for (i = 0, j = 0, k = fobj->shared_max; i < old->shared_count; ++i) {
-   struct dma_fence *check;
+   for (i = 0, j = 0, k = max; i < (old ? old->shared_count : 0); ++i) {
+   struct dma_fence *fence;
  
-		check = rcu_dereference_protected(old->shared[i],

-   reservation_object_held(obj));
-
-   if (check->context == fence->context 

Re: [PATCH libdrm 2/2] amdgpu: don't track handles for non-memory allocations

2018-10-23 Thread Zhang, Jerry(Junwei)

On 10/24/18 3:07 AM, Marek Olšák wrote:

From: Marek Olšák 


commit log and sign-off here as well.
And any reason for that?

Regards,
Jerry



---
  amdgpu/amdgpu_bo.c | 15 +--
  1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c
index 81f8a5f7..00b9b54a 100644
--- a/amdgpu/amdgpu_bo.c
+++ b/amdgpu/amdgpu_bo.c
@@ -91,26 +91,29 @@ drm_public int amdgpu_bo_alloc(amdgpu_device_handle dev,
if (r)
goto out;
  
  	r = amdgpu_bo_create(dev, alloc_buffer->alloc_size, args.out.handle,

 buf_handle);
if (r) {
amdgpu_close_kms_handle(dev, args.out.handle);
goto out;
}
  
-	pthread_mutex_lock(>bo_table_mutex);

-   r = handle_table_insert(>bo_handles, (*buf_handle)->handle,
-   *buf_handle);
-   pthread_mutex_unlock(>bo_table_mutex);
-   if (r)
-   amdgpu_bo_free(*buf_handle);
+   if (alloc_buffer->preferred_heap &
+   (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT)) {
+   pthread_mutex_lock(>bo_table_mutex);
+   r = handle_table_insert(>bo_handles, (*buf_handle)->handle,
+   *buf_handle);
+   pthread_mutex_unlock(>bo_table_mutex);
+   if (r)
+   amdgpu_bo_free(*buf_handle);
+   }
  out:
return r;
  }
  
  drm_public int amdgpu_bo_set_metadata(amdgpu_bo_handle bo,

  struct amdgpu_bo_metadata *info)
  {
struct drm_amdgpu_gem_metadata args = {};
  
  	args.handle = bo->handle;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm 1/2] amdgpu: prevent an integer wraparound of cpu_map_count

2018-10-23 Thread Zhang, Jerry(Junwei)

On 10/24/18 3:07 AM, Marek Olšák wrote:

From: Marek Olšák 


We need commit log and sign-off here.

BTW, have you encounter any issue about that?



---
  amdgpu/amdgpu_bo.c | 19 +--
  1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c
index c0f42e81..81f8a5f7 100644
--- a/amdgpu/amdgpu_bo.c
+++ b/amdgpu/amdgpu_bo.c
@@ -22,20 +22,21 @@
   *
   */
  
  #include 

  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
  
  #include "libdrm_macros.h"

  #include "xf86drm.h"
  #include "amdgpu_drm.h"
  #include "amdgpu_internal.h"
  #include "util_math.h"
  
@@ -442,21 +443,29 @@ drm_public int amdgpu_bo_cpu_map(amdgpu_bo_handle bo, void **cpu)

  {
union drm_amdgpu_gem_mmap args;
void *ptr;
int r;
  
  	pthread_mutex_lock(>cpu_access_mutex);
  
  	if (bo->cpu_ptr) {

/* already mapped */
assert(bo->cpu_map_count > 0);
-   bo->cpu_map_count++;
+
+   /* If the counter has already reached INT_MAX, don't increment
+* it and assume that the buffer will be mapped indefinitely.
+* The buffer is pretty unlikely to get unmapped by the user
+* at this point.
+*/
+   if (bo->cpu_map_count != INT_MAX)
+   bo->cpu_map_count++;


If so, shall we print some error here to notice that indefinite mappings 
come up.


Regards,
Jerry

+
*cpu = bo->cpu_ptr;
pthread_mutex_unlock(>cpu_access_mutex);
return 0;
}
  
  	assert(bo->cpu_map_count == 0);
  
  	memset(, 0, sizeof(args));
  
  	/* Query the buffer address (args.addr_ptr).

@@ -492,21 +501,27 @@ drm_public int amdgpu_bo_cpu_unmap(amdgpu_bo_handle bo)
  
  	pthread_mutex_lock(>cpu_access_mutex);

assert(bo->cpu_map_count >= 0);
  
  	if (bo->cpu_map_count == 0) {

/* not mapped */
pthread_mutex_unlock(>cpu_access_mutex);
return -EINVAL;
}
  
-	bo->cpu_map_count--;

+   /* If the counter has already reached INT_MAX, don't decrement it.
+* This is because amdgpu_bo_cpu_map doesn't increment it past
+* INT_MAX.
+*/
+   if (bo->cpu_map_count != INT_MAX)
+   bo->cpu_map_count--;
+
if (bo->cpu_map_count > 0) {
/* mapped multiple times */
pthread_mutex_unlock(>cpu_access_mutex);
return 0;
}
  
  	r = drm_munmap(bo->cpu_ptr, bo->alloc_size) == 0 ? 0 : -errno;

bo->cpu_ptr = NULL;
pthread_mutex_unlock(>cpu_access_mutex);
return r;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Fix amdgpu_vm_alloc_pts failed

2018-10-22 Thread Zhang, Jerry(Junwei)

On 10/23/2018 01:12 PM, Zhang, Jerry(Junwei) wrote:

On 10/23/2018 11:29 AM, Rex Zhu wrote:

when the VA address located in the last PD entries,
the alloc_pts will faile.

Use the right PD mask instand of hardcode, suggested
by jerry.zhang.

Signed-off-by: Rex Zhu 


Thanks to verify that.
Feel free to add
Reviewed-by: Junwei Zhang 

Also like to get to know some background about these two functions 
from Christian.

Perhaps we may make it more simple, e.g. merging them together.


If we really needs them all, we may simplify that like:
```
amdgpu_vm_entries_mask(struct amdgpu_device *adev, unsigned int level)
{
    return amdgpu_vm_num_entries(adev, level) - 1;
}
```

Jerry


Regards,
Jerry


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

index 054633b..3939013 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -202,8 +202,11 @@ static unsigned amdgpu_vm_num_entries(struct 
amdgpu_device *adev,

  static uint32_t amdgpu_vm_entries_mask(struct amdgpu_device *adev,
 unsigned int level)
  {
+    unsigned shift = amdgpu_vm_level_shift(adev,
+   adev->vm_manager.root_level);
+
  if (level <= adev->vm_manager.root_level)
-    return 0x;
+    return (round_up(adev->vm_manager.max_pfn, 1 << shift) >> 
shift) - 1;

  else if (level != AMDGPU_VM_PTB)
  return 0x1ff;
  else


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Fix amdgpu_vm_alloc_pts failed

2018-10-22 Thread Zhang, Jerry(Junwei)

On 10/23/2018 11:29 AM, Rex Zhu wrote:

when the VA address located in the last PD entries,
the alloc_pts will faile.

Use the right PD mask instand of hardcode, suggested
by jerry.zhang.

Signed-off-by: Rex Zhu 


Thanks to verify that.
Feel free to add
Reviewed-by: Junwei Zhang 

Also like to get to know some background about these two functions from 
Christian.

Perhaps we may make it more simple, e.g. merging them together.

Regards,
Jerry


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 054633b..3939013 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -202,8 +202,11 @@ static unsigned amdgpu_vm_num_entries(struct amdgpu_device 
*adev,
  static uint32_t amdgpu_vm_entries_mask(struct amdgpu_device *adev,
   unsigned int level)
  {
+   unsigned shift = amdgpu_vm_level_shift(adev,
+  adev->vm_manager.root_level);
+
if (level <= adev->vm_manager.root_level)
-   return 0x;
+   return (round_up(adev->vm_manager.max_pfn, 1 << shift) >> 
shift) - 1;
else if (level != AMDGPU_VM_PTB)
return 0x1ff;
else


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Reverse the sequence of ctx_mgr_fini and vm_fini in amdgpu_driver_postclose_kms

2018-10-22 Thread Zhang, Jerry(Junwei)

On 10/22/2018 05:47 PM, Rex Zhu wrote:

csa buffer will be created per ctx, when ctx fini,
the csa buffer and va will be released. so need to
do ctx_mgr fin before vm fini.

Signed-off-by: Rex Zhu 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 27de848..f2ef9a1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -1054,8 +1054,8 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
pasid = fpriv->vm.pasid;
pd = amdgpu_bo_ref(fpriv->vm.root.base.bo);
  
-	amdgpu_vm_fini(adev, >vm);

amdgpu_ctx_mgr_fini(>ctx_mgr);
+   amdgpu_vm_fini(adev, >vm);
  
  	if (pasid)

amdgpu_pasid_free_delayed(pd->tbo.resv, pasid);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Fix amdgpu_vm_alloc_pts failed

2018-10-22 Thread Zhang, Jerry(Junwei)

On 10/23/2018 12:09 AM, Rex Zhu wrote:

When the va address located in the last pd entry,


Do you mean the root PD?
maybe we need roundup root PD in amdgpu_vm_entries_mask() like 
amdgpu_vm_num_entries().


BTW, looks amdgpu_vm_entries_mask() is going to replace the 
amdgpu_vm_num_entries()


Jerry

the alloc_pts will failed.
caused by
"drm/amdgpu: add amdgpu_vm_entries_mask v2"
commit 72af632549b97ead9251bb155f08fefd1fb6f5c3.

Signed-off-by: Rex Zhu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 34 +++---
  1 file changed, 7 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 054633b..1a3af72 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -191,26 +191,6 @@ static unsigned amdgpu_vm_num_entries(struct amdgpu_device 
*adev,
  }
  
  /**

- * amdgpu_vm_entries_mask - the mask to get the entry number of a PD/PT
- *
- * @adev: amdgpu_device pointer
- * @level: VMPT level
- *
- * Returns:
- * The mask to extract the entry number of a PD/PT from an address.
- */
-static uint32_t amdgpu_vm_entries_mask(struct amdgpu_device *adev,
-  unsigned int level)
-{
-   if (level <= adev->vm_manager.root_level)
-   return 0x;
-   else if (level != AMDGPU_VM_PTB)
-   return 0x1ff;
-   else
-   return AMDGPU_VM_PTE_COUNT(adev) - 1;
-}
-
-/**
   * amdgpu_vm_bo_size - returns the size of the BOs in bytes
   *
   * @adev: amdgpu_device pointer
@@ -419,17 +399,17 @@ static void amdgpu_vm_pt_start(struct amdgpu_device *adev,
  static bool amdgpu_vm_pt_descendant(struct amdgpu_device *adev,
struct amdgpu_vm_pt_cursor *cursor)
  {
-   unsigned mask, shift, idx;
+   unsigned num_entries, shift, idx;
  
  	if (!cursor->entry->entries)

return false;
  
  	BUG_ON(!cursor->entry->base.bo);

-   mask = amdgpu_vm_entries_mask(adev, cursor->level);
+   num_entries = amdgpu_vm_num_entries(adev, cursor->level);
shift = amdgpu_vm_level_shift(adev, cursor->level);
  
  	++cursor->level;

-   idx = (cursor->pfn >> shift) & mask;
+   idx = (cursor->pfn >> shift) % num_entries;
cursor->parent = cursor->entry;
cursor->entry = >entry->entries[idx];
return true;
@@ -1618,7 +1598,7 @@ static int amdgpu_vm_update_ptes(struct 
amdgpu_pte_update_params *params,
amdgpu_vm_pt_start(adev, params->vm, start, );
while (cursor.pfn < end) {
struct amdgpu_bo *pt = cursor.entry->base.bo;
-   unsigned shift, parent_shift, mask;
+   unsigned shift, parent_shift, num_entries;
uint64_t incr, entry_end, pe_start;
  
  		if (!pt)

@@ -1673,9 +1653,9 @@ static int amdgpu_vm_update_ptes(struct 
amdgpu_pte_update_params *params,
  
  		/* Looks good so far, calculate parameters for the update */

incr = AMDGPU_GPU_PAGE_SIZE << shift;
-   mask = amdgpu_vm_entries_mask(adev, cursor.level);
-   pe_start = ((cursor.pfn >> shift) & mask) * 8;
-   entry_end = (mask + 1) << shift;
+   num_entries = amdgpu_vm_num_entries(adev, cursor.level);
+   pe_start = ((cursor.pfn >> shift) & (num_entries - 1)) * 8;
+   entry_end = num_entries << shift;
entry_end += cursor.pfn & ~(entry_end - 1);
entry_end = min(entry_end, end);
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 4/5] drm/ttm: initialize globals during device init

2018-10-22 Thread Zhang, Jerry(Junwei)

On 10/22/2018 08:35 PM, Christian König wrote:

Am 22.10.18 um 08:45 schrieb Zhang, Jerry(Junwei):

A question in ttm_bo.c
[SNIP]

    int ttm_bo_device_release(struct ttm_bo_device *bdev)
  {
@@ -1623,18 +1620,25 @@ int ttm_bo_device_release(struct 
ttm_bo_device *bdev)

drm_vma_offset_manager_destroy(>vma_manager);
  +    if (!ret)
+    ttm_bo_global_release();


if ttm_bo_clean_mm() fails, it will skip ttm_bo_global_release().
When will it be called?


Never.



Shall add it to delayed work? or maybe we could release it directly?


No, when ttm_bo_device_release() fails somebody is trying to unload a 
driver while this driver still has memory allocated.


In this case BO accounting should not be released because we should 
make sure that all the leaked memory is still accounted.


In this case, it's rather a bug to fix then.
Thanks to explain it .

looks fine for me, feel free to add
Reviewed-by: Junwei Zhang 

Jerry



Christian.



Regards,
Jerry




___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 4/5] drm/ttm: initialize globals during device init

2018-10-22 Thread Zhang, Jerry(Junwei)

A question for ttm_bo.c

On 10/20/2018 12:41 AM, Christian König wrote:

Make sure that the global BO state is always correctly initialized.

This allows removing all the device code to initialize it.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 59 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  1 -
  drivers/gpu/drm/ast/ast_drv.h   |  1 -
  drivers/gpu/drm/ast/ast_ttm.c   | 36 ---
  drivers/gpu/drm/bochs/bochs.h   |  1 -
  drivers/gpu/drm/bochs/bochs_mm.c| 35 ---
  drivers/gpu/drm/cirrus/cirrus_drv.h |  1 -
  drivers/gpu/drm/cirrus/cirrus_ttm.c | 36 ---
  drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h |  1 -
  drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c | 34 --
  drivers/gpu/drm/mgag200/mgag200_drv.h   |  1 -
  drivers/gpu/drm/mgag200/mgag200_ttm.c   | 36 ---
  drivers/gpu/drm/nouveau/nouveau_drv.h   |  1 -
  drivers/gpu/drm/nouveau/nouveau_ttm.c   | 39 
  drivers/gpu/drm/qxl/qxl_drv.h   |  2 -
  drivers/gpu/drm/qxl/qxl_ttm.c   | 33 --
  drivers/gpu/drm/radeon/radeon.h |  2 -
  drivers/gpu/drm/radeon/radeon_ttm.c | 39 
  drivers/gpu/drm/ttm/ttm_bo.c| 19 +---
  drivers/gpu/drm/virtio/virtgpu_drv.h|  2 -
  drivers/gpu/drm/virtio/virtgpu_ttm.c| 35 ---
  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 11 +
  drivers/gpu/drm/vmwgfx/vmwgfx_drv.h |  3 --
  drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c| 27 ---
  drivers/staging/vboxvideo/vbox_ttm.c| 36 ---
  include/drm/ttm/ttm_bo_driver.h | 41 +
  26 files changed, 16 insertions(+), 516 deletions(-)

[... skip above ...]
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index d89183f95570..df028805b7e2 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1530,7 +1530,7 @@ static void ttm_bo_global_kobj_release(struct kobject 
*kobj)
kfree(glob);
  }
  
-void ttm_bo_global_release(void)

+static void ttm_bo_global_release(void)
  {
struct ttm_bo_global *glob = _bo_glob;
  
@@ -1544,9 +1544,8 @@ void ttm_bo_global_release(void)

  out:
mutex_unlock(_global_mutex);
  }
-EXPORT_SYMBOL(ttm_bo_global_release);
  
-int ttm_bo_global_init(void)

+static int ttm_bo_global_init(void)
  {
struct ttm_bo_global *glob = _bo_glob;
int ret = 0;
@@ -1583,8 +1582,6 @@ int ttm_bo_global_init(void)
mutex_unlock(_global_mutex);
return ret;
  }
-EXPORT_SYMBOL(ttm_bo_global_init);
-
  
  int ttm_bo_device_release(struct ttm_bo_device *bdev)

  {
@@ -1623,18 +1620,25 @@ int ttm_bo_device_release(struct ttm_bo_device *bdev)
  
  	drm_vma_offset_manager_destroy(>vma_manager);
  
+	if (!ret)

+   ttm_bo_global_release();
+


If ttm_bo_clean_mm() fails, it will skip ttm_bo_global_release()
When it will be called?

Shall we add it to a delayed work or we may call it directly here.

Regards
Jerry



return ret;
  }
  EXPORT_SYMBOL(ttm_bo_device_release);
  
  int ttm_bo_device_init(struct ttm_bo_device *bdev,

-  struct ttm_bo_global *glob,
   struct ttm_bo_driver *driver,
   struct address_space *mapping,
   uint64_t file_page_offset,
   bool need_dma32)
  {
-   int ret = -EINVAL;
+   struct ttm_bo_global *glob = _bo_glob;
+   int ret;
+
+   ret = ttm_bo_global_init();
+   if (ret)
+   return ret;
  
  	bdev->driver = driver;
  
@@ -1661,6 +1665,7 @@ int ttm_bo_device_init(struct ttm_bo_device *bdev,
  
  	return 0;

  out_no_sys:
+   ttm_bo_global_release();
return ret;
  }
  EXPORT_SYMBOL(ttm_bo_device_init);
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index dec42d421e00..30caa20d9fcf 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -132,8 +132,6 @@ struct virtio_gpu_framebuffer {
container_of(x, struct virtio_gpu_framebuffer, base)
  
  struct virtio_gpu_mman {

-   struct ttm_bo_global_refbo_global_ref;
-   boolmem_global_referenced;
struct ttm_bo_devicebdev;
  };
  
diff --git a/drivers/gpu/drm/virtio/virtgpu_ttm.c b/drivers/gpu/drm/virtio/virtgpu_ttm.c

index b99ecc6d97d3..c1a56d640121 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ttm.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ttm.c
@@ -50,35 +50,6 @@ virtio_gpu_device *virtio_gpu_get_vgdev(struct ttm_bo_device 
*bdev)
return vgdev;
  }
  
-static int virtio_gpu_ttm_global_init(struct virtio_gpu_device 

Re: [PATCH 1/5] drm/ttm: use a static ttm_mem_global instance

2018-10-22 Thread Zhang, Jerry(Junwei)

On 10/20/2018 12:41 AM, Christian König wrote:

As the name says we only need one global instance of ttm_mem_global.

Drop all the driver initialization and just use a single exported
instance which is initialized during BO global initialization.

Signed-off-by: Christian König 


Patch 1, 2, 3, 5 look good for me.
Reviewed-by: Junwei Zhang 

a question for patch 4.

Jerry

---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 44 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  1 -
  drivers/gpu/drm/ast/ast_drv.h   |  1 -
  drivers/gpu/drm/ast/ast_ttm.c   | 32 ++
  drivers/gpu/drm/bochs/bochs.h   |  1 -
  drivers/gpu/drm/bochs/bochs_mm.c| 30 ++---
  drivers/gpu/drm/cirrus/cirrus_drv.h |  1 -
  drivers/gpu/drm/cirrus/cirrus_ttm.c | 32 ++
  drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h |  1 -
  drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c | 31 +++--
  drivers/gpu/drm/mgag200/mgag200_drv.h   |  1 -
  drivers/gpu/drm/mgag200/mgag200_ttm.c   | 32 ++
  drivers/gpu/drm/nouveau/nouveau_drv.h   |  1 -
  drivers/gpu/drm/nouveau/nouveau_ttm.c   | 34 ++-
  drivers/gpu/drm/qxl/qxl_drv.h   |  1 -
  drivers/gpu/drm/qxl/qxl_ttm.c   | 28 
  drivers/gpu/drm/radeon/radeon.h |  1 -
  drivers/gpu/drm/radeon/radeon_ttm.c | 26 ---
  drivers/gpu/drm/ttm/ttm_bo.c| 10 --
  drivers/gpu/drm/ttm/ttm_memory.c|  5 +--
  drivers/gpu/drm/virtio/virtgpu_drv.h|  1 -
  drivers/gpu/drm/virtio/virtgpu_ttm.c| 27 ---
  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c |  4 +--
  drivers/gpu/drm/vmwgfx/vmwgfx_drv.h |  3 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c| 27 ---
  drivers/staging/vboxvideo/vbox_drv.h|  1 -
  drivers/staging/vboxvideo/vbox_ttm.c| 24 --
  include/drm/ttm/ttm_bo_driver.h |  8 ++---
  include/drm/ttm/ttm_memory.h|  4 +--
  29 files changed, 32 insertions(+), 380 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 3a6802846698..fda252022b15 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -65,33 +65,6 @@ static void amdgpu_ttm_debugfs_fini(struct amdgpu_device 
*adev);
   * Global memory.
   */
  
-/**

- * amdgpu_ttm_mem_global_init - Initialize and acquire reference to
- * memory object
- *
- * @ref: Object for initialization.
- *
- * This is called by drm_global_item_ref() when an object is being
- * initialized.
- */
-static int amdgpu_ttm_mem_global_init(struct drm_global_reference *ref)
-{
-   return ttm_mem_global_init(ref->object);
-}
-
-/**
- * amdgpu_ttm_mem_global_release - Drop reference to a memory object
- *
- * @ref: Object being removed
- *
- * This is called by drm_global_item_unref() when an object is being
- * released.
- */
-static void amdgpu_ttm_mem_global_release(struct drm_global_reference *ref)
-{
-   ttm_mem_global_release(ref->object);
-}
-
  /**
   * amdgpu_ttm_global_init - Initialize global TTM memory reference structures.
   *
@@ -108,20 +81,6 @@ static int amdgpu_ttm_global_init(struct amdgpu_device 
*adev)
/* ensure reference is false in case init fails */
adev->mman.mem_global_referenced = false;
  
-	global_ref = >mman.mem_global_ref;

-   global_ref->global_type = DRM_GLOBAL_TTM_MEM;
-   global_ref->size = sizeof(struct ttm_mem_global);
-   global_ref->init = _ttm_mem_global_init;
-   global_ref->release = _ttm_mem_global_release;
-   r = drm_global_item_ref(global_ref);
-   if (r) {
-   DRM_ERROR("Failed setting up TTM memory accounting "
- "subsystem.\n");
-   goto error_mem;
-   }
-
-   adev->mman.bo_global_ref.mem_glob =
-   adev->mman.mem_global_ref.object;
global_ref = >mman.bo_global_ref.ref;
global_ref->global_type = DRM_GLOBAL_TTM_BO;
global_ref->size = sizeof(struct ttm_bo_global);
@@ -140,8 +99,6 @@ static int amdgpu_ttm_global_init(struct amdgpu_device *adev)
return 0;
  
  error_bo:

-   drm_global_item_unref(>mman.mem_global_ref);
-error_mem:
return r;
  }
  
@@ -150,7 +107,6 @@ static void amdgpu_ttm_global_fini(struct amdgpu_device *adev)

if (adev->mman.mem_global_referenced) {
mutex_destroy(>mman.gtt_window_lock);
drm_global_item_unref(>mman.bo_global_ref.ref);
-   drm_global_item_unref(>mman.mem_global_ref);
adev->mman.mem_global_referenced = false;
}
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 

Re: [PATCH 4/5] drm/ttm: initialize globals during device init

2018-10-22 Thread Zhang, Jerry(Junwei)

A question in ttm_bo.c

On 10/20/2018 12:41 AM, Christian König wrote:

Make sure that the global BO state is always correctly initialized.

This allows removing all the device code to initialize it.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 59 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  1 -
  drivers/gpu/drm/ast/ast_drv.h   |  1 -
  drivers/gpu/drm/ast/ast_ttm.c   | 36 ---
  drivers/gpu/drm/bochs/bochs.h   |  1 -
  drivers/gpu/drm/bochs/bochs_mm.c| 35 ---
  drivers/gpu/drm/cirrus/cirrus_drv.h |  1 -
  drivers/gpu/drm/cirrus/cirrus_ttm.c | 36 ---
  drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.h |  1 -
  drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c | 34 --
  drivers/gpu/drm/mgag200/mgag200_drv.h   |  1 -
  drivers/gpu/drm/mgag200/mgag200_ttm.c   | 36 ---
  drivers/gpu/drm/nouveau/nouveau_drv.h   |  1 -
  drivers/gpu/drm/nouveau/nouveau_ttm.c   | 39 
  drivers/gpu/drm/qxl/qxl_drv.h   |  2 -
  drivers/gpu/drm/qxl/qxl_ttm.c   | 33 --
  drivers/gpu/drm/radeon/radeon.h |  2 -
  drivers/gpu/drm/radeon/radeon_ttm.c | 39 
  drivers/gpu/drm/ttm/ttm_bo.c| 19 +---
  drivers/gpu/drm/virtio/virtgpu_drv.h|  2 -
  drivers/gpu/drm/virtio/virtgpu_ttm.c| 35 ---
  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 11 +
  drivers/gpu/drm/vmwgfx/vmwgfx_drv.h |  3 --
  drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c| 27 ---
  drivers/staging/vboxvideo/vbox_ttm.c| 36 ---
  include/drm/ttm/ttm_bo_driver.h | 41 +
  26 files changed, 16 insertions(+), 516 deletions(-)

[... skip above ...]
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index d89183f95570..df028805b7e2 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1530,7 +1530,7 @@ static void ttm_bo_global_kobj_release(struct kobject 
*kobj)
kfree(glob);
  }
  
-void ttm_bo_global_release(void)

+static void ttm_bo_global_release(void)
  {
struct ttm_bo_global *glob = _bo_glob;
  
@@ -1544,9 +1544,8 @@ void ttm_bo_global_release(void)

  out:
mutex_unlock(_global_mutex);
  }
-EXPORT_SYMBOL(ttm_bo_global_release);
  
-int ttm_bo_global_init(void)

+static int ttm_bo_global_init(void)
  {
struct ttm_bo_global *glob = _bo_glob;
int ret = 0;
@@ -1583,8 +1582,6 @@ int ttm_bo_global_init(void)
mutex_unlock(_global_mutex);
return ret;
  }
-EXPORT_SYMBOL(ttm_bo_global_init);
-
  
  int ttm_bo_device_release(struct ttm_bo_device *bdev)

  {
@@ -1623,18 +1620,25 @@ int ttm_bo_device_release(struct ttm_bo_device *bdev)
  
  	drm_vma_offset_manager_destroy(>vma_manager);
  
+	if (!ret)

+   ttm_bo_global_release();


if ttm_bo_clean_mm() fails, it will skip ttm_bo_global_release().
When will it be called?

Shall add it to delayed work? or maybe we could release it directly?

Regards,
Jerry


+
return ret;
  }
  EXPORT_SYMBOL(ttm_bo_device_release);
  
  int ttm_bo_device_init(struct ttm_bo_device *bdev,

-  struct ttm_bo_global *glob,
   struct ttm_bo_driver *driver,
   struct address_space *mapping,
   uint64_t file_page_offset,
   bool need_dma32)
  {
-   int ret = -EINVAL;
+   struct ttm_bo_global *glob = _bo_glob;
+   int ret;
+
+   ret = ttm_bo_global_init();
+   if (ret)
+   return ret;
  
  	bdev->driver = driver;
  
@@ -1661,6 +1665,7 @@ int ttm_bo_device_init(struct ttm_bo_device *bdev,
  
  	return 0;

  out_no_sys:
+   ttm_bo_global_release();
return ret;
  }
  EXPORT_SYMBOL(ttm_bo_device_init);
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index dec42d421e00..30caa20d9fcf 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -132,8 +132,6 @@ struct virtio_gpu_framebuffer {
container_of(x, struct virtio_gpu_framebuffer, base)
  
  struct virtio_gpu_mman {

-   struct ttm_bo_global_refbo_global_ref;
-   boolmem_global_referenced;
struct ttm_bo_devicebdev;
  };
  
diff --git a/drivers/gpu/drm/virtio/virtgpu_ttm.c b/drivers/gpu/drm/virtio/virtgpu_ttm.c

index b99ecc6d97d3..c1a56d640121 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ttm.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ttm.c
@@ -50,35 +50,6 @@ virtio_gpu_device *virtio_gpu_get_vgdev(struct ttm_bo_device 
*bdev)
return vgdev;
  }
  
-static int virtio_gpu_ttm_global_init(struct virtio_gpu_device 

Re: [PATCH 2/3] drm/amdgpu: Replace TTM initialization/release with ttm_global

2018-10-19 Thread Zhang, Jerry(Junwei)

On 10/19/2018 12:27 AM, Thomas Zimmermann wrote:

Unified initialization and relesae of the global TTM state is provided
by struct ttm_global and its interfaces.

Signed-off-by: Thomas Zimmermann 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 63 ++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |  4 +-
  2 files changed, 7 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 3a6802846698..70b0e8c77bb4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -65,33 +65,6 @@ static void amdgpu_ttm_debugfs_fini(struct amdgpu_device 
*adev);
   * Global memory.
   */
  
-/**

- * amdgpu_ttm_mem_global_init - Initialize and acquire reference to
- * memory object
- *
- * @ref: Object for initialization.
- *
- * This is called by drm_global_item_ref() when an object is being
- * initialized.
- */
-static int amdgpu_ttm_mem_global_init(struct drm_global_reference *ref)
-{
-   return ttm_mem_global_init(ref->object);
-}
-
-/**
- * amdgpu_ttm_mem_global_release - Drop reference to a memory object
- *
- * @ref: Object being removed
- *
- * This is called by drm_global_item_unref() when an object is being
- * released.
- */
-static void amdgpu_ttm_mem_global_release(struct drm_global_reference *ref)
-{
-   ttm_mem_global_release(ref->object);
-}
-
  /**
   * amdgpu_ttm_global_init - Initialize global TTM memory reference structures.
   *
@@ -102,35 +75,15 @@ static void amdgpu_ttm_mem_global_release(struct 
drm_global_reference *ref)
   */
  static int amdgpu_ttm_global_init(struct amdgpu_device *adev)
  {
-   struct drm_global_reference *global_ref;
int r;
  
  	/* ensure reference is false in case init fails */

adev->mman.mem_global_referenced = false;
  
-	global_ref = >mman.mem_global_ref;

-   global_ref->global_type = DRM_GLOBAL_TTM_MEM;
-   global_ref->size = sizeof(struct ttm_mem_global);
-   global_ref->init = _ttm_mem_global_init;
-   global_ref->release = _ttm_mem_global_release;
-   r = drm_global_item_ref(global_ref);
+   r = ttm_global_init(>mman.glob);
if (r) {
-   DRM_ERROR("Failed setting up TTM memory accounting "
- "subsystem.\n");
-   goto error_mem;
-   }
-
-   adev->mman.bo_global_ref.mem_glob =
-   adev->mman.mem_global_ref.object;


Seems to miss this action.

Or are you going to replace
  struct ttm_bo_global_ref bo_global_ref
with
  struct ttm_global glob ->  struct drm_global_reference bo_ref

if so, may need to remove ttm_bo_global_ref_init() and struct 
ttm_bo_global_ref at the same time.



Regards,
Jerry


-   global_ref = >mman.bo_global_ref.ref;
-   global_ref->global_type = DRM_GLOBAL_TTM_BO;
-   global_ref->size = sizeof(struct ttm_bo_global);
-   global_ref->init = _bo_global_ref_init;
-   global_ref->release = _bo_global_ref_release;
-   r = drm_global_item_ref(global_ref);
-   if (r) {
-   DRM_ERROR("Failed setting up TTM BO subsystem.\n");
-   goto error_bo;
+   DRM_ERROR("Failed setting up TTM subsystem.\n");
+   return r;
}
  
  	mutex_init(>mman.gtt_window_lock);

@@ -138,19 +91,13 @@ static int amdgpu_ttm_global_init(struct amdgpu_device 
*adev)
adev->mman.mem_global_referenced = true;
  
  	return 0;

-
-error_bo:
-   drm_global_item_unref(>mman.mem_global_ref);
-error_mem:
-   return r;
  }
  
  static void amdgpu_ttm_global_fini(struct amdgpu_device *adev)

  {
if (adev->mman.mem_global_referenced) {
mutex_destroy(>mman.gtt_window_lock);
-   drm_global_item_unref(>mman.bo_global_ref.ref);
-   drm_global_item_unref(>mman.mem_global_ref);
+   ttm_global_release(>mman.glob);
adev->mman.mem_global_referenced = false;
}
  }
@@ -1765,7 +1712,7 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
}
/* No others user of address space so set it to 0 */
r = ttm_bo_device_init(>mman.bdev,
-  adev->mman.bo_global_ref.ref.object,
+  ttm_global_get_bo_global(>mman.glob),
   _bo_driver,
   adev->ddev->anon_inode->i_mapping,
   DRM_FILE_PAGE_OFFSET,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index fe8f276e9811..c3a7fe3ead3a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -26,6 +26,7 @@
  
  #include "amdgpu.h"

  #include 
+#include 
  
  #define AMDGPU_PL_GDS		(TTM_PL_PRIV + 0)

  #define AMDGPU_PL_GWS (TTM_PL_PRIV + 1)
@@ -39,8 +40,7 @@
  #define AMDGPU_GTT_NUM_TRANSFER_WINDOWS   2
  
  struct amdgpu_mman {

-   struct ttm_bo_global_ref

Re: [PATCH v3] drm/amdgpu: Set the default value about gds vmid0 size

2018-10-12 Thread Zhang, Jerry(Junwei)

On 10/12/2018 06:08 PM, Emily Deng wrote:

For sriov, when first run windows guest, then run linux guest, the gds
vmid0 size will be reset to 0 by windows guest. So if the value has been
reset to 0, then set the value to the default value in linux guest.

v2:
Fixed value instead of reading mmGDS_VMID0_SIZE.

Signed-off-by: Emily Deng 

Reviewed-by: Junwei Zhang 

---
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 15 ++-
  1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index ae86238..a8acdd6 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4875,7 +4875,20 @@ static void gfx_v9_0_set_rlc_funcs(struct amdgpu_device 
*adev)
  static void gfx_v9_0_set_gds_init(struct amdgpu_device *adev)
  {
/* init asci gds info */
-   adev->gds.mem.total_size = RREG32_SOC15(GC, 0, mmGDS_VMID0_SIZE);
+   switch (adev->asic_type) {
+   case CHIP_VEGA10:
+   case CHIP_VEGA12:
+   case CHIP_VEGA20:
+   adev->gds.mem.total_size = 0x1;
+   break;
+   case CHIP_RAVEN:
+   adev->gds.mem.total_size = 0x1000;
+   break;
+   default:
+   adev->gds.mem.total_size = 0x1;
+   break;
+   }
+
adev->gds.gws.total_size = 64;
adev->gds.oa.total_size = 16;
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2] drm/amdgpu: Set the default value about gds vmid0 size

2018-10-12 Thread Zhang, Jerry(Junwei)

On 10/12/2018 05:34 PM, Emily Deng wrote:

For sriov, when first run windows guest, then run linux guest, the gds
vmid0 size will be reset to 0 by windows guest. So if the value has been
reset to 0, then set the value to the default value in linux guest.

v2:
Fixed value instead of reading mmGDS_VMID0_SIZE.

Signed-off-by: Emily Deng 
---
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 14 +-
  1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index e61f6a3..38a2ecc 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4904,7 +4904,19 @@ static void gfx_v9_0_set_rlc_funcs(struct amdgpu_device 
*adev)
  static void gfx_v9_0_set_gds_init(struct amdgpu_device *adev)
  {
/* init asci gds info */
-   adev->gds.mem.total_size = RREG32_SOC15(GC, 0, mmGDS_VMID0_SIZE);
+   switch (adev->asic_type) {
+   case CHIP_VEGA10:
+   case CHIP_VEGA12:
+   case CHIP_VEGA20:
+   adev->gds.mem.total_size = 0x1;
+   break;
+   case CHIP_RAVEN:
+   adev->gds.mem.total_size = 0x1000;
+   break;
+   default:
Better to show some info here in case that new card goes here with 0 
size unexpectedly.

or set default value as 0x1.

Jerry

+   break;
+   }
+
adev->gds.gws.total_size = 64;
adev->gds.oa.total_size = 16;
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu/sriov: Set the default value about gds vmid0 size

2018-10-12 Thread Zhang, Jerry(Junwei)

On 10/12/2018 03:39 PM, Christian König wrote:

Am 12.10.2018 um 05:21 schrieb Emily Deng:

For sriov, when first run windows guest, then run linux guest, the gds
vmid0 size will be reset to 0 by windows guest. So if the value has been
reset to 0, then set the value to the default value in linux guest.


Can we just always use the fixed value instead of reading 
mmGDS_VMID0_SIZE?


We really don't want to introduce so much complexity here and another 
extra code path for SRIOV.


Suppose VBIOS may set the size for different SKU so read from the register.
We may confirm a fixed or default value for all gfx v9.

Jerry



Christian.



Signed-off-by: Emily Deng 
---
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 14 ++
  1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c

index ae86238..d9df3dd 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4872,6 +4872,17 @@ static void gfx_v9_0_set_rlc_funcs(struct 
amdgpu_device *adev)

  }
  }
  +static void gfx_v9_0_set_gds_default(struct amdgpu_device *adev)
+{
+    switch (adev->asic_type) {
+    case CHIP_VEGA10:
+    adev->gds.mem.total_size = 0x1;
+    break;
+    default:
+    break;
+    }
+}
+
  static void gfx_v9_0_set_gds_init(struct amdgpu_device *adev)
  {
  /* init asci gds info */
@@ -4879,6 +4890,9 @@ static void gfx_v9_0_set_gds_init(struct 
amdgpu_device *adev)

  adev->gds.gws.total_size = 64;
  adev->gds.oa.total_size = 16;
  +    if (adev->gds.mem.total_size == 0 && amdgpu_sriov_vf(adev))
+    gfx_v9_0_set_gds_default(adev);
+
  if (adev->gds.mem.total_size == 64 * 1024) {
  adev->gds.mem.gfx_partition_size = 4096;
  adev->gds.mem.cs_partition_size = 4096;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu/sriov: Set the default value about gds vmid0 size

2018-10-11 Thread Zhang, Jerry(Junwei)

On 10/12/2018 11:21 AM, Emily Deng wrote:

For sriov, when first run windows guest, then run linux guest, the gds
vmid0 size will be reset to 0 by windows guest. So if the value has been
reset to 0, then set the value to the default value in linux guest.

Signed-off-by: Emily Deng 
---
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 14 ++
  1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index ae86238..d9df3dd 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -4872,6 +4872,17 @@ static void gfx_v9_0_set_rlc_funcs(struct amdgpu_device 
*adev)
}
  }
  
+static void gfx_v9_0_set_gds_default(struct amdgpu_device *adev)

+{
+   switch (adev->asic_type) {
+   case CHIP_VEGA10:
+   adev->gds.mem.total_size = 0x1;
Do you mean this value is same as the original value from 
mmGDS_VMID0_SIZE before reset?
if so, we may provide a default value for recovery, e.g. in the gds.mem 
structure.


Regards,
Jerry

+   break;
+   default:
+   break;
+   }
+}
+
  static void gfx_v9_0_set_gds_init(struct amdgpu_device *adev)
  {
/* init asci gds info */
@@ -4879,6 +4890,9 @@ static void gfx_v9_0_set_gds_init(struct amdgpu_device 
*adev)
adev->gds.gws.total_size = 64;
adev->gds.oa.total_size = 16;
  
+	if (adev->gds.mem.total_size == 0 && amdgpu_sriov_vf(adev))

+   gfx_v9_0_set_gds_default(adev);
+
if (adev->gds.mem.total_size == 64 * 1024) {
adev->gds.mem.gfx_partition_size = 4096;
adev->gds.mem.cs_partition_size = 4096;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix AGP location with VRAM at 0x0

2018-10-09 Thread Zhang, Jerry(Junwei)

On 10/04/2018 05:02 PM, Christian König wrote:

That also simplifies handling quite a bit.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 

---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 7 ++-
  1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 9a5b252784a1..999e15945355 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -200,16 +200,13 @@ void amdgpu_gmc_agp_location(struct amdgpu_device *adev, 
struct amdgpu_gmc *mc)
}
  
  	if (size_bf > size_af) {

-   mc->agp_start = mc->fb_start > mc->gart_start ?
-   mc->gart_end + 1 : 0;
+   mc->agp_start = (mc->fb_start - size_bf) & sixteen_gb_mask;
mc->agp_size = size_bf;
} else {
-   mc->agp_start = (mc->fb_start > mc->gart_start ?
-   mc->fb_end : mc->gart_end) + 1,
+   mc->agp_start = ALIGN(mc->fb_end + 1, sixteen_gb);
mc->agp_size = size_af;
}
  
-	mc->agp_start = ALIGN(mc->agp_start, sixteen_gb);

mc->agp_end = mc->agp_start + mc->agp_size - 1;
dev_info(adev->dev, "AGP: %lluM 0x%016llX - 0x%016llX\n",
mc->agp_size >> 20, mc->agp_start, mc->agp_end);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 5/5] drm/amdgpu: fix shadow BO restoring

2018-09-18 Thread Zhang, Jerry(Junwei)

On 09/14/2018 07:54 PM, Christian König wrote:

Am 13.09.2018 um 11:29 schrieb Zhang, Jerry(Junwei):

On 09/11/2018 05:56 PM, Christian König wrote:
Don't grab the reservation lock any more and simplify the handling 
quite

a bit.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 109 
-

  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  46 
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |   8 +--
  3 files changed, 43 insertions(+), 120 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index 5eba66ecf668..20bb702f5c7f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2940,54 +2940,6 @@ static int 
amdgpu_device_ip_post_soft_reset(struct amdgpu_device *adev)

  return 0;
  }
  -/**
- * amdgpu_device_recover_vram_from_shadow - restore shadowed VRAM 
buffers

- *
- * @adev: amdgpu_device pointer
- * @ring: amdgpu_ring for the engine handling the buffer operations
- * @bo: amdgpu_bo buffer whose shadow is being restored
- * @fence: dma_fence associated with the operation
- *
- * Restores the VRAM buffer contents from the shadow in GTT. Used to
- * restore things like GPUVM page tables after a GPU reset where
- * the contents of VRAM might be lost.
- * Returns 0 on success, negative error code on failure.
- */
-static int amdgpu_device_recover_vram_from_shadow(struct 
amdgpu_device *adev,

-  struct amdgpu_ring *ring,
-  struct amdgpu_bo *bo,
-  struct dma_fence **fence)
-{
-    uint32_t domain;
-    int r;
-
-    if (!bo->shadow)
-    return 0;
-
-    r = amdgpu_bo_reserve(bo, true);
-    if (r)
-    return r;
-    domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
-    /* if bo has been evicted, then no need to recover */
-    if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
-    r = amdgpu_bo_validate(bo->shadow);
-    if (r) {
-    DRM_ERROR("bo validate failed!\n");
-    goto err;
-    }
-
-    r = amdgpu_bo_restore_from_shadow(adev, ring, bo,
- NULL, fence, true);
-    if (r) {
-    DRM_ERROR("recover page table failed!\n");
-    goto err;
-    }
-    }
-err:
-    amdgpu_bo_unreserve(bo);
-    return r;
-}
-
  /**
   * amdgpu_device_recover_vram - Recover some VRAM contents
   *
@@ -2996,16 +2948,15 @@ static int 
amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
   * Restores the contents of VRAM buffers from the shadows in GTT.  
Used to

   * restore things like GPUVM page tables after a GPU reset where
   * the contents of VRAM might be lost.
- * Returns 0 on success, 1 on failure.
+ *
+ * Returns:
+ * 0 on success, negative error code on failure.
   */
  static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
  {
-    struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
-    struct amdgpu_bo *bo, *tmp;
  struct dma_fence *fence = NULL, *next = NULL;
-    long r = 1;
-    int i = 0;
-    long tmo;
+    struct amdgpu_bo *shadow;
+    long r = 1, tmo;
    if (amdgpu_sriov_runtime(adev))
  tmo = msecs_to_jiffies(8000);
@@ -3014,44 +2965,40 @@ static int amdgpu_device_recover_vram(struct 
amdgpu_device *adev)

    DRM_INFO("recover vram bo from shadow start\n");
  mutex_lock(>shadow_list_lock);
-    list_for_each_entry_safe(bo, tmp, >shadow_list, 
shadow_list) {

-    next = NULL;
-    amdgpu_device_recover_vram_from_shadow(adev, ring, bo, );
+    list_for_each_entry(shadow, >shadow_list, shadow_list) {
+
+    /* No need to recover an evicted BO */
+    if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
+    shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)

is there a change that shadow bo evicted to other domain?
like SYSTEM?


Yes, that's why I test "!= TTM_PL_TT" here.

What can happen is that either the shadow or the page table or page 
directory is evicted.


But in this case we don't need to restore anything because of patch #1 
in this series.


Thanks, then it's
Acked-by: Junwei Zhang 

Regards,
Jerry



Regards,
Christian.



Regards,
Jerry

+    continue;
+
+    r = amdgpu_bo_restore_shadow(shadow, );
+    if (r)
+    break;
+
  if (fence) {
  r = dma_fence_wait_timeout(fence, false, tmo);
-    if (r == 0)
-    pr_err("wait fence %p[%d] timeout\n", fence, i);
-    else if (r < 0)
-    pr_err("wait fence %p[%d] interrupted\n", fence, i);
-    if (r < 1) {
-    dma_fence_put(fence);
-    fence = next;
+    dma_fence_put(fence);
+    fence = next;
+    if (r <= 0)
  break;
-    }
-    i++;
+    } e

Re: [PATCH libdrm 3/3] test/amdgpu: add GDS, GWS and OA tests

2018-09-18 Thread Zhang, Jerry(Junwei)

On 09/14/2018 09:09 PM, Christian König wrote:

Add allocation tests for GDW, GWS and OA.

Signed-off-by: Christian König 
---
  tests/amdgpu/amdgpu_test.h | 48 +-
  tests/amdgpu/bo_tests.c| 21 
  2 files changed, 47 insertions(+), 22 deletions(-)

diff --git a/tests/amdgpu/amdgpu_test.h b/tests/amdgpu/amdgpu_test.h
index d1e14e23..af3041e5 100644
--- a/tests/amdgpu/amdgpu_test.h
+++ b/tests/amdgpu/amdgpu_test.h
@@ -207,11 +207,9 @@ static inline amdgpu_bo_handle gpu_mem_alloc(
amdgpu_va_handle *va_handle)
  {
struct amdgpu_bo_alloc_request req = {0};
-   amdgpu_bo_handle buf_handle;
+   amdgpu_bo_handle buf_handle = NULL;
int r;
  
-	CU_ASSERT_NOT_EQUAL(vmc_addr, NULL);

-
req.alloc_size = size;
req.phys_alignment = alignment;
req.preferred_heap = type;
@@ -222,16 +220,19 @@ static inline amdgpu_bo_handle gpu_mem_alloc(
if (r)
return NULL;
  
-	r = amdgpu_va_range_alloc(device_handle,

- amdgpu_gpu_va_range_general,
- size, alignment, 0, vmc_addr,
- va_handle, 0);
-   CU_ASSERT_EQUAL(r, 0);
-   if (r)
-   goto error_free_bo;
-
-   r = amdgpu_bo_va_op(buf_handle, 0, size, *vmc_addr, 0, 
AMDGPU_VA_OP_MAP);
-   CU_ASSERT_EQUAL(r, 0);
+   if (vmc_addr && va_handle) {
+   r = amdgpu_va_range_alloc(device_handle,
+ amdgpu_gpu_va_range_general,
+ size, alignment, 0, vmc_addr,
+ va_handle, 0);
+   CU_ASSERT_EQUAL(r, 0);
+   if (r)
+   goto error_free_bo;
+
+   r = amdgpu_bo_va_op(buf_handle, 0, size, *vmc_addr, 0,
+   AMDGPU_VA_OP_MAP);
+   CU_ASSERT_EQUAL(r, 0);


Error check for bo map here as well.

Regards,
Jerry

+   }
  
  	return buf_handle;
  
@@ -256,15 +257,18 @@ static inline int gpu_mem_free(amdgpu_bo_handle bo,

if (!bo)
return 0;
  
-	r = amdgpu_bo_va_op(bo, 0, size, vmc_addr, 0, AMDGPU_VA_OP_UNMAP);

-   CU_ASSERT_EQUAL(r, 0);
-   if (r)
-   return r;
-
-   r = amdgpu_va_range_free(va_handle);
-   CU_ASSERT_EQUAL(r, 0);
-   if (r)
-   return r;
+   if (va_handle) {
+   r = amdgpu_bo_va_op(bo, 0, size, vmc_addr, 0,
+   AMDGPU_VA_OP_UNMAP);
+   CU_ASSERT_EQUAL(r, 0);
+   if (r)
+   return r;
+
+   r = amdgpu_va_range_free(va_handle);
+   CU_ASSERT_EQUAL(r, 0);
+   if (r)
+   return r;
+   }
  
  	r = amdgpu_bo_free(bo);

CU_ASSERT_EQUAL(r, 0);
diff --git a/tests/amdgpu/bo_tests.c b/tests/amdgpu/bo_tests.c
index dc2de9b7..7cff4cf7 100644
--- a/tests/amdgpu/bo_tests.c
+++ b/tests/amdgpu/bo_tests.c
@@ -242,6 +242,27 @@ static void amdgpu_memory_alloc(void)
  
  	r = gpu_mem_free(bo, va_handle, bo_mc, 4096);

CU_ASSERT_EQUAL(r, 0);
+
+   /* Test GDS */
+   bo = gpu_mem_alloc(device_handle, 1024, 0,
+   AMDGPU_GEM_DOMAIN_GDS, 0,
+   NULL, NULL);
+   r = gpu_mem_free(bo, NULL, 0, 4096);
+   CU_ASSERT_EQUAL(r, 0);
+
+   /* Test GWS */
+   bo = gpu_mem_alloc(device_handle, 1, 0,
+   AMDGPU_GEM_DOMAIN_GWS, 0,
+   NULL, NULL);
+   r = gpu_mem_free(bo, NULL, 0, 4096);
+   CU_ASSERT_EQUAL(r, 0);
+
+   /* Test OA */
+   bo = gpu_mem_alloc(device_handle, 1, 0,
+   AMDGPU_GEM_DOMAIN_OA, 0,
+   NULL, NULL);
+   r = gpu_mem_free(bo, NULL, 0, 4096);
+   CU_ASSERT_EQUAL(r, 0);
  }
  
  static void amdgpu_mem_fail_alloc(void)


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm 1/3] amdgpu: remove invalid check in amdgpu_bo_alloc

2018-09-18 Thread Zhang, Jerry(Junwei)

On 09/14/2018 09:09 PM, Christian König wrote:

The heap is checked by the kernel and not libdrm, to make it even worse
it prevented allocating resources other than VRAM and GTT.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 


---
  amdgpu/amdgpu_bo.c | 9 ++---
  1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c
index 6a95929c..34904e38 100644
--- a/amdgpu/amdgpu_bo.c
+++ b/amdgpu/amdgpu_bo.c
@@ -74,19 +74,14 @@ int amdgpu_bo_alloc(amdgpu_device_handle dev,
amdgpu_bo_handle *buf_handle)
  {
union drm_amdgpu_gem_create args;
-   unsigned heap = alloc_buffer->preferred_heap;
-   int r = 0;
-
-   /* It's an error if the heap is not specified */
-   if (!(heap & (AMDGPU_GEM_DOMAIN_GTT | AMDGPU_GEM_DOMAIN_VRAM)))
-   return -EINVAL;
+   int r;
  
  	memset(, 0, sizeof(args));

args.in.bo_size = alloc_buffer->alloc_size;
args.in.alignment = alloc_buffer->phys_alignment;
  
  	/* Set the placement. */

-   args.in.domains = heap;
+   args.in.domains = alloc_buffer->preferred_heap;
args.in.domain_flags = alloc_buffer->flags;
  
  	/* Allocate the buffer with the preferred heap. */


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm 2/3] test/amdgpu: add proper error handling

2018-09-18 Thread Zhang, Jerry(Junwei)

On 09/14/2018 09:09 PM, Christian König wrote:

Otherwise the calling function won't notice that something is wrong.

Signed-off-by: Christian König 
---
  tests/amdgpu/amdgpu_test.h | 23 ++-
  1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/tests/amdgpu/amdgpu_test.h b/tests/amdgpu/amdgpu_test.h
index f2ece3c3..d1e14e23 100644
--- a/tests/amdgpu/amdgpu_test.h
+++ b/tests/amdgpu/amdgpu_test.h
@@ -219,17 +219,31 @@ static inline amdgpu_bo_handle gpu_mem_alloc(
  
  	r = amdgpu_bo_alloc(device_handle, , _handle);

CU_ASSERT_EQUAL(r, 0);
+   if (r)
+   return NULL;
  
  	r = amdgpu_va_range_alloc(device_handle,

  amdgpu_gpu_va_range_general,
  size, alignment, 0, vmc_addr,
  va_handle, 0);
CU_ASSERT_EQUAL(r, 0);
+   if (r)
+   goto error_free_bo;
  
  	r = amdgpu_bo_va_op(buf_handle, 0, size, *vmc_addr, 0, AMDGPU_VA_OP_MAP);

CU_ASSERT_EQUAL(r, 0);


We may also add error check for bo map.

Regards,
Jerry
  
  	return buf_handle;

+
+error_free_va:
+   r = amdgpu_va_range_free(*va_handle);
+   CU_ASSERT_EQUAL(r, 0);
+
+error_free_bo:
+   r = amdgpu_bo_free(buf_handle);
+   CU_ASSERT_EQUAL(r, 0);
+
+   return NULL;
  }
  
  static inline int gpu_mem_free(amdgpu_bo_handle bo,

@@ -239,16 +253,23 @@ static inline int gpu_mem_free(amdgpu_bo_handle bo,
  {
int r;
  
+	if (!bo)

+   return 0;
+
r = amdgpu_bo_va_op(bo, 0, size, vmc_addr, 0, AMDGPU_VA_OP_UNMAP);
CU_ASSERT_EQUAL(r, 0);
+   if (r)
+   return r;
  
  	r = amdgpu_va_range_free(va_handle);

CU_ASSERT_EQUAL(r, 0);
+   if (r)
+   return r;
  
  	r = amdgpu_bo_free(bo);

CU_ASSERT_EQUAL(r, 0);
  
-	return 0;

+   return r;
  }
  
  static inline int


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] list: introduce list_bulk_move_tail helper

2018-09-18 Thread Zhang, Jerry(Junwei)

On 09/17/2018 08:08 PM, Christian König wrote:

Move all entries between @first and including @last before @head.

This is useful for LRU lists where a whole block of entries should be
moved to the end of the list.

Used as a band aid in TTM, but better placed in the common list headers.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 

---
  drivers/gpu/drm/ttm/ttm_bo.c | 25 +
  include/linux/list.h | 23 +++
  2 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index b2a33bf1ef10..26b889f86670 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -247,20 +247,6 @@ void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
  }
  EXPORT_SYMBOL(ttm_bo_move_to_lru_tail);
  
-static void ttm_list_move_bulk_tail(struct list_head *list,

-   struct list_head *first,
-   struct list_head *last)
-{
-   first->prev->next = last->next;
-   last->next->prev = first->prev;
-
-   list->prev->next = first;
-   first->prev = list->prev;
-
-   last->next = list;
-   list->prev = last;
-}
-
  void ttm_bo_bulk_move_lru_tail(struct ttm_lru_bulk_move *bulk)
  {
unsigned i;
@@ -276,8 +262,8 @@ void ttm_bo_bulk_move_lru_tail(struct ttm_lru_bulk_move 
*bulk)
reservation_object_assert_held(pos->last->resv);
  
  		man = >first->bdev->man[TTM_PL_TT];

-   ttm_list_move_bulk_tail(>lru[i], >first->lru,
-   >last->lru);
+   list_bulk_move_tail(>lru[i], >first->lru,
+   >last->lru);
}
  
  	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {

@@ -291,8 +277,8 @@ void ttm_bo_bulk_move_lru_tail(struct ttm_lru_bulk_move 
*bulk)
reservation_object_assert_held(pos->last->resv);
  
  		man = >first->bdev->man[TTM_PL_VRAM];

-   ttm_list_move_bulk_tail(>lru[i], >first->lru,
-   >last->lru);
+   list_bulk_move_tail(>lru[i], >first->lru,
+   >last->lru);
}
  
  	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {

@@ -306,8 +292,7 @@ void ttm_bo_bulk_move_lru_tail(struct ttm_lru_bulk_move 
*bulk)
reservation_object_assert_held(pos->last->resv);
  
  		lru = >first->bdev->glob->swap_lru[i];

-   ttm_list_move_bulk_tail(lru, >first->swap,
-   >last->swap);
+   list_bulk_move_tail(lru, >first->swap, >last->swap);
}
  }
  EXPORT_SYMBOL(ttm_bo_bulk_move_lru_tail);
diff --git a/include/linux/list.h b/include/linux/list.h
index de04cc5ed536..edb7628e46ed 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -183,6 +183,29 @@ static inline void list_move_tail(struct list_head *list,
list_add_tail(list, head);
  }
  
+/**

+ * list_bulk_move_tail - move a subsection of a list to its tail
+ * @head: the head that will follow our entry
+ * @first: first entry to move
+ * @last: last entry to move, can be the same as first
+ *
+ * Move all entries between @first and including @last before @head.
+ * All three entries must belong to the same linked list.
+ */
+static inline void list_bulk_move_tail(struct list_head *head,
+  struct list_head *first,
+  struct list_head *last)
+{
+   first->prev->next = last->next;
+   last->next->prev = first->prev;
+
+   head->prev->next = first;
+   first->prev = head->prev;
+
+   last->next = head;
+   head->prev = last;
+}
+
  /**
   * list_is_last - tests whether @list is the last entry in list @head
   * @list: the entry to test


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] list: introduce list_bulk_move_tail helper

2018-09-17 Thread Zhang, Jerry(Junwei)

On 09/17/2018 08:08 PM, Christian König wrote:

Move all entries between @first and including @last before @head.

This is useful for LRU lists where a whole block of entries should be
moved to the end of the list.

Used as a band aid in TTM, but better placed in the common list headers.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 

---
  drivers/gpu/drm/ttm/ttm_bo.c | 25 +
  include/linux/list.h | 23 +++
  2 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index b2a33bf1ef10..26b889f86670 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -247,20 +247,6 @@ void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
  }
  EXPORT_SYMBOL(ttm_bo_move_to_lru_tail);
  
-static void ttm_list_move_bulk_tail(struct list_head *list,

-   struct list_head *first,
-   struct list_head *last)
-{
-   first->prev->next = last->next;
-   last->next->prev = first->prev;
-
-   list->prev->next = first;
-   first->prev = list->prev;
-
-   last->next = list;
-   list->prev = last;
-}
-
  void ttm_bo_bulk_move_lru_tail(struct ttm_lru_bulk_move *bulk)
  {
unsigned i;
@@ -276,8 +262,8 @@ void ttm_bo_bulk_move_lru_tail(struct ttm_lru_bulk_move 
*bulk)
reservation_object_assert_held(pos->last->resv);
  
  		man = >first->bdev->man[TTM_PL_TT];

-   ttm_list_move_bulk_tail(>lru[i], >first->lru,
-   >last->lru);
+   list_bulk_move_tail(>lru[i], >first->lru,
+   >last->lru);
}
  
  	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {

@@ -291,8 +277,8 @@ void ttm_bo_bulk_move_lru_tail(struct ttm_lru_bulk_move 
*bulk)
reservation_object_assert_held(pos->last->resv);
  
  		man = >first->bdev->man[TTM_PL_VRAM];

-   ttm_list_move_bulk_tail(>lru[i], >first->lru,
-   >last->lru);
+   list_bulk_move_tail(>lru[i], >first->lru,
+   >last->lru);
}
  
  	for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {

@@ -306,8 +292,7 @@ void ttm_bo_bulk_move_lru_tail(struct ttm_lru_bulk_move 
*bulk)
reservation_object_assert_held(pos->last->resv);
  
  		lru = >first->bdev->glob->swap_lru[i];

-   ttm_list_move_bulk_tail(lru, >first->swap,
-   >last->swap);
+   list_bulk_move_tail(lru, >first->swap, >last->swap);
}
  }
  EXPORT_SYMBOL(ttm_bo_bulk_move_lru_tail);
diff --git a/include/linux/list.h b/include/linux/list.h
index de04cc5ed536..edb7628e46ed 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -183,6 +183,29 @@ static inline void list_move_tail(struct list_head *list,
list_add_tail(list, head);
  }
  
+/**

+ * list_bulk_move_tail - move a subsection of a list to its tail
+ * @head: the head that will follow our entry
+ * @first: first entry to move
+ * @last: last entry to move, can be the same as first
+ *
+ * Move all entries between @first and including @last before @head.
+ * All three entries must belong to the same linked list.
+ */
+static inline void list_bulk_move_tail(struct list_head *head,
+  struct list_head *first,
+  struct list_head *last)
+{
+   first->prev->next = last->next;
+   last->next->prev = first->prev;
+
+   head->prev->next = first;
+   first->prev = head->prev;
+
+   last->next = head;
+   head->prev = last;
+}
+
  /**
   * list_is_last - tests whether @list is the last entry in list @head
   * @list: the entry to test


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm] tests/amdgpu: add unaligned VM test

2018-09-14 Thread Zhang, Jerry(Junwei)

On 09/13/2018 08:20 PM, Christian König wrote:

Am 11.09.2018 um 04:06 schrieb Zhang, Jerry (Junwei):

On 09/10/2018 05:33 PM, Christian König wrote:

Am 10.09.2018 um 04:44 schrieb Zhang, Jerry (Junwei):

On 09/10/2018 02:04 AM, Christian König wrote:

Make a VM mapping which is as unaligned as possible.


Is it going to test unaligned address between BO allocation and BO 
mapping

and skip huge page mapping?


Yes and no.

Huge page handling works by mapping at least 2MB of continuous 
memory on a 2MB aligned address.


What I do here is I allocate 4GB of VRAM and try to map it to an 
address which is aligned to 1GB + 4KB.


In other words the VM subsystem will add a single PTE to align the 
entry to 8KB, then it add two PTEs to align it to 16KB, then four to 
get to 32KB and so on until we have the maximum alignment of 2GB

which Vega/Raven support in the L1.


Thanks to explain that.

From the trace log, it will map 1*4KB, 2*4KB, ..., 256*4KB, then back 
to 1*4KB.


 amdgpu_test-1384  [005]    110.634466: amdgpu_vm_bo_update: 
soffs=11, eoffs=1f, flags=70
 amdgpu_test-1384  [005]    110.634467: amdgpu_vm_set_ptes: 
pe=f5feffd008, addr=01fec0, incr=4096, flags=71, count=1
 amdgpu_test-1384  [005]    110.634468: amdgpu_vm_set_ptes: 
pe=f5feffd010, addr=01fec01000, incr=4096, flags=f1, count=2
 amdgpu_test-1384  [005]    110.634468: amdgpu_vm_set_ptes: 
pe=f5feffd020, addr=01fec03000, incr=4096, flags=171, count=4
 amdgpu_test-1384  [005]    110.634468: amdgpu_vm_set_ptes: 
pe=f5feffd040, addr=01fec07000, incr=4096, flags=1f1, count=8
 amdgpu_test-1384  [005]    110.634468: amdgpu_vm_set_ptes: 
pe=f5feffd080, addr=01fec0f000, incr=4096, flags=271, count=16
 amdgpu_test-1384  [005]    110.634468: amdgpu_vm_set_ptes: 
pe=f5feffd100, addr=01fec1f000, incr=4096, flags=2f1, count=32
 amdgpu_test-1384  [005]    110.634469: amdgpu_vm_set_ptes: 
pe=f5feffd200, addr=01fec3f000, incr=4096, flags=371, count=64
 amdgpu_test-1384  [005]    110.634469: amdgpu_vm_set_ptes: 
pe=f5feffd400, addr=01fec7f000, incr=4096, flags=3f1, count=128
 amdgpu_test-1384  [005]    110.634469: amdgpu_vm_set_ptes: 
pe=f5feffd800, addr=01fecff000, incr=4096, flags=471, count=256
 amdgpu_test-1384  [005]    110.634469: amdgpu_vm_set_ptes: 
pe=f5feffc000, addr=01fedff000, incr=4096, flags=71, count=1
 amdgpu_test-1384  [005]    110.634470: amdgpu_vm_set_ptes: 
pe=f5feffc008, addr=01fea0, incr=4096, flags=71, count=1
 amdgpu_test-1384  [005]    110.634470: amdgpu_vm_set_ptes: 
pe=f5feffc010, addr=01fea01000, incr=4096, flags=f1, count=2


Yes, that it is exactly the expected result with the old code.



And it sounds like a performance test for Vega and later.
If so, shall we add some time stamp in the log?


Well I used it as performance test, but the resulting numbers are not 
very comparable.


It is useful to push to libdrm because it also exercises the VM code 
and makes sure that the code doesn't crash on corner cases.

Thanks for your info.
That's fine for me.

Reviewed-by: Junwei Zhang 

BTW, still think adding a print here is a good choice.
+ /* Don't let the test fail if the device doesn't have enough VRAM */
+ if (r)
+ return;

Regards,
Jerry


Regards,
Christian.



Regards,
Jerry



Regards,
Christian.





Signed-off-by: Christian König 
---
  tests/amdgpu/vm_tests.c | 45 
-

  1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/tests/amdgpu/vm_tests.c b/tests/amdgpu/vm_tests.c
index 7b6dc5d6..fada2987 100644
--- a/tests/amdgpu/vm_tests.c
+++ b/tests/amdgpu/vm_tests.c
@@ -31,8 +31,8 @@ static  amdgpu_device_handle device_handle;
  static  uint32_t  major_version;
  static  uint32_t  minor_version;

-
  static void amdgpu_vmid_reserve_test(void);
+static void amdgpu_vm_unaligned_map(void);

  CU_BOOL suite_vm_tests_enable(void)
  {
@@ -84,6 +84,7 @@ int suite_vm_tests_clean(void)

  CU_TestInfo vm_tests[] = {
  { "resere vmid test",  amdgpu_vmid_reserve_test },
+    { "unaligned map",  amdgpu_vm_unaligned_map },
  CU_TEST_INFO_NULL,
  };

@@ -167,3 +168,45 @@ static void amdgpu_vmid_reserve_test(void)
  r = amdgpu_cs_ctx_free(context_handle);
  CU_ASSERT_EQUAL(r, 0);
  }
+
+static void amdgpu_vm_unaligned_map(void)
+{
+    const uint64_t map_size = (4ULL << 30) - (2 << 12);
+    struct amdgpu_bo_alloc_request request = {};
+    amdgpu_bo_handle buf_handle;
+    amdgpu_va_handle handle;
+    uint64_t vmc_addr;
+    int r;
+
+    request.alloc_size = 4ULL << 30;
+    request.phys_alignment = 4096;
+    request.preferred_heap = AMDGPU_GEM_DOMAIN_VRAM;
+    request.flags = AMDGPU_GEM_CREATE_NO_CPU_ACCESS;
+
+    r = amdgpu_bo_alloc(device_handle, , _handle);
+    /* Don't let the test fail if the device doesn't have enough 
VRAM */


We may print some info to the console here.

Regards,
Jerry


Re: [PATCH 5/5] drm/amdgpu: fix shadow BO restoring

2018-09-13 Thread Zhang, Jerry(Junwei)

On 09/11/2018 05:56 PM, Christian König wrote:

Don't grab the reservation lock any more and simplify the handling quite
a bit.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 109 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  46 
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |   8 +--
  3 files changed, 43 insertions(+), 120 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 5eba66ecf668..20bb702f5c7f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2940,54 +2940,6 @@ static int amdgpu_device_ip_post_soft_reset(struct 
amdgpu_device *adev)
return 0;
  }
  
-/**

- * amdgpu_device_recover_vram_from_shadow - restore shadowed VRAM buffers
- *
- * @adev: amdgpu_device pointer
- * @ring: amdgpu_ring for the engine handling the buffer operations
- * @bo: amdgpu_bo buffer whose shadow is being restored
- * @fence: dma_fence associated with the operation
- *
- * Restores the VRAM buffer contents from the shadow in GTT.  Used to
- * restore things like GPUVM page tables after a GPU reset where
- * the contents of VRAM might be lost.
- * Returns 0 on success, negative error code on failure.
- */
-static int amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
- struct amdgpu_ring *ring,
- struct amdgpu_bo *bo,
- struct dma_fence **fence)
-{
-   uint32_t domain;
-   int r;
-
-   if (!bo->shadow)
-   return 0;
-
-   r = amdgpu_bo_reserve(bo, true);
-   if (r)
-   return r;
-   domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
-   /* if bo has been evicted, then no need to recover */
-   if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
-   r = amdgpu_bo_validate(bo->shadow);
-   if (r) {
-   DRM_ERROR("bo validate failed!\n");
-   goto err;
-   }
-
-   r = amdgpu_bo_restore_from_shadow(adev, ring, bo,
-NULL, fence, true);
-   if (r) {
-   DRM_ERROR("recover page table failed!\n");
-   goto err;
-   }
-   }
-err:
-   amdgpu_bo_unreserve(bo);
-   return r;
-}
-
  /**
   * amdgpu_device_recover_vram - Recover some VRAM contents
   *
@@ -2996,16 +2948,15 @@ static int 
amdgpu_device_recover_vram_from_shadow(struct amdgpu_device *adev,
   * Restores the contents of VRAM buffers from the shadows in GTT.  Used to
   * restore things like GPUVM page tables after a GPU reset where
   * the contents of VRAM might be lost.
- * Returns 0 on success, 1 on failure.
+ *
+ * Returns:
+ * 0 on success, negative error code on failure.
   */
  static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
  {
-   struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
-   struct amdgpu_bo *bo, *tmp;
struct dma_fence *fence = NULL, *next = NULL;
-   long r = 1;
-   int i = 0;
-   long tmo;
+   struct amdgpu_bo *shadow;
+   long r = 1, tmo;
  
  	if (amdgpu_sriov_runtime(adev))

tmo = msecs_to_jiffies(8000);
@@ -3014,44 +2965,40 @@ static int amdgpu_device_recover_vram(struct 
amdgpu_device *adev)
  
  	DRM_INFO("recover vram bo from shadow start\n");

mutex_lock(>shadow_list_lock);
-   list_for_each_entry_safe(bo, tmp, >shadow_list, shadow_list) {
-   next = NULL;
-   amdgpu_device_recover_vram_from_shadow(adev, ring, bo, );
+   list_for_each_entry(shadow, >shadow_list, shadow_list) {
+
+   /* No need to recover an evicted BO */
+   if (shadow->tbo.mem.mem_type != TTM_PL_TT ||
+   shadow->parent->tbo.mem.mem_type != TTM_PL_VRAM)

is there a change that shadow bo evicted to other domain?
like SYSTEM?

Regards,
Jerry

+   continue;
+
+   r = amdgpu_bo_restore_shadow(shadow, );
+   if (r)
+   break;
+
if (fence) {
r = dma_fence_wait_timeout(fence, false, tmo);
-   if (r == 0)
-   pr_err("wait fence %p[%d] timeout\n", fence, i);
-   else if (r < 0)
-   pr_err("wait fence %p[%d] interrupted\n", 
fence, i);
-   if (r < 1) {
-   dma_fence_put(fence);
-   fence = next;
+   dma_fence_put(fence);
+   fence = next;
+   if (r <= 0)
break;
-   }
-   i++;
+   } else {
+  

Re: [PATCH 3/5] drm/amdgpu: shadow BOs don't need any alignment

2018-09-13 Thread Zhang, Jerry(Junwei)

On 09/11/2018 05:56 PM, Christian König wrote:

They aren't directly used by the hardware.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 

---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 7db0040ca145..3a6f92de5504 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -516,7 +516,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
  }
  
  static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,

-  unsigned long size, int byte_align,
+  unsigned long size,
   struct amdgpu_bo *bo)
  {
struct amdgpu_bo_param bp;
@@ -527,7 +527,6 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device 
*adev,
  
  	memset(, 0, sizeof(bp));

bp.size = size;
-   bp.byte_align = byte_align;
bp.domain = AMDGPU_GEM_DOMAIN_GTT;
bp.flags = AMDGPU_GEM_CREATE_CPU_GTT_USWC |
AMDGPU_GEM_CREATE_SHADOW;
@@ -576,7 +575,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
WARN_ON(reservation_object_lock((*bo_ptr)->tbo.resv,
NULL));
  
-		r = amdgpu_bo_create_shadow(adev, bp->size, bp->byte_align, (*bo_ptr));

+   r = amdgpu_bo_create_shadow(adev, bp->size, *bo_ptr);
  
  		if (!bp->resv)

reservation_object_unlock((*bo_ptr)->tbo.resv);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 4/5] drm/amdgpu: always recover VRAM during GPU recovery

2018-09-13 Thread Zhang, Jerry(Junwei)

On 09/11/2018 05:56 PM, Christian König wrote:

It shouldn't add much overhead and we should make sure that critical
VRAM content is always restored.

Signed-off-by: Christian König 

Acked-by: Junwei Zhang 

---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 93476b8c2e72..5eba66ecf668 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2989,7 +2989,7 @@ static int amdgpu_device_recover_vram_from_shadow(struct 
amdgpu_device *adev,
  }
  
  /**

- * amdgpu_device_handle_vram_lost - Handle the loss of VRAM contents
+ * amdgpu_device_recover_vram - Recover some VRAM contents
   *
   * @adev: amdgpu_device pointer
   *
@@ -2998,7 +2998,7 @@ static int amdgpu_device_recover_vram_from_shadow(struct 
amdgpu_device *adev,
   * the contents of VRAM might be lost.
   * Returns 0 on success, 1 on failure.
   */
-static int amdgpu_device_handle_vram_lost(struct amdgpu_device *adev)
+static int amdgpu_device_recover_vram(struct amdgpu_device *adev)
  {
struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
struct amdgpu_bo *bo, *tmp;
@@ -3125,8 +3125,8 @@ static int amdgpu_device_reset(struct amdgpu_device *adev)
}
}
  
-	if (!r && ((need_full_reset && !(adev->flags & AMD_IS_APU)) || vram_lost))

-   r = amdgpu_device_handle_vram_lost(adev);
+   if (!r)
+   r = amdgpu_device_recover_vram(adev);
  
  	return r;

  }
@@ -3172,7 +3172,7 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device 
*adev,
amdgpu_virt_release_full_gpu(adev, true);
if (!r && adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) {
atomic_inc(>vram_lost_counter);
-   r = amdgpu_device_handle_vram_lost(adev);
+   r = amdgpu_device_recover_vram(adev);
}
  
  	return r;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/5] drm/amdgpu: stop pipelining VM PDs/PTs moves

2018-09-13 Thread Zhang, Jerry(Junwei)


On 09/11/2018 05:55 PM, Christian König wrote:

We are going to need this for recoverable page fault handling and it
makes shadow handling during GPU reset much more easier.

Signed-off-by: Christian König 

Acked-by: Junwei Zhang 

---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 6 +-
  2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index b5f20b42439e..a7e39c9dd14b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1363,7 +1363,7 @@ u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
  {
WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_SYSTEM);
WARN_ON_ONCE(!ww_mutex_is_locked(>tbo.resv->lock) &&
-!bo->pin_count);
+!bo->pin_count && bo->tbo.type != ttm_bo_type_kernel);
WARN_ON_ONCE(bo->tbo.mem.start == AMDGPU_BO_INVALID_OFFSET);
WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_VRAM &&
 !(bo->flags & AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS));
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index d9f3201c9e5c..2f32dc692d40 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -524,7 +524,11 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
if (r)
goto error;
  
-	r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);

+   /* Always block for VM page tables before committing the new location */
+   if (bo->type == ttm_bo_type_kernel)
+   r = ttm_bo_move_accel_cleanup(bo, fence, true, new_mem);
+   else
+   r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);
dma_fence_put(fence);
return r;
  


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/8] drm/amdgpu: add some VM PD/PT iterators v2

2018-09-12 Thread Zhang, Jerry(Junwei)

On 09/12/2018 04:54 PM, Christian König wrote:
Both a leaf as well as dfs iterator to walk over all the PDs/PTs. v2: 
update comments and fix for_each_amdgpu_vm_pt_dfs_safe Signed-off-by: 
Christian König 

Reviewed-by: Junwei Zhang 
--- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 224 
+ 1 file changed, 224 insertions(+) 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 
136b00412dc8..787a200cf796 100644 --- 
a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -355,6 +355,230 @@ static 
struct amdgpu_vm_pt *amdgpu_vm_pt_parent(struct amdgpu_vm_pt *pt) 
return list_first_entry(>va, struct amdgpu_vm_pt, 
base.bo_list); } +/** + * amdgpu_vm_pt_cursor - state for 
for_each_amdgpu_vm_pt + */ +struct amdgpu_vm_pt_cursor { + uint64_t 
pfn; + struct amdgpu_vm_pt *parent; + struct amdgpu_vm_pt *entry; + 
unsigned level; +}; + +/** + * amdgpu_vm_pt_start - start PD/PT walk + 
* + * @adev: amdgpu_device pointer + * @vm: amdgpu_vm structure + * 
@start: start address of the walk + * @cursor: state to initialize + * 
+ * Initialize a amdgpu_vm_pt_cursor to start a walk. + */ +static 
void amdgpu_vm_pt_start(struct amdgpu_device *adev, + struct amdgpu_vm 
*vm, uint64_t start, + struct amdgpu_vm_pt_cursor *cursor) +{ + 
cursor->pfn = start; + cursor->parent = NULL; + cursor->entry = 
>root; + cursor->level = adev->vm_manager.root_level; +} + +/** + 
* amdgpu_vm_pt_descendant - go to child node + * + * @adev: 
amdgpu_device pointer + * @cursor: current state + * + * Walk to the 
child node of the current node. + * Returns: + * True if the walk was 
possible, false otherwise. + */ +static bool 
amdgpu_vm_pt_descendant(struct amdgpu_device *adev, + struct 
amdgpu_vm_pt_cursor *cursor) +{ + unsigned num_entries, shift, idx; + 
+ if (!cursor->entry->entries) + return false; + + 
BUG_ON(!cursor->entry->base.bo); + num_entries = 
amdgpu_vm_num_entries(adev, cursor->level); + shift = 
amdgpu_vm_level_shift(adev, cursor->level); + + ++cursor->level; + idx 
= (cursor->pfn >> shift) % num_entries; + cursor->parent = 
cursor->entry; + cursor->entry = >entry->entries[idx]; + 
return true; +} + +/** + * amdgpu_vm_pt_sibling - go to sibling node + 
* + * @adev: amdgpu_device pointer + * @cursor: current state + * + * 
Walk to the sibling node of the current node. + * Returns: + * True if 
the walk was possible, false otherwise. + */ +static bool 
amdgpu_vm_pt_sibling(struct amdgpu_device *adev, + struct 
amdgpu_vm_pt_cursor *cursor) +{ + unsigned shift, num_entries; + + /* 
Root doesn't have a sibling */ + if (!cursor->parent) + return false; 
+ + /* Go to our parents and see if we got a sibling */ + shift = 
amdgpu_vm_level_shift(adev, cursor->level - 1); + num_entries = 
amdgpu_vm_num_entries(adev, cursor->level - 1); + + if (cursor->entry 
== >parent->entries[num_entries - 1]) + return false; + + 
cursor->pfn += 1ULL << shift; + cursor->pfn &= ~((1ULL << shift) - 1); 
+ ++cursor->entry; + return true; +} + +/** + * amdgpu_vm_pt_ancestor 
- go to parent node + * + * @cursor: current state + * + * Walk to the 
parent node of the current node. + * Returns: + * True if the walk was 
possible, false otherwise. + */ +static bool 
amdgpu_vm_pt_ancestor(struct amdgpu_vm_pt_cursor *cursor) +{ + if 
(!cursor->parent) + return false; + + --cursor->level; + cursor->entry 
= cursor->parent; + cursor->parent = 
amdgpu_vm_pt_parent(cursor->parent); + return true; +} + +/** + * 
amdgpu_vm_pt_next - get next PD/PT in hieratchy + * + * @adev: 
amdgpu_device pointer + * @cursor: current state + * + * Walk the 
PD/PT tree to the next node. + */ +static void 
amdgpu_vm_pt_next(struct amdgpu_device *adev, + struct 
amdgpu_vm_pt_cursor *cursor) +{ + /* First try a newborn child */ + if 
(amdgpu_vm_pt_descendant(adev, cursor)) + return; + + /* If that 
didn't worked try to find a sibling */ + while 
(!amdgpu_vm_pt_sibling(adev, cursor)) { + /* No sibling, go to our 
parents and grandparents */ + if (!amdgpu_vm_pt_ancestor(cursor)) { + 
cursor->pfn = ~0ll; + return; + } + } +} + +/** + * 
amdgpu_vm_pt_first_leaf - get first leaf PD/PT + * + * @adev: 
amdgpu_device pointer + * @vm: amdgpu_vm structure + * @start: start 
addr of the walk + * @cursor: state to initialize + * + * Start a walk 
and go directly to the leaf node. + */ +static void 
amdgpu_vm_pt_first_leaf(struct amdgpu_device *adev, + struct amdgpu_vm 
*vm, uint64_t start, + struct amdgpu_vm_pt_cursor *cursor) +{ + 
amdgpu_vm_pt_start(adev, vm, start, cursor); + while 
(amdgpu_vm_pt_descendant(adev, cursor)); +} + +/** + * 
amdgpu_vm_pt_next_leaf - get next leaf PD/PT + * + * @adev: 
amdgpu_device pointer + * @cursor: current state + * + * Walk the 
PD/PT tree to the next leaf node. + */ +static void 
amdgpu_vm_pt_next_leaf(struct amdgpu_device *adev, + struct 
amdgpu_vm_pt_cursor *cursor) +{ + amdgpu_vm_pt_next(adev, cursor); + 
while (amdgpu_vm_pt_descendant(adev, cursor)); +} 

Re: [PATCH 4/8] drm/amdgpu: use the DFS iterator in amdgpu_vm_invalidate_pds v2

2018-09-12 Thread Zhang, Jerry(Junwei)

On 09/12/2018 04:54 PM, Christian König wrote:

Less code and easier to maintain.

v2: rename the function as well

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 

---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 32 
  1 file changed, 8 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index a0a30416a490..c0c97b1425fa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -1370,37 +1370,22 @@ static void amdgpu_vm_update_pde(struct 
amdgpu_pte_update_params *params,
  }
  
  /*

- * amdgpu_vm_invalidate_level - mark all PD levels as invalid
+ * amdgpu_vm_invalidate_pds - mark all PDs as invalid
   *
   * @adev: amdgpu_device pointer
   * @vm: related vm
- * @parent: parent PD
- * @level: VMPT level
   *
   * Mark all PD level as invalid after an error.
   */
-static void amdgpu_vm_invalidate_level(struct amdgpu_device *adev,
-  struct amdgpu_vm *vm,
-  struct amdgpu_vm_pt *parent,
-  unsigned level)
+static void amdgpu_vm_invalidate_pds(struct amdgpu_device *adev,
+struct amdgpu_vm *vm)
  {
-   unsigned pt_idx, num_entries;
-
-   /*
-* Recurse into the subdirectories. This recursion is harmless because
-* we only have a maximum of 5 layers.
-*/
-   num_entries = amdgpu_vm_num_entries(adev, level);
-   for (pt_idx = 0; pt_idx < num_entries; ++pt_idx) {
-   struct amdgpu_vm_pt *entry = >entries[pt_idx];
-
-   if (!entry->base.bo)
-   continue;
+   struct amdgpu_vm_pt_cursor cursor;
+   struct amdgpu_vm_pt *entry;
  
-		if (!entry->base.moved)

+   for_each_amdgpu_vm_pt_dfs_safe(adev, vm, cursor, entry)
+   if (entry->base.bo && !entry->base.moved)
amdgpu_vm_bo_relocated(>base);
-   amdgpu_vm_invalidate_level(adev, vm, entry, level + 1);
-   }
  }
  
  /*

@@ -1497,8 +1482,7 @@ int amdgpu_vm_update_directories(struct amdgpu_device 
*adev,
return 0;
  
  error:

-   amdgpu_vm_invalidate_level(adev, vm, >root,
-  adev->vm_manager.root_level);
+   amdgpu_vm_invalidate_pds(adev, vm);
amdgpu_job_free(job);
return r;
  }


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Fix the dead lock issue.

2018-09-10 Thread Zhang, Jerry (Junwei)

On 09/11/2018 10:51 AM, Emily Deng wrote:

It will ramdomly have the dead lock issue when test TDR:
1. amdgpu_device_handle_vram_lost gets the lock shadow_list_lock
2. amdgpu_bo_create locked the bo's resv lock
3. amdgpu_bo_create_shadow is waiting for the shadow_list_lock
4. amdgpu_device_recover_vram_from_shadow is waiting for the bo's resv
lock.

v2:
Make a local copy of the list

Signed-off-by: Emily Deng 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 21 -
  1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 2a21267..8c81404 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3105,6 +3105,9 @@ static int amdgpu_device_handle_vram_lost(struct 
amdgpu_device *adev)
long r = 1;
int i = 0;
long tmo;
+   struct list_head local_shadow_list;
+
+   INIT_LIST_HEAD(_shadow_list);

if (amdgpu_sriov_runtime(adev))
tmo = msecs_to_jiffies(8000);
@@ -3112,8 +3115,19 @@ static int amdgpu_device_handle_vram_lost(struct 
amdgpu_device *adev)
tmo = msecs_to_jiffies(100);

DRM_INFO("recover vram bo from shadow start\n");
+
+   mutex_lock(>shadow_list_lock);
+   list_splice_init(>shadow_list, _shadow_list);
+   mutex_unlock(>shadow_list_lock);
+
+
mutex_lock(>shadow_list_lock);
-   list_for_each_entry_safe(bo, tmp, >shadow_list, shadow_list) {
+   list_for_each_entry_safe(bo, tmp, _shadow_list, shadow_list) {
+   mutex_unlock(>shadow_list_lock);


We may not to use shadow_list_lock when traverse the local shadow list.

Regards,
Jerry


+
+   if (!bo)
+   continue;
+
next = NULL;
amdgpu_device_recover_vram_from_shadow(adev, ring, bo, );
if (fence) {
@@ -3132,9 +3146,14 @@ static int amdgpu_device_handle_vram_lost(struct 
amdgpu_device *adev)

dma_fence_put(fence);
fence = next;
+   mutex_lock(>shadow_list_lock);
}
mutex_unlock(>shadow_list_lock);

+   mutex_lock(>shadow_list_lock);
+   list_splice_init(_shadow_list, >shadow_list);
+   mutex_unlock(>shadow_list_lock);
+
if (fence) {
r = dma_fence_wait_timeout(fence, false, tmo);
if (r == 0)


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Optimize VM handling a bit more

2018-09-10 Thread Zhang, Jerry (Junwei)

Apart from Felix comments,

Looks good for me, patch 2 ~ 8 are
Reviewed-by: Junwei Zhang 

Patch 9 ~ 11 are
Acked-by: Junwei Zhang 


On 09/10/2018 02:03 AM, Christian König wrote:

Hi everyone,

Especially on Vega and Raven VM handling is rather inefficient while creating 
PTEs because we originally only supported 2 level page tables and implemented 4 
level page tables on top of that.

This patch set reworks quite a bit of that handling and adds proper iterator 
and tree walking functions which are then used to update PTEs more efficiently.

A totally constructed test case which tried to map 2GB of VRAM on an unaligned 
address is reduced from 45ms down to ~20ms on my test system.

As a very positive side effect this also adds support for 1GB giant VRAM pages 
additional to the existing 2MB huge pages on Vega/Raven and also enables all 
additional power of two values (2MB-2GB) for the L1.

This could be beneficial for applications which allocate very huge amounts of 
memory because it reduces the overhead of page table walks by 50% (huge pages 
where 25%).

Please comment and/or review,
Christian.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 04/11] drm/amdgpu: add some VM PD/PT iterators

2018-09-10 Thread Zhang, Jerry (Junwei)

On 09/10/2018 02:03 AM, Christian König wrote:

Both a leaf as well as dfs iterator to walk over all the PDs/PTs.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 221 +
  1 file changed, 221 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 416eccd9ea29..4007202585d4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -355,6 +355,227 @@ static struct amdgpu_vm_pt *amdgpu_vm_pt_parent(struct 
amdgpu_vm_pt *pt)
return list_first_entry(>va, struct amdgpu_vm_pt, base.bo_list);
  }

+/**
+ * amdgpu_vm_pt_cursor - state for for_each_amdgpu_vm_pt
+ */
+struct amdgpu_vm_pt_cursor {
+   uint64_t pfn;
+   struct amdgpu_vm_pt *parent;
+   struct amdgpu_vm_pt *entry;
+   unsigned level;
+};
+
+/**
+ * amdgpu_vm_pt_start - start PD/PT walk
+ *
+ * @adev: amdgpu_device pointer
+ * @vm: amdgpu_vm structure
+ * @start: start address of the walk
+ * @cursor: state to initialize
+ *
+ * Initialize a amdgpu_vm_pt_cursor to start a walk.
+ */
+static void amdgpu_vm_pt_start(struct amdgpu_device *adev,
+  struct amdgpu_vm *vm, uint64_t start,
+  struct amdgpu_vm_pt_cursor *cursor)
+{
+   cursor->pfn = start;
+   cursor->parent = NULL;
+   cursor->entry = >root;
+   cursor->level = adev->vm_manager.root_level;
+}
+
+/**
+ * amdgpu_vm_pt_descendant - got to child node


seems typo for "go to"

Jerry


+ *
+ * @adev: amdgpu_device pointer
+ * @cursor: current state
+ *
+ * Walk to the child node of the current node.
+ * Returns:
+ * True if the walk was possible, false otherwise.
+ */
+static bool amdgpu_vm_pt_descendant(struct amdgpu_device *adev,
+   struct amdgpu_vm_pt_cursor *cursor)
+{
+   unsigned num_entries, shift, idx;
+
+   if (!cursor->entry->entries)
+   return false;
+
+   BUG_ON(!cursor->entry->base.bo);
+   num_entries = amdgpu_vm_num_entries(adev, cursor->level);
+   shift = amdgpu_vm_level_shift(adev, cursor->level);
+
+   ++cursor->level;
+   idx = (cursor->pfn >> shift) % num_entries;
+   cursor->parent = cursor->entry;
+   cursor->entry = >entry->entries[idx];
+   return true;
+}
+
+/**
+ * amdgpu_vm_pt_sibling - go to sibling node
+ *
+ * @adev: amdgpu_device pointer
+ * @cursor: current state
+ *
+ * Walk to the sibling node of the current node.
+ * Returns:
+ * True if the walk was possible, false otherwise.
+ */
+static bool amdgpu_vm_pt_sibling(struct amdgpu_device *adev,
+struct amdgpu_vm_pt_cursor *cursor)
+{
+   unsigned shift, num_entries;
+
+   /* Root doesn't have a sibling */
+   if (!cursor->parent)
+   return false;
+
+   /* Go to our parents and see if we got a sibling */
+   shift = amdgpu_vm_level_shift(adev, cursor->level - 1);
+   num_entries = amdgpu_vm_num_entries(adev, cursor->level - 1);
+
+   if (cursor->entry == >parent->entries[num_entries - 1])
+   return false;
+
+   cursor->pfn += 1ULL << shift;
+   cursor->pfn &= ~((1ULL << shift) - 1);
+   ++cursor->entry;
+   return true;
+}
+
+/**
+ * amdgpu_vm_pt_ancestor - go to parent node
+ *
+ * @adev: amdgpu_device pointer
+ * @cursor: current state
+ *
+ * Walk to the parent node of the current node.
+ * Returns:
+ * True if the walk was possible, false otherwise.
+ */
+static bool amdgpu_vm_pt_ancestor(struct amdgpu_vm_pt_cursor *cursor)
+{
+   if (!cursor->parent)
+   return false;
+
+   --cursor->level;
+   cursor->entry = cursor->parent;
+   cursor->parent = amdgpu_vm_pt_parent(cursor->parent);
+   return true;
+}
+
+/**
+ * amdgpu_vm_pt_next - get next PD/PT in hieratchy
+ *
+ * @adev: amdgpu_device pointer
+ * @cursor: current state
+ *
+ * Walk the PD/PT tree to the next node.
+ */
+static void amdgpu_vm_pt_next(struct amdgpu_device *adev,
+ struct amdgpu_vm_pt_cursor *cursor)
+{
+   /* First try a newborn child */
+   if (amdgpu_vm_pt_descendant(adev, cursor))
+   return;
+
+   /* If that didn't worked try to find a sibling */
+   while (!amdgpu_vm_pt_sibling(adev, cursor)) {
+   /* No sibling, go to our parents and grandparents */
+   if (!amdgpu_vm_pt_ancestor(cursor)) {
+   cursor->pfn = ~0ll;
+   return;
+   }
+   }
+}
+
+/**
+ * amdgpu_vm_pt_first_leaf - get first leaf PD/PT
+ *
+ * @adev: amdgpu_device pointer
+ * @vm: amdgpu_vm structure
+ * @start: start addr of the walk
+ * @cursor: state to initialize
+ *
+ * Start a walk and go directly to the leaf node.
+ */
+static void amdgpu_vm_pt_first_leaf(struct amdgpu_device *adev,
+   struct amdgpu_vm *vm, uint64_t start,
+ 

Re: [PATCH libdrm] tests/amdgpu: add unaligned VM test

2018-09-10 Thread Zhang, Jerry (Junwei)

On 09/10/2018 05:33 PM, Christian König wrote:

Am 10.09.2018 um 04:44 schrieb Zhang, Jerry (Junwei):

On 09/10/2018 02:04 AM, Christian König wrote:

Make a VM mapping which is as unaligned as possible.


Is it going to test unaligned address between BO allocation and BO mapping
and skip huge page mapping?


Yes and no.

Huge page handling works by mapping at least 2MB of continuous memory on a 2MB 
aligned address.

What I do here is I allocate 4GB of VRAM and try to map it to an address which 
is aligned to 1GB + 4KB.

In other words the VM subsystem will add a single PTE to align the entry to 
8KB, then it add two PTEs to align it to 16KB, then four to get to 32KB and so 
on until we have the maximum alignment of 2GB
which Vega/Raven support in the L1.


Thanks to explain that.

From the trace log, it will map 1*4KB, 2*4KB, ..., 256*4KB, then back to 1*4KB.

 amdgpu_test-1384  [005]    110.634466: amdgpu_vm_bo_update: 
soffs=11, eoffs=1f, flags=70
 amdgpu_test-1384  [005]    110.634467: amdgpu_vm_set_ptes: 
pe=f5feffd008, addr=01fec0, incr=4096, flags=71, count=1
 amdgpu_test-1384  [005]    110.634468: amdgpu_vm_set_ptes: 
pe=f5feffd010, addr=01fec01000, incr=4096, flags=f1, count=2
 amdgpu_test-1384  [005]    110.634468: amdgpu_vm_set_ptes: 
pe=f5feffd020, addr=01fec03000, incr=4096, flags=171, count=4
 amdgpu_test-1384  [005]    110.634468: amdgpu_vm_set_ptes: 
pe=f5feffd040, addr=01fec07000, incr=4096, flags=1f1, count=8
 amdgpu_test-1384  [005]    110.634468: amdgpu_vm_set_ptes: 
pe=f5feffd080, addr=01fec0f000, incr=4096, flags=271, count=16
 amdgpu_test-1384  [005]    110.634468: amdgpu_vm_set_ptes: 
pe=f5feffd100, addr=01fec1f000, incr=4096, flags=2f1, count=32
 amdgpu_test-1384  [005]    110.634469: amdgpu_vm_set_ptes: 
pe=f5feffd200, addr=01fec3f000, incr=4096, flags=371, count=64
 amdgpu_test-1384  [005]    110.634469: amdgpu_vm_set_ptes: 
pe=f5feffd400, addr=01fec7f000, incr=4096, flags=3f1, count=128
 amdgpu_test-1384  [005]    110.634469: amdgpu_vm_set_ptes: 
pe=f5feffd800, addr=01fecff000, incr=4096, flags=471, count=256
 amdgpu_test-1384  [005]    110.634469: amdgpu_vm_set_ptes: 
pe=f5feffc000, addr=01fedff000, incr=4096, flags=71, count=1
 amdgpu_test-1384  [005]    110.634470: amdgpu_vm_set_ptes: 
pe=f5feffc008, addr=01fea0, incr=4096, flags=71, count=1
 amdgpu_test-1384  [005]    110.634470: amdgpu_vm_set_ptes: 
pe=f5feffc010, addr=01fea01000, incr=4096, flags=f1, count=2

And it sounds like a performance test for Vega and later.
If so, shall we add some time stamp in the log?

Regards,
Jerry



Regards,
Christian.





Signed-off-by: Christian König 
---
  tests/amdgpu/vm_tests.c | 45 -
  1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/tests/amdgpu/vm_tests.c b/tests/amdgpu/vm_tests.c
index 7b6dc5d6..fada2987 100644
--- a/tests/amdgpu/vm_tests.c
+++ b/tests/amdgpu/vm_tests.c
@@ -31,8 +31,8 @@ static  amdgpu_device_handle device_handle;
  static  uint32_t  major_version;
  static  uint32_t  minor_version;

-
  static void amdgpu_vmid_reserve_test(void);
+static void amdgpu_vm_unaligned_map(void);

  CU_BOOL suite_vm_tests_enable(void)
  {
@@ -84,6 +84,7 @@ int suite_vm_tests_clean(void)

  CU_TestInfo vm_tests[] = {
  { "resere vmid test",  amdgpu_vmid_reserve_test },
+{ "unaligned map",  amdgpu_vm_unaligned_map },
  CU_TEST_INFO_NULL,
  };

@@ -167,3 +168,45 @@ static void amdgpu_vmid_reserve_test(void)
  r = amdgpu_cs_ctx_free(context_handle);
  CU_ASSERT_EQUAL(r, 0);
  }
+
+static void amdgpu_vm_unaligned_map(void)
+{
+const uint64_t map_size = (4ULL << 30) - (2 << 12);
+struct amdgpu_bo_alloc_request request = {};
+amdgpu_bo_handle buf_handle;
+amdgpu_va_handle handle;
+uint64_t vmc_addr;
+int r;
+
+request.alloc_size = 4ULL << 30;
+request.phys_alignment = 4096;
+request.preferred_heap = AMDGPU_GEM_DOMAIN_VRAM;
+request.flags = AMDGPU_GEM_CREATE_NO_CPU_ACCESS;
+
+r = amdgpu_bo_alloc(device_handle, , _handle);
+/* Don't let the test fail if the device doesn't have enough VRAM */


We may print some info to the console here.

Regards,
Jerry


+if (r)
+return;
+
+r = amdgpu_va_range_alloc(device_handle, amdgpu_gpu_va_range_general,
+  4ULL << 30, 1ULL << 30, 0, _addr,
+  , 0);
+CU_ASSERT_EQUAL(r, 0);
+if (r)
+goto error_va_alloc;
+
+vmc_addr += 1 << 12;
+
+r = amdgpu_bo_va_op(buf_handle, 0, map_size, vmc_addr, 0,
+AMDGPU_VA_OP_MAP);
+CU_ASSERT_EQUAL(r, 0);
+if (r)
+goto error_va_alloc;
+
+amdgpu_bo_va_op(buf_handle, 0, map_size, vmc_addr, 0,
+AMDGPU_VA_OP_UNMAP);
+
+error_va_alloc:
+amdgpu_bo_free(buf_handle);
+
+}





Re: [PATCH libdrm] tests/amdgpu: add unaligned VM test

2018-09-09 Thread Zhang, Jerry (Junwei)

On 09/10/2018 02:04 AM, Christian König wrote:

Make a VM mapping which is as unaligned as possible.


Is it going to test unaligned address between BO allocation and BO mapping
and skip huge page mapping?



Signed-off-by: Christian König 
---
  tests/amdgpu/vm_tests.c | 45 -
  1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/tests/amdgpu/vm_tests.c b/tests/amdgpu/vm_tests.c
index 7b6dc5d6..fada2987 100644
--- a/tests/amdgpu/vm_tests.c
+++ b/tests/amdgpu/vm_tests.c
@@ -31,8 +31,8 @@ static  amdgpu_device_handle device_handle;
  static  uint32_t  major_version;
  static  uint32_t  minor_version;

-
  static void amdgpu_vmid_reserve_test(void);
+static void amdgpu_vm_unaligned_map(void);

  CU_BOOL suite_vm_tests_enable(void)
  {
@@ -84,6 +84,7 @@ int suite_vm_tests_clean(void)

  CU_TestInfo vm_tests[] = {
{ "resere vmid test",  amdgpu_vmid_reserve_test },
+   { "unaligned map",  amdgpu_vm_unaligned_map },
CU_TEST_INFO_NULL,
  };

@@ -167,3 +168,45 @@ static void amdgpu_vmid_reserve_test(void)
r = amdgpu_cs_ctx_free(context_handle);
CU_ASSERT_EQUAL(r, 0);
  }
+
+static void amdgpu_vm_unaligned_map(void)
+{
+   const uint64_t map_size = (4ULL << 30) - (2 << 12);
+   struct amdgpu_bo_alloc_request request = {};
+   amdgpu_bo_handle buf_handle;
+   amdgpu_va_handle handle;
+   uint64_t vmc_addr;
+   int r;
+
+   request.alloc_size = 4ULL << 30;
+   request.phys_alignment = 4096;
+   request.preferred_heap = AMDGPU_GEM_DOMAIN_VRAM;
+   request.flags = AMDGPU_GEM_CREATE_NO_CPU_ACCESS;
+
+   r = amdgpu_bo_alloc(device_handle, , _handle);
+   /* Don't let the test fail if the device doesn't have enough VRAM */


We may print some info to the console here.

Regards,
Jerry


+   if (r)
+   return;
+
+   r = amdgpu_va_range_alloc(device_handle, amdgpu_gpu_va_range_general,
+ 4ULL << 30, 1ULL << 30, 0, _addr,
+ , 0);
+   CU_ASSERT_EQUAL(r, 0);
+   if (r)
+   goto error_va_alloc;
+
+   vmc_addr += 1 << 12;
+
+   r = amdgpu_bo_va_op(buf_handle, 0, map_size, vmc_addr, 0,
+   AMDGPU_VA_OP_MAP);
+   CU_ASSERT_EQUAL(r, 0);
+   if (r)
+   goto error_va_alloc;
+
+   amdgpu_bo_va_op(buf_handle, 0, map_size, vmc_addr, 0,
+   AMDGPU_VA_OP_UNMAP);
+
+error_va_alloc:
+   amdgpu_bo_free(buf_handle);
+
+}


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Fix SDMA hang in prt mode

2018-09-07 Thread Zhang, Jerry (Junwei)

On 09/07/2018 03:41 PM, Tao Zhou wrote:

Fix SDMA hang in prt mode, clear XNACK_WATERMARK in reg SDMA0_UTCL1_WATERMK to 
avoid the issue


What test case for that? new case?
Previously we have passed Vulkan CTS for that.

IIRC, NACK is required to reply, what's that meaning to clear that? no reply?

Regards,
Jerry



Affected ASIC: VEGA10 VEGA12 RV1 RV2

Change-Id: I2261b8e753600731d0d8ee8bbdfc08d01eeb428e
Signed-off-by: Tao Zhou 
Tested-by: Yukun Li 
---
  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c 
b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index df13840..13bf8ea 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -70,6 +70,7 @@ static const struct soc15_reg_golden golden_settings_sdma_4[] 
= {
SOC15_REG_GOLDEN_VALUE(SDMA0, 0, mmSDMA0_RLC1_IB_CNTL, 0x800f0100, 
0x0100),
SOC15_REG_GOLDEN_VALUE(SDMA0, 0, mmSDMA0_RLC1_RB_WPTR_POLL_CNTL, 
0xfff0, 0x00403000),
SOC15_REG_GOLDEN_VALUE(SDMA0, 0, mmSDMA0_UTCL1_PAGE, 0x03ff, 
0x03c0),
+   SOC15_REG_GOLDEN_VALUE(SDMA0, 0, mmSDMA0_UTCL1_WATERMK, 0xfc00, 
0x),
SOC15_REG_GOLDEN_VALUE(SDMA1, 0, mmSDMA1_CHICKEN_BITS, 0xfe931f07, 
0x02831f07),
SOC15_REG_GOLDEN_VALUE(SDMA1, 0, mmSDMA1_CLK_CTRL, 0x, 
0x3f000100),
SOC15_REG_GOLDEN_VALUE(SDMA1, 0, mmSDMA1_GFX_IB_CNTL, 0x800f0100, 
0x0100),
@@ -108,7 +109,8 @@ static const struct soc15_reg_golden 
golden_settings_sdma_4_1[] = {
SOC15_REG_GOLDEN_VALUE(SDMA0, 0, mmSDMA0_RLC0_RB_WPTR_POLL_CNTL, 
0xfff7, 0x00403000),
SOC15_REG_GOLDEN_VALUE(SDMA0, 0, mmSDMA0_RLC1_IB_CNTL, 0x800f0111, 
0x0100),
SOC15_REG_GOLDEN_VALUE(SDMA0, 0, mmSDMA0_RLC1_RB_WPTR_POLL_CNTL, 
0xfff7, 0x00403000),
-   SOC15_REG_GOLDEN_VALUE(SDMA0, 0, mmSDMA0_UTCL1_PAGE, 0x03ff, 
0x03c0)
+   SOC15_REG_GOLDEN_VALUE(SDMA0, 0, mmSDMA0_UTCL1_PAGE, 0x03ff, 
0x03c0),
+   SOC15_REG_GOLDEN_VALUE(SDMA0, 0, mmSDMA0_UTCL1_WATERMK, 0xfc00, 
0x)
  };

  static const struct soc15_reg_golden golden_settings_sdma0_4_2_init[] = {


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/1] drm/amdgpu: Fix compute VM BO params after rebase

2018-09-05 Thread Zhang, Jerry (Junwei)

On 09/06/2018 08:28 AM, Felix Kuehling wrote:

The intent of two commits was lost in the last rebase:

810955b drm/amdgpu: Fix acquiring VM on large-BAR systems
b5d21aa drm/amdgpu: Don't use shadow BO for compute context

This commit restores the original behaviour:
* Don't set AMDGPU_GEM_CREATE_NO_CPU_ACCESS for page directories
   to allow them to be reused for compute VMs
* Don't create shadow BOs for page tables in compute VMs

Signed-off-by: Felix Kuehling 


Personally making no_shadow -> shadow looks more simple.
anyway, that patches restoring what it's missing.

Acked-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 17 +++--
  1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index ea5e277..5e7a3de 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -577,10 +577,13 @@ static int amdgpu_vm_clear_bo(struct amdgpu_device *adev,
   *
   * @adev: amdgpu_device pointer
   * @vm: requesting vm
+ * @level: level in the page table hierarchy
+ * @no_shadow: disable creation of shadow BO for this VM
   * @bp: resulting BO allocation parameters
   */
  static void amdgpu_vm_bo_param(struct amdgpu_device *adev, struct amdgpu_vm 
*vm,
-  int level, struct amdgpu_bo_param *bp)
+  int level, bool no_shadow,
+  struct amdgpu_bo_param *bp)
  {
memset(bp, 0, sizeof(*bp));

@@ -595,9 +598,8 @@ static void amdgpu_vm_bo_param(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
AMDGPU_GEM_CREATE_CPU_GTT_USWC;
if (vm->use_cpu_for_update)
bp->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
-   else
-   bp->flags |= AMDGPU_GEM_CREATE_SHADOW |
-   AMDGPU_GEM_CREATE_NO_CPU_ACCESS;
+   else if (!no_shadow)
+   bp->flags |= AMDGPU_GEM_CREATE_SHADOW;
bp->type = ttm_bo_type_kernel;
if (vm->root.base.bo)
bp->resv = vm->root.base.bo->tbo.resv;
@@ -626,6 +628,7 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device 
*adev,
  unsigned level, bool ats)
  {
unsigned shift = amdgpu_vm_level_shift(adev, level);
+   bool no_shadow = !vm->root.base.bo->shadow;
struct amdgpu_bo_param bp;
unsigned pt_idx, from, to;
int r;
@@ -650,7 +653,7 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device 
*adev,
saddr = saddr & ((1 << shift) - 1);
eaddr = eaddr & ((1 << shift) - 1);

-   amdgpu_vm_bo_param(adev, vm, level, );
+   amdgpu_vm_bo_param(adev, vm, level, no_shadow, );

/* walk over the address space and allocate the page tables */
for (pt_idx = from; pt_idx <= to; ++pt_idx) {
@@ -2709,6 +2712,7 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, 
uint32_t min_vm_size,
  int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
   int vm_context, unsigned int pasid)
  {
+   bool no_shadow = (vm_context == AMDGPU_VM_CONTEXT_COMPUTE);
struct amdgpu_bo_param bp;
struct amdgpu_bo *root;
int r, i;
@@ -2748,7 +2752,8 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
  "CPU update of VM recommended only for large BAR system\n");
vm->last_update = NULL;

-   amdgpu_vm_bo_param(adev, vm, adev->vm_manager.root_level, );
+   amdgpu_vm_bo_param(adev, vm, adev->vm_manager.root_level, no_shadow,
+  );
r = amdgpu_bo_create(adev, , );
if (r)
goto error_free_sched_entity;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: enable AGP aperture for GMC9 v2

2018-09-03 Thread Zhang, Jerry (Junwei)

On 09/03/2018 08:22 PM, Christian König wrote:

Enable the old AGP aperture to avoid GART mappings.

v2: don't enable it for SRIOV

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c | 10 +-
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c|  2 ++
  drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c  | 10 +-
  3 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
index 3403ded39d13..ffd0ec9586d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
@@ -65,16 +65,16 @@ static void gfxhub_v1_0_init_system_aperture_regs(struct 
amdgpu_device *adev)
  {
uint64_t value;

-   /* Disable AGP. */
+   /* Program the AGP BAR */
WREG32_SOC15(GC, 0, mmMC_VM_AGP_BASE, 0);
-   WREG32_SOC15(GC, 0, mmMC_VM_AGP_TOP, 0);
-   WREG32_SOC15(GC, 0, mmMC_VM_AGP_BOT, 0x);
+   WREG32_SOC15(GC, 0, mmMC_VM_AGP_BOT, adev->gmc.agp_start >> 24);
+   WREG32_SOC15(GC, 0, mmMC_VM_AGP_TOP, adev->gmc.agp_end >> 24);

/* Program the system aperture low logical page number. */
WREG32_SOC15(GC, 0, mmMC_VM_SYSTEM_APERTURE_LOW_ADDR,
-adev->gmc.vram_start >> 18);
+min(adev->gmc.vram_start, adev->gmc.agp_start) >> 18);
WREG32_SOC15(GC, 0, mmMC_VM_SYSTEM_APERTURE_HIGH_ADDR,
-adev->gmc.vram_end >> 18);
+max(adev->gmc.vram_end, adev->gmc.agp_end) >> 18);

/* Set default page address. */
value = adev->vram_scratch.gpu_addr - adev->gmc.vram_start
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index f467638eb49d..3529c55ab52d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -772,6 +772,8 @@ static void gmc_v9_0_vram_gtt_location(struct amdgpu_device 
*adev,
base = mmhub_v1_0_get_fb_location(adev);
amdgpu_gmc_vram_location(adev, >gmc, base);
amdgpu_gmc_gart_location(adev, mc);
+   if (!amdgpu_sriov_vf(adev))
+   amdgpu_gmc_agp_location(adev, mc);
/* base offset of vram pages */
adev->vm_manager.vram_base_offset = gfxhub_v1_0_get_mc_fb_offset(adev);
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
index 5f6a9c85488f..73d7c075dd33 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
@@ -76,16 +76,16 @@ static void mmhub_v1_0_init_system_aperture_regs(struct 
amdgpu_device *adev)
uint64_t value;
uint32_t tmp;

-   /* Disable AGP. */
+   /* Program the AGP BAR */
WREG32_SOC15(MMHUB, 0, mmMC_VM_AGP_BASE, 0);
-   WREG32_SOC15(MMHUB, 0, mmMC_VM_AGP_TOP, 0);
-   WREG32_SOC15(MMHUB, 0, mmMC_VM_AGP_BOT, 0x00FF);
+   WREG32_SOC15(MMHUB, 0, mmMC_VM_AGP_BOT, adev->gmc.agp_start >> 24);
+   WREG32_SOC15(MMHUB, 0, mmMC_VM_AGP_TOP, adev->gmc.agp_end >> 24);

/* Program the system aperture low logical page number. */
WREG32_SOC15(MMHUB, 0, mmMC_VM_SYSTEM_APERTURE_LOW_ADDR,
-adev->gmc.vram_start >> 18);
+min(adev->gmc.vram_start, adev->gmc.agp_start) >> 18);
WREG32_SOC15(MMHUB, 0, mmMC_VM_SYSTEM_APERTURE_HIGH_ADDR,
-adev->gmc.vram_end >> 18);
+max(adev->gmc.vram_end, adev->gmc.agp_end) >> 18);

/* Set default page address. */
value = adev->vram_scratch.gpu_addr - adev->gmc.vram_start +


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm] amdgpu: fix typo in function comment

2018-09-03 Thread Zhang, Jerry (Junwei)

On 09/03/2018 06:59 PM, Qiang Yu wrote:

Signed-off-by: Qiang Yu 

Reviewed-by: Junwei Zhang 


---
  amdgpu/amdgpu.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index dc51659..e6ec7a8 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -731,7 +731,7 @@ int amdgpu_bo_free(amdgpu_bo_handle buf_handle);
  void amdgpu_bo_inc_ref(amdgpu_bo_handle bo);

  /**
- * Request CPU access to GPU accessable memory
+ * Request CPU access to GPU accessible memory
   *
   * \param   buf_handle - \c [in] Buffer handle
   * \param   cpu- \c [out] CPU address to be used for access


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: improve VM state machine documentation v2

2018-09-03 Thread Zhang, Jerry (Junwei)

On 09/03/2018 05:08 PM, Christian König wrote:

Since we have a lot of FAQ on the VM state machine try to improve the
documentation by adding functions for each state move.

v2: fix typo in amdgpu_vm_bo_invalidated, use amdgpu_vm_bo_relocated in
 one more place as well.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 141 +
  1 file changed, 109 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 65977e7c94dc..1f79a0ddc78a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -204,6 +204,95 @@ static unsigned amdgpu_vm_bo_size(struct amdgpu_device 
*adev, unsigned level)
return AMDGPU_GPU_PAGE_ALIGN(amdgpu_vm_num_entries(adev, level) * 8);
  }

+/**
+ * amdgpu_vm_bo_evicted - vm_bo is evicted
+ *
+ * @vm_bo: vm_bo which is evicted
+ *
+ * State for PDs/PTs and per VM BOs which are not at the location they should
+ * be.
+ */
+static void amdgpu_vm_bo_evicted(struct amdgpu_vm_bo_base *vm_bo)
+{
+   struct amdgpu_vm *vm = vm_bo->vm;
+   struct amdgpu_bo *bo = vm_bo->bo;
+
+   vm_bo->moved = true;
+   if (bo->tbo.type == ttm_bo_type_kernel)
+   list_move(_bo->vm_status, >evicted);
+   else
+   list_move_tail(_bo->vm_status, >evicted);
+}
+
+/**
+ * amdgpu_vm_bo_relocated - vm_bo is reloacted
+ *
+ * @vm_bo: vm_bo which is relocated
+ *
+ * State for PDs/PTs which needs to update their parent PD.
+ */
+static void amdgpu_vm_bo_relocated(struct amdgpu_vm_bo_base *vm_bo)
+{
+   list_move(_bo->vm_status, _bo->vm->relocated);
+}
+
+/**
+ * amdgpu_vm_bo_moved - vm_bo is moved
+ *
+ * @vm_bo: vm_bo which is moved
+ *
+ * State for per VM BOs which are moved, but that change is not yet reflected
+ * in the page tables.
+ */
+static void amdgpu_vm_bo_moved(struct amdgpu_vm_bo_base *vm_bo)
+{
+   list_move(_bo->vm_status, _bo->vm->moved);
+}
+
+/**
+ * amdgpu_vm_bo_idle - vm_bo is idle
+ *
+ * @vm_bo: vm_bo which is now idle
+ *
+ * State for PDs/PTs and per VM BOs which have gone through the state machine
+ * and are now idle.
+ */
+static void amdgpu_vm_bo_idle(struct amdgpu_vm_bo_base *vm_bo)
+{
+   list_move(_bo->vm_status, _bo->vm->idle);
+   vm_bo->moved = false;
+}
+
+/**
+ * amdgpu_vm_bo_invalidated - vm_bo is invalidated
+ *
+ * @vm_bo: vm_bo which is now invalidated
+ *
+ * State for normal BOs which are invalidated and that change not yet reflected
+ * in the PTs.
+ */
+static void amdgpu_vm_bo_invalidated(struct amdgpu_vm_bo_base *vm_bo)
+{
+   spin_lock(_bo->vm->invalidated_lock);
+   list_move(_bo->vm_status, _bo->vm->invalidated);
+   spin_unlock(_bo->vm->invalidated_lock);
+}
+
+/**
+ * amdgpu_vm_bo_done - vm_bo is done
+ *
+ * @vm_bo: vm_bo which is now done
+ *
+ * State for normal BOs which are invalidated and that change has been updated
+ * in the PTs.
+ */
+static void amdgpu_vm_bo_done(struct amdgpu_vm_bo_base *vm_bo)
+{
+   spin_lock(_bo->vm->invalidated_lock);
+   list_del_init(_bo->vm_status);
+   spin_unlock(_bo->vm->invalidated_lock);
+}
+
  /**
   * amdgpu_vm_bo_base_init - Adds bo to the list of bos associated with the vm
   *
@@ -232,9 +321,9 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base 
*base,

vm->bulk_moveable = false;
if (bo->tbo.type == ttm_bo_type_kernel)
-   list_move(>vm_status, >relocated);
+   amdgpu_vm_bo_relocated(base);
else
-   list_move(>vm_status, >idle);
+   amdgpu_vm_bo_idle(base);

if (bo->preferred_domains &
amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type))
@@ -245,8 +334,7 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base 
*base,
 * is currently evicted. add the bo to the evicted list to make sure it
 * is validated on next vm use to avoid fault.
 * */
-   list_move_tail(>vm_status, >evicted);
-   base->moved = true;
+   amdgpu_vm_bo_evicted(base);
  }

  /**
@@ -342,7 +430,7 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
break;

if (bo->tbo.type != ttm_bo_type_kernel) {
-   list_move(_base->vm_status, >moved);
+   amdgpu_vm_bo_moved(bo_base);
} else {
if (vm->use_cpu_for_update)
r = amdgpu_bo_kmap(bo, NULL);
@@ -350,7 +438,7 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
r = amdgpu_ttm_alloc_gart(>tbo);
if (r)
break;
-   list_move(_base->vm_status, >relocated);
+   amdgpu_vm_bo_relocated(bo_base);
}
}

@@ -1066,7 +1154,7 @@ static 

Re: [PATCH libdrm] amdgpu: When couldn't find bo, need to return error.

2018-09-03 Thread Zhang, Jerry (Junwei)

On 09/03/2018 04:44 PM, Christian König wrote:

Am 03.09.2018 um 09:16 schrieb Zhang, Jerry (Junwei):

On 09/03/2018 03:11 PM, Christian König wrote:

About master branch, needs someone's help with correct permission.

I've already took care of that on the weekend.


Thank you again.
BTW, how to apply that permission?


Previously that was done by opening a bugzilla ticket, but since the migration 
to gitlab that might be outdated now.

A good start is to go to https://gitlab.freedesktop.org and register an 
account, then somebody from the admin team needs to add that account to the 
appropriate groups.


Thanks, will give a try.

Regards,
Jerry



Regards,
Christian.



Regards,
Jerry



Regards,
Christian.

Am 03.09.2018 um 03:42 schrieb Zhang, Jerry (Junwei):

On 09/01/2018 04:58 PM, Deng, Emily wrote:

Ok, then just ignore this patch. But seems didn't saw the patch on branch 
amd-staging-hybrid-master20180315.


Thanks to take care of this as well.

I'm waiting some verification, and now push the patch to internal staging branch
mainline will be pushed later for another verification.

About master branch, needs someone's help with correct permission.

Regards,
Jerry


Best wishes
Emily Deng


-Original Message-
From: Christian König 
Sent: Saturday, September 1, 2018 4:17 PM
To: Deng, Emily ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH libdrm] amdgpu: When couldn't find bo, need to return
error.

Am 01.09.2018 um 06:24 schrieb Emily Deng:

The startx will have segmant fault if return success.

SWDEV-163962

Change-Id: I56b189fa26efdcd1d96e5100af3f3e0b1208b0c3
Signed-off-by: Emily Deng 


Jerry already send a much better patch for this.


---
   amdgpu/amdgpu_bo.c | 4 +++-
   1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c index
f25cacc..7e297fa 100644
--- a/amdgpu/amdgpu_bo.c
+++ b/amdgpu/amdgpu_bo.c
@@ -760,6 +760,7 @@ int

amdgpu_find_bo_by_cpu_mapping(amdgpu_device_handle dev,

 uint64_t *offset_in_bo)
   {
   uint32_t i;
+int r = 0;
   struct amdgpu_bo *bo;

   if (cpu == NULL || size == 0)
@@ -787,10 +788,11 @@ int

amdgpu_find_bo_by_cpu_mapping(amdgpu_device_handle dev,

   } else {
   *buf_handle = NULL;
   *offset_in_bo = 0;
+r = -errno;


errno doesn't contain any error in this case.


   }
pthread_mutex_unlock(>bo_table_mutex);

-return 0;
+return r;
   }

   int amdgpu_create_bo_from_user_mem(amdgpu_device_handle dev,


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx






___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: fix amdgpu_mn_unlock() in the CS error path

2018-09-03 Thread Zhang, Jerry (Junwei)

On 09/03/2018 04:53 PM, Christian König wrote:

Avoid unlocking a lock we never locked.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 349dcc37ee64..04a2733b5ccc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1247,10 +1247,10 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
  error_abort:
dma_fence_put(>base.s_fence->finished);
job->base.s_fence = NULL;
+   amdgpu_mn_unlock(p->mn);

  error_unlock:
amdgpu_job_free(job);
-   amdgpu_mn_unlock(p->mn);
return r;
  }



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm] amdgpu: add amdgpu_bo_inc_ref() function.

2018-09-03 Thread Zhang, Jerry (Junwei)

On 09/03/2018 02:55 PM, Qiang Yu wrote:

For Pro OGL be able to work with upstream libdrm.

Signed-off-by: Qiang Yu 
Reviewed-by: Christian König 


I'm fine with that, not sure if mesa is going to use that as well.

Reviewed-by: Junwei Zhang 

Regards,
Jerry


---
  amdgpu/amdgpu-symbol-check |  1 +
  amdgpu/amdgpu.h| 15 ++-
  amdgpu/amdgpu_bo.c |  6 ++
  3 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/amdgpu/amdgpu-symbol-check b/amdgpu/amdgpu-symbol-check
index b5e4fe6..487610e 100755
--- a/amdgpu/amdgpu-symbol-check
+++ b/amdgpu/amdgpu-symbol-check
@@ -15,6 +15,7 @@ amdgpu_bo_cpu_map
  amdgpu_bo_cpu_unmap
  amdgpu_bo_export
  amdgpu_bo_free
+amdgpu_bo_inc_ref
  amdgpu_bo_import
  amdgpu_bo_list_create
  amdgpu_bo_list_destroy
diff --git a/amdgpu/amdgpu.h b/amdgpu/amdgpu.h
index a8c353c..e1f93f8 100644
--- a/amdgpu/amdgpu.h
+++ b/amdgpu/amdgpu.h
@@ -721,7 +721,20 @@ int amdgpu_find_bo_by_cpu_mapping(amdgpu_device_handle dev,
  int amdgpu_bo_free(amdgpu_bo_handle buf_handle);

  /**
- * Request CPU access to GPU accessible memory
+ * Increase the reference count of a buffer object
+ *
+ * \param   bo - \c [in]  Buffer object handle to increase the reference count
+ *
+ * \return   0 on success\n
+ *  <0 - Negative POSIX Error code
+ *
+ * \sa amdgpu_bo_alloc(), amdgpu_bo_free()
+ *
+*/
+int amdgpu_bo_inc_ref(amdgpu_bo_handle bo);
+
+/**
+ * Request CPU access to GPU accessable memory
   *
   * \param   buf_handle - \c [in] Buffer handle
   * \param   cpu- \c [out] CPU address to be used for access
diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c
index a2fc525..dceab01 100644
--- a/amdgpu/amdgpu_bo.c
+++ b/amdgpu/amdgpu_bo.c
@@ -438,6 +438,12 @@ int amdgpu_bo_free(amdgpu_bo_handle buf_handle)
return 0;
  }

+int amdgpu_bo_inc_ref(amdgpu_bo_handle bo)
+{
+   atomic_inc(>refcount);
+   return 0;
+}
+
  int amdgpu_bo_cpu_map(amdgpu_bo_handle bo, void **cpu)
  {
union drm_amdgpu_gem_mmap args;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm] amdgpu: When couldn't find bo, need to return error.

2018-09-03 Thread Zhang, Jerry (Junwei)

On 09/03/2018 03:11 PM, Christian König wrote:

About master branch, needs someone's help with correct permission.

I've already took care of that on the weekend.


Thank you again.
BTW, how to apply that permission?

Regards,
Jerry



Regards,
Christian.

Am 03.09.2018 um 03:42 schrieb Zhang, Jerry (Junwei):

On 09/01/2018 04:58 PM, Deng, Emily wrote:

Ok, then just ignore this patch. But seems didn't saw the patch on branch 
amd-staging-hybrid-master20180315.


Thanks to take care of this as well.

I'm waiting some verification, and now push the patch to internal staging branch
mainline will be pushed later for another verification.

About master branch, needs someone's help with correct permission.

Regards,
Jerry


Best wishes
Emily Deng


-Original Message-
From: Christian König 
Sent: Saturday, September 1, 2018 4:17 PM
To: Deng, Emily ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH libdrm] amdgpu: When couldn't find bo, need to return
error.

Am 01.09.2018 um 06:24 schrieb Emily Deng:

The startx will have segmant fault if return success.

SWDEV-163962

Change-Id: I56b189fa26efdcd1d96e5100af3f3e0b1208b0c3
Signed-off-by: Emily Deng 


Jerry already send a much better patch for this.


---
   amdgpu/amdgpu_bo.c | 4 +++-
   1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c index
f25cacc..7e297fa 100644
--- a/amdgpu/amdgpu_bo.c
+++ b/amdgpu/amdgpu_bo.c
@@ -760,6 +760,7 @@ int

amdgpu_find_bo_by_cpu_mapping(amdgpu_device_handle dev,

 uint64_t *offset_in_bo)
   {
   uint32_t i;
+int r = 0;
   struct amdgpu_bo *bo;

   if (cpu == NULL || size == 0)
@@ -787,10 +788,11 @@ int

amdgpu_find_bo_by_cpu_mapping(amdgpu_device_handle dev,

   } else {
   *buf_handle = NULL;
   *offset_in_bo = 0;
+r = -errno;


errno doesn't contain any error in this case.


   }
   pthread_mutex_unlock(>bo_table_mutex);

-return 0;
+return r;
   }

   int amdgpu_create_bo_from_user_mem(amdgpu_device_handle dev,


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] drm/amdgpu: move size calculations to the front of the file again

2018-09-02 Thread Zhang, Jerry (Junwei)

On 09/02/2018 02:05 AM, Christian König wrote:

amdgpu_vm_bo_* functions should come much later.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 90 +-
  1 file changed, 45 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index d59222fb5931..a9275a99d793 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -133,51 +133,6 @@ struct amdgpu_prt_cb {
struct dma_fence_cb cb;
  };

-/**
- * amdgpu_vm_bo_base_init - Adds bo to the list of bos associated with the vm
- *
- * @base: base structure for tracking BO usage in a VM
- * @vm: vm to which bo is to be added
- * @bo: amdgpu buffer object
- *
- * Initialize a bo_va_base structure and add it to the appropriate lists
- *
- */
-static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base,
-  struct amdgpu_vm *vm,
-  struct amdgpu_bo *bo)
-{
-   base->vm = vm;
-   base->bo = bo;
-   INIT_LIST_HEAD(>bo_list);
-   INIT_LIST_HEAD(>vm_status);
-
-   if (!bo)
-   return;
-   list_add_tail(>bo_list, >va);
-
-   if (bo->tbo.resv != vm->root.base.bo->tbo.resv)
-   return;
-
-   vm->bulk_moveable = false;
-   if (bo->tbo.type == ttm_bo_type_kernel)
-   list_move(>vm_status, >relocated);
-   else
-   list_move(>vm_status, >idle);
-
-   if (bo->preferred_domains &
-   amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type))
-   return;
-
-   /*
-* we checked all the prerequisites, but it looks like this per vm bo
-* is currently evicted. add the bo to the evicted list to make sure it
-* is validated on next vm use to avoid fault.
-* */
-   list_move_tail(>vm_status, >evicted);
-   base->moved = true;
-}
-
  /**
   * amdgpu_vm_level_shift - return the addr shift for each level
   *
@@ -249,6 +204,51 @@ static unsigned amdgpu_vm_bo_size(struct amdgpu_device 
*adev, unsigned level)
return AMDGPU_GPU_PAGE_ALIGN(amdgpu_vm_num_entries(adev, level) * 8);
  }

+/**
+ * amdgpu_vm_bo_base_init - Adds bo to the list of bos associated with the vm
+ *
+ * @base: base structure for tracking BO usage in a VM
+ * @vm: vm to which bo is to be added
+ * @bo: amdgpu buffer object
+ *
+ * Initialize a bo_va_base structure and add it to the appropriate lists
+ *
+ */
+static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base,
+  struct amdgpu_vm *vm,
+  struct amdgpu_bo *bo)
+{
+   base->vm = vm;
+   base->bo = bo;
+   INIT_LIST_HEAD(>bo_list);
+   INIT_LIST_HEAD(>vm_status);
+
+   if (!bo)
+   return;
+   list_add_tail(>bo_list, >va);
+
+   if (bo->tbo.resv != vm->root.base.bo->tbo.resv)
+   return;
+
+   vm->bulk_moveable = false;
+   if (bo->tbo.type == ttm_bo_type_kernel)
+   list_move(>vm_status, >relocated);
+   else
+   list_move(>vm_status, >idle);
+
+   if (bo->preferred_domains &
+   amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type))
+   return;
+
+   /*
+* we checked all the prerequisites, but it looks like this per vm bo
+* is currently evicted. add the bo to the evicted list to make sure it
+* is validated on next vm use to avoid fault.
+* */
+   list_move_tail(>vm_status, >evicted);
+   base->moved = true;
+}
+
  /**
   * amdgpu_vm_get_pd_bo - add the VM PD to a validation list
   *


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 3/3] drm/amdgpu: improve VM state machine documentation

2018-09-02 Thread Zhang, Jerry (Junwei)

On 09/02/2018 02:05 AM, Christian König wrote:

Since we have a lot of FAQ on the VM state machine try to improve the
documentation by adding functions for each state move.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 139 +
  1 file changed, 108 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 65977e7c94dc..ce252ead2ee4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -204,6 +204,95 @@ static unsigned amdgpu_vm_bo_size(struct amdgpu_device 
*adev, unsigned level)
return AMDGPU_GPU_PAGE_ALIGN(amdgpu_vm_num_entries(adev, level) * 8);
  }

+/**
+ * amdgpu_vm_bo_evicted - vm_bo is evicted
+ *
+ * @vm_bo: vm_bo which is evicted
+ *
+ * State for PDs/PTs and per VM BOs which are not at the location they should
+ * be.
+ */
+static void amdgpu_vm_bo_evicted(struct amdgpu_vm_bo_base *vm_bo)
+{
+   struct amdgpu_vm *vm = vm_bo->vm;
+   struct amdgpu_bo *bo = vm_bo->bo;
+
+   vm_bo->moved = true;
+   if (bo->tbo.type == ttm_bo_type_kernel)
+   list_move(_bo->vm_status, >evicted);
+   else
+   list_move_tail(_bo->vm_status, >evicted);
+}
+
+/**
+ * amdgpu_vm_bo_relocated - vm_bo is reloacted
+ *
+ * @vm_bo: vm_bo which is relocated
+ *
+ * State for PDs/PTs which needs to update their parent PD.
+ */
+static void amdgpu_vm_bo_relocated(struct amdgpu_vm_bo_base *vm_bo)
+{
+   list_move(_bo->vm_status, _bo->vm->relocated);
+}
+
+/**
+ * amdgpu_vm_bo_moved - vm_bo is moved
+ *
+ * @vm_bo: vm_bo which is moved
+ *
+ * State for per VM BOs which are moved, but that change is not yet reflected
+ * in the page tables.
+ */
+static void amdgpu_vm_bo_moved(struct amdgpu_vm_bo_base *vm_bo)
+{
+   list_move(_bo->vm_status, _bo->vm->moved);
+}
+
+/**
+ * amdgpu_vm_bo_idle - vm_bo is idle
+ *
+ * @vm_bo: vm_bo which is now idle
+ *
+ * State for PDs/PTs and per VM BOs which have gone through the state machine
+ * and are now idle.
+ */
+static void amdgpu_vm_bo_idle(struct amdgpu_vm_bo_base *vm_bo)
+{
+   list_move(_bo->vm_status, _bo->vm->idle);
+   vm_bo->moved = false;
+}
+
+/**
+ * amdgpu_vm_bo_invalidated - vm_bo is invalidated
+ *
+ * @vm_bo: vm_bo which is now invalidated
+ *
+ * State for normal BOs which are invalidated and that change not yet reflected
+ * in the PTs.
+ */
+static void amdgpu_vm_bo_invalidated(struct amdgpu_vm_bo_base *vm_bo)
+{
+   spin_lock(_bo->vm->invalidated_lock);
+   list_move(_bo->vm_status, _bo->vm->idle);


Is that a typo? move to vm->invalidated?

Apart from that, it's
Reviewed-by: Junwei Zhang 

Regards,
Jerry


+   spin_unlock(_bo->vm->invalidated_lock);
+}
+
+/**
+ * amdgpu_vm_bo_done - vm_bo is done
+ *
+ * @vm_bo: vm_bo which is now done
+ *
+ * State for normal BOs which are invalidated and that change has been updated
+ * in the PTs.
+ */
+static void amdgpu_vm_bo_done(struct amdgpu_vm_bo_base *vm_bo)
+{
+   spin_lock(_bo->vm->invalidated_lock);
+   list_del_init(_bo->vm_status);
+   spin_unlock(_bo->vm->invalidated_lock);
+}
+
  /**
   * amdgpu_vm_bo_base_init - Adds bo to the list of bos associated with the vm
   *
@@ -232,9 +321,9 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base 
*base,

vm->bulk_moveable = false;
if (bo->tbo.type == ttm_bo_type_kernel)
-   list_move(>vm_status, >relocated);
+   amdgpu_vm_bo_relocated(base);
else
-   list_move(>vm_status, >idle);
+   amdgpu_vm_bo_idle(base);

if (bo->preferred_domains &
amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type))
@@ -245,8 +334,7 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base 
*base,
 * is currently evicted. add the bo to the evicted list to make sure it
 * is validated on next vm use to avoid fault.
 * */
-   list_move_tail(>vm_status, >evicted);
-   base->moved = true;
+   amdgpu_vm_bo_evicted(base);
  }

  /**
@@ -342,7 +430,7 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
break;

if (bo->tbo.type != ttm_bo_type_kernel) {
-   list_move(_base->vm_status, >moved);
+   amdgpu_vm_bo_moved(bo_base);
} else {
if (vm->use_cpu_for_update)
r = amdgpu_bo_kmap(bo, NULL);
@@ -350,7 +438,7 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
r = amdgpu_ttm_alloc_gart(>tbo);
if (r)
break;
-   list_move(_base->vm_status, >relocated);
+   amdgpu_vm_bo_relocated(bo_base);
}
}

@@ -1121,8 +1209,7 @@ int 

Re: [PATCH 2/3] drm/amdgpu: separate per VM BOs from normal in the moved state

2018-09-02 Thread Zhang, Jerry (Junwei)

On 09/02/2018 02:05 AM, Christian König wrote:

Allows us to avoid taking the spinlock in more places.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 67 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |  7 +++-
  2 files changed, 38 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index a9275a99d793..65977e7c94dc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -342,9 +342,7 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
break;

if (bo->tbo.type != ttm_bo_type_kernel) {
-   spin_lock(>moved_lock);
list_move(_base->vm_status, >moved);
-   spin_unlock(>moved_lock);
} else {
if (vm->use_cpu_for_update)
r = amdgpu_bo_kmap(bo, NULL);
@@ -1734,10 +1732,6 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
amdgpu_asic_flush_hdp(adev, NULL);
}

-   spin_lock(>moved_lock);
-   list_del_init(_va->base.vm_status);
-   spin_unlock(>moved_lock);
-
/* If the BO is not in its preferred location add it back to
 * the evicted list so that it gets validated again on the
 * next command submission.
@@ -1746,9 +1740,13 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
uint32_t mem_type = bo->tbo.mem.mem_type;

if (!(bo->preferred_domains & 
amdgpu_mem_type_to_domain(mem_type)))
-   list_add_tail(_va->base.vm_status, >evicted);
+   list_move_tail(_va->base.vm_status, >evicted);
else
-   list_add(_va->base.vm_status, >idle);
+   list_move(_va->base.vm_status, >idle);
+   } else {
+   spin_lock(>invalidated_lock);
+   list_del_init(_va->base.vm_status);
+   spin_unlock(>invalidated_lock);
}

list_splice_init(_va->invalids, _va->valids);
@@ -1974,40 +1972,40 @@ int amdgpu_vm_handle_moved(struct amdgpu_device *adev,
   struct amdgpu_vm *vm)
  {
struct amdgpu_bo_va *bo_va, *tmp;
-   struct list_head moved;
+   struct reservation_object *resv;
bool clear;
int r;

-   INIT_LIST_HEAD();
-   spin_lock(>moved_lock);
-   list_splice_init(>moved, );
-   spin_unlock(>moved_lock);
+   list_for_each_entry_safe(bo_va, tmp, >moved, base.vm_status) {
+   /* Per VM BOs never need to bo cleared in the page tables */
+   r = amdgpu_vm_bo_update(adev, bo_va, false);
+   if (r)
+   return r;
+   }

-   list_for_each_entry_safe(bo_va, tmp, , base.vm_status) {
-   struct reservation_object *resv = bo_va->base.bo->tbo.resv;
+   spin_lock(>invalidated_lock);
+   while (!list_empty(>invalidated)) {
+   bo_va = list_first_entry(>invalidated, struct amdgpu_bo_va,
+base.vm_status);
+   resv = bo_va->base.bo->tbo.resv;
+   spin_unlock(>invalidated_lock);

-   /* Per VM BOs never need to bo cleared in the page tables */
-   if (resv == vm->root.base.bo->tbo.resv)
-   clear = false;
/* Try to reserve the BO to avoid clearing its ptes */
-   else if (!amdgpu_vm_debug && reservation_object_trylock(resv))
+   if (!amdgpu_vm_debug && reservation_object_trylock(resv))
clear = false;
/* Somebody else is using the BO right now */
else
clear = true;

r = amdgpu_vm_bo_update(adev, bo_va, clear);
-   if (r) {
-   spin_lock(>moved_lock);
-   list_splice(, >moved);
-   spin_unlock(>moved_lock);
+   if (r)
return r;
-   }

-   if (!clear && resv != vm->root.base.bo->tbo.resv)
+   if (!clear)
reservation_object_unlock(resv);
-
+   spin_lock(>invalidated_lock);
}
+   spin_unlock(>invalidated_lock);

return 0;
  }
@@ -2072,9 +2070,7 @@ static void amdgpu_vm_bo_insert_map(struct amdgpu_device 
*adev,

if (bo && bo->tbo.resv == vm->root.base.bo->tbo.resv &&
!bo_va->base.moved) {
-   spin_lock(>moved_lock);
list_move(_va->base.vm_status, >moved);
-   spin_unlock(>moved_lock);
}
trace_amdgpu_vm_bo_map(bo_va, mapping);
  }
@@ -2430,9 +2426,9 @@ void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,


Re: [PATCH 1/2] drm/amdgpu: move size calculations to the front of the file again

2018-09-02 Thread Zhang, Jerry (Junwei)

On 08/31/2018 09:27 PM, Christian König wrote:

amdgpu_vm_bo_* functions should come much later.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 90 +-
  1 file changed, 45 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index d59222fb5931..a9275a99d793 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -133,51 +133,6 @@ struct amdgpu_prt_cb {
struct dma_fence_cb cb;
  };

-/**
- * amdgpu_vm_bo_base_init - Adds bo to the list of bos associated with the vm
- *
- * @base: base structure for tracking BO usage in a VM
- * @vm: vm to which bo is to be added
- * @bo: amdgpu buffer object
- *
- * Initialize a bo_va_base structure and add it to the appropriate lists
- *
- */
-static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base,
-  struct amdgpu_vm *vm,
-  struct amdgpu_bo *bo)
-{
-   base->vm = vm;
-   base->bo = bo;
-   INIT_LIST_HEAD(>bo_list);
-   INIT_LIST_HEAD(>vm_status);
-
-   if (!bo)
-   return;
-   list_add_tail(>bo_list, >va);
-
-   if (bo->tbo.resv != vm->root.base.bo->tbo.resv)
-   return;
-
-   vm->bulk_moveable = false;
-   if (bo->tbo.type == ttm_bo_type_kernel)
-   list_move(>vm_status, >relocated);
-   else
-   list_move(>vm_status, >idle);
-
-   if (bo->preferred_domains &
-   amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type))
-   return;
-
-   /*
-* we checked all the prerequisites, but it looks like this per vm bo
-* is currently evicted. add the bo to the evicted list to make sure it
-* is validated on next vm use to avoid fault.
-* */
-   list_move_tail(>vm_status, >evicted);
-   base->moved = true;
-}
-
  /**
   * amdgpu_vm_level_shift - return the addr shift for each level
   *
@@ -249,6 +204,51 @@ static unsigned amdgpu_vm_bo_size(struct amdgpu_device 
*adev, unsigned level)
return AMDGPU_GPU_PAGE_ALIGN(amdgpu_vm_num_entries(adev, level) * 8);
  }

+/**
+ * amdgpu_vm_bo_base_init - Adds bo to the list of bos associated with the vm
+ *
+ * @base: base structure for tracking BO usage in a VM
+ * @vm: vm to which bo is to be added
+ * @bo: amdgpu buffer object
+ *
+ * Initialize a bo_va_base structure and add it to the appropriate lists
+ *
+ */
+static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base,
+  struct amdgpu_vm *vm,
+  struct amdgpu_bo *bo)
+{
+   base->vm = vm;
+   base->bo = bo;
+   INIT_LIST_HEAD(>bo_list);
+   INIT_LIST_HEAD(>vm_status);
+
+   if (!bo)
+   return;
+   list_add_tail(>bo_list, >va);
+
+   if (bo->tbo.resv != vm->root.base.bo->tbo.resv)
+   return;
+
+   vm->bulk_moveable = false;
+   if (bo->tbo.type == ttm_bo_type_kernel)
+   list_move(>vm_status, >relocated);
+   else
+   list_move(>vm_status, >idle);
+
+   if (bo->preferred_domains &
+   amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type))
+   return;
+
+   /*
+* we checked all the prerequisites, but it looks like this per vm bo
+* is currently evicted. add the bo to the evicted list to make sure it
+* is validated on next vm use to avoid fault.
+* */
+   list_move_tail(>vm_status, >evicted);
+   base->moved = true;
+}
+
  /**
   * amdgpu_vm_get_pd_bo - add the VM PD to a validation list
   *


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/3] drm/amdgpu: fix "use bulk moves for efficient VM LRU handling" v2

2018-09-02 Thread Zhang, Jerry (Junwei)

On 08/31/2018 09:10 PM, Christian König wrote:

First step to fix the LRU corruption, we accidentially tried to move things
on the LRU after dropping the lock.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index dd734970e167..349dcc37ee64 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1237,6 +1237,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
ring = to_amdgpu_ring(entity->rq->sched);
amdgpu_ring_priority_get(ring, priority);

+   amdgpu_vm_move_to_lru_tail(p->adev, >vm);
+
ttm_eu_fence_buffer_objects(>ticket, >validated, p->fence);
amdgpu_mn_unlock(p->mn);

@@ -1258,7 +1260,6 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
union drm_amdgpu_cs *cs = data;
struct amdgpu_cs_parser parser = {};
bool reserved_buffers = false;
-   struct amdgpu_fpriv *fpriv;
int i, r;

if (!adev->accel_working)
@@ -1303,8 +1304,6 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)

r = amdgpu_cs_submit(, cs);

-   fpriv = filp->driver_priv;
-   amdgpu_vm_move_to_lru_tail(adev, >vm);
  out:
amdgpu_cs_parser_fini(, r, reserved_buffers);
return r;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/3] drm/ttm: fix ttm_bo_bulk_move_helper

2018-09-02 Thread Zhang, Jerry (Junwei)

On 08/31/2018 09:10 PM, Christian König wrote:

Staring at the function for six hours, just to essentially move one line
of code.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/ttm/ttm_bo.c | 13 -
  1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 35d53d81f486..138c98902033 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -250,15 +250,18 @@ EXPORT_SYMBOL(ttm_bo_move_to_lru_tail);
  static void ttm_bo_bulk_move_helper(struct ttm_lru_bulk_move_pos *pos,
struct list_head *lru, bool is_swap)
  {
+   struct list_head *list;
LIST_HEAD(entries);
LIST_HEAD(before);
-   struct list_head *list1, *list2;

-   list1 = is_swap ? >last->swap : >last->lru;
-   list2 = is_swap ? pos->first->swap.prev : pos->first->lru.prev;
+   reservation_object_assert_held(pos->last->resv);
+   list = is_swap ? >last->swap : >last->lru;
+   list_cut_position(, lru, list);
+
+   reservation_object_assert_held(pos->first->resv);
+   list = is_swap ? pos->first->swap.prev : pos->first->lru.prev;
+   list_cut_position(, , list);

-   list_cut_position(, lru, list1);
-   list_cut_position(, , list2);
list_splice(, lru);
list_splice_tail(, lru);
  }


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm] amdgpu: When couldn't find bo, need to return error.

2018-09-02 Thread Zhang, Jerry (Junwei)

On 09/01/2018 04:58 PM, Deng, Emily wrote:

Ok, then just ignore this patch. But seems didn't saw the patch on branch 
amd-staging-hybrid-master20180315.


Thanks to take care of this as well.

I'm waiting some verification, and now push the patch to internal staging branch
mainline will be pushed later for another verification.

About master branch, needs someone's help with correct permission.

Regards,
Jerry


Best wishes
Emily Deng


-Original Message-
From: Christian König 
Sent: Saturday, September 1, 2018 4:17 PM
To: Deng, Emily ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH libdrm] amdgpu: When couldn't find bo, need to return
error.

Am 01.09.2018 um 06:24 schrieb Emily Deng:

The startx will have segmant fault if return success.

SWDEV-163962

Change-Id: I56b189fa26efdcd1d96e5100af3f3e0b1208b0c3
Signed-off-by: Emily Deng 


Jerry already send a much better patch for this.


---
   amdgpu/amdgpu_bo.c | 4 +++-
   1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/amdgpu/amdgpu_bo.c b/amdgpu/amdgpu_bo.c index
f25cacc..7e297fa 100644
--- a/amdgpu/amdgpu_bo.c
+++ b/amdgpu/amdgpu_bo.c
@@ -760,6 +760,7 @@ int

amdgpu_find_bo_by_cpu_mapping(amdgpu_device_handle dev,

  uint64_t *offset_in_bo)
   {
uint32_t i;
+   int r = 0;
struct amdgpu_bo *bo;

if (cpu == NULL || size == 0)
@@ -787,10 +788,11 @@ int

amdgpu_find_bo_by_cpu_mapping(amdgpu_device_handle dev,

} else {
*buf_handle = NULL;
*offset_in_bo = 0;
+   r = -errno;


errno doesn't contain any error in this case.


}
pthread_mutex_unlock(>bo_table_mutex);

-   return 0;
+   return r;
   }

   int amdgpu_create_bo_from_user_mem(amdgpu_device_handle dev,


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/4] drm/amdgpu/gmc9: rework stolen vga memory handling

2018-08-30 Thread Zhang, Jerry (Junwei)

On 08/30/2018 10:53 PM, Alex Deucher wrote:

No functional change, just rework it in order to adjust the
behavior on a per asic level.  The problem is that on vega10,
something corrupts the lower 8 MB of vram on the second
resume from S3.  This does not seem to affect Raven, other
gmc9 based asics need testing.

Signed-off-by: Alex Deucher 


Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 48 +--
  1 file changed, 29 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 04d50893a6f2..46cff7d8b375 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -692,6 +692,28 @@ static int gmc_v9_0_ecc_available(struct amdgpu_device 
*adev)
return lost_sheep == 0;
  }

+static bool gmc_v9_0_keep_stolen_memory(struct amdgpu_device *adev)
+{
+
+   /*
+* TODO:
+* Currently there is a bug where some memory client outside
+* of the driver writes to first 8M of VRAM on S3 resume,
+* this overrides GART which by default gets placed in first 8M and
+* causes VM_FAULTS once GTT is accessed.
+* Keep the stolen memory reservation until the while this is not 
solved.
+* Also check code in gmc_v9_0_get_vbios_fb_size and gmc_v9_0_late_init
+*/
+   switch (adev->asic_type) {
+   case CHIP_RAVEN:
+   case CHIP_VEGA10:
+   case CHIP_VEGA12:
+   case CHIP_VEGA20:
+   default:
+   return true;
+   }
+}
+
  static int gmc_v9_0_late_init(void *handle)
  {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
@@ -708,10 +730,8 @@ static int gmc_v9_0_late_init(void *handle)
unsigned i;
int r;

-   /*
-* TODO - Uncomment once GART corruption issue is fixed.
-*/
-   /* amdgpu_bo_late_init(adev); */
+   if (!gmc_v9_0_keep_stolen_memory(adev))
+   amdgpu_bo_late_init(adev);

for(i = 0; i < adev->num_rings; ++i) {
struct amdgpu_ring *ring = adev->rings[i];
@@ -848,18 +868,16 @@ static int gmc_v9_0_gart_init(struct amdgpu_device *adev)

  static unsigned gmc_v9_0_get_vbios_fb_size(struct amdgpu_device *adev)
  {
-#if 0
u32 d1vga_control = RREG32_SOC15(DCE, 0, mmD1VGA_CONTROL);
-#endif
unsigned size;

/*
 * TODO Remove once GART corruption is resolved
 * Check related code in gmc_v9_0_sw_fini
 * */
-   size = 9 * 1024 * 1024;
+   if (gmc_v9_0_keep_stolen_memory(adev))
+   return 9 * 1024 * 1024;

-#if 0
if (REG_GET_FIELD(d1vga_control, D1VGA_CONTROL, D1VGA_MODE_ENABLE)) {
size = 9 * 1024 * 1024; /* reserve 8MB for vga emulator and 1 
MB for FB */
} else {
@@ -876,6 +894,7 @@ static unsigned gmc_v9_0_get_vbios_fb_size(struct 
amdgpu_device *adev)
break;
case CHIP_VEGA10:
case CHIP_VEGA12:
+   case CHIP_VEGA20:
default:
viewport = RREG32_SOC15(DCE, 0, mmSCL0_VIEWPORT_SIZE);
size = (REG_GET_FIELD(viewport, SCL0_VIEWPORT_SIZE, 
VIEWPORT_HEIGHT) *
@@ -888,7 +907,6 @@ static unsigned gmc_v9_0_get_vbios_fb_size(struct 
amdgpu_device *adev)
if ((adev->gmc.real_vram_size - size) < (8 * 1024 * 1024))
return 0;

-#endif
return size;
  }

@@ -1000,16 +1018,8 @@ static int gmc_v9_0_sw_fini(void *handle)
amdgpu_gem_force_release(adev);
amdgpu_vm_manager_fini(adev);

-   /*
-   * TODO:
-   * Currently there is a bug where some memory client outside
-   * of the driver writes to first 8M of VRAM on S3 resume,
-   * this overrides GART which by default gets placed in first 8M and
-   * causes VM_FAULTS once GTT is accessed.
-   * Keep the stolen memory reservation until the while this is not solved.
-   * Also check code in gmc_v9_0_get_vbios_fb_size and gmc_v9_0_late_init
-   */
-   amdgpu_bo_free_kernel(>stolen_vga_memory, NULL, NULL);
+   if (gmc_v9_0_keep_stolen_memory(adev))
+   amdgpu_bo_free_kernel(>stolen_vga_memory, NULL, NULL);

amdgpu_gart_table_vram_free(adev);
amdgpu_bo_fini(adev);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/4] drm/amdgpu/gmc9: don't keep stolen memory on Raven

2018-08-30 Thread Zhang, Jerry (Junwei)

On 08/30/2018 10:53 PM, Alex Deucher wrote:

Raven does not appear to be affected by the same issue
as vega10.  Enable the full stolen memory handling on
Raven.  Reserve the appropriate size at init time to avoid
display artifacts and then free it at the end of init once
the new FB is up and running.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106639
Signed-off-by: Alex Deucher 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 46cff7d8b375..938d03593713 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -706,6 +706,7 @@ static bool gmc_v9_0_keep_stolen_memory(struct 
amdgpu_device *adev)
 */
switch (adev->asic_type) {
case CHIP_RAVEN:
+   return false;
case CHIP_VEGA10:
case CHIP_VEGA12:
case CHIP_VEGA20:


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/6] drm/amdgpu: correctly sign extend 48bit addresses v3

2018-08-30 Thread Zhang, Jerry (Junwei)

Patch 1~5 are

Reviewed-by: Junwei Zhang 

Patch 6 is

Acked-by: Junwei Zhang 


BTW, [PATCH 4/6] drm/amdgpu: manually map the shadow BOs again

with this patch, the user cannot create a shadow bo with gart address.
anyway, I cannot image that use case either.

Regards,
Jerry

On 08/30/2018 08:14 PM, Christian König wrote:

Correct sign extend the GMC addresses to 48bit.

v2: sign extending turned out easier than thought.
v3: clean up the defines and move them into amdgpu_gmc.h as well

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 10 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h| 26 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  8 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   |  6 ++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  7 +++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 13 ---
  9 files changed, 44 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 8c652ecc4f9a..bc5ccfca68c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -135,7 +135,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
.num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe,
.gpuvm_size = min(adev->vm_manager.max_pfn
  << AMDGPU_GPU_PAGE_SHIFT,
- AMDGPU_VA_HOLE_START),
+ AMDGPU_GMC_HOLE_START),
.drm_render_minor = adev->ddev->render->index
};

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index dd734970e167..ef2bfc04b41c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -835,7 +835,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p)
if (chunk->chunk_id != AMDGPU_CHUNK_ID_IB)
continue;

-   va_start = chunk_ib->va_start & AMDGPU_VA_HOLE_MASK;
+   va_start = chunk_ib->va_start & AMDGPU_GMC_HOLE_MASK;
r = amdgpu_cs_find_mapping(p, va_start, , );
if (r) {
DRM_ERROR("IB va_start is invalid\n");
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 71792d820ae0..d30a0838851b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -572,16 +572,16 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void 
*data,
return -EINVAL;
}

-   if (args->va_address >= AMDGPU_VA_HOLE_START &&
-   args->va_address < AMDGPU_VA_HOLE_END) {
+   if (args->va_address >= AMDGPU_GMC_HOLE_START &&
+   args->va_address < AMDGPU_GMC_HOLE_END) {
dev_dbg(>pdev->dev,
"va_address 0x%LX is in VA hole 0x%LX-0x%LX\n",
-   args->va_address, AMDGPU_VA_HOLE_START,
-   AMDGPU_VA_HOLE_END);
+   args->va_address, AMDGPU_GMC_HOLE_START,
+   AMDGPU_GMC_HOLE_END);
return -EINVAL;
}

-   args->va_address &= AMDGPU_VA_HOLE_MASK;
+   args->va_address &= AMDGPU_GMC_HOLE_MASK;

if ((args->flags & ~valid_flags) && (args->flags & ~prt_flags)) {
dev_dbg(>pdev->dev, "invalid flags combination 0x%08X\n",
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index 72fcc9338f5e..48715dd5808a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -30,6 +30,19 @@

  #include "amdgpu_irq.h"

+/* VA hole for 48bit addresses on Vega10 */
+#define AMDGPU_GMC_HOLE_START  0x8000ULL
+#define AMDGPU_GMC_HOLE_END0x8000ULL
+
+/*
+ * Hardware is programmed as if the hole doesn't exists with start and end
+ * address values.
+ *
+ * This mask is used to remove the upper 16bits of the VA and so come up with
+ * the linear addr value.
+ */
+#define AMDGPU_GMC_HOLE_MASK   0xULL
+
  struct firmware;

  /*
@@ -131,6 +144,19 @@ static inline bool amdgpu_gmc_vram_full_visible(struct 
amdgpu_gmc *gmc)
return (gmc->real_vram_size == gmc->visible_vram_size);
  }

+/**
+ * amdgpu_gmc_sign_extend - sign extend the given gmc address
+ *
+ * @addr: address to extend
+ */
+static inline uint64_t amdgpu_gmc_sign_extend(uint64_t addr)
+{
+   if (addr >= AMDGPU_GMC_HOLE_START)
+   addr |= AMDGPU_GMC_HOLE_END;
+
+   return addr;
+}
+
  void 

Re: [PATCH 4/7] drm/amdgpu: use the AGP aperture for system memory access v2

2018-08-30 Thread Zhang, Jerry (Junwei)

On 08/30/2018 08:15 PM, Christian König wrote:

Am 30.08.2018 um 05:20 schrieb Zhang, Jerry (Junwei):

On 08/29/2018 10:08 PM, Christian König wrote:

Start to use the old AGP aperture for system memory access.

v2: Move that to amdgpu_ttm_alloc_gart

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 23 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 58 ++---
  3 files changed, 57 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 1d201fd3f4af..65aee57b35fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -79,6 +79,29 @@ uint64_t amdgpu_gmc_pd_addr(struct amdgpu_bo *bo)
  return pd_addr;
  }

+/**
+ * amdgpu_gmc_agp_addr - return the address in the AGP address space
+ *
+ * @tbo: TTM BO which needs the address, must be in GTT domain
+ *
+ * Tries to figure out how to access the BO through the AGP aperture. Returns
+ * AMDGPU_BO_INVALID_OFFSET if that is not possible.
+ */
+uint64_t amdgpu_gmc_agp_addr(struct ttm_buffer_object *bo)
+{
+struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
+struct ttm_dma_tt *ttm;
+
+if (bo->num_pages != 1 || bo->ttm->caching_state == tt_cached)
+return AMDGPU_BO_INVALID_OFFSET;


If GTT bo size is 1 page, it will also access in AGP address space?


Yes, that is the idea here.

We basically can avoid GART mappings for BOs in the GTT domain which are only 
one page in size.


Thanks to explain that, got the intention.

Jerry



Christian.



Jerry

+
+ttm = container_of(bo->ttm, struct ttm_dma_tt, ttm);
+if (ttm->dma_address[0] + PAGE_SIZE >= adev->gmc.agp_size)
+return AMDGPU_BO_INVALID_OFFSET;
+
+return adev->gmc.agp_start + ttm->dma_address[0];
+}
+
  /**
   * amdgpu_gmc_vram_location - try to find VRAM location
   *
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index c9985e7dc9e5..265ca415c64c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -163,6 +163,7 @@ static inline uint64_t amdgpu_gmc_sign_extend(uint64_t addr)
  void amdgpu_gmc_get_pde_for_bo(struct amdgpu_bo *bo, int level,
 uint64_t *addr, uint64_t *flags);
  uint64_t amdgpu_gmc_pd_addr(struct amdgpu_bo *bo);
+uint64_t amdgpu_gmc_agp_addr(struct ttm_buffer_object *bo);
  void amdgpu_gmc_vram_location(struct amdgpu_device *adev, struct amdgpu_gmc 
*mc,
u64 base);
  void amdgpu_gmc_gart_location(struct amdgpu_device *adev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index d9f3201c9e5c..8a158ee922f7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1081,41 +1081,49 @@ int amdgpu_ttm_alloc_gart(struct ttm_buffer_object *bo)
  struct ttm_mem_reg tmp;
  struct ttm_placement placement;
  struct ttm_place placements;
-uint64_t flags;
+uint64_t addr, flags;
  int r;

  if (bo->mem.start != AMDGPU_BO_INVALID_OFFSET)
  return 0;

-/* allocate GART space */
-tmp = bo->mem;
-tmp.mm_node = NULL;
-placement.num_placement = 1;
-placement.placement = 
-placement.num_busy_placement = 1;
-placement.busy_placement = 
-placements.fpfn = 0;
-placements.lpfn = adev->gmc.gart_size >> PAGE_SHIFT;
-placements.flags = (bo->mem.placement & ~TTM_PL_MASK_MEM) |
-TTM_PL_FLAG_TT;
+addr = amdgpu_gmc_agp_addr(bo);
+if (addr != AMDGPU_BO_INVALID_OFFSET) {
+bo->mem.start = addr >> PAGE_SHIFT;
+} else {

-r = ttm_bo_mem_space(bo, , , );
-if (unlikely(r))
-return r;
+/* allocate GART space */
+tmp = bo->mem;
+tmp.mm_node = NULL;
+placement.num_placement = 1;
+placement.placement = 
+placement.num_busy_placement = 1;
+placement.busy_placement = 
+placements.fpfn = 0;
+placements.lpfn = adev->gmc.gart_size >> PAGE_SHIFT;
+placements.flags = (bo->mem.placement & ~TTM_PL_MASK_MEM) |
+TTM_PL_FLAG_TT;
+
+r = ttm_bo_mem_space(bo, , , );
+if (unlikely(r))
+return r;

-/* compute PTE flags for this buffer object */
-flags = amdgpu_ttm_tt_pte_flags(adev, bo->ttm, );
+/* compute PTE flags for this buffer object */
+flags = amdgpu_ttm_tt_pte_flags(adev, bo->ttm, );

-/* Bind pages */
-gtt->offset = ((u64)tmp.start << PAGE_SHIFT) - adev->gmc.gart_start;
-r = amdgpu_ttm_gart_bind(adev, bo, flags);
-if (unlikely(r)) {
-ttm_bo_mem_put(bo, );
-return r;
+/* Bind pages */
+gtt->offset = ((u64)tmp.start << PAGE_SHIFT) -
+  

Re: [PATCH] drm/amdgpu: Revert "kmap PDs/PTs in amdgpu_vm_update_directories"

2018-08-30 Thread Zhang, Jerry (Junwei)

On 08/30/2018 03:50 PM, Christian König wrote:

This reverts commit a7f91061c60ad9cac2e6a03b642be6a4f88b3662.

Felix pointed out that we need to have the BOs mapped even before
amdgpu_vm_update_directories is called.

Signed-off-by: Christian König 

Acked-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 13 -
  1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 33d9ce229f4a..72f8c750e128 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -343,7 +343,10 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
list_move(_base->vm_status, >moved);
spin_unlock(>moved_lock);
} else {
-   r = amdgpu_ttm_alloc_gart(>tbo);
+   if (vm->use_cpu_for_update)
+   r = amdgpu_bo_kmap(bo, NULL);
+   else
+   r = amdgpu_ttm_alloc_gart(>tbo);
if (r)
break;
list_move(_base->vm_status, >relocated);
@@ -1093,14 +1096,6 @@ int amdgpu_vm_update_directories(struct amdgpu_device 
*adev,
params.adev = adev;

if (vm->use_cpu_for_update) {
-   struct amdgpu_vm_bo_base *bo_base;
-
-   list_for_each_entry(bo_base, >relocated, vm_status) {
-   r = amdgpu_bo_kmap(bo_base->bo, NULL);
-   if (unlikely(r))
-   return r;
-   }
-
r = amdgpu_vm_wait_pd(adev, vm, AMDGPU_FENCE_OWNER_VM);
if (unlikely(r))
return r;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH libdrm] amdgpu: add error return value for finding bo by cpu mapping

2018-08-30 Thread Zhang, Jerry (Junwei)

On 08/30/2018 04:57 PM, Michel Dänzer wrote:

On 2018-08-30 10:50 a.m., Junwei Zhang wrote:

If nothing is found, error should be returned.

Signed-off-by: Junwei Zhang 

[...]

@@ -577,10 +578,11 @@ int amdgpu_find_bo_by_cpu_mapping(amdgpu_device_handle 
dev,
} else {
*buf_handle = NULL;
*offset_in_bo = 0;
+   r = -EINVAL;


I think -ENOENT would be better, to differentiate this error from
passing invalid pointer / size parameters.


Mmm, good point, perhaps ENXIO is better,
which is used in amdgpu ttm for address out of existing range.

#define ENOENT   2  /* No such file or directory */
#define ENXIO6  /* No such device or address */

Regards,
Jerry



With that,

Reviewed-by: Michel Dänzer 



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: correctly sign extend 48bit addresses v3

2018-08-30 Thread Zhang, Jerry (Junwei)

On 08/30/2018 02:48 PM, Christian König wrote:

Am 30.08.2018 um 04:43 schrieb Zhang, Jerry (Junwei):

On 08/29/2018 05:39 PM, Christian König wrote:

Am 29.08.2018 um 04:03 schrieb Zhang, Jerry (Junwei):

On 08/28/2018 08:17 PM, Christian König wrote:

Correct sign extend the GMC addresses to 48bit.


Could you explain a bit more why to extend the sign?


The hardware works like this, in other words when bit 47 is set we must extend 
that into bits 48-63.


Thanks. fine.




the address is uint64_t. is if failed in some case?


What do you mean?


Sorry for the typo without finishing the sentence before sending.

I mean even if the address is uint64_t, still needs to extend the sign?
what I was thinking is that the int64_t needs to do this.


Well, no. What we would need is an int48_t type, but such thing doesn't exists 
and isn't easily implementable in C.


If so, it would be better to understand.
Thanks.









> -/* VA hole for 48bit addresses on Vega10 */
> -#define AMDGPU_VA_HOLE_START 0x8000ULL
> -#define AMDGPU_VA_HOLE_END 0x8000ULL

BTW, the hole for 48bit is actually 47 bit left, any background for that?


Well bits start counting at zero. So the 48bit addresses have bits 0-47.


The VA hole is going to catch the VA address out of normal range, which for 
vega10 is 48-bit?


Yes, exactly.


if so, 0x8000__ ULL holds from 0~46 bits, starting from 128TB, but 
vega10 VA is 256TB


Correct, the lower range is from 0x0-0x8000__ and the higher range is 
from 0x_8000__-0x___.



it also could be found in old code gmc_v9.c, the mc_mask holds 48-bits address, 
like below:

adev->gmc.mc_mask = 0x__ ULL; /* 48 bit MC */

But the VA hole start address is 0x8000__ ULL, then libdrm gets 
virtual_address_max

dev_info.virtual_address_max = min(vm_size, AMDGPU_VA_HOLE_START)
// that's 0x8000__ ULL actually


We limit the reported VA size for backward compatibility with old userspace 
here.


fine, got it.
Thanks.

Regards,
Jerry





Above all, it looks the VA hole start should be 0x1___ UL.


Nope, that isn't correct. The hole is between 0x8000__ and 
0x_8000__.

Regards,
Christian.



Regards,
Jerry



Regards,
Christian.



Regards,
Jerry



v2: sign extending turned out easier than thought.
v3: clean up the defines and move them into amdgpu_gmc.h as well

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 10 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h| 26 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  8 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   |  6 ++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  7 +++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 13 ---
  9 files changed, 44 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 8c652ecc4f9a..bc5ccfca68c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -135,7 +135,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
  .num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe,
  .gpuvm_size = min(adev->vm_manager.max_pfn
<< AMDGPU_GPU_PAGE_SHIFT,
-  AMDGPU_VA_HOLE_START),
+  AMDGPU_GMC_HOLE_START),
  .drm_render_minor = adev->ddev->render->index
  };

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index dd734970e167..ef2bfc04b41c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -835,7 +835,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p)
  if (chunk->chunk_id != AMDGPU_CHUNK_ID_IB)
  continue;

-va_start = chunk_ib->va_start & AMDGPU_VA_HOLE_MASK;
+va_start = chunk_ib->va_start & AMDGPU_GMC_HOLE_MASK;
  r = amdgpu_cs_find_mapping(p, va_start, , );
  if (r) {
  DRM_ERROR("IB va_start is invalid\n");
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 71792d820ae0..d30a0838851b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -572,16 +572,16 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void 
*data,
  return -EINVAL;
  }

-if (args->va_address >= AMDGPU_VA_HOLE_START &&
-args->va_address < AMDGPU_VA_HOLE_END) {
+if (args->va_address >= AMDGPU_GMC_HOLE_START &&
+args->

Re: When to kmap PT BOs?

2018-08-29 Thread Zhang, Jerry (Junwei)

On 08/30/2018 06:30 AM, Felix Kuehling wrote:

Hi,

Currently PT BOs are kmapped in amdgpu_vm_update_directories. That
means, to avoid kernel oopses after page table evictions, I need to call
amdgpu_vm_update_directories before calling amdgpu_vm_bo_update.

But amdgpu_vm_bo_update can also move PTs on the vm->relocated list
during huge page handling. That means I also need to call
amdgpu_vm_update_directories after amdgpu_vm_bo_update.


Not very familiar with huge page handling.

But from code, maybe we can kmap the PTE entry right here.
Then it will update current non-huge page PTE later in amdgpu_vm_update_ptes().

Regards,
Jerry



I think a better solution is to move kmapping out of
amdgpu_vm_update_directories. But I'm not sure what's the right place
for it. Any suggestions? For a quick fix for kernel oopses after page
table evictions in the ROCm 1.9 release I'll call
amdgpu_vm_update_directories twice. If there are no new entries on the
vm->relocated lists, the second call won't add much overhead anyway.

Thanks,
   Felix


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 6/7] drm/amdgpu: enable AGP aperture for GMC9

2018-08-29 Thread Zhang, Jerry (Junwei)

On 08/29/2018 10:08 PM, Christian König wrote:

Enable the old AGP aperture to avoid GART mappings.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c | 10 +-
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c|  1 +
  drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c  | 10 +-
  3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
index 3403ded39d13..ffd0ec9586d1 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
@@ -65,16 +65,16 @@ static void gfxhub_v1_0_init_system_aperture_regs(struct 
amdgpu_device *adev)
  {
uint64_t value;

-   /* Disable AGP. */
+   /* Program the AGP BAR */
WREG32_SOC15(GC, 0, mmMC_VM_AGP_BASE, 0);
-   WREG32_SOC15(GC, 0, mmMC_VM_AGP_TOP, 0);
-   WREG32_SOC15(GC, 0, mmMC_VM_AGP_BOT, 0x);
+   WREG32_SOC15(GC, 0, mmMC_VM_AGP_BOT, adev->gmc.agp_start >> 24);
+   WREG32_SOC15(GC, 0, mmMC_VM_AGP_TOP, adev->gmc.agp_end >> 24);

/* Program the system aperture low logical page number. */
WREG32_SOC15(GC, 0, mmMC_VM_SYSTEM_APERTURE_LOW_ADDR,
-adev->gmc.vram_start >> 18);
+min(adev->gmc.vram_start, adev->gmc.agp_start) >> 18);
WREG32_SOC15(GC, 0, mmMC_VM_SYSTEM_APERTURE_HIGH_ADDR,
-adev->gmc.vram_end >> 18);
+max(adev->gmc.vram_end, adev->gmc.agp_end) >> 18);

/* Set default page address. */
value = adev->vram_scratch.gpu_addr - adev->gmc.vram_start
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 04d50893a6f2..719f45cdaf6a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -751,6 +751,7 @@ static void gmc_v9_0_vram_gtt_location(struct amdgpu_device 
*adev,
base = mmhub_v1_0_get_fb_location(adev);
amdgpu_gmc_vram_location(adev, >gmc, base);
amdgpu_gmc_gart_location(adev, mc);
+   amdgpu_gmc_agp_location(adev, mc);
/* base offset of vram pages */
adev->vm_manager.vram_base_offset = gfxhub_v1_0_get_mc_fb_offset(adev);
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
index 5f6a9c85488f..73d7c075dd33 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
@@ -76,16 +76,16 @@ static void mmhub_v1_0_init_system_aperture_regs(struct 
amdgpu_device *adev)
uint64_t value;
uint32_t tmp;

-   /* Disable AGP. */
+   /* Program the AGP BAR */
WREG32_SOC15(MMHUB, 0, mmMC_VM_AGP_BASE, 0);
-   WREG32_SOC15(MMHUB, 0, mmMC_VM_AGP_TOP, 0);
-   WREG32_SOC15(MMHUB, 0, mmMC_VM_AGP_BOT, 0x00FF);
+   WREG32_SOC15(MMHUB, 0, mmMC_VM_AGP_BOT, adev->gmc.agp_start >> 24);
+   WREG32_SOC15(MMHUB, 0, mmMC_VM_AGP_TOP, adev->gmc.agp_end >> 24);

/* Program the system aperture low logical page number. */
WREG32_SOC15(MMHUB, 0, mmMC_VM_SYSTEM_APERTURE_LOW_ADDR,
-adev->gmc.vram_start >> 18);
+min(adev->gmc.vram_start, adev->gmc.agp_start) >> 18);
WREG32_SOC15(MMHUB, 0, mmMC_VM_SYSTEM_APERTURE_HIGH_ADDR,
-adev->gmc.vram_end >> 18);
+max(adev->gmc.vram_end, adev->gmc.agp_end) >> 18);

/* Set default page address. */
value = adev->vram_scratch.gpu_addr - adev->gmc.vram_start +


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 5/7] drm/amdgpu: manually map the shadow BOs again

2018-08-29 Thread Zhang, Jerry (Junwei)

On 08/29/2018 10:08 PM, Christian König wrote:

Otherwise we won't be able to use the AGP aperture.


do you mean we use AGP for GTT shadow only now?

Jerry


Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 5 +
  2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 0cbf651a88a6..de990bdcdd6c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -163,10 +163,7 @@ void amdgpu_bo_placement_from_domain(struct amdgpu_bo 
*abo, u32 domain)

if (domain & AMDGPU_GEM_DOMAIN_GTT) {
places[c].fpfn = 0;
-   if (flags & AMDGPU_GEM_CREATE_SHADOW)
-   places[c].lpfn = adev->gmc.gart_size >> PAGE_SHIFT;
-   else
-   places[c].lpfn = 0;
+   places[c].lpfn = 0;
places[c].flags = TTM_PL_FLAG_TT;
if (flags & AMDGPU_GEM_CREATE_CPU_GTT_USWC)
places[c].flags |= TTM_PL_FLAG_WC |
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index a3675c7b6190..abe1db4c63f9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -346,6 +346,11 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
r = amdgpu_ttm_alloc_gart(>tbo);
if (r)
break;
+   if (bo->shadow) {
+   r = amdgpu_ttm_alloc_gart(>shadow->tbo);
+   if (r)
+   break;
+   }
list_move(_base->vm_status, >relocated);
}
}


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 4/7] drm/amdgpu: use the AGP aperture for system memory access v2

2018-08-29 Thread Zhang, Jerry (Junwei)

On 08/29/2018 10:08 PM, Christian König wrote:

Start to use the old AGP aperture for system memory access.

v2: Move that to amdgpu_ttm_alloc_gart

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 23 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 58 ++---
  3 files changed, 57 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 1d201fd3f4af..65aee57b35fe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -79,6 +79,29 @@ uint64_t amdgpu_gmc_pd_addr(struct amdgpu_bo *bo)
return pd_addr;
  }

+/**
+ * amdgpu_gmc_agp_addr - return the address in the AGP address space
+ *
+ * @tbo: TTM BO which needs the address, must be in GTT domain
+ *
+ * Tries to figure out how to access the BO through the AGP aperture. Returns
+ * AMDGPU_BO_INVALID_OFFSET if that is not possible.
+ */
+uint64_t amdgpu_gmc_agp_addr(struct ttm_buffer_object *bo)
+{
+   struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
+   struct ttm_dma_tt *ttm;
+
+   if (bo->num_pages != 1 || bo->ttm->caching_state == tt_cached)
+   return AMDGPU_BO_INVALID_OFFSET;


If GTT bo size is 1 page, it will also access in AGP address space?

Jerry

+
+   ttm = container_of(bo->ttm, struct ttm_dma_tt, ttm);
+   if (ttm->dma_address[0] + PAGE_SIZE >= adev->gmc.agp_size)
+   return AMDGPU_BO_INVALID_OFFSET;
+
+   return adev->gmc.agp_start + ttm->dma_address[0];
+}
+
  /**
   * amdgpu_gmc_vram_location - try to find VRAM location
   *
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index c9985e7dc9e5..265ca415c64c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -163,6 +163,7 @@ static inline uint64_t amdgpu_gmc_sign_extend(uint64_t addr)
  void amdgpu_gmc_get_pde_for_bo(struct amdgpu_bo *bo, int level,
   uint64_t *addr, uint64_t *flags);
  uint64_t amdgpu_gmc_pd_addr(struct amdgpu_bo *bo);
+uint64_t amdgpu_gmc_agp_addr(struct ttm_buffer_object *bo);
  void amdgpu_gmc_vram_location(struct amdgpu_device *adev, struct amdgpu_gmc 
*mc,
  u64 base);
  void amdgpu_gmc_gart_location(struct amdgpu_device *adev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index d9f3201c9e5c..8a158ee922f7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1081,41 +1081,49 @@ int amdgpu_ttm_alloc_gart(struct ttm_buffer_object *bo)
struct ttm_mem_reg tmp;
struct ttm_placement placement;
struct ttm_place placements;
-   uint64_t flags;
+   uint64_t addr, flags;
int r;

if (bo->mem.start != AMDGPU_BO_INVALID_OFFSET)
return 0;

-   /* allocate GART space */
-   tmp = bo->mem;
-   tmp.mm_node = NULL;
-   placement.num_placement = 1;
-   placement.placement = 
-   placement.num_busy_placement = 1;
-   placement.busy_placement = 
-   placements.fpfn = 0;
-   placements.lpfn = adev->gmc.gart_size >> PAGE_SHIFT;
-   placements.flags = (bo->mem.placement & ~TTM_PL_MASK_MEM) |
-   TTM_PL_FLAG_TT;
+   addr = amdgpu_gmc_agp_addr(bo);
+   if (addr != AMDGPU_BO_INVALID_OFFSET) {
+   bo->mem.start = addr >> PAGE_SHIFT;
+   } else {

-   r = ttm_bo_mem_space(bo, , , );
-   if (unlikely(r))
-   return r;
+   /* allocate GART space */
+   tmp = bo->mem;
+   tmp.mm_node = NULL;
+   placement.num_placement = 1;
+   placement.placement = 
+   placement.num_busy_placement = 1;
+   placement.busy_placement = 
+   placements.fpfn = 0;
+   placements.lpfn = adev->gmc.gart_size >> PAGE_SHIFT;
+   placements.flags = (bo->mem.placement & ~TTM_PL_MASK_MEM) |
+   TTM_PL_FLAG_TT;
+
+   r = ttm_bo_mem_space(bo, , , );
+   if (unlikely(r))
+   return r;

-   /* compute PTE flags for this buffer object */
-   flags = amdgpu_ttm_tt_pte_flags(adev, bo->ttm, );
+   /* compute PTE flags for this buffer object */
+   flags = amdgpu_ttm_tt_pte_flags(adev, bo->ttm, );

-   /* Bind pages */
-   gtt->offset = ((u64)tmp.start << PAGE_SHIFT) - adev->gmc.gart_start;
-   r = amdgpu_ttm_gart_bind(adev, bo, flags);
-   if (unlikely(r)) {
-   ttm_bo_mem_put(bo, );
-   return r;
+   /* Bind pages */
+   gtt->offset = ((u64)tmp.start << PAGE_SHIFT) -
+   adev->gmc.gart_start;
+   r = amdgpu_ttm_gart_bind(adev, bo, flags);
+  

Re: [PATCH 3/7] drm/amdgpu: add amdgpu_gmc_agp_location v2

2018-08-29 Thread Zhang, Jerry (Junwei)

On 08/29/2018 10:08 PM, Christian König wrote:

Helper to figure out the location of the AGP BAR.

v2: fix a couple of bugs

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 43 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  5 +++
  2 files changed, 48 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index c6bcc4715373..1d201fd3f4af 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -143,3 +143,46 @@ void amdgpu_gmc_gart_location(struct amdgpu_device *adev, 
struct amdgpu_gmc *mc)
dev_info(adev->dev, "GART: %lluM 0x%016llX - 0x%016llX\n",
mc->gart_size >> 20, mc->gart_start, mc->gart_end);
  }
+
+/**
+ * amdgpu_gmc_agp_location - try to find AGP location
+ * @adev: amdgpu device structure holding all necessary informations
+ * @mc: memory controller structure holding memory informations
+ *
+ * Function will place try to find a place for the AGP BAR in the MC address
+ * space.
+ *
+ * AGP BAR will be assigned the largest available hole in the address space.
+ * Should be called after VRAM and GART locations are setup.
+ */
+void amdgpu_gmc_agp_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc)
+{
+   const uint64_t sixteen_gb = 1ULL << 34;
+   const uint64_t sixteen_gb_mask = ~(sixteen_gb - 1);
+   u64 size_af, size_bf;
+
+   if (mc->vram_start > mc->gart_start) {
+   size_bf = (mc->vram_start & sixteen_gb_mask) -
+   ALIGN(mc->gart_end + 1, sixteen_gb);
+   size_af = mc->mc_mask + 1 - ALIGN(mc->vram_end, sixteen_gb);
+   } else {
+   size_bf = mc->vram_start & sixteen_gb_mask;
+   size_af = (mc->gart_start & sixteen_gb_mask) -
+   ALIGN(mc->vram_end, sixteen_gb);


we may need ALIGN(mc->vram_end + 1, sixteen_gb) for size_af.


+   }
+
+   if (size_bf > size_af) {
+   mc->agp_start = mc->vram_start > mc->gart_start ?
+   mc->gart_end + 1 : 0;
+   mc->agp_size = size_bf;
+   } else {
+   mc->agp_start = (mc->vram_start > mc->gart_start ?
+   mc->vram_end : mc->gart_end) + 1,
+   mc->agp_size = size_af;
+   }
+
+   mc->agp_start = ALIGN(mc->agp_start, sixteen_gb);
+   mc->agp_end = mc->agp_start + mc->agp_size - 1;
+   dev_info(adev->dev, "AGP: %lluM 0x%016llX - 0x%016llX\n",
+   mc->agp_size >> 20, mc->agp_start, mc->agp_end);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index 48715dd5808a..c9985e7dc9e5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -94,6 +94,9 @@ struct amdgpu_gmc {
 * about vram size near mc fb location */
u64 mc_vram_size;
u64 visible_vram_size;
+   u64 agp_size;
+   u64 agp_start;
+   u64 agp_end;
u64 gart_size;
u64 gart_start;
u64 gart_end;
@@ -164,5 +167,7 @@ void amdgpu_gmc_vram_location(struct amdgpu_device *adev, 
struct amdgpu_gmc *mc,
  u64 base);
  void amdgpu_gmc_gart_location(struct amdgpu_device *adev,
  struct amdgpu_gmc *mc);
+void amdgpu_gmc_agp_location(struct amdgpu_device *adev,
+struct amdgpu_gmc *mc);

  #endif


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/7] drm/amdgpu: put GART away from VRAM v2

2018-08-29 Thread Zhang, Jerry (Junwei)

On 08/29/2018 10:08 PM, Christian König wrote:

Always try to put the GART away from where VRAM is.

v2: correctly handle the 4GB limitation

Signed-off-by: Christian König 


Fix my concern :)

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 265ec6807130..c6bcc4715373 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -116,6 +116,7 @@ void amdgpu_gmc_vram_location(struct amdgpu_device *adev, 
struct amdgpu_gmc *mc,
   */
  void amdgpu_gmc_gart_location(struct amdgpu_device *adev, struct amdgpu_gmc 
*mc)
  {
+   const uint64_t four_gb = 0x1ULL;
u64 size_af, size_bf;

mc->gart_size += adev->pm.smu_prv_buffer_size;
@@ -124,8 +125,7 @@ void amdgpu_gmc_gart_location(struct amdgpu_device *adev, 
struct amdgpu_gmc *mc)
 * the GART base on a 4GB boundary as well.
 */
size_bf = mc->vram_start;
-   size_af = adev->gmc.mc_mask + 1 -
-   ALIGN(mc->vram_end + 1, 0x1ULL);
+   size_af = adev->gmc.mc_mask + 1 - ALIGN(mc->vram_end + 1, four_gb);

if (mc->gart_size > max(size_bf, size_af)) {
dev_warn(adev->dev, "limiting GART\n");
@@ -136,7 +136,9 @@ void amdgpu_gmc_gart_location(struct amdgpu_device *adev, 
struct amdgpu_gmc *mc)
(size_af < mc->gart_size))
mc->gart_start = 0;
else
-   mc->gart_start = ALIGN(mc->vram_end + 1, 0x1ULL);
+   mc->gart_start = mc->mc_mask - mc->gart_size + 1;
+
+   mc->gart_start &= four_gb - 1;
mc->gart_end = mc->gart_start + mc->gart_size - 1;
dev_info(adev->dev, "GART: %lluM 0x%016llX - 0x%016llX\n",
mc->gart_size >> 20, mc->gart_start, mc->gart_end);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: remove redundant memset

2018-08-29 Thread Zhang, Jerry (Junwei)

On 08/29/2018 11:17 PM, Philip Yang wrote:

kvmalloc_array uses __GFP_ZERO flag ensures that the returned address
is zeroed already, memset it to zero again afterwards is unnecessary,
and in this case buggy because we only clear the first entry.

Change-Id: If94a59d3cbf2690dd2a1e2add71bc393df6a9686
Signed-off-by: Philip Yang 


Good catch.

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 153c9be..33d9ce2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -540,7 +540,6 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device 
*adev,
   GFP_KERNEL | __GFP_ZERO);
if (!parent->entries)
return -ENOMEM;
-   memset(parent->entries, 0 , sizeof(struct amdgpu_vm_pt));
}

from = saddr >> shift;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Need to set moved to true when evict bo

2018-08-29 Thread Zhang, Jerry (Junwei)

On 08/29/2018 04:53 PM, Christian König wrote:

Am 29.08.2018 um 04:52 schrieb Zhang, Jerry (Junwei):

On 08/28/2018 08:40 PM, Emily Deng wrote:

Fix the VMC page fault when the running sequence is as below:
1.amdgpu_gem_create_ioctl
2.ttm_bo_swapout->amdgpu_vm_bo_invalidate, as not called
amdgpu_vm_bo_base_init, so won't called
list_add_tail(>bo_list, >va). Even the bo was evicted,
it won't set the bo_base->moved.


IMO, the evicted bo should be created previously.
On BO creation we will add it to the bo->va as below:

amdgpu_gem_create_ioctl
  drm_gem_handle_create
amdgpu_gem_object_open


And here is the problem. Between creating the BO and opening it in the client 
the BO can be evicted.


Thanks to explain that.

It's the key point, falling in Murphy's law as well.

Jerry



That's what Emily's patch is handling here.

Christian.


amdgpu_vm_bo_add
amdgpu_vm_bo_base_init
  list_add_tail(>bo_list, >va)

Then it could be set moved in bo invalidate when evicting.

could you provide a bit more backgroud about the issue?
looks a per vm bo is evicted and a new same bo created.

Jerry


3.drm_gem_open_ioctl->amdgpu_vm_bo_base_init, here only called
list_move_tail(>vm_status, >evicted), but not set the
bo_base->moved.
4.amdgpu_vm_bo_map->amdgpu_vm_bo_insert_map, as the bo_base->moved is
not set true, the function amdgpu_vm_bo_insert_map will call
list_move(_va->base.vm_status, >moved)
5.amdgpu_cs_ioctl won't validate the swapout bo, as it is only in the
moved list, not in the evict list. So VMC page fault occurs.

Signed-off-by: Emily Deng 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 1f4b8df..015e20e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -172,6 +172,7 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base 
*base,
   * is validated on next vm use to avoid fault.
   * */
  list_move_tail(>vm_status, >evicted);
+base->moved = true;
  }

  /**


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Need to set moved to true when evict bo

2018-08-28 Thread Zhang, Jerry (Junwei)

On 08/28/2018 08:40 PM, Emily Deng wrote:

Fix the VMC page fault when the running sequence is as below:
1.amdgpu_gem_create_ioctl
2.ttm_bo_swapout->amdgpu_vm_bo_invalidate, as not called
amdgpu_vm_bo_base_init, so won't called
list_add_tail(>bo_list, >va). Even the bo was evicted,
it won't set the bo_base->moved.


IMO, the evicted bo should be created previously.
On BO creation we will add it to the bo->va as below:

amdgpu_gem_create_ioctl
  drm_gem_handle_create
amdgpu_gem_object_open
  amdgpu_vm_bo_add
amdgpu_vm_bo_base_init
  list_add_tail(>bo_list, >va)

Then it could be set moved in bo invalidate when evicting.

could you provide a bit more backgroud about the issue?
looks a per vm bo is evicted and a new same bo created.

Jerry


3.drm_gem_open_ioctl->amdgpu_vm_bo_base_init, here only called
list_move_tail(>vm_status, >evicted), but not set the
bo_base->moved.
4.amdgpu_vm_bo_map->amdgpu_vm_bo_insert_map, as the bo_base->moved is
not set true, the function amdgpu_vm_bo_insert_map will call
list_move(_va->base.vm_status, >moved)
5.amdgpu_cs_ioctl won't validate the swapout bo, as it is only in the
moved list, not in the evict list. So VMC page fault occurs.

Signed-off-by: Emily Deng 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 1f4b8df..015e20e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -172,6 +172,7 @@ static void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base 
*base,
 * is validated on next vm use to avoid fault.
 * */
list_move_tail(>vm_status, >evicted);
+   base->moved = true;
  }

  /**


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: correctly sign extend 48bit addresses v3

2018-08-28 Thread Zhang, Jerry (Junwei)

On 08/28/2018 08:17 PM, Christian König wrote:

Correct sign extend the GMC addresses to 48bit.


Could you explain a bit more why to extend the sign?
the address is uint64_t. is if failed in some case?

> -/* VA hole for 48bit addresses on Vega10 */
> -#define AMDGPU_VA_HOLE_START  0x8000ULL
> -#define AMDGPU_VA_HOLE_END0x8000ULL

BTW, the hole for 48bit is actually 47 bit left, any background for that?

Regards,
Jerry



v2: sign extending turned out easier than thought.
v3: clean up the defines and move them into amdgpu_gmc.h as well

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 10 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h| 26 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  8 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   |  6 ++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |  7 +++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 13 ---
  9 files changed, 44 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 8c652ecc4f9a..bc5ccfca68c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -135,7 +135,7 @@ void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
.num_queue_per_pipe = adev->gfx.mec.num_queue_per_pipe,
.gpuvm_size = min(adev->vm_manager.max_pfn
  << AMDGPU_GPU_PAGE_SHIFT,
- AMDGPU_VA_HOLE_START),
+ AMDGPU_GMC_HOLE_START),
.drm_render_minor = adev->ddev->render->index
};

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index dd734970e167..ef2bfc04b41c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -835,7 +835,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p)
if (chunk->chunk_id != AMDGPU_CHUNK_ID_IB)
continue;

-   va_start = chunk_ib->va_start & AMDGPU_VA_HOLE_MASK;
+   va_start = chunk_ib->va_start & AMDGPU_GMC_HOLE_MASK;
r = amdgpu_cs_find_mapping(p, va_start, , );
if (r) {
DRM_ERROR("IB va_start is invalid\n");
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 71792d820ae0..d30a0838851b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -572,16 +572,16 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void 
*data,
return -EINVAL;
}

-   if (args->va_address >= AMDGPU_VA_HOLE_START &&
-   args->va_address < AMDGPU_VA_HOLE_END) {
+   if (args->va_address >= AMDGPU_GMC_HOLE_START &&
+   args->va_address < AMDGPU_GMC_HOLE_END) {
dev_dbg(>pdev->dev,
"va_address 0x%LX is in VA hole 0x%LX-0x%LX\n",
-   args->va_address, AMDGPU_VA_HOLE_START,
-   AMDGPU_VA_HOLE_END);
+   args->va_address, AMDGPU_GMC_HOLE_START,
+   AMDGPU_GMC_HOLE_END);
return -EINVAL;
}

-   args->va_address &= AMDGPU_VA_HOLE_MASK;
+   args->va_address &= AMDGPU_GMC_HOLE_MASK;

if ((args->flags & ~valid_flags) && (args->flags & ~prt_flags)) {
dev_dbg(>pdev->dev, "invalid flags combination 0x%08X\n",
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index 0d2c9f65ca13..9d9c7a9f54e4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -30,6 +30,19 @@

  #include "amdgpu_irq.h"

+/* VA hole for 48bit addresses on Vega10 */
+#define AMDGPU_GMC_HOLE_START  0x8000ULL
+#define AMDGPU_GMC_HOLE_END0x8000ULL
+
+/*
+ * Hardware is programmed as if the hole doesn't exists with start and end
+ * address values.
+ *
+ * This mask is used to remove the upper 16bits of the VA and so come up with
+ * the linear addr value.
+ */
+#define AMDGPU_GMC_HOLE_MASK   0xULL
+
  struct firmware;

  /*
@@ -131,6 +144,19 @@ static inline bool amdgpu_gmc_vram_full_visible(struct 
amdgpu_gmc *gmc)
return (gmc->real_vram_size == gmc->visible_vram_size);
  }

+/**
+ * amdgpu_gmc_sign_extend - sign extend the given gmc address
+ *
+ * @addr: address to extend
+ */
+static inline uint64_t amdgpu_gmc_sign_extend(uint64_t addr)
+{
+   if (addr >= 

Re: [PATCH] drm/amdgpu: Only retrieve GPU address of GART table after pinning it

2018-08-28 Thread Zhang, Jerry (Junwei)

On 08/28/2018 05:27 PM, Michel Dänzer wrote:

From: Michel Dänzer 

Doing it earlier hits a WARN_ON_ONCE in amdgpu_bo_gpu_offset.

Fixes: "drm/amdgpu: remove gart.table_addr"
Signed-off-by: Michel Dänzer 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c | 5 -
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 5 -
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 5 -
  3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index 543287e5d67b..9c45ea318bd6 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -494,7 +494,7 @@ static void gmc_v6_0_set_prt(struct amdgpu_device *adev, 
bool enable)

  static int gmc_v6_0_gart_enable(struct amdgpu_device *adev)
  {
-   uint64_t table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
+   uint64_t table_addr;
int r, i;
u32 field;

@@ -505,6 +505,9 @@ static int gmc_v6_0_gart_enable(struct amdgpu_device *adev)
r = amdgpu_gart_table_vram_pin(adev);
if (r)
return r;
+
+   table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
+
/* Setup TLB control */
WREG32(mmMC_VM_MX_L1_TLB_CNTL,
   (0xA << 7) |
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 060c79afef80..fc5fe187b614 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -604,7 +604,7 @@ static void gmc_v7_0_set_prt(struct amdgpu_device *adev, 
bool enable)
   */
  static int gmc_v7_0_gart_enable(struct amdgpu_device *adev)
  {
-   uint64_t table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
+   uint64_t table_addr;
int r, i;
u32 tmp, field;

@@ -615,6 +615,9 @@ static int gmc_v7_0_gart_enable(struct amdgpu_device *adev)
r = amdgpu_gart_table_vram_pin(adev);
if (r)
return r;
+
+   table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
+
/* Setup TLB control */
tmp = RREG32(mmMC_VM_MX_L1_TLB_CNTL);
tmp = REG_SET_FIELD(tmp, MC_VM_MX_L1_TLB_CNTL, ENABLE_L1_TLB, 1);
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 3fe9b9755cf7..91216cdf4d1c 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -806,7 +806,7 @@ static void gmc_v8_0_set_prt(struct amdgpu_device *adev, 
bool enable)
   */
  static int gmc_v8_0_gart_enable(struct amdgpu_device *adev)
  {
-   uint64_t table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
+   uint64_t table_addr;
int r, i;
u32 tmp, field;

@@ -817,6 +817,9 @@ static int gmc_v8_0_gart_enable(struct amdgpu_device *adev)
r = amdgpu_gart_table_vram_pin(adev);
if (r)
return r;
+
+   table_addr = amdgpu_bo_gpu_offset(adev->gart.bo);
+
/* Setup TLB control */
tmp = RREG32(mmMC_VM_MX_L1_TLB_CNTL);
tmp = REG_SET_FIELD(tmp, MC_VM_MX_L1_TLB_CNTL, ENABLE_L1_TLB, 1);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: remove extra newline when printing VM faults

2018-08-27 Thread Zhang, Jerry (Junwei)

On 08/27/2018 10:04 PM, Alex Deucher wrote:

On Mon, Aug 27, 2018 at 9:45 AM Christian König
 wrote:


Looks like a copy error to me.

Signed-off-by: Christian König 


Reviewed-by: Alex Deucher 


Reviewed-by: Junwei Zhang 




---
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 66abfcf87ad0..ad40acb236bc 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -265,7 +265,7 @@ static int gmc_v9_0_process_interrupt(struct amdgpu_device 
*adev,
 amdgpu_vm_get_task_info(adev, entry->pasid, _info);

 dev_err(adev->dev,
-   "[%s] VMC page fault (src_id:%u ring:%u vmid:%u pasid:%u, 
for process %s pid %d thread %s pid %d\n)\n",
+   "[%s] VMC page fault (src_id:%u ring:%u vmid:%u pasid:%u, 
for process %s pid %d thread %s pid %d)\n",
 entry->vmid_src ? "mmhub" : "gfxhub",
 entry->src_id, entry->ring_id, entry->vmid,
 entry->pasid, task_info.process_name, task_info.tgid,
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 09/10] drm/amdgpu: use the AGP aperture for system memory access

2018-08-27 Thread Zhang, Jerry (Junwei)

On 08/28/2018 12:53 AM, Christian König wrote:

Start to use the old AGP aperture for system memory access.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 24 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c |  9 
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 21 ++
  4 files changed, 46 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index eed5352f3136..54d353951e21 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -79,6 +79,30 @@ uint64_t amdgpu_gmc_pd_addr(struct amdgpu_bo *bo)
return pd_addr;
  }

+/**
+ * amdgpu_gmc_agp_addr - return the address in the AGP address space
+ *
+ * @tbo: TTM BO which needs the address, must be in GTT domain
+ *
+ * Tries to figure out how to access the BO through the AGP aperture. Returns
+ * AMDGPU_BO_INVALID_OFFSET if that is not possible.
+ */
+uint64_t amdgpu_gmc_agp_addr(struct ttm_buffer_object *tbo)
+{
+   struct amdgpu_device *adev = amdgpu_ttm_adev(tbo->bdev);
+   struct ttm_dma_tt *ttm;
+
+   if (tbo->num_pages != 1 || !tbo->ttm ||
+   tbo->ttm->caching_state == tt_cached)
+   return AMDGPU_BO_INVALID_OFFSET;
+
+   ttm = container_of(tbo->ttm, struct ttm_dma_tt, ttm);
+   if (ttm->dma_address[0] + PAGE_SIZE >= adev->gmc.agp_size)
+   return AMDGPU_BO_INVALID_OFFSET;
+
+   return adev->gmc.agp_start + ttm->dma_address[0];
+}
+
  /**
   * amdgpu_gmc_vram_location - try to find VRAM location
   *
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index 163110fe375d..6e8432fd3309 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -137,6 +137,7 @@ static inline bool amdgpu_gmc_vram_full_visible(struct 
amdgpu_gmc *gmc)
  void amdgpu_gmc_get_pde_for_bo(struct amdgpu_bo *bo, int level,
   uint64_t *addr, uint64_t *flags);
  uint64_t amdgpu_gmc_pd_addr(struct amdgpu_bo *bo);
+uint64_t amdgpu_gmc_agp_addr(struct ttm_buffer_object *tbo);
  void amdgpu_gmc_vram_location(struct amdgpu_device *adev, struct amdgpu_gmc 
*mc,
  u64 base);
  void amdgpu_gmc_gart_location(struct amdgpu_device *adev,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index c2539f6821c0..deaea11eb39a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -132,6 +132,15 @@ static int amdgpu_gtt_mgr_alloc(struct 
ttm_mem_type_manager *man,
else
lpfn = adev->gart.num_cpu_pages;

+   if (fpfn == 0 && lpfn == adev->gart.num_cpu_pages) {
+   uint64_t addr = amdgpu_gmc_agp_addr(tbo);
+
+   if (addr != AMDGPU_BO_INVALID_OFFSET) {
+   mem->start = addr >> PAGE_SHIFT;
+   return 0;
+   }
+   }
+
mode = DRM_MM_INSERT_BEST;
if (place && place->flags & TTM_PL_FLAG_TOPDOWN)
mode = DRM_MM_INSERT_HIGH;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index d9f3201c9e5c..281611f6bcd4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1103,15 +1103,18 @@ int amdgpu_ttm_alloc_gart(struct ttm_buffer_object *bo)
if (unlikely(r))
return r;

-   /* compute PTE flags for this buffer object */
-   flags = amdgpu_ttm_tt_pte_flags(adev, bo->ttm, );
-
-   /* Bind pages */
-   gtt->offset = ((u64)tmp.start << PAGE_SHIFT) - adev->gmc.gart_start;
-   r = amdgpu_ttm_gart_bind(adev, bo, flags);
-   if (unlikely(r)) {
-   ttm_bo_mem_put(bo, );
-   return r;
+   if (amdgpu_gtt_mgr_has_gart_addr()) {
+   /* compute PTE flags for this buffer object */
+   flags = amdgpu_ttm_tt_pte_flags(adev, bo->ttm, );
+
+   /* Bind pages */
+   gtt->offset = ((u64)tmp.start << PAGE_SHIFT) -
+   adev->gmc.gart_start;
+   r = amdgpu_ttm_gart_bind(adev, bo, flags);
+   if (unlikely(r)) {
+   ttm_bo_mem_put(bo, );
+   return r;
+   }
}

ttm_bo_mem_put(bo, >mem);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 06/10] drm/amdgpu: add amdgpu_gmc_agp_location

2018-08-27 Thread Zhang, Jerry (Junwei)

On 08/28/2018 12:53 AM, Christian König wrote:

Helper to figure out the location of the AGP BAR.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 42 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h |  5 +++
  2 files changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 4331a0e25cdc..eed5352f3136 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -141,3 +141,45 @@ void amdgpu_gmc_gart_location(struct amdgpu_device *adev, 
struct amdgpu_gmc *mc)
dev_info(adev->dev, "GART: %lluM 0x%016llX - 0x%016llX\n",
mc->gart_size >> 20, mc->gart_start, mc->gart_end);
  }
+
+/**
+ * amdgpu_gmc_agp_location - try to find AGP location
+ * @adev: amdgpu device structure holding all necessary informations
+ * @mc: memory controller structure holding memory informations
+ *
+ * Function will place try to fina a place for the AGP BAR in the MC address
+ * space.
+ *
+ * AGP BAR will be assigned the largest available hole in the address space.
+ */
+void amdgpu_gmc_agp_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc)
+{
+   const uint64_t sixteen_gb = 1ULL << 34;
+   u64 size_af, size_bf;
+
+   if (mc->vram_start > mc->gart_start) {
+   size_bf = mc->vram_start - mc->gart_end + 1;
+   size_af = mc->mc_mask - mc->vram_end;
+   } else {
+   size_bf = mc->vram_start;
+   size_af = mc->mc_mask - mc->gart_end;
+   }
+
+   size_bf &= ~(sixteen_gb - 1);
+   size_af &= ~(sixteen_gb - 1);
+
+   if (size_bf > size_af) {
+   mc->agp_start = mc->vram_start > mc->gart_start ?
+   mc->gart_start : 0;


Here looks mc->gart_end?


+   mc->agp_size = size_bf;
+   } else {
+   mc->agp_start = (mc->vram_start > mc->gart_start ?
+   mc->vram_end : mc->gart_end) + 1,
+   mc->agp_size = size_af;
+   }
+
+   mc->agp_start = ALIGN(mc->agp_start, sixteen_gb);


still needs mc->apg_start + 1 for alignment?


+   mc->agp_end = mc->agp_start + mc->agp_size - 1;
+   dev_info(adev->dev, "AGP: %lluM 0x%016llX - 0x%016llX\n",
+   mc->agp_size >> 20, mc->agp_start, mc->agp_end);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index 72fcc9338f5e..163110fe375d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -81,6 +81,9 @@ struct amdgpu_gmc {
 * about vram size near mc fb location */
u64 mc_vram_size;
u64 visible_vram_size;
+   u64 agp_size;
+   u64 agp_start;
+   u64 agp_end;
u64 gart_size;
u64 gart_start;
u64 gart_end;
@@ -138,5 +141,7 @@ void amdgpu_gmc_vram_location(struct amdgpu_device *adev, 
struct amdgpu_gmc *mc,
  u64 base);
  void amdgpu_gmc_gart_location(struct amdgpu_device *adev,
  struct amdgpu_gmc *mc);
+void amdgpu_gmc_agp_location(struct amdgpu_device *adev,
+struct amdgpu_gmc *mc);

  #endif


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 08/10] drm/amdgpu: distinct between allocated GART space and GMC addr

2018-08-27 Thread Zhang, Jerry (Junwei)

On 08/28/2018 12:53 AM, Christian König wrote:

Most of the time we only need to know if the BO has a valid GMC addr.

Signed-off-by: Christian König 


good to see this cleanup :)

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  2 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 13 +
  3 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 5ddd4e87480b..b5f20b42439e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1362,8 +1362,6 @@ void amdgpu_bo_fence(struct amdgpu_bo *bo, struct 
dma_fence *fence,
  u64 amdgpu_bo_gpu_offset(struct amdgpu_bo *bo)
  {
WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_SYSTEM);
-   WARN_ON_ONCE(bo->tbo.mem.mem_type == TTM_PL_TT &&
-!amdgpu_gtt_mgr_has_gart_addr(>tbo.mem));
WARN_ON_ONCE(!ww_mutex_is_locked(>tbo.resv->lock) &&
 !bo->pin_count);
WARN_ON_ONCE(bo->tbo.mem.start == AMDGPU_BO_INVALID_OFFSET);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 18945dd6982d..37c79ae3574e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -200,7 +200,7 @@ static inline u64 amdgpu_bo_mmap_offset(struct amdgpu_bo 
*bo)
  static inline bool amdgpu_bo_gpu_accessible(struct amdgpu_bo *bo)
  {
switch (bo->tbo.mem.mem_type) {
-   case TTM_PL_TT: return amdgpu_gtt_mgr_has_gart_addr(>tbo.mem);
+   case TTM_PL_TT: return bo->tbo.mem.start != AMDGPU_BO_INVALID_OFFSET;
case TTM_PL_VRAM: return true;
default: return false;
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 5cadf4f1ee2c..d9f3201c9e5c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -345,7 +345,7 @@ static uint64_t amdgpu_mm_node_addr(struct 
ttm_buffer_object *bo,
  {
uint64_t addr = 0;

-   if (mem->mem_type != TTM_PL_TT || amdgpu_gtt_mgr_has_gart_addr(mem)) {
+   if (mm_node->start != AMDGPU_BO_INVALID_OFFSET) {
addr = mm_node->start << PAGE_SHIFT;
addr += bo->bdev->man[mem->mem_type].gpu_offset;
}
@@ -433,8 +433,7 @@ int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev,
/* Map only what needs to be accessed. Map src to window 0 and
 * dst to window 1
 */
-   if (src->mem->mem_type == TTM_PL_TT &&
-   !amdgpu_gtt_mgr_has_gart_addr(src->mem)) {
+   if (src->mem->start == AMDGPU_BO_INVALID_OFFSET) {
r = amdgpu_map_buffer(src->bo, src->mem,
PFN_UP(cur_size + src_page_offset),
src_node_start, 0, ring,
@@ -447,8 +446,7 @@ int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev,
from += src_page_offset;
}

-   if (dst->mem->mem_type == TTM_PL_TT &&
-   !amdgpu_gtt_mgr_has_gart_addr(dst->mem)) {
+   if (dst->mem->start == AMDGPU_BO_INVALID_OFFSET) {
r = amdgpu_map_buffer(dst->bo, dst->mem,
PFN_UP(cur_size + dst_page_offset),
dst_node_start, 1, ring,
@@ -1086,11 +1084,10 @@ int amdgpu_ttm_alloc_gart(struct ttm_buffer_object *bo)
uint64_t flags;
int r;

-   if (bo->mem.mem_type != TTM_PL_TT ||
-   amdgpu_gtt_mgr_has_gart_addr(>mem))
+   if (bo->mem.start != AMDGPU_BO_INVALID_OFFSET)
return 0;

-   /* allocate GTT space */
+   /* allocate GART space */
tmp = bo->mem;
tmp.mm_node = NULL;
placement.num_placement = 1;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 07/10] drm/amdgpu: stop using gart_start as offset for the GTT domain

2018-08-27 Thread Zhang, Jerry (Junwei)

On 08/28/2018 03:07 AM, Alex Deucher wrote:

On Mon, Aug 27, 2018 at 12:56 PM Christian König
 wrote:


Further separate GART and GTT domain.

Signed-off-by: Christian König 


Reviewed-by: Alex Deucher 


Reviewed-by: Junwei Zhang 




---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 3 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 +++---
  2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index da7b1b92d9cf..c2539f6821c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -143,7 +143,8 @@ static int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager 
*man,
 spin_unlock(>lock);

 if (!r)
-   mem->start = node->node.start;
+   mem->start = node->node.start +
+   (adev->gmc.gart_start >> PAGE_SHIFT);

 return r;
  }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 2f304f9dd543..5cadf4f1ee2c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -188,7 +188,7 @@ static int amdgpu_init_mem_type(struct ttm_bo_device *bdev, 
uint32_t type,
 case TTM_PL_TT:
 /* GTT memory  */
 man->func = _gtt_mgr_func;
-   man->gpu_offset = adev->gmc.gart_start;
+   man->gpu_offset = 0;
 man->available_caching = TTM_PL_MASK_CACHING;
 man->default_caching = TTM_PL_FLAG_CACHED;
 man->flags = TTM_MEMTYPE_FLAG_MAPPABLE | TTM_MEMTYPE_FLAG_CMA;
@@ -1062,7 +1062,7 @@ static int amdgpu_ttm_backend_bind(struct ttm_tt *ttm,
 flags = amdgpu_ttm_tt_pte_flags(adev, ttm, bo_mem);

 /* bind pages into GART page tables */
-   gtt->offset = (u64)bo_mem->start << PAGE_SHIFT;
+   gtt->offset = ((u64)bo_mem->start << PAGE_SHIFT) - adev->gmc.gart_start;
 r = amdgpu_gart_bind(adev, gtt->offset, ttm->num_pages,
 ttm->pages, gtt->ttm.dma_address, flags);

@@ -1110,7 +1110,7 @@ int amdgpu_ttm_alloc_gart(struct ttm_buffer_object *bo)
 flags = amdgpu_ttm_tt_pte_flags(adev, bo->ttm, );

 /* Bind pages */
-   gtt->offset = (u64)tmp.start << PAGE_SHIFT;
+   gtt->offset = ((u64)tmp.start << PAGE_SHIFT) - adev->gmc.gart_start;
 r = amdgpu_ttm_gart_bind(adev, bo, flags);
 if (unlikely(r)) {
 ttm_bo_mem_put(bo, );
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9

2018-08-27 Thread Zhang, Jerry (Junwei)

On 08/28/2018 09:56 AM, Zhang, Jerry (Junwei) wrote:

On 08/28/2018 12:53 AM, Christian König wrote:

Only use the lower address space on GMC9 for the system domain.
Otherwise we would need to sign extend GMC addresses.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index e44b5191735d..d982956c8329 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -938,11 +938,10 @@ static int gmc_v9_0_sw_init(void *handle)
  if (r)
  return r;

-/* Set the internal MC address mask
- * This is the max address of the GPU's
- * internal address space.
+/* Use only the lower range for the internal MC address mask. This is
+ * the max address of the GPU's internal address space.
   */
-adev->gmc.mc_mask = 0xULL; /* 48 bit MC */
+adev->gmc.mc_mask = 0x7fffULL;


do we need to update vm_size as 128T at the same time?


Likely no, since we use that for system domain only.

BTW, how do we decide it's size limitation.
looks we always use that less than 40bit?

Jerry



Jerry



  /* set DMA mask + need_dma32 flags.
   * PCIE - can handle 44-bits.


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 05/10] drm/amdgpu: put GART away from VRAM

2018-08-27 Thread Zhang, Jerry (Junwei)

On 08/28/2018 12:53 AM, Christian König wrote:

Always try to put the GART away from where VRAM is.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 265ec6807130..4331a0e25cdc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -136,7 +136,7 @@ void amdgpu_gmc_gart_location(struct amdgpu_device *adev, 
struct amdgpu_gmc *mc)
(size_af < mc->gart_size))
mc->gart_start = 0;
else
-   mc->gart_start = ALIGN(mc->vram_end + 1, 0x1ULL);
+   mc->gart_start = mc->mc_mask - mc->gart_size + 1;


it seems to break the VCE limitation about 4G alignment?

Jerry


mc->gart_end = mc->gart_start + mc->gart_size - 1;
dev_info(adev->dev, "GART: %lluM 0x%016llX - 0x%016llX\n",
mc->gart_size >> 20, mc->gart_start, mc->gart_end);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 04/10] drm/amdgpu: use the smaller hole for GART

2018-08-27 Thread Zhang, Jerry (Junwei)

On 08/28/2018 03:05 AM, Alex Deucher wrote:

On Mon, Aug 27, 2018 at 12:55 PM Christian König
 wrote:


Instead of the larger one use the smaller hole in the MC address
space for the GART mappings.

Signed-off-by: Christian König 


Reviewed-by: Alex Deucher 


Reviewed-by: Junwei Zhang 




---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 8269197df8e0..265ec6807130 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -132,7 +132,8 @@ void amdgpu_gmc_gart_location(struct amdgpu_device *adev, 
struct amdgpu_gmc *mc)
 mc->gart_size = max(size_bf, size_af);
 }

-   if (size_bf > size_af)
+   if ((size_bf >= mc->gart_size && size_bf < size_af) ||
+   (size_af < mc->gart_size))
 mc->gart_start = 0;
 else
 mc->gart_start = ALIGN(mc->vram_end + 1, 0x1ULL);
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 03/10] drm/amdgpu: fix amdgpu_gmc_gart_location a little bit

2018-08-27 Thread Zhang, Jerry (Junwei)

On 08/28/2018 12:53 AM, Christian König wrote:

Improve the VCE limitation handling.

Signed-off-by: Christian König 

Reviewed-by: Junwei Zhang 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 28 -
  1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index 72dffa3fd194..8269197df8e0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -120,24 +120,22 @@ void amdgpu_gmc_gart_location(struct amdgpu_device *adev, 
struct amdgpu_gmc *mc)

mc->gart_size += adev->pm.smu_prv_buffer_size;

-   size_af = adev->gmc.mc_mask - mc->vram_end;
+   /* VCE doesn't like it when BOs cross a 4GB segment, so align
+* the GART base on a 4GB boundary as well.
+*/
size_bf = mc->vram_start;
-   if (size_bf > size_af) {
-   if (mc->gart_size > size_bf) {
-   dev_warn(adev->dev, "limiting GART\n");
-   mc->gart_size = size_bf;
-   }
+   size_af = adev->gmc.mc_mask + 1 -
+   ALIGN(mc->vram_end + 1, 0x1ULL);
+
+   if (mc->gart_size > max(size_bf, size_af)) {
+   dev_warn(adev->dev, "limiting GART\n");
+   mc->gart_size = max(size_bf, size_af);
+   }
+
+   if (size_bf > size_af)
mc->gart_start = 0;
-   } else {
-   if (mc->gart_size > size_af) {
-   dev_warn(adev->dev, "limiting GART\n");
-   mc->gart_size = size_af;
-   }
-   /* VCE doesn't like it when BOs cross a 4GB segment, so align
-* the GART base on a 4GB boundary as well.
-*/
+   else
mc->gart_start = ALIGN(mc->vram_end + 1, 0x1ULL);
-   }
mc->gart_end = mc->gart_start + mc->gart_size - 1;
dev_info(adev->dev, "GART: %lluM 0x%016llX - 0x%016llX\n",
mc->gart_size >> 20, mc->gart_start, mc->gart_end);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 02/10] drm/amdgpu: move amdgpu_device_(vram|gtt)_location

2018-08-27 Thread Zhang, Jerry (Junwei)

On 08/28/2018 02:59 AM, Alex Deucher wrote:

On Mon, Aug 27, 2018 at 12:56 PM Christian König
 wrote:


Move that into amdgpu_gmc.c since we are really deadling with GMC
address space here.

Signed-off-by: Christian König 


Reviewed-by: Alex Deucher 


Reviewed-by: Junwei Zhang 



---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  4 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 65 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c| 64 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h|  4 ++
  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c  |  4 +-
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  |  4 +-
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  |  4 +-
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  |  4 +-
  8 files changed, 76 insertions(+), 77 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 340e40d03d54..09bdedfc91c7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1166,10 +1166,6 @@ bool amdgpu_device_need_post(struct amdgpu_device *adev);

  void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes,
   u64 num_vis_bytes);
-void amdgpu_device_vram_location(struct amdgpu_device *adev,
-struct amdgpu_gmc *mc, u64 base);
-void amdgpu_device_gart_location(struct amdgpu_device *adev,
-struct amdgpu_gmc *mc);
  int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev);
  void amdgpu_device_program_register_sequence(struct amdgpu_device *adev,
  const u32 *registers,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 9f5e4be76d5e..0afc5e599683 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -651,71 +651,6 @@ void amdgpu_device_wb_free(struct amdgpu_device *adev, u32 
wb)
 __clear_bit(wb, adev->wb.used);
  }

-/**
- * amdgpu_device_vram_location - try to find VRAM location
- *
- * @adev: amdgpu device structure holding all necessary informations
- * @mc: memory controller structure holding memory informations
- * @base: base address at which to put VRAM
- *
- * Function will try to place VRAM at base address provided
- * as parameter.
- */
-void amdgpu_device_vram_location(struct amdgpu_device *adev,
-struct amdgpu_gmc *mc, u64 base)
-{
-   uint64_t limit = (uint64_t)amdgpu_vram_limit << 20;
-
-   mc->vram_start = base;
-   mc->vram_end = mc->vram_start + mc->mc_vram_size - 1;
-   if (limit && limit < mc->real_vram_size)
-   mc->real_vram_size = limit;
-   dev_info(adev->dev, "VRAM: %lluM 0x%016llX - 0x%016llX (%lluM used)\n",
-   mc->mc_vram_size >> 20, mc->vram_start,
-   mc->vram_end, mc->real_vram_size >> 20);
-}
-
-/**
- * amdgpu_device_gart_location - try to find GART location
- *
- * @adev: amdgpu device structure holding all necessary informations
- * @mc: memory controller structure holding memory informations
- *
- * Function will place try to place GART before or after VRAM.
- *
- * If GART size is bigger than space left then we ajust GART size.
- * Thus function will never fails.
- */
-void amdgpu_device_gart_location(struct amdgpu_device *adev,
-struct amdgpu_gmc *mc)
-{
-   u64 size_af, size_bf;
-
-   mc->gart_size += adev->pm.smu_prv_buffer_size;
-
-   size_af = adev->gmc.mc_mask - mc->vram_end;
-   size_bf = mc->vram_start;
-   if (size_bf > size_af) {
-   if (mc->gart_size > size_bf) {
-   dev_warn(adev->dev, "limiting GART\n");
-   mc->gart_size = size_bf;
-   }
-   mc->gart_start = 0;
-   } else {
-   if (mc->gart_size > size_af) {
-   dev_warn(adev->dev, "limiting GART\n");
-   mc->gart_size = size_af;
-   }
-   /* VCE doesn't like it when BOs cross a 4GB segment, so align
-* the GART base on a 4GB boundary as well.
-*/
-   mc->gart_start = ALIGN(mc->vram_end + 1, 0x1ULL);
-   }
-   mc->gart_end = mc->gart_start + mc->gart_size - 1;
-   dev_info(adev->dev, "GART: %lluM 0x%016llX - 0x%016llX\n",
-   mc->gart_size >> 20, mc->gart_start, mc->gart_end);
-}
-
  /**
   * amdgpu_device_resize_fb_bar - try to resize FB BAR
   *
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
index a249931ef512..72dffa3fd194 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
@@ -78,3 +78,67 @@ uint64_t amdgpu_gmc_pd_addr(struct amdgpu_bo *bo)
 }
 return pd_addr;
  }
+
+/**
+ * 

Re: [PATCH 01/10] drm/amdgpu: use only the lower address space on GMC9

2018-08-27 Thread Zhang, Jerry (Junwei)

On 08/28/2018 12:53 AM, Christian König wrote:

Only use the lower address space on GMC9 for the system domain.
Otherwise we would need to sign extend GMC addresses.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index e44b5191735d..d982956c8329 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -938,11 +938,10 @@ static int gmc_v9_0_sw_init(void *handle)
if (r)
return r;

-   /* Set the internal MC address mask
-* This is the max address of the GPU's
-* internal address space.
+   /* Use only the lower range for the internal MC address mask. This is
+* the max address of the GPU's internal address space.
 */
-   adev->gmc.mc_mask = 0xULL; /* 48 bit MC */
+   adev->gmc.mc_mask = 0x7fffULL;


do we need to update vm_size as 128T at the same time?

Jerry



/* set DMA mask + need_dma32 flags.
 * PCIE - can handle 44-bits.


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


  1   2   3   4   >