RE: [PATCH] drm/amd/powerplay: ACG frequency added in PPTable

2017-08-22 Thread Deucher, Alexander
> -Original Message-
> From: Evan Quan [mailto:evan.q...@amd.com]
> Sent: Tuesday, August 22, 2017 9:59 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander; Quan, Evan
> Subject: [PATCH] drm/amd/powerplay: ACG frequency added in PPTable
> 
> Change-Id: If1df87cf4a458d59ce2545a813c4594118fa4c3d
> Signed-off-by: Evan Quan 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 11 -
> --
>  drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h |  6 --
>  2 files changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
> b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
> index 0e7be87..1e16f84 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
> @@ -1557,7 +1557,8 @@ static int vega10_populate_smc_link_levels(struct
> pp_hwmgr *hwmgr)
>  */
> 
>  static int vega10_populate_single_gfx_level(struct pp_hwmgr *hwmgr,
> - uint32_t gfx_clock, PllSetting_t *current_gfxclk_level)
> + uint32_t gfx_clock, PllSetting_t *current_gfxclk_level,
> + uint32_t *acg_freq)
>  {
>   struct phm_ppt_v2_information *table_info =
>   (struct phm_ppt_v2_information *)(hwmgr-
> >pptable);
> @@ -1608,6 +1609,8 @@ static int vega10_populate_single_gfx_level(struct
> pp_hwmgr *hwmgr,
>   cpu_to_le16(dividers.usPll_ss_slew_frac);
>   current_gfxclk_level->Did = (uint8_t)(dividers.ulDid);
> 
> + *acg_freq = gfx_clock / 100; /* 100 Khz to Mhz conversion */
> +
>   return 0;
>  }
> 
> @@ -1688,7 +1691,8 @@ static int
> vega10_populate_all_graphic_levels(struct pp_hwmgr *hwmgr)
>   for (i = 0; i < dpm_table->count; i++) {
>   result = vega10_populate_single_gfx_level(hwmgr,
>   dpm_table->dpm_levels[i].value,
> - &(pp_table->GfxclkLevel[i]));
> + &(pp_table->GfxclkLevel[i]),
> + &(pp_table->AcgFreqTable[i]));
>   if (result)
>   return result;
>   }
> @@ -1697,7 +1701,8 @@ static int
> vega10_populate_all_graphic_levels(struct pp_hwmgr *hwmgr)
>   while (i < NUM_GFXCLK_DPM_LEVELS) {
>   result = vega10_populate_single_gfx_level(hwmgr,
>   dpm_table->dpm_levels[j].value,
> - &(pp_table->GfxclkLevel[i]));
> + &(pp_table->GfxclkLevel[i]),
> + &(pp_table->AcgFreqTable[i]));
>   if (result)
>   return result;
>   i++;
> diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h
> b/drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h
> index f6d6c61..2818c98 100644
> --- a/drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h
> +++ b/drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h
> @@ -315,10 +315,12 @@ typedef struct {
>uint8_t  AcgEnable[NUM_GFXCLK_DPM_LEVELS];
>GbVdroopTable_t AcgBtcGbVdroopTable;
>QuadraticInt_t  AcgAvfsGb;
> -  uint32_t Reserved[4];
> +
> +  /* ACG Frequency Table, in Mhz */
> +  uint32_t AcgFreqTable[NUM_GFXCLK_DPM_LEVELS];
> 
>/* Padding - ignore */
> -  uint32_t MmHubPadding[7]; /* SMU internal use */
> +  uint32_t MmHubPadding[3]; /* SMU internal use */
> 
>  } PPTable_t;
> 
> --
> 2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: set sched_hw_submission higher for KIQ (v2)

2017-08-22 Thread Felix Kuehling
This change is Reviewed-by: Felix Kuehling 


On 2017-08-22 09:52 PM, Alex Deucher wrote:
> KIQ doesn't really use the GPU scheduler.  The base
> drivers generally use the KIQ ring directly rather than
> submitting IBs.  However, amdgpu_sched_hw_submission
> (which defaults to 2) limits the number of outstanding
> fences to 2.  KFD uses the KIQ for TLB flushes and the
> 2 fence limit hurts performance when there are several KFD
> processes running.
>
> v2: move some expressions to one line
> change KIQ sched_hw_submission to at least 16
>
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 16 
>  1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index 6c5646b..7c251ff 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -170,6 +170,16 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
> amdgpu_ring *ring,
>unsigned irq_type)
>  {
>   int r;
> + int sched_hw_submission = amdgpu_sched_hw_submission;
> +
> + /* Set the hw submission limit higher for KIQ because
> +  * it's used for a number of gfx/compute tasks by both
> +  * KFD and KGD which may have outstanding fences and
> +  * it doesn't really use the gpu scheduler anyway;
> +  * KIQ tasks get submitted directly to the ring.
> +  */
> + if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
> + sched_hw_submission = max(sched_hw_submission, 16);
>  
>   if (ring->adev == NULL) {
>   if (adev->num_rings >= AMDGPU_MAX_RINGS)
> @@ -178,8 +188,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
> amdgpu_ring *ring,
>   ring->adev = adev;
>   ring->idx = adev->num_rings++;
>   adev->rings[ring->idx] = ring;
> - r = amdgpu_fence_driver_init_ring(ring,
> - amdgpu_sched_hw_submission);
> + r = amdgpu_fence_driver_init_ring(ring, sched_hw_submission);
>   if (r)
>   return r;
>   }
> @@ -218,8 +227,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
> amdgpu_ring *ring,
>   return r;
>   }
>  
> - ring->ring_size = roundup_pow_of_two(max_dw * 4 *
> -  amdgpu_sched_hw_submission);
> + ring->ring_size = roundup_pow_of_two(max_dw * 4 * sched_hw_submission);
>  
>   ring->buf_mask = (ring->ring_size / 4) - 1;
>   ring->ptr_mask = ring->funcs->support_64bit_ptrs ?

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/powerplay: ACG frequency added in PPTable

2017-08-22 Thread Evan Quan
Change-Id: If1df87cf4a458d59ce2545a813c4594118fa4c3d
Signed-off-by: Evan Quan 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c | 11 ---
 drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h |  6 --
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
index 0e7be87..1e16f84 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
@@ -1557,7 +1557,8 @@ static int vega10_populate_smc_link_levels(struct 
pp_hwmgr *hwmgr)
 */
 
 static int vega10_populate_single_gfx_level(struct pp_hwmgr *hwmgr,
-   uint32_t gfx_clock, PllSetting_t *current_gfxclk_level)
+   uint32_t gfx_clock, PllSetting_t *current_gfxclk_level,
+   uint32_t *acg_freq)
 {
struct phm_ppt_v2_information *table_info =
(struct phm_ppt_v2_information *)(hwmgr->pptable);
@@ -1608,6 +1609,8 @@ static int vega10_populate_single_gfx_level(struct 
pp_hwmgr *hwmgr,
cpu_to_le16(dividers.usPll_ss_slew_frac);
current_gfxclk_level->Did = (uint8_t)(dividers.ulDid);
 
+   *acg_freq = gfx_clock / 100; /* 100 Khz to Mhz conversion */
+
return 0;
 }
 
@@ -1688,7 +1691,8 @@ static int vega10_populate_all_graphic_levels(struct 
pp_hwmgr *hwmgr)
for (i = 0; i < dpm_table->count; i++) {
result = vega10_populate_single_gfx_level(hwmgr,
dpm_table->dpm_levels[i].value,
-   &(pp_table->GfxclkLevel[i]));
+   &(pp_table->GfxclkLevel[i]),
+   &(pp_table->AcgFreqTable[i]));
if (result)
return result;
}
@@ -1697,7 +1701,8 @@ static int vega10_populate_all_graphic_levels(struct 
pp_hwmgr *hwmgr)
while (i < NUM_GFXCLK_DPM_LEVELS) {
result = vega10_populate_single_gfx_level(hwmgr,
dpm_table->dpm_levels[j].value,
-   &(pp_table->GfxclkLevel[i]));
+   &(pp_table->GfxclkLevel[i]),
+   &(pp_table->AcgFreqTable[i]));
if (result)
return result;
i++;
diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h 
b/drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h
index f6d6c61..2818c98 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h
@@ -315,10 +315,12 @@ typedef struct {
   uint8_t  AcgEnable[NUM_GFXCLK_DPM_LEVELS];
   GbVdroopTable_t AcgBtcGbVdroopTable;
   QuadraticInt_t  AcgAvfsGb;
-  uint32_t Reserved[4];
+
+  /* ACG Frequency Table, in Mhz */
+  uint32_t AcgFreqTable[NUM_GFXCLK_DPM_LEVELS];
 
   /* Padding - ignore */
-  uint32_t MmHubPadding[7]; /* SMU internal use */
+  uint32_t MmHubPadding[3]; /* SMU internal use */
 
 } PPTable_t;
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: set sched_hw_submission higher for KIQ (v2)

2017-08-22 Thread Alex Deucher
KIQ doesn't really use the GPU scheduler.  The base
drivers generally use the KIQ ring directly rather than
submitting IBs.  However, amdgpu_sched_hw_submission
(which defaults to 2) limits the number of outstanding
fences to 2.  KFD uses the KIQ for TLB flushes and the
2 fence limit hurts performance when there are several KFD
processes running.

v2: move some expressions to one line
change KIQ sched_hw_submission to at least 16

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 6c5646b..7c251ff 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -170,6 +170,16 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
 unsigned irq_type)
 {
int r;
+   int sched_hw_submission = amdgpu_sched_hw_submission;
+
+   /* Set the hw submission limit higher for KIQ because
+* it's used for a number of gfx/compute tasks by both
+* KFD and KGD which may have outstanding fences and
+* it doesn't really use the gpu scheduler anyway;
+* KIQ tasks get submitted directly to the ring.
+*/
+   if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
+   sched_hw_submission = max(sched_hw_submission, 16);
 
if (ring->adev == NULL) {
if (adev->num_rings >= AMDGPU_MAX_RINGS)
@@ -178,8 +188,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
ring->adev = adev;
ring->idx = adev->num_rings++;
adev->rings[ring->idx] = ring;
-   r = amdgpu_fence_driver_init_ring(ring,
-   amdgpu_sched_hw_submission);
+   r = amdgpu_fence_driver_init_ring(ring, sched_hw_submission);
if (r)
return r;
}
@@ -218,8 +227,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
return r;
}
 
-   ring->ring_size = roundup_pow_of_two(max_dw * 4 *
-amdgpu_sched_hw_submission);
+   ring->ring_size = roundup_pow_of_two(max_dw * 4 * sched_hw_submission);
 
ring->buf_mask = (ring->ring_size / 4) - 1;
ring->ptr_mask = ring->funcs->support_64bit_ptrs ?
-- 
2.5.5

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: set sched_hw_submission higher for KIQ

2017-08-22 Thread Jay Cornwall
On Tue, Aug 22, 2017, at 16:17, Felix Kuehling wrote:
> Thanks Alex!
> 
> Jay, do you think this is enough? This bumps the number of concurrent
> operations on KIQ to 4 by default.

I'm not sure what the best number is. Up to 8 KFD processes is common
(beyond that performance drops off due to VMID availability) but I'm not
sure how often they would need to submit to KIQ concurrently. If it's
not expensive I'd just bump it up to say 16.

The performance problem isn't that bad since all the KIQ requests are
serialized but the dmesg spam is not nice. Perhaps lowering the severity
of the 'rcu slot is busy' message would address that as well?

> 
> Regards,
>   Felix
> 
> 
> On 2017-08-22 04:49 PM, Alex Deucher wrote:
> > KIQ doesn't really use the GPU scheduler.  The base
> > drivers generally use the KIQ ring directly rather than
> > submitting IBs.  However, amdgpu_sched_hw_submission
> > (which defaults to 2) limits the number of outstanding
> > fences to 2.  KFD uses the KIQ for TLB flushes and the
> > 2 fence limit hurts performance when there are several KFD
> > processes running.
> >
> > Signed-off-by: Alex Deucher 
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 14 --
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> > index 6c5646b..f39b851 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> > @@ -170,6 +170,16 @@ int amdgpu_ring_init(struct amdgpu_device *adev, 
> > struct amdgpu_ring *ring,
> >  unsigned irq_type)
> >  {
> > int r;
> > +   int sched_hw_submission = amdgpu_sched_hw_submission;
> > +
> > +   /* Set the hw submission limit higher for KIQ because
> > +* it's used for a number of gfx/compute tasks by both
> > +* KFD and KGD which may have outstanding fences and
> > +* it doesn't really use the gpu scheduler anyway;
> > +* KIQ tasks get submitted directly to the ring.
> > +*/
> > +   if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
> > +   sched_hw_submission *= 2;
> >  
> > if (ring->adev == NULL) {
> > if (adev->num_rings >= AMDGPU_MAX_RINGS)
> > @@ -179,7 +189,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
> > amdgpu_ring *ring,
> > ring->idx = adev->num_rings++;
> > adev->rings[ring->idx] = ring;
> > r = amdgpu_fence_driver_init_ring(ring,
> > -   amdgpu_sched_hw_submission);
> > + sched_hw_submission);
> > if (r)
> > return r;
> > }
> > @@ -219,7 +229,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
> > amdgpu_ring *ring,
> > }
> >  
> > ring->ring_size = roundup_pow_of_two(max_dw * 4 *
> > -amdgpu_sched_hw_submission);
> > +sched_hw_submission);
> >  
> > ring->buf_mask = (ring->ring_size / 4) - 1;
> > ring->ptr_mask = ring->funcs->support_64bit_ptrs ?
> 
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: set sched_hw_submission higher for KIQ

2017-08-22 Thread Xie, AlexBin
Hi Alex,


See below comment about coding style of 80 characters per line.


Thanks,

Alex Bin Xie


From: amd-gfx  on behalf of Alex Deucher 

Sent: Tuesday, August 22, 2017 4:49 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander
Subject: [PATCH] drm/amdgpu: set sched_hw_submission higher for KIQ

KIQ doesn't really use the GPU scheduler.  The base
drivers generally use the KIQ ring directly rather than
submitting IBs.  However, amdgpu_sched_hw_submission
(which defaults to 2) limits the number of outstanding
fences to 2.  KFD uses the KIQ for TLB flushes and the
2 fence limit hurts performance when there are several KFD
processes running.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 6c5646b..f39b851 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -170,6 +170,16 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
  unsigned irq_type)
 {
 int r;
+   int sched_hw_submission = amdgpu_sched_hw_submission;
+
+   /* Set the hw submission limit higher for KIQ because
+* it's used for a number of gfx/compute tasks by both
+* KFD and KGD which may have outstanding fences and
+* it doesn't really use the gpu scheduler anyway;
+* KIQ tasks get submitted directly to the ring.
+*/
+   if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
+   sched_hw_submission *= 2;

 if (ring->adev == NULL) {
 if (adev->num_rings >= AMDGPU_MAX_RINGS)
@@ -179,7 +189,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
 ring->idx = adev->num_rings++;
 adev->rings[ring->idx] = ring;
 r = amdgpu_fence_driver_init_ring(ring,
-   amdgpu_sched_hw_submission);
+ sched_hw_submission);

Alex Bin Xie: With shorter variable name, there is no need to wrap into a new 
line of code.

 if (r)
 return r;
 }
@@ -219,7 +229,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
 }

 ring->ring_size = roundup_pow_of_two(max_dw * 4 *
-amdgpu_sched_hw_submission);
+sched_hw_submission);

Alex Bin Xie: With shorter variable name, there is no need to wrap into a new 
line of code.


 ring->buf_mask = (ring->ring_size / 4) - 1;
 ring->ptr_mask = ring->funcs->support_64bit_ptrs ?
--
2.5.5

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
amd-gfx Info Page - 
freedesktop.org
lists.freedesktop.org
Subscribing to amd-gfx: Subscribe to amd-gfx by filling out the following form. 
Use of all freedesktop.org lists is subject to our Code of ...


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: set sched_hw_submission higher for KIQ

2017-08-22 Thread Felix Kuehling
Thanks Alex!

Jay, do you think this is enough? This bumps the number of concurrent
operations on KIQ to 4 by default.

Regards,
  Felix


On 2017-08-22 04:49 PM, Alex Deucher wrote:
> KIQ doesn't really use the GPU scheduler.  The base
> drivers generally use the KIQ ring directly rather than
> submitting IBs.  However, amdgpu_sched_hw_submission
> (which defaults to 2) limits the number of outstanding
> fences to 2.  KFD uses the KIQ for TLB flushes and the
> 2 fence limit hurts performance when there are several KFD
> processes running.
>
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> index 6c5646b..f39b851 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
> @@ -170,6 +170,16 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
> amdgpu_ring *ring,
>unsigned irq_type)
>  {
>   int r;
> + int sched_hw_submission = amdgpu_sched_hw_submission;
> +
> + /* Set the hw submission limit higher for KIQ because
> +  * it's used for a number of gfx/compute tasks by both
> +  * KFD and KGD which may have outstanding fences and
> +  * it doesn't really use the gpu scheduler anyway;
> +  * KIQ tasks get submitted directly to the ring.
> +  */
> + if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
> + sched_hw_submission *= 2;
>  
>   if (ring->adev == NULL) {
>   if (adev->num_rings >= AMDGPU_MAX_RINGS)
> @@ -179,7 +189,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
> amdgpu_ring *ring,
>   ring->idx = adev->num_rings++;
>   adev->rings[ring->idx] = ring;
>   r = amdgpu_fence_driver_init_ring(ring,
> - amdgpu_sched_hw_submission);
> +   sched_hw_submission);
>   if (r)
>   return r;
>   }
> @@ -219,7 +229,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
> amdgpu_ring *ring,
>   }
>  
>   ring->ring_size = roundup_pow_of_two(max_dw * 4 *
> -  amdgpu_sched_hw_submission);
> +  sched_hw_submission);
>  
>   ring->buf_mask = (ring->ring_size / 4) - 1;
>   ring->ptr_mask = ring->funcs->support_64bit_ptrs ?

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: set sched_hw_submission higher for KIQ

2017-08-22 Thread Alex Deucher
KIQ doesn't really use the GPU scheduler.  The base
drivers generally use the KIQ ring directly rather than
submitting IBs.  However, amdgpu_sched_hw_submission
(which defaults to 2) limits the number of outstanding
fences to 2.  KFD uses the KIQ for TLB flushes and the
2 fence limit hurts performance when there are several KFD
processes running.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 6c5646b..f39b851 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -170,6 +170,16 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
 unsigned irq_type)
 {
int r;
+   int sched_hw_submission = amdgpu_sched_hw_submission;
+
+   /* Set the hw submission limit higher for KIQ because
+* it's used for a number of gfx/compute tasks by both
+* KFD and KGD which may have outstanding fences and
+* it doesn't really use the gpu scheduler anyway;
+* KIQ tasks get submitted directly to the ring.
+*/
+   if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
+   sched_hw_submission *= 2;
 
if (ring->adev == NULL) {
if (adev->num_rings >= AMDGPU_MAX_RINGS)
@@ -179,7 +189,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
ring->idx = adev->num_rings++;
adev->rings[ring->idx] = ring;
r = amdgpu_fence_driver_init_ring(ring,
-   amdgpu_sched_hw_submission);
+ sched_hw_submission);
if (r)
return r;
}
@@ -219,7 +229,7 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct 
amdgpu_ring *ring,
}
 
ring->ring_size = roundup_pow_of_two(max_dw * 4 *
-amdgpu_sched_hw_submission);
+sched_hw_submission);
 
ring->buf_mask = (ring->ring_size / 4) - 1;
ring->ptr_mask = ring->funcs->support_64bit_ptrs ?
-- 
2.5.5

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/2] drm/amdgpu/ci: tidy up some limit checks

2017-08-22 Thread Dan Carpenter
This is partly to silence a static checker warning:

drivers/gpu/drm/amd/amdgpu/ci_dpm.c:4553 ci_set_mc_special_registers()
error: buffer overflow 'table->mc_reg_address' 16 <= 16

The story is that it's valid to exit the loop with "j" less than or equal
to SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE".  That value for "j" gets saved
as table->last.  We're not allow to access "table->mc_reg_address[j]"
when j == SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE because that's one past
the end of the array.

The tests for "if (j > SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE)" in
ci_set_mc_special_registers() are always false because we start with
j less than the array size and we increment it once so at most it can be
equal.

Then there is a bogus check in ci_register_patching_mc_seq() where
it complains if "table->last >= SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE".
The "table->last" value can't possibly be greater than the array size
and it is allowed to be equal to it.  So let's just remove that test
entirely.

Signed-off-by: Dan Carpenter 
---
Not tested.  Please review this one carefully.

diff --git a/drivers/gpu/drm/amd/amdgpu/ci_dpm.c 
b/drivers/gpu/drm/amd/amdgpu/ci_dpm.c
index cb508a211b2f..c885388bdc3c 100644
--- a/drivers/gpu/drm/amd/amdgpu/ci_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/ci_dpm.c
@@ -4546,10 +4546,10 @@ static int ci_set_mc_special_registers(struct 
amdgpu_device *adev,
table->mc_reg_table_entry[k].mc_data[j] 
|= 0x100;
}
j++;
-   if (j > SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE)
-   return -EINVAL;
 
if (adev->mc.vram_type != AMDGPU_VRAM_TYPE_GDDR5) {
+   if (j >= SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE)
+   return -EINVAL;
table->mc_reg_address[j].s1 = mmMC_PMG_AUTO_CMD;
table->mc_reg_address[j].s0 = mmMC_PMG_AUTO_CMD;
for (k = 0; k < table->num_entries; k++) {
@@ -4725,8 +4725,6 @@ static int ci_register_patching_mc_seq(struct 
amdgpu_device *adev,
((adev->pdev->device == 0x67B0) ||
 (adev->pdev->device == 0x67B1))) {
for (i = 0; i < table->last; i++) {
-   if (table->last >= SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE)
-   return -EINVAL;
switch (table->mc_reg_address[i].s1) {
case mmMC_SEQ_MISC1:
for (k = 0; k < table->num_entries; k++) {
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/2] drm/radeon/ci: tidy up some limit checks

2017-08-22 Thread Dan Carpenter
This is partly to silence a static checker warning:

drivers/gpu/drm/amd/amdgpu/ci_dpm.c:4553 ci_set_mc_special_registers()
error: buffer overflow 'table->mc_reg_address' 16 <= 16

The story is that it's valid to exit the loop with "j" less than or equal
to SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE".  That value for "j" gets saved
as table->last.  We're not allow to access "table->mc_reg_address[j]"
when j == SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE because that's one past
the end of the array.

The tests for "if (j > SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE)" in
ci_set_mc_special_registers() are always false because we start with
j less than the array size and we increment it once so at most it can be
equal.

Then there is a bogus check in ci_register_patching_mc_seq() where
it complains if "table->last >= SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE".
The "table->last" value can't possibly be greater than the array size
and it is allowed to be equal to it.  So let's just remove that test
entirely.

Signed-off-by: Dan Carpenter 
---
Not tested.  Please review this one carefully.

diff --git a/drivers/gpu/drm/radeon/ci_dpm.c b/drivers/gpu/drm/radeon/ci_dpm.c
index c97fbb2ab48b..d3136e020d5a 100644
--- a/drivers/gpu/drm/radeon/ci_dpm.c
+++ b/drivers/gpu/drm/radeon/ci_dpm.c
@@ -4342,10 +4342,10 @@ static int ci_set_mc_special_registers(struct 
radeon_device *rdev,
table->mc_reg_table_entry[k].mc_data[j] 
|= 0x100;
}
j++;
-   if (j > SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE)
-   return -EINVAL;
 
if (!pi->mem_gddr5) {
+   if (j >= SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE)
+   return -EINVAL;
table->mc_reg_address[j].s1 = MC_PMG_AUTO_CMD 
>> 2;
table->mc_reg_address[j].s0 = MC_PMG_AUTO_CMD 
>> 2;
for (k = 0; k < table->num_entries; k++) {
@@ -4521,8 +4521,6 @@ static int ci_register_patching_mc_seq(struct 
radeon_device *rdev,
((rdev->pdev->device == 0x67B0) ||
 (rdev->pdev->device == 0x67B1))) {
for (i = 0; i < table->last; i++) {
-   if (table->last >= SMU7_DISCRETE_MC_REGISTER_ARRAY_SIZE)
-   return -EINVAL;
switch(table->mc_reg_address[i].s1 >> 2) {
case MC_SEQ_MISC1:
for (k = 0; k < table->num_entries; k++) {
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: move default gart size setting into gmc modules

2017-08-22 Thread Christian König
Reviewed-by: Christian König  for the series 
as well.


Am 22.08.2017 um 20:17 schrieb Felix Kuehling:

The series is Reviewed-by: Felix Kuehling 


On 2017-08-22 01:09 PM, Alex Deucher wrote:

Move the asic specific code into the IP modules.

Signed-off-by: Alex Deucher 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 52 
  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h |  1 -
  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c| 19 +++-
  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c| 22 +-
  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c| 21 -
  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c| 16 +-
  6 files changed, 74 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 2027eb0..f437008 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -57,58 +57,6 @@
   */
  
  /**

- * amdgpu_gart_set_defaults - set the default gart_size
- *
- * @adev: amdgpu_device pointer
- *
- * Set the default gart_size based on parameters and available VRAM.
- */
-void amdgpu_gart_set_defaults(struct amdgpu_device *adev)
-{
-   u64 gart_size;
-
-   if (amdgpu_gart_size == -1) {
-   switch (adev->asic_type) {
-#ifdef CONFIG_DRM_AMDGPU_SI
-   case CHIP_HAINAN:/* no MM engines */
-#endif
-   case CHIP_TOPAZ: /* no MM engines */
-   case CHIP_POLARIS11: /* all engines support GPUVM */
-   case CHIP_POLARIS10: /* all engines support GPUVM */
-   case CHIP_POLARIS12: /* all engines support GPUVM */
-   case CHIP_VEGA10:/* all engines support GPUVM */
-   default:
-   gart_size = 256;
-   break;
-#ifdef CONFIG_DRM_AMDGPU_SI
-   case CHIP_VERDE:/* UVD, VCE do not support GPUVM */
-   case CHIP_TAHITI:   /* UVD, VCE do not support GPUVM */
-   case CHIP_PITCAIRN: /* UVD, VCE do not support GPUVM */
-   case CHIP_OLAND:/* UVD, VCE do not support GPUVM */
-#endif
-#ifdef CONFIG_DRM_AMDGPU_CIK
-   case CHIP_BONAIRE: /* UVD, VCE do not support GPUVM */
-   case CHIP_HAWAII:  /* UVD, VCE do not support GPUVM */
-   case CHIP_KAVERI:  /* UVD, VCE do not support GPUVM */
-   case CHIP_KABINI:  /* UVD, VCE do not support GPUVM */
-   case CHIP_MULLINS: /* UVD, VCE do not support GPUVM */
-#endif
-   case CHIP_TONGA:   /* UVD, VCE do not support GPUVM */
-   case CHIP_FIJI:/* UVD, VCE do not support GPUVM */
-   case CHIP_CARRIZO: /* UVD, VCE do not support GPUVM, DCE SG 
support */
-   case CHIP_STONEY:  /* UVD does not support GPUVM, DCE SG 
support */
-   case CHIP_RAVEN:   /* DCE SG support */
-   gart_size = 1024;
-   break;
-   }
-   } else {
-   gart_size = amdgpu_gart_size;
-   }
-
-   adev->mc.gart_size = gart_size << 20;
-}
-
-/**
   * amdgpu_gart_table_ram_alloc - allocate system ram for gart page table
   *
   * @adev: amdgpu_device pointer
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
index d4cce69..afbe803 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
@@ -56,7 +56,6 @@ struct amdgpu_gart {
const struct amdgpu_gart_funcs *gart_funcs;
  };
  
-void amdgpu_gart_set_defaults(struct amdgpu_device *adev);

  int amdgpu_gart_table_ram_alloc(struct amdgpu_device *adev);
  void amdgpu_gart_table_ram_free(struct amdgpu_device *adev);
  int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index 6e68579..58fe8cc 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -332,7 +332,24 @@ static int gmc_v6_0_mc_init(struct amdgpu_device *adev)
adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
adev->mc.visible_vram_size = adev->mc.aper_size;
  
-	amdgpu_gart_set_defaults(adev);

+   /* set the gart size */
+   if (amdgpu_gart_size == -1) {
+   switch (adev->asic_type) {
+   case CHIP_HAINAN:/* no MM engines */
+   default:
+   adev->mc.gart_size = 256ULL << 20;
+   break;
+   case CHIP_VERDE:/* UVD, VCE do not support GPUVM */
+   case CHIP_TAHITI:   /* UVD, VCE do not support GPUVM */
+   case CHIP_PITCAIRN: /* UVD, VCE do not support GPUVM */
+   case CHIP_OLAND:/* UVD, VCE do not support GPUVM */
+   adev->mc.gart_size = 1024ULL << 20;

Re: [PATCH 22/24] drm/amdkfd: Adding new IOCTL for scratch memory v2

2017-08-22 Thread Felix Kuehling

On 2017-08-22 04:52 AM, Christian König wrote:
>> BTW, on Vega10 the way scratch works has been changed, and this ioctl
>> becomes basically a no-op. The SH_HIDDEN_PRIVATE_BASE_VMID register no
>> longer exists. Instead the base address is now provided to the shader
>> directly without being programmed ahead of time by KFD or HWS. Setting
>> up scratch memory per queue is entirely up to the Runtime now.
>
> So essentially this is actually more a workaround to set a privileged
> register by the hardware scheduler when it switches between tasks.

Sort of. In case we run without HWS it's programmed by the driver once
(because we assign VMIDs statically in that case).

>
> Do we have more cases like this? In other words would it make sense to
> generalize this interface to something like "Please HWS when you
> switch to one of my tasks can you set register X to value Y"?

In general, if you're looking for process-specific information that's
programmed by the HWS, the MAP_PROCESS PM4 packet definition is a good
place to look. Beside the sh_hidden_private_base_vmid it includes
information about virtual memory address apertures. Part of this is
handled by the set_memory_policy ioctl that's already upstream. More
will be added by a dGPU-specific ioctl:

struct kfd_ioctl_set_process_dgpu_aperture_args {
uint64_t dgpu_base;
uint64_t dgpu_limit;
uint32_t gpu_id;
uint32_t pad;
};

The only other similar thing that comes to mind is the secondary trap
handler address. But that's not programmed in a register, but in a
memory buffer associated with the first level trap handler.

Regards,
  Felix

>
> Regards,
> Christian. 

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/2] drm/amdgpu: move default gart size setting into gmc modules

2017-08-22 Thread Felix Kuehling
The series is Reviewed-by: Felix Kuehling 


On 2017-08-22 01:09 PM, Alex Deucher wrote:
> Move the asic specific code into the IP modules.
>
> Signed-off-by: Alex Deucher 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 52 
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h |  1 -
>  drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c| 19 +++-
>  drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c| 22 +-
>  drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c| 21 -
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c| 16 +-
>  6 files changed, 74 insertions(+), 57 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> index 2027eb0..f437008 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
> @@ -57,58 +57,6 @@
>   */
>  
>  /**
> - * amdgpu_gart_set_defaults - set the default gart_size
> - *
> - * @adev: amdgpu_device pointer
> - *
> - * Set the default gart_size based on parameters and available VRAM.
> - */
> -void amdgpu_gart_set_defaults(struct amdgpu_device *adev)
> -{
> - u64 gart_size;
> -
> - if (amdgpu_gart_size == -1) {
> - switch (adev->asic_type) {
> -#ifdef CONFIG_DRM_AMDGPU_SI
> - case CHIP_HAINAN:/* no MM engines */
> -#endif
> - case CHIP_TOPAZ: /* no MM engines */
> - case CHIP_POLARIS11: /* all engines support GPUVM */
> - case CHIP_POLARIS10: /* all engines support GPUVM */
> - case CHIP_POLARIS12: /* all engines support GPUVM */
> - case CHIP_VEGA10:/* all engines support GPUVM */
> - default:
> - gart_size = 256;
> - break;
> -#ifdef CONFIG_DRM_AMDGPU_SI
> - case CHIP_VERDE:/* UVD, VCE do not support GPUVM */
> - case CHIP_TAHITI:   /* UVD, VCE do not support GPUVM */
> - case CHIP_PITCAIRN: /* UVD, VCE do not support GPUVM */
> - case CHIP_OLAND:/* UVD, VCE do not support GPUVM */
> -#endif
> -#ifdef CONFIG_DRM_AMDGPU_CIK
> - case CHIP_BONAIRE: /* UVD, VCE do not support GPUVM */
> - case CHIP_HAWAII:  /* UVD, VCE do not support GPUVM */
> - case CHIP_KAVERI:  /* UVD, VCE do not support GPUVM */
> - case CHIP_KABINI:  /* UVD, VCE do not support GPUVM */
> - case CHIP_MULLINS: /* UVD, VCE do not support GPUVM */
> -#endif
> - case CHIP_TONGA:   /* UVD, VCE do not support GPUVM */
> - case CHIP_FIJI:/* UVD, VCE do not support GPUVM */
> - case CHIP_CARRIZO: /* UVD, VCE do not support GPUVM, DCE SG 
> support */
> - case CHIP_STONEY:  /* UVD does not support GPUVM, DCE SG 
> support */
> - case CHIP_RAVEN:   /* DCE SG support */
> - gart_size = 1024;
> - break;
> - }
> - } else {
> - gart_size = amdgpu_gart_size;
> - }
> -
> - adev->mc.gart_size = gart_size << 20;
> -}
> -
> -/**
>   * amdgpu_gart_table_ram_alloc - allocate system ram for gart page table
>   *
>   * @adev: amdgpu_device pointer
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> index d4cce69..afbe803 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
> @@ -56,7 +56,6 @@ struct amdgpu_gart {
>   const struct amdgpu_gart_funcs *gart_funcs;
>  };
>  
> -void amdgpu_gart_set_defaults(struct amdgpu_device *adev);
>  int amdgpu_gart_table_ram_alloc(struct amdgpu_device *adev);
>  void amdgpu_gart_table_ram_free(struct amdgpu_device *adev);
>  int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev);
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> index 6e68579..58fe8cc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> @@ -332,7 +332,24 @@ static int gmc_v6_0_mc_init(struct amdgpu_device *adev)
>   adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
>   adev->mc.visible_vram_size = adev->mc.aper_size;
>  
> - amdgpu_gart_set_defaults(adev);
> + /* set the gart size */
> + if (amdgpu_gart_size == -1) {
> + switch (adev->asic_type) {
> + case CHIP_HAINAN:/* no MM engines */
> + default:
> + adev->mc.gart_size = 256ULL << 20;
> + break;
> + case CHIP_VERDE:/* UVD, VCE do not support GPUVM */
> + case CHIP_TAHITI:   /* UVD, VCE do not support GPUVM */
> + case CHIP_PITCAIRN: /* UVD, VCE do not support GPUVM */
> + case CHIP_OLAND:/* UVD, VCE do not support GPUVM */
> + adev->mc.gart_size = 1024ULL << 20;
> +

Re: [BUG] X broken on Carrizo when GFX_PG enabled

2017-08-22 Thread Tom St Denis

On 22/08/17 12:20 PM, Johannes Hirte wrote:

Because nobody reacted on the bug report, I'm trying this way.

As mentioned in https://bugzilla.kernel.org/show_bug.cgi?id=196337, my
system gets unusable with GFX_PG enabled, cause X doesn't start anymore.


Hi Johannes,

Sorry nobody replied.  On my A12-9800 devel machine it works fine (and 
compute rings work fine finally with KIQ sorted) but there are patches 
out from the KFD team wanting to disable it again as well.


You can isolate this by loading amdgpu with pg_mask=0xFFFE that 
should disable GFX_PG.  If that solves your issue on an upstream kernel 
(or tip of amd-staging-4.12) then likely GFX PG is the issue (or tied to 
the issue).


Cheers,
Tom
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/2] drm/amdgpu: move default gart size setting into gmc modules

2017-08-22 Thread Alex Deucher
Move the asic specific code into the IP modules.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 52 
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h |  1 -
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c| 19 +++-
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c| 22 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c| 21 -
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c| 16 +-
 6 files changed, 74 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 2027eb0..f437008 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -57,58 +57,6 @@
  */
 
 /**
- * amdgpu_gart_set_defaults - set the default gart_size
- *
- * @adev: amdgpu_device pointer
- *
- * Set the default gart_size based on parameters and available VRAM.
- */
-void amdgpu_gart_set_defaults(struct amdgpu_device *adev)
-{
-   u64 gart_size;
-
-   if (amdgpu_gart_size == -1) {
-   switch (adev->asic_type) {
-#ifdef CONFIG_DRM_AMDGPU_SI
-   case CHIP_HAINAN:/* no MM engines */
-#endif
-   case CHIP_TOPAZ: /* no MM engines */
-   case CHIP_POLARIS11: /* all engines support GPUVM */
-   case CHIP_POLARIS10: /* all engines support GPUVM */
-   case CHIP_POLARIS12: /* all engines support GPUVM */
-   case CHIP_VEGA10:/* all engines support GPUVM */
-   default:
-   gart_size = 256;
-   break;
-#ifdef CONFIG_DRM_AMDGPU_SI
-   case CHIP_VERDE:/* UVD, VCE do not support GPUVM */
-   case CHIP_TAHITI:   /* UVD, VCE do not support GPUVM */
-   case CHIP_PITCAIRN: /* UVD, VCE do not support GPUVM */
-   case CHIP_OLAND:/* UVD, VCE do not support GPUVM */
-#endif
-#ifdef CONFIG_DRM_AMDGPU_CIK
-   case CHIP_BONAIRE: /* UVD, VCE do not support GPUVM */
-   case CHIP_HAWAII:  /* UVD, VCE do not support GPUVM */
-   case CHIP_KAVERI:  /* UVD, VCE do not support GPUVM */
-   case CHIP_KABINI:  /* UVD, VCE do not support GPUVM */
-   case CHIP_MULLINS: /* UVD, VCE do not support GPUVM */
-#endif
-   case CHIP_TONGA:   /* UVD, VCE do not support GPUVM */
-   case CHIP_FIJI:/* UVD, VCE do not support GPUVM */
-   case CHIP_CARRIZO: /* UVD, VCE do not support GPUVM, DCE SG 
support */
-   case CHIP_STONEY:  /* UVD does not support GPUVM, DCE SG 
support */
-   case CHIP_RAVEN:   /* DCE SG support */
-   gart_size = 1024;
-   break;
-   }
-   } else {
-   gart_size = amdgpu_gart_size;
-   }
-
-   adev->mc.gart_size = gart_size << 20;
-}
-
-/**
  * amdgpu_gart_table_ram_alloc - allocate system ram for gart page table
  *
  * @adev: amdgpu_device pointer
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
index d4cce69..afbe803 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
@@ -56,7 +56,6 @@ struct amdgpu_gart {
const struct amdgpu_gart_funcs *gart_funcs;
 };
 
-void amdgpu_gart_set_defaults(struct amdgpu_device *adev);
 int amdgpu_gart_table_ram_alloc(struct amdgpu_device *adev);
 void amdgpu_gart_table_ram_free(struct amdgpu_device *adev);
 int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index 6e68579..58fe8cc 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -332,7 +332,24 @@ static int gmc_v6_0_mc_init(struct amdgpu_device *adev)
adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
adev->mc.visible_vram_size = adev->mc.aper_size;
 
-   amdgpu_gart_set_defaults(adev);
+   /* set the gart size */
+   if (amdgpu_gart_size == -1) {
+   switch (adev->asic_type) {
+   case CHIP_HAINAN:/* no MM engines */
+   default:
+   adev->mc.gart_size = 256ULL << 20;
+   break;
+   case CHIP_VERDE:/* UVD, VCE do not support GPUVM */
+   case CHIP_TAHITI:   /* UVD, VCE do not support GPUVM */
+   case CHIP_PITCAIRN: /* UVD, VCE do not support GPUVM */
+   case CHIP_OLAND:/* UVD, VCE do not support GPUVM */
+   adev->mc.gart_size = 1024ULL << 20;
+   break;
+   }
+   } else {
+   adev->mc.gart_size = (u64)amdgpu_gart_size << 20;
+   }
+
gmc_v6_0_vram_gtt_location(adev, >mc);
 
return 0;
diff --git 

[PATCH 1/2] drm/amdgpu: refine default gart size

2017-08-22 Thread Alex Deucher
Be more explicit and add comments explaining each case.
Also s/gart/GART/ in the parameter string as per Felix'
suggestion.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 38 +++-
 2 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 7d5b008..21116fc 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -129,7 +129,7 @@ module_param_named(vramlimit, amdgpu_vram_limit, int, 0600);
 MODULE_PARM_DESC(vis_vramlimit, "Restrict visible VRAM for testing, in 
megabytes");
 module_param_named(vis_vramlimit, amdgpu_vis_vram_limit, int, 0444);
 
-MODULE_PARM_DESC(gartsize, "Size of gart to setup in megabytes (32, 64, etc., 
-1=auto)");
+MODULE_PARM_DESC(gartsize, "Size of GART to setup in megabytes (32, 64, etc., 
-1=auto)");
 module_param_named(gartsize, amdgpu_gart_size, uint, 0600);
 
 MODULE_PARM_DESC(gttsize, "Size of the GTT domain in megabytes (-1 = auto)");
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index b9b9f68..2027eb0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -68,13 +68,39 @@ void amdgpu_gart_set_defaults(struct amdgpu_device *adev)
u64 gart_size;
 
if (amdgpu_gart_size == -1) {
-   /* make the GART larger for chips that
-* dont' support VM for all rings
-*/
-   if (adev->asic_type <= CHIP_STONEY)
-   gart_size = 1024;
-   else
+   switch (adev->asic_type) {
+#ifdef CONFIG_DRM_AMDGPU_SI
+   case CHIP_HAINAN:/* no MM engines */
+#endif
+   case CHIP_TOPAZ: /* no MM engines */
+   case CHIP_POLARIS11: /* all engines support GPUVM */
+   case CHIP_POLARIS10: /* all engines support GPUVM */
+   case CHIP_POLARIS12: /* all engines support GPUVM */
+   case CHIP_VEGA10:/* all engines support GPUVM */
+   default:
gart_size = 256;
+   break;
+#ifdef CONFIG_DRM_AMDGPU_SI
+   case CHIP_VERDE:/* UVD, VCE do not support GPUVM */
+   case CHIP_TAHITI:   /* UVD, VCE do not support GPUVM */
+   case CHIP_PITCAIRN: /* UVD, VCE do not support GPUVM */
+   case CHIP_OLAND:/* UVD, VCE do not support GPUVM */
+#endif
+#ifdef CONFIG_DRM_AMDGPU_CIK
+   case CHIP_BONAIRE: /* UVD, VCE do not support GPUVM */
+   case CHIP_HAWAII:  /* UVD, VCE do not support GPUVM */
+   case CHIP_KAVERI:  /* UVD, VCE do not support GPUVM */
+   case CHIP_KABINI:  /* UVD, VCE do not support GPUVM */
+   case CHIP_MULLINS: /* UVD, VCE do not support GPUVM */
+#endif
+   case CHIP_TONGA:   /* UVD, VCE do not support GPUVM */
+   case CHIP_FIJI:/* UVD, VCE do not support GPUVM */
+   case CHIP_CARRIZO: /* UVD, VCE do not support GPUVM, DCE SG 
support */
+   case CHIP_STONEY:  /* UVD does not support GPUVM, DCE SG 
support */
+   case CHIP_RAVEN:   /* DCE SG support */
+   gart_size = 1024;
+   break;
+   }
} else {
gart_size = amdgpu_gart_size;
}
-- 
2.5.5

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[BUG] X broken on Carrizo when GFX_PG enabled

2017-08-22 Thread Johannes Hirte
Because nobody reacted on the bug report, I'm trying this way.

As mentioned in https://bugzilla.kernel.org/show_bug.cgi?id=196337, my
system gets unusable with GFX_PG enabled, cause X doesn't start anymore.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH xf86-video-amdgpu 04/10] Create drmmode_crtc_wait_pending_event helper macro

2017-08-22 Thread Abramov, Slava

From: Michel Dänzer 
Sent: Tuesday, August 22, 2017 1:52 AM
To: Abramov, Slava
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH xf86-video-amdgpu 04/10] Create 
drmmode_crtc_wait_pending_event helper macro

On 18/08/17 11:51 PM, Abramov, Slava wrote:
> *From:* amd-gfx  on behalf of
>
>> diff --git a/src/drmmode_display.c b/src/drmmode_display.c
>> index 1a805b82d..bdd3866b8 100644
>> --- a/src/drmmode_display.c
>> +++ b/src/drmmode_display.c
>> @@ -96,6 +96,14 @@ AMDGPUZaphodStringMatches(ScrnInfoPtr pScrn, const
>> char *s, char *output_name)
>>  return FALSE;
>>  }
>>
>> +
>> +/* Wait for the boolean condition to be FALSE */
>> +#define drmmode_crtc_wait_pending_event(drmmode_crtc, fd, condition) \
>> +   do {} while ((condition) && \
>> +drmHandleEvent(fd,
>> _crtc->drmmode->event_context) \
>> +> 0);
>> +
>> +
>
> [slava] The comment seems a little misleading to me.  The loop condition
> has actually two parts, and it's unclear what we're actually waiting
> for.  A side question: is it expected that the variable 'condition' will
> change while we're waiting?

Yes, that's the whole point. :) While (condition) evaluates to non-0, we
call drmHandleEvent(), which blocks until at least one DRM event
arrives, and triggers processing of arrived DRM events.

Maybe something like this would be better?

/* If condition evaluates to TRUE, process DRM events until condition
 * evaluates to FALSE.
 */

[slava] Yes, I like that :)


Anyway, this patch just ports the same change from xf86-video-ati, so
I'm not going to modify the comment in this patch. Do you want to send a
patch modifying the comment in either driver?

[slava] I will try to do this, as a military exercise, to get use to the 
process 


Slava
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH xf86-video-amdgpu] modesetting: re-set the crtc's mode when link-status goes BAD

2017-08-22 Thread Deucher, Alexander
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Michel Dänzer
> Sent: Tuesday, August 22, 2017 5:48 AM
> To: amd-gfx@lists.freedesktop.org
> Subject: [PATCH xf86-video-amdgpu] modesetting: re-set the crtc's mode
> when link-status goes BAD
> 
> From: Martin Peres 
> 
> Despite all the careful planning of the kernel, a link may become
> insufficient to handle the currently-set mode. At this point, the
> kernel should mark this particular configuration as being broken
> and potentially prune the mode before setting the offending connector's
> link-status to BAD and send the userspace a hotplug event. This may
> happen right after a modeset or later on.
> 
> Upon receiving a hot-plug event, we iterate through the connectors to
> re-apply the currently-set mode on all the connectors that have a
> link-status property set to BAD. The kernel may be able to get the
> link to work by dropping to using a lower link bpp (with the same
> display bpp). However, the modeset may fail if the kernel has pruned
> the mode, so to make users aware of this problem a warning is outputed
> in the logs to warn about having a potentially-black display.
> 
> This patch does not modify the current behaviour of always propagating
> the events to the randr clients. This allows desktop environments to
> re-probe the connectors and select a new resolution based on the new
> (currated) mode list if a mode disapeared. This behaviour is expected in
> order to pass the Display Port compliance tests.
> 
> (Ported from xserver commit
> bcee1b76aa0db8525b491485e90b8740763d7de6)
> 
> [ Michel: Bump libdrm dependency to >= 2.4.78 for
>   DRM_MODE_LINK_STATUS_BAD ]
> (Ported from radeon commit 0472a605e0ec8fec1892bbc3a84698b7ef9c5296)
> 
> Signed-off-by: Michel Dänzer 

Reviewed-by: Alex Deucher 

> ---
>  configure.ac  |  2 +-
>  src/drmmode_display.c | 43
> +++
>  2 files changed, 44 insertions(+), 1 deletion(-)
> 
> diff --git a/configure.ac b/configure.ac
> index 0ff2bdfaa..6d7cee476 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -70,7 +70,7 @@ XORG_DRIVER_CHECK_EXT(XV, videoproto)
>  XORG_DRIVER_CHECK_EXT(DPMSExtension, xextproto)
> 
>  # Checks for libraries.
> -PKG_CHECK_MODULES(LIBDRM, [libdrm >= 2.4.72])
> +PKG_CHECK_MODULES(LIBDRM, [libdrm >= 2.4.78])
>  PKG_CHECK_MODULES(LIBDRM_AMDGPU, [libdrm_amdgpu >= 2.4.72])
>  PKG_CHECK_MODULES(GBM, [gbm])
> 
> diff --git a/src/drmmode_display.c b/src/drmmode_display.c
> index ad3325be6..17efde8e8 100644
> --- a/src/drmmode_display.c
> +++ b/src/drmmode_display.c
> @@ -2627,6 +2627,49 @@ amdgpu_mode_hotplug(ScrnInfoPtr scrn,
> drmmode_ptr drmmode)
>   Bool changed = FALSE;
>   int num_dvi = 0, num_hdmi = 0;
> 
> + /* Try to re-set the mode on all the connectors with a BAD link-state:
> +  * This may happen if a link degrades and a new modeset is
> necessary, using
> +  * different link-training parameters. If the kernel found that the
> current
> +  * mode is not achievable anymore, it should have pruned the mode
> before
> +  * sending the hotplug event. Try to re-set the currently-set mode to
> keep
> +  * the display alive, this will fail if the mode has been pruned.
> +  * In any case, we will send randr events for the Desktop
> Environment to
> +  * deal with it, if it wants to.
> +  */
> + for (i = 0; i < config->num_output; i++) {
> + xf86OutputPtr output = config->output[i];
> + drmmode_output_private_ptr drmmode_output = output-
> >driver_private;
> + uint32_t con_id = drmmode_output->mode_output-
> >connector_id;
> + drmModeConnectorPtr koutput;
> +
> + /* Get an updated view of the properties for the current
> connector and
> +  * look for the link-status property
> +  */
> + koutput = drmModeGetConnectorCurrent(pAMDGPUEnt-
> >fd, con_id);
> + for (j = 0; koutput && j < koutput->count_props; j++) {
> + drmModePropertyPtr props;
> + props = drmModeGetProperty(pAMDGPUEnt->fd,
> koutput->props[j]);
> + if (props && props->flags &
> DRM_MODE_PROP_ENUM &&
> + !strcmp(props->name, "link-status") &&
> + koutput->prop_values[j] ==
> DRM_MODE_LINK_STATUS_BAD) {
> + xf86CrtcPtr crtc = output->crtc;
> + if (!crtc)
> + continue;
> +
> + /* the connector got a link failure, re-set the
> current mode */
> + drmmode_set_mode_major(crtc, 
> >mode, crtc->rotation,
> +crtc->x, crtc->y);
> +
> + xf86DrvMsg(scrn->scrnIndex, 

Re: [PATCH xf86-video-amdgpu] modesetting: re-set the crtc's mode when link-status goes BAD

2017-08-22 Thread Harry Wentland
On 2017-08-22 05:48 AM, Michel Dänzer wrote:
> From: Martin Peres 
> 
> Despite all the careful planning of the kernel, a link may become
> insufficient to handle the currently-set mode. At this point, the
> kernel should mark this particular configuration as being broken
> and potentially prune the mode before setting the offending connector's
> link-status to BAD and send the userspace a hotplug event. This may
> happen right after a modeset or later on.
> 
> Upon receiving a hot-plug event, we iterate through the connectors to
> re-apply the currently-set mode on all the connectors that have a
> link-status property set to BAD. The kernel may be able to get the
> link to work by dropping to using a lower link bpp (with the same
> display bpp). However, the modeset may fail if the kernel has pruned
> the mode, so to make users aware of this problem a warning is outputed
> in the logs to warn about having a potentially-black display.
> 
> This patch does not modify the current behaviour of always propagating
> the events to the randr clients. This allows desktop environments to
> re-probe the connectors and select a new resolution based on the new
> (currated) mode list if a mode disapeared. This behaviour is expected in
> order to pass the Display Port compliance tests.
> 
> (Ported from xserver commit bcee1b76aa0db8525b491485e90b8740763d7de6)
> 
> [ Michel: Bump libdrm dependency to >= 2.4.78 for
>   DRM_MODE_LINK_STATUS_BAD ]
> (Ported from radeon commit 0472a605e0ec8fec1892bbc3a84698b7ef9c5296)
> 
> Signed-off-by: Michel Dänzer 

Thanks, and
Acked-by: Harry Wentland 

I'll send an amd/display kernel patch soon to set link-status to bad.

Harry

> ---
>  configure.ac  |  2 +-
>  src/drmmode_display.c | 43 +++
>  2 files changed, 44 insertions(+), 1 deletion(-)
> 
> diff --git a/configure.ac b/configure.ac
> index 0ff2bdfaa..6d7cee476 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -70,7 +70,7 @@ XORG_DRIVER_CHECK_EXT(XV, videoproto)
>  XORG_DRIVER_CHECK_EXT(DPMSExtension, xextproto)
>  
>  # Checks for libraries.
> -PKG_CHECK_MODULES(LIBDRM, [libdrm >= 2.4.72])
> +PKG_CHECK_MODULES(LIBDRM, [libdrm >= 2.4.78])
>  PKG_CHECK_MODULES(LIBDRM_AMDGPU, [libdrm_amdgpu >= 2.4.72])
>  PKG_CHECK_MODULES(GBM, [gbm])
>  
> diff --git a/src/drmmode_display.c b/src/drmmode_display.c
> index ad3325be6..17efde8e8 100644
> --- a/src/drmmode_display.c
> +++ b/src/drmmode_display.c
> @@ -2627,6 +2627,49 @@ amdgpu_mode_hotplug(ScrnInfoPtr scrn, drmmode_ptr 
> drmmode)
>   Bool changed = FALSE;
>   int num_dvi = 0, num_hdmi = 0;
>  
> + /* Try to re-set the mode on all the connectors with a BAD link-state:
> +  * This may happen if a link degrades and a new modeset is necessary, 
> using
> +  * different link-training parameters. If the kernel found that the 
> current
> +  * mode is not achievable anymore, it should have pruned the mode before
> +  * sending the hotplug event. Try to re-set the currently-set mode to 
> keep
> +  * the display alive, this will fail if the mode has been pruned.
> +  * In any case, we will send randr events for the Desktop Environment to
> +  * deal with it, if it wants to.
> +  */
> + for (i = 0; i < config->num_output; i++) {
> + xf86OutputPtr output = config->output[i];
> + drmmode_output_private_ptr drmmode_output = 
> output->driver_private;
> + uint32_t con_id = drmmode_output->mode_output->connector_id;
> + drmModeConnectorPtr koutput;
> +
> + /* Get an updated view of the properties for the current 
> connector and
> +  * look for the link-status property
> +  */
> + koutput = drmModeGetConnectorCurrent(pAMDGPUEnt->fd, con_id);
> + for (j = 0; koutput && j < koutput->count_props; j++) {
> + drmModePropertyPtr props;
> + props = drmModeGetProperty(pAMDGPUEnt->fd, 
> koutput->props[j]);
> + if (props && props->flags & DRM_MODE_PROP_ENUM &&
> + !strcmp(props->name, "link-status") &&
> + koutput->prop_values[j] == 
> DRM_MODE_LINK_STATUS_BAD) {
> + xf86CrtcPtr crtc = output->crtc;
> + if (!crtc)
> + continue;
> +
> + /* the connector got a link failure, re-set the 
> current mode */
> + drmmode_set_mode_major(crtc, >mode, 
> crtc->rotation,
> +crtc->x, crtc->y);
> +
> + xf86DrvMsg(scrn->scrnIndex, X_WARNING,
> +"hotplug event: connector %u's 
> link-state is BAD, "
> +"tried resetting the current 

Re: [PATCH] drm/amd/include: Add hdmi_redriver_set to atomfirmware

2017-08-22 Thread Alex Deucher
On Tue, Aug 22, 2017 at 9:35 AM, Harry Wentland  wrote:
> We'll need this for a some upcoming display changes
>
> Signed-off-by: Harry Wentland 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/include/atomfirmware.h | 19 ++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/include/atomfirmware.h 
> b/drivers/gpu/drm/amd/include/atomfirmware.h
> index 837296db9628..7c92f4707085 100644
> --- a/drivers/gpu/drm/amd/include/atomfirmware.h
> +++ b/drivers/gpu/drm/amd/include/atomfirmware.h
> @@ -1017,6 +1017,19 @@ struct atom_14nm_combphy_tmds_vs_set
>uint8_t margin_deemph_lane0__deemph_sel_val;
>  };
>
> +struct atom_i2c_reg_info {
> +  uint8_t ucI2cRegIndex;
> +  uint8_t ucI2cRegVal;
> +};
> +
> +struct atom_hdmi_retimer_redriver_set {
> +  uint8_t HdmiSlvAddr;
> +  uint8_t HdmiRegNum;
> +  uint8_t Hdmi6GRegNum;
> +  struct atom_i2c_reg_info HdmiRegSetting[9];//For non 6G Hz use
> +  struct atom_i2c_reg_info Hdmi6GhzRegSetting[3];//For 6G Hz use.
> +};
> +
>  struct atom_integrated_system_info_v1_11
>  {
>struct  atom_common_table_header  table_header;
> @@ -1052,7 +1065,11 @@ struct atom_integrated_system_info_v1_11
>struct atom_14nm_dpphy_dp_tuningset dp_tuningset;
>struct atom_14nm_dpphy_dp_tuningset dp_hbr3_tuningset;
>struct atom_camera_data  camera_info;
> -  uint32_t  reserved[138];
> +  struct atom_hdmi_retimer_redriver_set dp0_retimer_set;   //for DP0
> +  struct atom_hdmi_retimer_redriver_set dp1_retimer_set;   //for DP1
> +  struct atom_hdmi_retimer_redriver_set dp2_retimer_set;   //for DP2
> +  struct atom_hdmi_retimer_redriver_set dp3_retimer_set;   //for DP3
> +  uint32_t  reserved[108];
>  };
>
>
> --
> 2.11.0
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/include: Add hdmi_redriver_set to atomfirmware

2017-08-22 Thread Harry Wentland
We'll need this for a some upcoming display changes

Signed-off-by: Harry Wentland 
---
 drivers/gpu/drm/amd/include/atomfirmware.h | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/include/atomfirmware.h 
b/drivers/gpu/drm/amd/include/atomfirmware.h
index 837296db9628..7c92f4707085 100644
--- a/drivers/gpu/drm/amd/include/atomfirmware.h
+++ b/drivers/gpu/drm/amd/include/atomfirmware.h
@@ -1017,6 +1017,19 @@ struct atom_14nm_combphy_tmds_vs_set
   uint8_t margin_deemph_lane0__deemph_sel_val; 
 };
 
+struct atom_i2c_reg_info {
+  uint8_t ucI2cRegIndex;
+  uint8_t ucI2cRegVal;
+};
+
+struct atom_hdmi_retimer_redriver_set {
+  uint8_t HdmiSlvAddr;
+  uint8_t HdmiRegNum;
+  uint8_t Hdmi6GRegNum;
+  struct atom_i2c_reg_info HdmiRegSetting[9];//For non 6G Hz use
+  struct atom_i2c_reg_info Hdmi6GhzRegSetting[3];//For 6G Hz use.
+};
+
 struct atom_integrated_system_info_v1_11
 {
   struct  atom_common_table_header  table_header;
@@ -1052,7 +1065,11 @@ struct atom_integrated_system_info_v1_11
   struct atom_14nm_dpphy_dp_tuningset dp_tuningset;
   struct atom_14nm_dpphy_dp_tuningset dp_hbr3_tuningset;
   struct atom_camera_data  camera_info;
-  uint32_t  reserved[138];
+  struct atom_hdmi_retimer_redriver_set dp0_retimer_set;   //for DP0
+  struct atom_hdmi_retimer_redriver_set dp1_retimer_set;   //for DP1
+  struct atom_hdmi_retimer_redriver_set dp2_retimer_set;   //for DP2
+  struct atom_hdmi_retimer_redriver_set dp3_retimer_set;   //for DP3
+  uint32_t  reserved[108];
 };
 
 
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: Add helper functions to populate/map in one call

2017-08-22 Thread Christian König
Reviewed-by: Christian König  for this one as 
well as the two patches for radeon and amdgpu.


Feel free to also add my Acked-by to the nouveau patch.

Regards,
Christian.

Am 22.08.2017 um 13:13 schrieb Tom St Denis:

ping? Haven't seen any comments on amd-gfx or dri-devel.

Cheers,
Tom


On 18/08/17 10:07 AM, Tom St Denis wrote:

These functions replace a section of common code found
in radeon/amdgpu drivers (and possibly others) as part
of the ttm_tt_*populate() callbacks.

Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/ttm/ttm_page_alloc.c | 41 


  include/drm/ttm/ttm_page_alloc.h | 11 ++
  2 files changed, 52 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c

index 871599826773..6a660d196d87 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -920,6 +920,47 @@ void ttm_pool_unpopulate(struct ttm_tt *ttm)
  }
  EXPORT_SYMBOL(ttm_pool_unpopulate);
  +int ttm_populate_and_map_pages(struct device *dev, struct 
ttm_dma_tt *tt)

+{
+unsigned i;
+int r;
+
+r = ttm_pool_populate(>ttm);
+if (r)
+return r;
+
+for (i = 0; i < tt->ttm.num_pages; i++) {
+tt->dma_address[i] = dma_map_page(dev, tt->ttm.pages[i],
+  0, PAGE_SIZE,
+  DMA_BIDIRECTIONAL);
+if (dma_mapping_error(dev, tt->dma_address[i])) {
+while (i--) {
+dma_unmap_page(dev, tt->dma_address[i],
+   PAGE_SIZE, DMA_BIDIRECTIONAL);
+tt->dma_address[i] = 0;
+}
+ttm_pool_unpopulate(>ttm);
+return -EFAULT;
+}
+}
+return 0;
+}
+EXPORT_SYMBOL(ttm_populate_and_map_pages);
+
+void ttm_unmap_and_unpopulate_pages(struct device *dev, struct 
ttm_dma_tt *tt)

+{
+unsigned i;
+
+for (i = 0; i < tt->ttm.num_pages; i++) {
+if (tt->dma_address[i]) {
+dma_unmap_page(dev, tt->dma_address[i],
+   PAGE_SIZE, DMA_BIDIRECTIONAL);
+}
+}
+ttm_pool_unpopulate(>ttm);
+}
+EXPORT_SYMBOL(ttm_unmap_and_unpopulate_pages);
+
  int ttm_page_alloc_debugfs(struct seq_file *m, void *data)
  {
  struct ttm_page_pool *p;
diff --git a/include/drm/ttm/ttm_page_alloc.h 
b/include/drm/ttm/ttm_page_alloc.h

index 49a828425fa2..8695918ea629 100644
--- a/include/drm/ttm/ttm_page_alloc.h
+++ b/include/drm/ttm/ttm_page_alloc.h
@@ -83,6 +83,17 @@ extern int ttm_dma_page_alloc_debugfs(struct 
seq_file *m, void *data);
  extern int ttm_dma_populate(struct ttm_dma_tt *ttm_dma, struct 
device *dev);
  extern void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct 
device *dev);

  +
+/**
+ * Populates and DMA maps pages to fullfil a ttm_dma_populate() request
+ */
+int ttm_populate_and_map_pages(struct device *dev, struct ttm_dma_tt 
*tt);

+
+/**
+ * Unpopulates and DMA unmaps pages as part of a
+ * ttm_dma_unpopulate() request */
+void ttm_unmap_and_unpopulate_pages(struct device *dev, struct 
ttm_dma_tt *tt);

+
  #else
  static inline int ttm_dma_page_alloc_init(struct ttm_mem_global *glob,
unsigned max_pages)



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/ttm: Add helper functions to populate/map in one call

2017-08-22 Thread Tom St Denis

ping?  Haven't seen any comments on amd-gfx or dri-devel.

Cheers,
Tom


On 18/08/17 10:07 AM, Tom St Denis wrote:

These functions replace a section of common code found
in radeon/amdgpu drivers (and possibly others) as part
of the ttm_tt_*populate() callbacks.

Signed-off-by: Tom St Denis 
---
  drivers/gpu/drm/ttm/ttm_page_alloc.c | 41 
  include/drm/ttm/ttm_page_alloc.h | 11 ++
  2 files changed, 52 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 871599826773..6a660d196d87 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -920,6 +920,47 @@ void ttm_pool_unpopulate(struct ttm_tt *ttm)
  }
  EXPORT_SYMBOL(ttm_pool_unpopulate);
  
+int ttm_populate_and_map_pages(struct device *dev, struct ttm_dma_tt *tt)

+{
+   unsigned i;
+   int r;
+
+   r = ttm_pool_populate(>ttm);
+   if (r)
+   return r;
+
+   for (i = 0; i < tt->ttm.num_pages; i++) {
+   tt->dma_address[i] = dma_map_page(dev, tt->ttm.pages[i],
+ 0, PAGE_SIZE,
+ DMA_BIDIRECTIONAL);
+   if (dma_mapping_error(dev, tt->dma_address[i])) {
+   while (i--) {
+   dma_unmap_page(dev, tt->dma_address[i],
+  PAGE_SIZE, DMA_BIDIRECTIONAL);
+   tt->dma_address[i] = 0;
+   }
+   ttm_pool_unpopulate(>ttm);
+   return -EFAULT;
+   }
+   }
+   return 0;
+}
+EXPORT_SYMBOL(ttm_populate_and_map_pages);
+
+void ttm_unmap_and_unpopulate_pages(struct device *dev, struct ttm_dma_tt *tt)
+{
+   unsigned i;
+   
+   for (i = 0; i < tt->ttm.num_pages; i++) {
+   if (tt->dma_address[i]) {
+   dma_unmap_page(dev, tt->dma_address[i],
+  PAGE_SIZE, DMA_BIDIRECTIONAL);
+   }
+   }
+   ttm_pool_unpopulate(>ttm);
+}
+EXPORT_SYMBOL(ttm_unmap_and_unpopulate_pages);
+
  int ttm_page_alloc_debugfs(struct seq_file *m, void *data)
  {
struct ttm_page_pool *p;
diff --git a/include/drm/ttm/ttm_page_alloc.h b/include/drm/ttm/ttm_page_alloc.h
index 49a828425fa2..8695918ea629 100644
--- a/include/drm/ttm/ttm_page_alloc.h
+++ b/include/drm/ttm/ttm_page_alloc.h
@@ -83,6 +83,17 @@ extern int ttm_dma_page_alloc_debugfs(struct seq_file *m, 
void *data);
  extern int ttm_dma_populate(struct ttm_dma_tt *ttm_dma, struct device *dev);
  extern void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device 
*dev);
  
+

+/**
+ * Populates and DMA maps pages to fullfil a ttm_dma_populate() request
+ */
+int ttm_populate_and_map_pages(struct device *dev, struct ttm_dma_tt *tt);
+
+/**
+ * Unpopulates and DMA unmaps pages as part of a
+ * ttm_dma_unpopulate() request */
+void ttm_unmap_and_unpopulate_pages(struct device *dev, struct ttm_dma_tt *tt);
+
  #else
  static inline int ttm_dma_page_alloc_init(struct ttm_mem_global *glob,
  unsigned max_pages)



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH xf86-video-amdgpu] modesetting: re-set the crtc's mode when link-status goes BAD

2017-08-22 Thread Michel Dänzer
From: Martin Peres 

Despite all the careful planning of the kernel, a link may become
insufficient to handle the currently-set mode. At this point, the
kernel should mark this particular configuration as being broken
and potentially prune the mode before setting the offending connector's
link-status to BAD and send the userspace a hotplug event. This may
happen right after a modeset or later on.

Upon receiving a hot-plug event, we iterate through the connectors to
re-apply the currently-set mode on all the connectors that have a
link-status property set to BAD. The kernel may be able to get the
link to work by dropping to using a lower link bpp (with the same
display bpp). However, the modeset may fail if the kernel has pruned
the mode, so to make users aware of this problem a warning is outputed
in the logs to warn about having a potentially-black display.

This patch does not modify the current behaviour of always propagating
the events to the randr clients. This allows desktop environments to
re-probe the connectors and select a new resolution based on the new
(currated) mode list if a mode disapeared. This behaviour is expected in
order to pass the Display Port compliance tests.

(Ported from xserver commit bcee1b76aa0db8525b491485e90b8740763d7de6)

[ Michel: Bump libdrm dependency to >= 2.4.78 for
  DRM_MODE_LINK_STATUS_BAD ]
(Ported from radeon commit 0472a605e0ec8fec1892bbc3a84698b7ef9c5296)

Signed-off-by: Michel Dänzer 
---
 configure.ac  |  2 +-
 src/drmmode_display.c | 43 +++
 2 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 0ff2bdfaa..6d7cee476 100644
--- a/configure.ac
+++ b/configure.ac
@@ -70,7 +70,7 @@ XORG_DRIVER_CHECK_EXT(XV, videoproto)
 XORG_DRIVER_CHECK_EXT(DPMSExtension, xextproto)
 
 # Checks for libraries.
-PKG_CHECK_MODULES(LIBDRM, [libdrm >= 2.4.72])
+PKG_CHECK_MODULES(LIBDRM, [libdrm >= 2.4.78])
 PKG_CHECK_MODULES(LIBDRM_AMDGPU, [libdrm_amdgpu >= 2.4.72])
 PKG_CHECK_MODULES(GBM, [gbm])
 
diff --git a/src/drmmode_display.c b/src/drmmode_display.c
index ad3325be6..17efde8e8 100644
--- a/src/drmmode_display.c
+++ b/src/drmmode_display.c
@@ -2627,6 +2627,49 @@ amdgpu_mode_hotplug(ScrnInfoPtr scrn, drmmode_ptr 
drmmode)
Bool changed = FALSE;
int num_dvi = 0, num_hdmi = 0;
 
+   /* Try to re-set the mode on all the connectors with a BAD link-state:
+* This may happen if a link degrades and a new modeset is necessary, 
using
+* different link-training parameters. If the kernel found that the 
current
+* mode is not achievable anymore, it should have pruned the mode before
+* sending the hotplug event. Try to re-set the currently-set mode to 
keep
+* the display alive, this will fail if the mode has been pruned.
+* In any case, we will send randr events for the Desktop Environment to
+* deal with it, if it wants to.
+*/
+   for (i = 0; i < config->num_output; i++) {
+   xf86OutputPtr output = config->output[i];
+   drmmode_output_private_ptr drmmode_output = 
output->driver_private;
+   uint32_t con_id = drmmode_output->mode_output->connector_id;
+   drmModeConnectorPtr koutput;
+
+   /* Get an updated view of the properties for the current 
connector and
+* look for the link-status property
+*/
+   koutput = drmModeGetConnectorCurrent(pAMDGPUEnt->fd, con_id);
+   for (j = 0; koutput && j < koutput->count_props; j++) {
+   drmModePropertyPtr props;
+   props = drmModeGetProperty(pAMDGPUEnt->fd, 
koutput->props[j]);
+   if (props && props->flags & DRM_MODE_PROP_ENUM &&
+   !strcmp(props->name, "link-status") &&
+   koutput->prop_values[j] == 
DRM_MODE_LINK_STATUS_BAD) {
+   xf86CrtcPtr crtc = output->crtc;
+   if (!crtc)
+   continue;
+
+   /* the connector got a link failure, re-set the 
current mode */
+   drmmode_set_mode_major(crtc, >mode, 
crtc->rotation,
+  crtc->x, crtc->y);
+
+   xf86DrvMsg(scrn->scrnIndex, X_WARNING,
+  "hotplug event: connector %u's 
link-state is BAD, "
+  "tried resetting the current mode. 
You may be left "
+  "with a black screen if this 
fails...\n", con_id);
+   }
+   drmModeFreeProperty(props);
+   }
+   drmModeFreeConnector(koutput);
+   }
+
mode_res = 

Re: [PATCH 22/24] drm/amdkfd: Adding new IOCTL for scratch memory v2

2017-08-22 Thread Christian König

Am 21.08.2017 um 22:06 schrieb Felix Kuehling:

On 2017-08-21 03:21 PM, Oded Gabbay wrote:

On Mon, Aug 21, 2017 at 8:39 PM, Jerome Glisse  wrote:

On Tue, Aug 15, 2017 at 11:00:20PM -0400, Felix Kuehling wrote:

From: Moses Reuben 

v2:
* Renamed ALLOC_MEMORY_OF_SCRATCH to SET_SCRATCH_BACKING_VA
* Removed size parameter from the ioctl, it was unused
* Removed hole in ioctl number space
* No more call to write_config_static_mem
* Return correct error code from ioctl

What kind of memory is suppose to back this virtual address
range ? How big is the range suppose to be ? Can it be any
valid virtual address ?

My worry here is to ascertain that one can not abuse this
ioctl say to set the virtual address to some mmaped shared
library code/data section and write something malicious
there.

I am assuming that if it has to go through ATS/PASID of the
IOMMUv2 then the write protection will be asserted and we
will see proper COW (copy on write) due to mmap PRIVATE flags.


Idealy this area should be a special vma and the driver
should track its lifetime and cancel GPU jobs if it is
unmap. But i am unsure on how dynamic is that scratch
memory suppose to be (ie do you allocate new scratch memory
with every GPU job or is it allocated once and reuse for
every jobs).

Bigger commit message would be nice too. Like i had tons
of i believe valid questions.

Cheers,
Jérôme

Hi Jerome,
Great points.

This is the explanation Felix gave a few days ago on this ioctl in a
different email thread:

"Scratch memory is private memory per work-item. It's used when a shader
program has too few registers available. With HSA we use flat scratch
addressing, where shaders can access private memory in a special scratch
aperture using normal memory instructions. Using the same virtual
address, each work item gets its own private piece of memory. The
hardware does the address translation from the VA in the private
aperture to a scratch-backing VA. The application is responsible for
allocating the memory to back that scratch area, and to map it somewhere
in its virtual address space.

This ioctl tells the hardware (or HWS firmware) the VA of the scratch
backing memory."

 From this explanation, I think that the user's supplied VA should be
tested that its a valid writable area of the user.
How do you test for that ? could you point me to such a code in the kernel ?
 From looking at other drivers that pin host memory, like mellanox nic,
they don't do any special testing for the address they receive from
the user.

Felix,
Is the scratch memory size fixed ? Because I don't see a size argument
in the IOCTL.

The hardware needs a continuous pieces of virtual address space for
scratch backing. The base address of that continuous region is
programmed into a per-VMID register (SH_HIDDEN_PRIVATE_BASE_VMID) by the
driver or the HWS (hardware scheduler) firmware. So the size is limited
by the amount of address space reserved for this purpose by user mode.
But the HW doesn't really care what that size is. The HSA runtime
manages the sub-allocation or scratch memory per queue. It knows the
size that it allocates and will need to make sure it sub-allocates
within that size.

In the queue setup data structures it must somewhere contain information
about the offset from the start of scratch memory, and some address
swizzling parameters to determine per work item or per wavefront
addresses. I'm not familiar with the details of those data structures,
and the size limits in the hardware. They would determine an upper limit
for the size of scratch backing.

BTW, on Vega10 the way scratch works has been changed, and this ioctl
becomes basically a no-op. The SH_HIDDEN_PRIVATE_BASE_VMID register no
longer exists. Instead the base address is now provided to the shader
directly without being programmed ahead of time by KFD or HWS. Setting
up scratch memory per queue is entirely up to the Runtime now.


So essentially this is actually more a workaround to set a privileged 
register by the hardware scheduler when it switches between tasks.


Do we have more cases like this? In other words would it make sense to 
generalize this interface to something like "Please HWS when you switch 
to one of my tasks can you set register X to value Y"?


Regards,
Christian.



Regards,
   Felix


If it is fixed, what is the size ?

Thanks,
Oded

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: add automatic per asic settings for gart_size

2017-08-22 Thread Christian König

Am 21.08.2017 um 22:48 schrieb Felix Kuehling:

On 2017-08-21 12:00 PM, Alex Deucher wrote:

We need a larger gart for asics that do not support GPUVM on all
engines (e.g., MM) to make sure we have enough space for all
gtt buffers in physical mode.  Change the default size based on
the asic type.

Signed-off-by: Alex Deucher 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h|  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  4 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  4 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c   | 16 +++-
  4 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ffb98bb..97d50cd 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -77,7 +77,7 @@
  extern int amdgpu_modeset;
  extern int amdgpu_vram_limit;
  extern int amdgpu_vis_vram_limit;
-extern unsigned amdgpu_gart_size;
+extern int amdgpu_gart_size;
  extern int amdgpu_gtt_size;
  extern int amdgpu_moverate;
  extern int amdgpu_benchmarking;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 1e66eda8..3c062a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1063,11 +1063,11 @@ static void amdgpu_check_arguments(struct amdgpu_device 
*adev)
amdgpu_sched_jobs = roundup_pow_of_two(amdgpu_sched_jobs);
}
  
-	if (amdgpu_gart_size < 32) {

+   if (amdgpu_gart_size != -1 && amdgpu_gart_size < 32) {
/* gart size must be greater or equal to 32M */
dev_warn(adev->dev, "gart size (%d) too small\n",
 amdgpu_gart_size);
-   amdgpu_gart_size = 32;
+   amdgpu_gart_size = -1;
}
  
  	if (amdgpu_gtt_size != -1 && amdgpu_gtt_size < 32) {

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 3eafe39..3f2143d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -76,7 +76,7 @@
  
  int amdgpu_vram_limit = 0;

  int amdgpu_vis_vram_limit = 0;
-unsigned amdgpu_gart_size = 256;
+int amdgpu_gart_size = -1; /* auto */
  int amdgpu_gtt_size = -1; /* auto */
  int amdgpu_moverate = -1; /* auto */
  int amdgpu_benchmarking = 0;
@@ -129,7 +129,7 @@ module_param_named(vramlimit, amdgpu_vram_limit, int, 0600);
  MODULE_PARM_DESC(vis_vramlimit, "Restrict visible VRAM for testing, in 
megabytes");
  module_param_named(vis_vramlimit, amdgpu_vis_vram_limit, int, 0444);
  
-MODULE_PARM_DESC(gartsize, "Size of PCIE/IGP gart to setup in megabytes (32, 64, etc.)");

+MODULE_PARM_DESC(gartsize, "Size of gart to setup in megabytes (32, 64, etc., 
-1=auto)");

I would spell GART in capitals here.


  module_param_named(gartsize, amdgpu_gart_size, uint, 0600);
  
  MODULE_PARM_DESC(gttsize, "Size of the GTT domain in megabytes (-1 = auto)");

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 94c1e2e..b9b9f68 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -65,7 +65,21 @@
   */
  void amdgpu_gart_set_defaults(struct amdgpu_device *adev)
  {
-   adev->mc.gart_size = (uint64_t)amdgpu_gart_size << 20;
+   u64 gart_size;
+
+   if (amdgpu_gart_size == -1) {
+   /* make the GART larger for chips that
+* dont' support VM for all rings
+*/
+   if (adev->asic_type <= CHIP_STONEY)

I guess that means Polaris10 and later support VM for multimedia? I
needed to go look at amd_shared.h to find that out. It would be more
obvious like this:

 if (adev->asic_type < CHIP_POLARIS10) ...

Other than that, this patch is Reviewed-by: Felix Kuehling



Yeah, the code in uvd_v6_0.c to decide what to do is:

static void uvd_v6_0_set_ring_funcs(struct amdgpu_device *adev)
{
if (adev->asic_type >= CHIP_POLARIS10) {
adev->uvd.ring.funcs = _v6_0_ring_vm_funcs;
DRM_INFO("UVD is enabled in VM mode\n");
} else {
adev->uvd.ring.funcs = _v6_0_ring_phys_funcs;
DRM_INFO("UVD is enabled in physical mode\n");
}
}


Not 100% sure about VCE, but Monk also pointed out internally that we 
should put that in a single place.


A check in amdgpu_uvd.h and one in amdgpu_vce.h sound reasonable to me.

Christian.



Regards,
   Felix


+   gart_size = 1024;
+   else
+   gart_size = 256;
+   } else {
+   gart_size = amdgpu_gart_size;
+   }
+
+   adev->mc.gart_size = gart_size << 20;
  }
  
  /**

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




Re: [PATCH 22/24] drm/amdkfd: Adding new IOCTL for scratch memory v2

2017-08-22 Thread Oded Gabbay
On Mon, Aug 21, 2017 at 11:11 PM, Felix Kuehling  wrote:
> On 2017-08-21 04:01 PM, Jerome Glisse wrote:
>>> From this explanation, I think that the user's supplied VA should be
>>> tested that its a valid writable area of the user.
>>> How do you test for that ? could you point me to such a code in the kernel ?
>>> From looking at other drivers that pin host memory, like mellanox nic,
>>> they don't do any special testing for the address they receive from
>>> the user.
>> GUP (get_user_page()) will perform proper check on your behalf.
>> I am assuming those driver use GUP.
>
Ah, ok. I thought you refer to some other check that needs to be done
*before* calling get_user_page().
As Felix said, for APU the GPU doesn't need the memory to be pinned
and it uses the ATS to access it directly so we don't call
get_user_page() or some other function that calls it eventually.

On dGPU I guess it would be similar to what is done in most drivers
that need to pin user memory, by calling to get_user_page().

Oded

> On APUs, the GPU can address memory directly through ATS. So it doesn't
> use GUP. The GPU uses the same page tables as the CPU, so uses the same
> access bits to check permissions on the fly.
>
> On dGPU, when we use VRAM for scratch backing, that doesn't use GUP. In
> theory it should be possible to use GUP (using our userptr mechanism).
> In that case GUP would check the access permissions of the underlying
> mapping.
>
> Regards,
>   Felix
>
>>
>> Jérôme
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx