RE: [PATCH 1/6] drm/ttm: add ttm_bo_pipeline_gutting

2018-03-07 Thread He, Roger
Patch 1,4,5: Acked-by: Roger He 
Patch 2,3,6: Reviewed-by: Roger He 

-Original Message-
From: dri-devel [mailto:dri-devel-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: Tuesday, March 06, 2018 10:44 PM
To: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org
Subject: [PATCH 1/6] drm/ttm: add ttm_bo_pipeline_gutting

Allows us to gut a BO of it's backing store when the driver says that it isn't 
needed any more.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/ttm/ttm_bo.c  | 15 ---
 drivers/gpu/drm/ttm/ttm_bo_util.c | 24 
 include/drm/ttm/ttm_bo_driver.h   |  9 +
 3 files changed, 45 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 
ad142a92eb80..98e06f8bf23b 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -622,14 +622,23 @@ static int ttm_bo_evict(struct ttm_buffer_object *bo,
 
reservation_object_assert_held(bo->resv);
 
+   placement.num_placement = 0;
+   placement.num_busy_placement = 0;
+   bdev->driver->evict_flags(bo, );
+
+   if (!placement.num_placement && !placement.num_busy_placement) {
+   ret = ttm_bo_pipeline_gutting(bo);
+   if (ret)
+   return ret;
+
+   return ttm_tt_create(bo, false);
+   }
+
evict_mem = bo->mem;
evict_mem.mm_node = NULL;
evict_mem.bus.io_reserved_vm = false;
evict_mem.bus.io_reserved_count = 0;
 
-   placement.num_placement = 0;
-   placement.num_busy_placement = 0;
-   bdev->driver->evict_flags(bo, );
ret = ttm_bo_mem_space(bo, , _mem, ctx);
if (ret) {
if (ret != -ERESTARTSYS) {
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index 6d6a3f46143b..1f730b3f18e5 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -801,3 +801,27 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
return 0;
 }
 EXPORT_SYMBOL(ttm_bo_pipeline_move);
+
+int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo) {
+   struct ttm_buffer_object *ghost;
+   int ret;
+
+   ret = ttm_buffer_object_transfer(bo, );
+   if (ret)
+   return ret;
+
+   ret = reservation_object_copy_fences(ghost->resv, bo->resv);
+   /* Last resort, wait for the BO to be idle when we are OOM */
+   if (ret)
+   ttm_bo_wait(bo, false, false);
+
+   memset(>mem, 0, sizeof(bo->mem));
+   bo->mem.mem_type = TTM_PL_SYSTEM;
+   bo->ttm = NULL;
+
+   ttm_bo_unreserve(ghost);
+   ttm_bo_unref();
+
+   return 0;
+}
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h 
index f8e2515b401f..39cd6b086d3a 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -849,6 +849,15 @@ int ttm_bo_pipeline_move(struct ttm_buffer_object *bo,
 struct dma_fence *fence, bool evict,
 struct ttm_mem_reg *new_mem);
 
+/**
+ * ttm_bo_pipeline_gutting.
+ *
+ * @bo: A pointer to a struct ttm_buffer_object.
+ *
+ * Pipelined gutting a BO of it's backing store.
+ */
+int ttm_bo_pipeline_gutting(struct ttm_buffer_object *bo);
+
 /**
  * ttm_io_prot
  *
--
2.14.1

___
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 4/8] drm/amd/pp: Clean up the code for RV in powerplay

2018-03-07 Thread Zhu, Rex
>Is this consistent with our other smu implementations?  If we never use the 
>errors, I'm ok with removing them, but I don't want to have to add them again 
>later if we decide we 
>actually need them.

Rex: until vega10/raven, we do not handle the errors in windows/linux, just 
print out the response code for debug. And the smu dispatch functions always 
return true. So the check code is meanless.


There are five responses from smu.  Success/not support/busy/no response/need 
prereq.

For not support, smu just not recognize the command, hw still work normally. so 
print out a message is enough for debug or notify user.
For busy or not response, in most case, the hw is hung. and seems no way to 
reset/recovery smu engine currently. 
For need prereq, I think print the error message is enough for debug.

In the future, if we can handle the errors, we can add in the dispatch 
functions.

Best Regards
Rex


-Original Message-
From: Alex Deucher [mailto:alexdeuc...@gmail.com] 
Sent: Thursday, March 08, 2018 1:23 AM
To: Zhu, Rex
Cc: amd-gfx list
Subject: Re: [PATCH 4/8] drm/amd/pp: Clean up the code for RV in powerplay

On Wed, Mar 7, 2018 at 5:46 AM, Rex Zhu  wrote:
> In send_message_to_smu helper functions, Print out the error code for 
> debug if smu failed to response.
>
> and the helper functions always return true, so no need to check their 
> return value.
>
> Signed-off-by: Rex Zhu 

Is this consistent with our other smu implementations?  If we never use the 
errors, I'm ok with removing them, but I don't want to have to add them again 
later if we decide we actually need them.

Alex

>
> Change-Id: I27da8bf22d3cea0df968df7b0809dc727461762f
> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c   | 72 +
>  drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c | 98 
> 
>  2 files changed, 53 insertions(+), 117 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
> index 8ddfb78..4b5c5fc 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
> @@ -243,8 +243,7 @@ static int rv_disable_gfx_off(struct pp_hwmgr *hwmgr)
> struct rv_hwmgr *rv_data = (struct rv_hwmgr 
> *)(hwmgr->backend);
>
> if (rv_data->gfx_off_controled_by_driver)
> -   smum_send_msg_to_smc(hwmgr,
> -   PPSMC_MSG_DisableGfxOff);
> +   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_DisableGfxOff);
>
> return 0;
>  }
> @@ -259,8 +258,7 @@ static int rv_enable_gfx_off(struct pp_hwmgr *hwmgr)
> struct rv_hwmgr *rv_data = (struct rv_hwmgr 
> *)(hwmgr->backend);
>
> if (rv_data->gfx_off_controled_by_driver)
> -   smum_send_msg_to_smc(hwmgr,
> -   PPSMC_MSG_EnableGfxOff);
> +   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_EnableGfxOff);
>
> return 0;
>  }
> @@ -387,24 +385,12 @@ static int rv_populate_clock_table(struct pp_hwmgr 
> *hwmgr)
> rv_get_clock_voltage_dependency_table(hwmgr, 
> >vdd_dep_on_phyclk,
> ARRAY_SIZE(VddPhyClk), 
> [0]);
>
> -   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
> -   PPSMC_MSG_GetMinGfxclkFrequency),
> -   "Attempt to get min GFXCLK Failed!",
> -   return -1);
> -   PP_ASSERT_WITH_CODE(!rv_read_arg_from_smc(hwmgr,
> -   ),
> -   "Attempt to get min GFXCLK Failed!",
> -   return -1);
> +   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetMinGfxclkFrequency);
> +   rv_read_arg_from_smc(hwmgr, );
> rv_data->gfx_min_freq_limit = result * 100;
>
> -   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
> -   PPSMC_MSG_GetMaxGfxclkFrequency),
> -   "Attempt to get max GFXCLK Failed!",
> -   return -1);
> -   PP_ASSERT_WITH_CODE(!rv_read_arg_from_smc(hwmgr,
> -   ),
> -   "Attempt to get max GFXCLK Failed!",
> -   return -1);
> +   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetMaxGfxclkFrequency);
> +   rv_read_arg_from_smc(hwmgr, );
> rv_data->gfx_max_freq_limit = result * 100;
>
> return 0;
> @@ -739,14 +725,8 @@ static int rv_print_clock_levels(struct pp_hwmgr 
> *hwmgr,
>
> switch (type) {
> case PP_SCLK:
> -   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
> -   PPSMC_MSG_GetGfxclkFrequency),
> -   "Attempt to get current GFXCLK Failed!",
> -   return -1);
> -   PP_ASSERT_WITH_CODE(!rv_read_arg_from_smc(hwmgr,
> -   ),
> -   "Attempt 

RE: deprecated register issues

2018-03-07 Thread Liu, Monk
Hi Alex

While we can avoid such vm flush failure by stitch together of the sending REQ 
and reading ACK part, at least for compute ring this is confirmed.
And I believe for SDMA ring (even UVD/VCE ring) it could also be achieved.

But @Koenig, Christian insist stitching 
together the REQ AND ACK part is not a formal way to fix the issue, instead 
just a walkaround and I cannot debate that

What make me worry more is what if there are more registers like Alex said that 
behaves like this CC_RB_BACKEND_DISABLE,
since we don’t know their names(too hard to filter them out!) so we couldn’t 
remove them all from SR list,
So I still think we need plan B to handle above case,  A.K.A use one package 
for the REQ and ACK job

/Monk

From: Deucher, Alexander
Sent: 2018年3月8日 10:53
To: Liu, Monk ; Koenig, Christian ; 
Mao, David 
Cc: amd-gfx@lists.freedesktop.org; Jin, Jian-Rong 
Subject: Re: deprecated register issues


I think there are more than just CC_RB_BACKEND_DISABLE that could cause this 
problem.  IIRC, some entire class of gfx registers could cause it, it just 
happened that this was one of the only ones we readback via mmio.  Also for the 
save and restore list, I think the RLC uses a different interface to read back 
the registers so it may not be affected the same way.



Alex


From: Liu, Monk
Sent: Wednesday, March 7, 2018 9:42:41 PM
To: Deucher, Alexander; Koenig, Christian; Mao, David
Cc: amd-gfx@lists.freedesktop.org; Jin, 
Jian-Rong
Subject: RE: deprecated register issues


Hi guys



According to Christian’s found, reading this register would make vm hub failed 
to finish the vm flush request , e.g.: sdma is doing vm flush which first write 
data to vm_invalidat_req and read result from vm_invalidate_ack, but found 
driver will forever failed to get the correct value from vm_invalidate_ack if 
the meantime BIF is reading this CC_RB_BACKEND_DISABLE register.



Now SR-IOV world switch also may get such similar trouble, see below 
save_restore_list ( during world_switch, RLCV will save current VF’s register 
according to this list and restore all those registers when loading back this 
VF)



uint32 register_restore[] = {

   (uint32)((0x3000 << 18) | mmPA_SC_FIFO_SIZE),   /* SC   */

   0x0001,

   (uint32)((0x3000 << 18) | mmCC_RB_BACKEND_DISABLE),   /* SC SC PER_SE  */

   0x,

   (uint32)((0x3400 << 18) | mmCC_RB_BACKEND_DISABLE),   /* SC SC PER_SE  */

   0x,

   (uint32)((0x3800 << 18) | mmCC_RB_BACKEND_DISABLE),   /* SC SC PER_SE  */

   0x,

   (uint32)((0x3c00 << 18) | mmCC_RB_BACKEND_DISABLE),   /* SC SC PER_SE  */

   0x,

   (uint32)((0x3000 << 18) | mmVGT_VTX_VECT_EJECT_REG),

   0x0001,

   (uint32)((0x3000 << 18) | mmVGT_DMA_DATA_FIFO_DEPTH),   /* IA WD  */

   0x0001,

   (uint32)((0x3000 << 18) | mmVGT_DMA_REQ_FIFO_DEPTH),   /* WD   */

   0x0001,

   (uint32)((0x3000 << 18) | mmVGT_DRAW_INIT_FIFO_DEPTH),   /* WD   */

   0x0001,

   (uint32)((0x3000 << 18) | mmVGT_CACHE_INVALIDATION),   /*  IA  */

   0x0001,

   (uint32)((0x3000 << 18) | mmVGT_RESET_DEBUG),   /*  WD  */

   0x0001,

   (uint32)((0x3000 << 18) | mmVGT_FIFO_DEPTHS),



I will do some test against this CC_RB_BACKEND_DISABLE register, see if vm 
flush failure issue could be avoided by removing those four register from SR 
list



Thanks



/Monk



From: Deucher, Alexander
Sent: 2018年3月7日 23:13
To: Koenig, Christian 
>; Mao, David 
>; Liu, Monk 
>
Cc: amd-gfx@lists.freedesktop.org; Jin, 
Jian-Rong >
Subject: Re: deprecated register issues



Right.  We ran into issues with reading back that register at runtime when UMDs 
queried it when other stuff was in flight, so we just read it once at startup 
and cache the results. Now when UMDs request it, we return the cached value.



Alex



From: Koenig, Christian
Sent: Wednesday, March 7, 2018 9:31:13 AM
To: Mao, David; Liu, Monk
Cc: Deucher, Alexander; 
amd-gfx@lists.freedesktop.org; Jin, 
Jian-Rong
Subject: Re: deprecated register issues



Hi David,

well I just figured that this is a misunderstanding.

Accessing this register and some other deprecated registers can cause problem 
when invalidating VMHUBs.

This register itself isn't deprecated, the wording in a patch fixing things is 
just a bit unclear.

Question is is that register still accessed regularly or is it value cached 
after startup?

Regards,
Christian.

Am 07.03.2018 um 

Re: [PATCH] drm/amdgpu: Correct the amdgpu_ucode_fini_bo place for Tonga

2018-03-07 Thread Deucher, Alexander
Reviewed-by: Alex Deucher 


From: amd-gfx  on behalf of Emily Deng 

Sent: Wednesday, March 7, 2018 9:55:02 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deng, Emily
Subject: [PATCH] drm/amdgpu: Correct the amdgpu_ucode_fini_bo place for Tonga

The amdgpu_ucode_fini_bo should be called after gfx_v8_0_hw_fini,
or it will have KCQ disable failed issue.

For Tonga, as it firstly finishes SMC block, and the SMC hw fini
will call amdgpu_ucode_fini, which will lead the amdgpu_ucode_fini_bo
called before gfx_v8_0_hw_fini, this is incorrect.

Signed-off-by: Emily Deng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c | 3 ---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index aa1e517..ac76983 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1457,6 +1457,9 @@ static int amdgpu_device_ip_fini(struct amdgpu_device 
*adev)
 }

 for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
+   if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_SMC &&
+   adev->firmware.load_type == AMDGPU_FW_LOAD_SMU)
+   amdgpu_ucode_fini_bo(adev);
 if (!adev->ip_blocks[i].status.hw)
 continue;

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c
index 5c2e2d5..825c9b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c
@@ -162,9 +162,6 @@ static int amdgpu_pp_hw_fini(void *handle)
 ret = adev->powerplay.ip_funcs->hw_fini(
 adev->powerplay.pp_handle);

-   if (adev->firmware.load_type == AMDGPU_FW_LOAD_SMU)
-   amdgpu_ucode_fini_bo(adev);
-
 return ret;
 }

--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: Correct the amdgpu_ucode_fini_bo place for Tonga

2018-03-07 Thread Emily Deng
The amdgpu_ucode_fini_bo should be called after gfx_v8_0_hw_fini,
or it will have KCQ disable failed issue.

For Tonga, as it firstly finishes SMC block, and the SMC hw fini
will call amdgpu_ucode_fini, which will lead the amdgpu_ucode_fini_bo
called before gfx_v8_0_hw_fini, this is incorrect.

Signed-off-by: Emily Deng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c | 3 ---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index aa1e517..ac76983 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1457,6 +1457,9 @@ static int amdgpu_device_ip_fini(struct amdgpu_device 
*adev)
}
 
for (i = adev->num_ip_blocks - 1; i >= 0; i--) {
+   if (adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_SMC &&
+   adev->firmware.load_type == AMDGPU_FW_LOAD_SMU)
+   amdgpu_ucode_fini_bo(adev);
if (!adev->ip_blocks[i].status.hw)
continue;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c
index 5c2e2d5..825c9b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_powerplay.c
@@ -162,9 +162,6 @@ static int amdgpu_pp_hw_fini(void *handle)
ret = adev->powerplay.ip_funcs->hw_fini(
adev->powerplay.pp_handle);
 
-   if (adev->firmware.load_type == AMDGPU_FW_LOAD_SMU)
-   amdgpu_ucode_fini_bo(adev);
-
return ret;
 }
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: deprecated register issues

2018-03-07 Thread Deucher, Alexander
I think there are more than just CC_RB_BACKEND_DISABLE that could cause this 
problem.  IIRC, some entire class of gfx registers could cause it, it just 
happened that this was one of the only ones we readback via mmio.  Also for the 
save and restore list, I think the RLC uses a different interface to read back 
the registers so it may not be affected the same way.


Alex


From: Liu, Monk
Sent: Wednesday, March 7, 2018 9:42:41 PM
To: Deucher, Alexander; Koenig, Christian; Mao, David
Cc: amd-gfx@lists.freedesktop.org; Jin, Jian-Rong
Subject: RE: deprecated register issues


Hi guys



According to Christian’s found, reading this register would make vm hub failed 
to finish the vm flush request , e.g.: sdma is doing vm flush which first write 
data to vm_invalidat_req and read result from vm_invalidate_ack, but found 
driver will forever failed to get the correct value from vm_invalidate_ack if 
the meantime BIF is reading this CC_RB_BACKEND_DISABLE register.



Now SR-IOV world switch also may get such similar trouble, see below 
save_restore_list ( during world_switch, RLCV will save current VF’s register 
according to this list and restore all those registers when loading back this 
VF)



uint32 register_restore[] = {

   (uint32)((0x3000 << 18) | mmPA_SC_FIFO_SIZE),   /* SC   */

   0x0001,

   (uint32)((0x3000 << 18) | mmCC_RB_BACKEND_DISABLE),   /* SC SC PER_SE  */

   0x,

   (uint32)((0x3400 << 18) | mmCC_RB_BACKEND_DISABLE),   /* SC SC PER_SE  */

   0x,

   (uint32)((0x3800 << 18) | mmCC_RB_BACKEND_DISABLE),   /* SC SC PER_SE  */

   0x,

   (uint32)((0x3c00 << 18) | mmCC_RB_BACKEND_DISABLE),   /* SC SC PER_SE  */

   0x,

   (uint32)((0x3000 << 18) | mmVGT_VTX_VECT_EJECT_REG),

   0x0001,

   (uint32)((0x3000 << 18) | mmVGT_DMA_DATA_FIFO_DEPTH),   /* IA WD  */

   0x0001,

   (uint32)((0x3000 << 18) | mmVGT_DMA_REQ_FIFO_DEPTH),   /* WD   */

   0x0001,

   (uint32)((0x3000 << 18) | mmVGT_DRAW_INIT_FIFO_DEPTH),   /* WD   */

   0x0001,

   (uint32)((0x3000 << 18) | mmVGT_CACHE_INVALIDATION),   /*  IA  */

   0x0001,

   (uint32)((0x3000 << 18) | mmVGT_RESET_DEBUG),   /*  WD  */

   0x0001,

   (uint32)((0x3000 << 18) | mmVGT_FIFO_DEPTHS),



I will do some test against this CC_RB_BACKEND_DISABLE register, see if vm 
flush failure issue could be avoided by removing those four register from SR 
list



Thanks



/Monk



From: Deucher, Alexander
Sent: 2018年3月7日 23:13
To: Koenig, Christian ; Mao, David 
; Liu, Monk 
Cc: amd-gfx@lists.freedesktop.org; Jin, Jian-Rong 
Subject: Re: deprecated register issues



Right.  We ran into issues with reading back that register at runtime when UMDs 
queried it when other stuff was in flight, so we just read it once at startup 
and cache the results. Now when UMDs request it, we return the cached value.



Alex



From: Koenig, Christian
Sent: Wednesday, March 7, 2018 9:31:13 AM
To: Mao, David; Liu, Monk
Cc: Deucher, Alexander; 
amd-gfx@lists.freedesktop.org; Jin, 
Jian-Rong
Subject: Re: deprecated register issues



Hi David,

well I just figured that this is a misunderstanding.

Accessing this register and some other deprecated registers can cause problem 
when invalidating VMHUBs.

This register itself isn't deprecated, the wording in a patch fixing things is 
just a bit unclear.

Question is is that register still accessed regularly or is it value cached 
after startup?

Regards,
Christian.

Am 07.03.2018 um 15:25 schrieb Mao, David:

We requires base driver to provide the mask of disabled RB.

This is why kernel read the CC_RB_BACKEND_DISABLE to collect the harvest 
configuration.

Where did you get to know that the register is deprecated?

I think it should still be there.



Best Regards,

David



On Mar 7, 2018, at 9:49 PM, Liu, Monk 
> wrote:



+ UMD guys



Hi David



Do you know if GC_USER_RB_BACKEND_DISABLE is still exist for gfx9/vega10 ?



We found CC_RB_BACKEND_DISABLE was deprecated but looks it is still in use in 
kmd, so

I want to check with you both of above registers



Thanks

/Monk



From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: 2018年3月7日 20:26
To: Liu, Monk >; Deucher, Alexander 
>
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: deprecated register issues



Hi Monk,

I honestly don't have the slightest idea why we are still accessing 
CC_RB_BACKEND_DISABLE. Maybe it still contains some useful values?

Key point was that we needed to stop accessing it all the time to 

RE: deprecated register issues

2018-03-07 Thread Liu, Monk
Hi guys

According to Christian’s found, reading this register would make vm hub failed 
to finish the vm flush request , e.g.: sdma is doing vm flush which first write 
data to vm_invalidat_req and read result from vm_invalidate_ack, but found 
driver will forever failed to get the correct value from vm_invalidate_ack if 
the meantime BIF is reading this CC_RB_BACKEND_DISABLE register.

Now SR-IOV world switch also may get such similar trouble, see below 
save_restore_list ( during world_switch, RLCV will save current VF’s register 
according to this list and restore all those registers when loading back this 
VF)

uint32 register_restore[] = {
   (uint32)((0x3000 << 18) | mmPA_SC_FIFO_SIZE),   /* SC   */
   0x0001,
   (uint32)((0x3000 << 18) | mmCC_RB_BACKEND_DISABLE),   /* SC SC PER_SE  */
   0x,
   (uint32)((0x3400 << 18) | mmCC_RB_BACKEND_DISABLE),   /* SC SC PER_SE  */
   0x,
   (uint32)((0x3800 << 18) | mmCC_RB_BACKEND_DISABLE),   /* SC SC PER_SE  */
   0x,
   (uint32)((0x3c00 << 18) | mmCC_RB_BACKEND_DISABLE),   /* SC SC PER_SE  */
   0x,
   (uint32)((0x3000 << 18) | mmVGT_VTX_VECT_EJECT_REG),
   0x0001,
   (uint32)((0x3000 << 18) | mmVGT_DMA_DATA_FIFO_DEPTH),   /* IA WD  */
   0x0001,
   (uint32)((0x3000 << 18) | mmVGT_DMA_REQ_FIFO_DEPTH),   /* WD   */
   0x0001,
   (uint32)((0x3000 << 18) | mmVGT_DRAW_INIT_FIFO_DEPTH),   /* WD   */
   0x0001,
   (uint32)((0x3000 << 18) | mmVGT_CACHE_INVALIDATION),   /*  IA  */
   0x0001,
   (uint32)((0x3000 << 18) | mmVGT_RESET_DEBUG),   /*  WD  */
   0x0001,
   (uint32)((0x3000 << 18) | mmVGT_FIFO_DEPTHS),

I will do some test against this CC_RB_BACKEND_DISABLE register, see if vm 
flush failure issue could be avoided by removing those four register from SR 
list

Thanks

/Monk

From: Deucher, Alexander
Sent: 2018年3月7日 23:13
To: Koenig, Christian ; Mao, David 
; Liu, Monk 
Cc: amd-gfx@lists.freedesktop.org; Jin, Jian-Rong 
Subject: Re: deprecated register issues


Right.  We ran into issues with reading back that register at runtime when UMDs 
queried it when other stuff was in flight, so we just read it once at startup 
and cache the results. Now when UMDs request it, we return the cached value.



Alex


From: Koenig, Christian
Sent: Wednesday, March 7, 2018 9:31:13 AM
To: Mao, David; Liu, Monk
Cc: Deucher, Alexander; 
amd-gfx@lists.freedesktop.org; Jin, 
Jian-Rong
Subject: Re: deprecated register issues

Hi David,

well I just figured that this is a misunderstanding.

Accessing this register and some other deprecated registers can cause problem 
when invalidating VMHUBs.

This register itself isn't deprecated, the wording in a patch fixing things is 
just a bit unclear.

Question is is that register still accessed regularly or is it value cached 
after startup?

Regards,
Christian.

Am 07.03.2018 um 15:25 schrieb Mao, David:
We requires base driver to provide the mask of disabled RB.
This is why kernel read the CC_RB_BACKEND_DISABLE to collect the harvest 
configuration.
Where did you get to know that the register is deprecated?
I think it should still be there.

Best Regards,
David


On Mar 7, 2018, at 9:49 PM, Liu, Monk 
> wrote:

+ UMD guys

Hi David

Do you know if GC_USER_RB_BACKEND_DISABLE is still exist for gfx9/vega10 ?

We found CC_RB_BACKEND_DISABLE was deprecated but looks it is still in use in 
kmd, so
I want to check with you both of above registers

Thanks
/Monk

From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: 2018年3月7日 20:26
To: Liu, Monk >; Deucher, Alexander 
>
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: deprecated register issues

Hi Monk,

I honestly don't have the slightest idea why we are still accessing 
CC_RB_BACKEND_DISABLE. Maybe it still contains some useful values?

Key point was that we needed to stop accessing it all the time to avoid 
triggering problems.

Regards,
Christian.

Am 07.03.2018 um 13:11 schrieb Liu, Monk:
Hi Christian

I remember you and AlexD mentioned that a handful registers are deprecated for 
greenland (gfx9)
e.g. CC_RB_BACKEND_DISABLE

do you know why we still have this routine ?

static u32
gfx_v9_0_get_rb_active_bitmap(struct amdgpu_device *adev)

{

u32 data, mask;



data = RREG32_SOC15(GC,
0, mmCC_RB_BACKEND_DISABLE);

data |= RREG32_SOC15(GC,
0, mmGC_USER_RB_BACKEND_DISABLE);



data &= CC_RB_BACKEND_DISABLE__BACKEND_DISABLE_MASK;

data >>= GC_USER_RB_BACKEND_DISABLE__BACKEND_DISABLE__SHIFT;



mask = 

[pull] radeon, amdgpu, ttm drm-next-4.17

2018-03-07 Thread Alex Deucher
Hi Dave,

More stuff for 4.17. Highlights:
- More fixes for "wattman" like functionality (fine grained clk/voltage control)
- Add more power profile infrastucture (context based dpm)
- SR-IOV fixes
- Add iomem debugging interface for use with umr
- Powerplay and cgs cleanups
- DC fixes and cleanups
- ttm improvements
- Misc cleanups all over

The following changes since commit 9aff8b2ae71dcf7f02443821a894a736f40e4919:

  Revert "drm/radeon/pm: autoswitch power state when in balanced mode" 
(2018-02-20 16:27:16 -0500)

are available in the git repository at:

  git://people.freedesktop.org/~agd5f/linux drm-next-4.17

for you to fetch changes up to f6c3b601bd490eda08c27b03607448abd4b4841b:

  drm/amdgpu:Always save uvd vcpu_bo in VM Mode (2018-03-07 16:12:18 -0500)


Alex Deucher (5):
  drm/amdgpu/powerplay/smu7: use proper dep table for mclk
  drm/amdgpu: fix module parameter descriptions
  drm/amdgpu: used cached pcie gen info for SI (v2)
  drm/radeon: fix KV harvesting
  drm/amdgpu: fix KV harvesting

Amber Lin (1):
  drm/amdgpu: Map all visible VRAM at startup

Arnd Bergmann (1):
  radeon: hide pointless #warning when compile testing

Ben Crocker (1):
  drm/radeon: insist on 32-bit DMA for Cedar on PPC64/PPC64LE

Bhawanpreet Lakha (2):
  drm/amd/display: Use MACROS instead of dm_logger
  drm/amd/display: define DC_LOGGER for logger

Christian König (25):
  drm/ttm: set page mapping during allocation
  drm/amdgpu: use the TTM dummy page instead of allocating one
  drm/ttm: add default implementations for ttm_tt_(un)populate
  drm/virtio: remove ttm_pool_* wrappers
  drm/mgag200: remove ttm_pool_* wrappers
  drm/hisilicon: remove ttm_pool_* wrappers
  drm/ast: remove ttm_pool_* wrappers
  drm/qxl: remove ttm_pool_* wrappers
  drm/cirrus: remove ttm_pool_* wrappers
  drm/bochs: remove the default ttm_tt_populate callbacks
  staging: vboxvideo: remove ttm_pool_* wrappers
  drm/ttm: drop bo->glob
  drm/ttm: drop ttm->glob
  drm/ttm: drop ttm->dummy_read_page
  drm/ttm: drop persistent_swap_storage from ttm_bo_init and co
  drm/ttm: move ttm_tt_create into ttm_tt.c v2
  drm/ttm: cleanup ttm_tt_create
  drm/amdgpu: move some functions into amdgpu_ttm.h
  drm/amdgpu: change amdgpu_ttm_set_active_vram_size
  drm/amdgpu: ignore changes of buffer function status because of GPU resets
  drm/amdgpu: use separate status for buffer funcs availability v2
  drm/amd/pp: fix "Delete the wrapper layer of smu_allocate/free_memory"
  drm/amdgpu: add amdgpu_evict_gtt debugfs entry
  drm/amdgpu: drop gtt->adev
  drm/amdgpu: further mitigate workaround for i915

Corentin Labbe (2):
  drm/amd: remove inclusion of non-existing scheduler directory
  drm/amd: Remove inclusion of non-existing include directories

Dmytro Laktyushkin (4):
  drm/amd/display: Update DCN OPTC registers
  drm/amd/display: add per pipe dppclk
  drm/amd/display: add diags clock programming
  drm/amd/display: fix dcn1 dppclk when min dispclk patch applies

Emily Deng (2):
  drm/amdgpu: Correct sdma_v4 get_wptr(v2)
  drm/amdgpu: Clean sdma wptr register when only enable wptr polling

Eric Bernstein (1):
  drm/amd/display: Fix DAL surface change test

Eric Huang (2):
  drm/amd/powerplay: fix thermal interrupts on vega10
  drm/amd/powerplay: fix power over limit on Fiji

Eric Yang (2):
  drm/amd/display: fix missing az disable in reset backend
  drm/amd/display: update infoframe after dig fe is turned on

Harry Wentland (6):
  drm/amd/display: Remove duplicate dm_pp_power_level enum
  drm/amd/display: Use crtc enable/disable_vblank hooks
  drm/amd/display: Return success when enabling interrupt
  drm/amd/display: Clean up formatting in irq_service_dce110.c
  drm/amd/display: Don't blow up if TG is NULL in dce110_vblank_set
  drm/amd/display: Default HDMI6G support to true. Log VBIOS table error.

Hersen Wu (2):
  drm/amd/display: move MST branch initialize to before link training
  drm/amd/display: Check DCN PState ASSERT failure

James Zhu (3):
  drm/amdgpu:Fixed wrong emit frame size for enc
  drm/amdgpu:Correct max uvd handles
  drm/amdgpu:Always save uvd vcpu_bo in VM Mode

John Barberiz (1):
  drm/amd/display: Add passive dongle support for HPD Rearch

Leo (Sunpeng) Li (2):
  drm/amd/display: Use 4096 lut entries
  drm/amd/display: Add regamma lut write mask to SOC base

Matthias Kaehlcke (1):
  drm/radeon/mkregtable: Delete unused list functions and macros

Michel Dänzer (1):
  drm/amdgpu/dce6: Use DRM_DEBUG instead of DRM_INFO for HPD IRQ info

Monk Liu (15):
  drm/amdgpu: fix for wb_clear
  drm/amdgpu: skip ECC for SRIOV in gmc late_init
  drm/amdgpu: only flush hotplug work without DC
  drm/amdgpu: cond_exec only for 

[pull] radeon and amdgpu drm-fixes-4.16

2018-03-07 Thread Alex Deucher
Hi Dave,

Fixes for 4.16.  A bit bigger than I would have liked, but most of that
is DC fixes which Harry helped me pull together from the past few weeks.
Highlights:
- Fix DL DVI with DC
- Various RV fixes for DC
- Overlay fixes for DC
- Fix HDMI2 handling on boards without HBR tables in the vbios
- Fix crash with pass-through on SI on amdgpu
- Fix RB harvesting on KV
- Fix hibernation failures on UVD with certain cards

The following changes since commit 93dfdf9fde9f20f1c46738bf184adeebc7d7d66e:

  Merge branch 'drm-fixes-4.16' of git://people.freedesktop.org/~agd5f/linux 
into drm-fixes (2018-03-01 14:03:14 +1000)

are available in the git repository at:

  git://people.freedesktop.org/~agd5f/linux drm-fixes-4.16

for you to fetch changes up to 4a53d9045ec31f3f97719c2e41cc8b2e7151a1fe:

  drm/amd/display: validate plane format on primary plane (2018-03-07 16:31:19 
-0500)


Alex Deucher (3):
  drm/amdgpu: used cached pcie gen info for SI (v2)
  drm/radeon: fix KV harvesting
  drm/amdgpu: fix KV harvesting

Bhawanpreet Lakha (1):
  drm/amd/display: Fix takover from VGA mode

Eric Yang (3):
  drm/amd/display: fix cursor related Pstate hang
  drm/amd/display: update infoframe after dig fe is turned on
  drm/amd/display: early return if not in vga mode in disable_vga

Harry Wentland (11):
  drm/amd/display: Don't blow up if TG is NULL in dce110_vblank_set
  drm/amd/display: Default HDMI6G support to true. Log VBIOS table error.
  drm/amd/display: Move MAX_TMDS_CLOCK define to header
  drm/amd/display: Remove unnecessary fail labels in create_stream_for_sink
  drm/amd/display: Pass signal directly to enable_tmds_output
  drm/amd/display: Don't allow dual-link DVI on all ASICs.
  drm/amd/display: Don't block dual-link DVI modes
  drm/amd/display: Make create_stream_for_sink more consistent
  drm/amd/display: Call update_stream_signal directly from amdgpu_dm
  drm/amd/display: Use crtc enable/disable_vblank hooks
  drm/amd/display: Return success when enabling interrupt

James Zhu (2):
  drm/amdgpu:Correct max uvd handles
  drm/amdgpu:Always save uvd vcpu_bo in VM Mode

Jerry (Fangzhi) Zuo (2):
  drm/amd/display: Fix topology change issue in MST rehook
  drm/amd/display: Fixed non-native modes not lighting up

Leo (Sunpeng) Li (1):
  drm/amd/display: Fix memleaks when atomic check fails.

Michel Dänzer (1):
  drm/amdgpu/dce6: Use DRM_DEBUG instead of DRM_INFO for HPD IRQ info

Mikita Lipski (1):
  drm/amd/display: Set irq state only on existing crtcs

Rex Zhu (1):
  drm/amdgpu: Notify sbios device ready before send request

Roman Li (3):
  drm/amd/display: Fix active dongle hotplug
  drm/amd/display: Fix FBC topology change
  drm/amd/display: fix boot-up on vega10

Shirish S (5):
  drm/amd/display: defer modeset check in dm_update_planes_state
  drm/amd/display: validate plane in dce110 for scaling
  drm/amd/display: update plane params before validation
  drm/amd/display: disable CRTCs with NULL FB on their primary plane (V2)
  drm/amd/display: validate plane format on primary plane

Tom St Denis (1):
  drm/amd/amdgpu: Mask rptr as well in ring debugfs

 drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c   |   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c   |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c|  13 +-
 drivers/gpu/drm/amd/amdgpu/dce_v6_0.c  |   2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c  |  30 +---
 drivers/gpu/drm/amd/amdgpu/si.c|  22 ++-
 drivers/gpu/drm/amd/amdgpu/si_dpm.c|  50 ++-
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 165 +++--
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c  |   6 +-
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c|   6 +
 drivers/gpu/drm/amd/display/dc/core/dc.c   |   6 +-
 drivers/gpu/drm/amd/display/dc/core/dc_link.c  |   3 +-
 drivers/gpu/drm/amd/display/dc/core/dc_resource.c  |   3 -
 drivers/gpu/drm/amd/display/dc/core/dc_stream.c|  76 +-
 drivers/gpu/drm/amd/display/dc/dc.h|   3 +-
 drivers/gpu/drm/amd/display/dc/dc_stream.h |   2 +
 drivers/gpu/drm/amd/display/dc/dce/dce_hwseq.h |  10 +-
 .../gpu/drm/amd/display/dc/dce/dce_link_encoder.c  |  38 +++--
 .../gpu/drm/amd/display/dc/dce/dce_link_encoder.h  |   3 +-
 .../drm/amd/display/dc/dce100/dce100_resource.c|   1 +
 .../amd/display/dc/dce110/dce110_hw_sequencer.c|  91 ++--
 .../drm/amd/display/dc/dce110/dce110_resource.c|  18 +++
 .../drm/amd/display/dc/dce112/dce112_resource.c|   2 +
 .../drm/amd/display/dc/dce120/dce120_resource.c|   2 +
 .../gpu/drm/amd/display/dc/dce80/dce80_resource.c  |   1 +
 .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c  |  65 +++-
 .../gpu/drm/amd/display/dc/inc/hw/link_encoder.h   |  

Re: Modesetting (amdgpudrmfb) artifacts on RAVEN APU

2018-03-07 Thread KARBOWSKI Piotr

On 2018-03-07 22:15, Harry Wentland wrote:

amd-stg has some other patches relating to DCN pipe-split that I don't fully 
understand. We might need one or more of those.

As a workaround you can probably set 'force_single_disp_pipe_split' to 'false' 
in 'debug_defaults_drv' in dc/dcn10/dcn10_resource.c.



Thank you for the information. This indeed make it working on 
drm-fixes-4.16.



diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c

index 44825e2c9ebb..df5eefbcc854 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
@@ -444,7 +444,7 @@ static const struct dc_debug debug_defaults_drv = {
.disable_pplib_wm_range = false,
.pplib_wm_report_mode = WM_REPORT_DEFAULT,
.pipe_split_policy = MPC_SPLIT_AVOID_MULT_DISP,
-   .force_single_disp_pipe_split = true,
+   .force_single_disp_pipe_split = false,
.disable_dcc = DCC_ENABLE,
.voltage_align_fclk = true,
.disable_stereo_support = true,


-- Piotr.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Modesetting (amdgpudrmfb) artifacts on RAVEN APU

2018-03-07 Thread Harry Wentland
On 2018-03-07 02:54 PM, KARBOWSKI Piotr wrote:
> On 2018-03-07 20:31, KARBOWSKI Piotr wrote:
>> On 2018-03-07 20:12, Harry Wentland wrote:
>>> Thanks for testing.
>>>
>>> What do you mean with broken video? I tried going back in the email thread 
>>> but I'm not 100% clear what you mean by this.
>>>
>>> BTW, drm-fixes-4.16 are fixes we intend to get into the 4.16 upstream 
>>> kernel. amd-staging-drm-next is our development tree and quite a bit 
>>> farther ahead in development. I'm trying to find fixes to backmerge to 4.16 
>>> to make everyone happy when it releases.
>>
>> Like on those screenshots
>>
>>
>>  https://i.imgur.com/qnDOKY7.jpg
>>  https://i.imgur.com/XH42zit.jpg
>>
>> To describe it better, lets split the screen into 4 segments
>>
>> +-+-+
>> |  A  |  B  |
>> +-+-+
>> |  C  |  D  |
>> +-+-+
>>
>> The screen resolution kicks in as it should, 1080p, then A displays proper 
>> part of screen, around 25%, then B contain a copy of what A displays, so the 
>> same 25% of top-left part of screen, then C and D remain black, if I scroll 
>> the screen a bit, C and D an get a bold vertical line of whatever color was 
>> on screen, if I had a white/gray font, it will be it, if there was some blue 
>> text, it will be blue until I scroll a bit more.
>>
>>
>> drm-fixes-4.16 yield the same effect as amd-staging-drm-next without 
>> 0001-drm-amd-display-Fix-takeover-from-VGA-mode.patch, but it seems that 
>> htis patch is already in drm-fixes-4.16 so I am confused.
> 
> I was not correct.
> 
> The drm-fixes-4.16 is a bit better It fixes half of my issue! So with this 
> handy ascii art of mine
> 
>   +-+-+
>   |  A  |  B  |
>   +-+-+
>   |  C  |  D  |
>   +-+-+
> 
> A and C works, so I have full termianl on left side. but B remains copy of A 
> and D either is black of have the fancy line.
> 

amd-stg has some other patches relating to DCN pipe-split that I don't fully 
understand. We might need one or more of those.

As a workaround you can probably set 'force_single_disp_pipe_split' to 'false' 
in 'debug_defaults_drv' in dc/dcn10/dcn10_resource.c.

Harry

> I compared manually the patch that did worked for me and it seems it was not 
> fully applied, the .h and .c does not look the same. Perhaps that's the 
> source here.
> 
> -- Piotr.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/atomic: Add new reverse iterator over all plane state (V2)

2018-03-07 Thread Harry Wentland
On 2018-03-06 10:10 PM, Shirish S wrote:
> Add reverse iterator for_each_oldnew_plane_in_state_reverse to
> compliment the for_each_oldnew_plane_in_state way or reading plane
> states.
> 
> The plane states are required to be read in reverse order for
> amd drivers, cause the z order convention followed in linux is
> opposite to how the planes are supposed to be presented to DC
> engine, which is in common to both windows and linux.
> 
> V2: fix compile time errors due to -Werror flag.
> 
> Signed-off-by: Shirish S 
> Signed-off-by: Pratik Vishwakarma 
> Reviewed-by: Daniel Vetter 

Merged to drm-misc-next.

Harry

> ---
>  include/drm/drm_atomic.h | 22 ++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
> index cf13842..3fe8dde 100644
> --- a/include/drm/drm_atomic.h
> +++ b/include/drm/drm_atomic.h
> @@ -754,6 +754,28 @@ void drm_state_dump(struct drm_device *dev, struct 
> drm_printer *p);
> (new_plane_state) = 
> (__state)->planes[__i].new_state, 1))
>  
>  /**
> + * for_each_oldnew_plane_in_state_reverse - iterate over all planes in an 
> atomic
> + * update in reverse order
> + * @__state:  drm_atomic_state pointer
> + * @plane:  drm_plane iteration cursor
> + * @old_plane_state:  drm_plane_state iteration cursor for the old 
> state
> + * @new_plane_state:  drm_plane_state iteration cursor for the new 
> state
> + * @__i: int iteration cursor, for macro-internal use
> + *
> + * This iterates over all planes in an atomic update in reverse order,
> + * tracking both old and  new state. This is useful in places where the
> + * state delta needs to be considered, for example in atomic check functions.
> + */
> +#define for_each_oldnew_plane_in_state_reverse(__state, plane, 
> old_plane_state, new_plane_state, __i) \
> + for ((__i) = ((__state)->dev->mode_config.num_total_plane - 1); \
> +  (__i) >= 0;\
> +  (__i)--)   \
> + for_each_if ((__state)->planes[__i].ptr &&  \
> +  ((plane) = (__state)->planes[__i].ptr, \
> +   (old_plane_state) = 
> (__state)->planes[__i].old_state,\
> +   (new_plane_state) = 
> (__state)->planes[__i].new_state, 1))
> +
> +/**
>   * for_each_old_plane_in_state - iterate over all planes in an atomic update
>   * @__state:  drm_atomic_state pointer
>   * @plane:  drm_plane iteration cursor
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: New KFD ioctls: taking the skeletons out of the closet

2018-03-07 Thread Felix Kuehling
Thanks for the feedback. I'm answering some of your questions inline.

On 2018-03-06 06:09 PM, Dave Airlie wrote:
> On 7 March 2018 at 08:44, Felix Kuehling  wrote:
>> Hi all,
>>
>> Christian raised two potential issues in a recent KFD upstreaming code
>> review that are related to the KFD ioctl APIs:
>>
>>  1. behaviour of -ERESTARTSYS
>>  2. transactional nature of KFD ioctl definitions, or lack thereof
>>
>> I appreciate constructive feedback, but I also want to encourage an
>> open-minded rather than a dogmatic approach to API definitions. So let
>> me take all the skeletons out of my closet and get these APIs reviewed
>> in the appropriate forum before we commit to them upstream. See the end
>> of this email for reference.
>>
>> The controversial part at this point is kfd_ioctl_map_memory_to_gpu. If
>> any of the other APIs raise concerns or questions, please ask.
>>
>> Because of the HSA programming model, KFD memory management APIs are
>> synchronous. There is no pipelining. Command submission to GPUs through
>> user mode queues does not involve KFD. This means KFD doesn't know what
>> memory is used by the GPUs and when it's used. That means, when the
>> map_memory_to_gpu ioctl returns to user mode, all memory mapping
>> operations are complete and the memory can be used by the CPUs or GPUs
>> immediately.
> I've got a few opinions, but first up I still dislike user-mode queues
> and everything
> they entail. I still feel they are solving a Windows problem and not a
> Linux problem,
> and it would be nice if we had some Linux numbers on what they gain us over
> a dispatch ioctl, because they sure bring a lot of memory management issues.
>
> That said amdkfd is here.
>
> The first question you should ask (which you haven't asked here at all) is
> what should userspace do with the ioctl result.
>
>> HSA also uses a shared virtual memory model, so typically memory gets
>> mapped on multiple GPUs and CPUs at the same virtual address.
>>
>> The point of contention seems to be the ability to map memory to
>> multiple GPUs in a single ioctl and the behaviour in failure cases. I'll
>> discuss two main failure cases:
>>
>> 1: Failure after all mappings have been dispatched via SDMA, but a
>> signal interrupts the wait for completion and we return -ERESTARTSYS.
>> Documentation/kernel-hacking/hacking.rst only says "[...] you should be
>> prepared to process the restart, e.g. if you're in the middle of
>> manipulating some data structure." I think we do that by ensuring that
>> memory that's already mapped won't be mapped again. So the restart will
>> become a no-op and just end up waiting for all the previous mappings to
>> complete.
> -ERESTARTSYS at that late stage points to a badly synchronous API,
> I'd have said you should have two ioctls, one that returns a fence after
> starting the processes, and one that waits for the fence separately.
>
> The overhead of ioctls isn't your enemy until you've measured it and
> proven it's a problem.
>
>> Christian has a stricter requirement, and I'd like to know where that
>> comes from: "An interrupted IOCTL should never have a visible effect."
> Christian might be taking things a bit further but synchronous gpu access
> APIs are bad, but I don't think undoing a bunch of work is a good plan either
> just because you got ERESTARTSYS. If you get ERESTARTSYS can you
> handle it, if I've fired off 5 SDMAs and wait for them will I fire off 5 more?
> will I wait for the original SDMAs if I reenter?

It will wait for the original SDMAs to complete.

>
>> 2: Failure to map on some but not all GPUs. This comes down to the
>> question, do all ioctl APIs or system calls in general need to be
>> transactional? As a counter example I'd give incomplete read or write
>> system calls that return how much was actually read or written. Our
>> current implementation of map_memory_to_gpu doesn't do this, but it
>> could be modified to return to user mode how many of the mappings, or
>> which mappings specifically failed or succeeded.
> What should userspace do? if it only get mappings on 3 of the gpus instead
> of say 4? Is there a sane resolution other than calling the ioctl again with
> the single GPU? Would it drop the GPU from the working set from that point on?
>
> Need more info to do what can come out of the API doing incomplete
> operations.

There are two typical use cases where this function is used.

 1. During allocation
 2. Changing access to an existing buffer

There is no retry logic in either case. And given the likely failure
conditions, a retry doesn't really make much sense.

I think the most likely failure I've seen is a failure to validate the
BO under heavy memory pressure. This will affect the first GPU trying to
map the memory. Once it's mapped on one GPU, subsequent GPUs don't need
to validate it again, so that's less likely to fail. Maybe if we're
running out of space for the SDMA command buffers. If you're under that
much memory 

Re: Modesetting (amdgpudrmfb) artifacts on RAVEN APU

2018-03-07 Thread KARBOWSKI Piotr

On 2018-03-07 20:12, Harry Wentland wrote:

Thanks for testing.

What do you mean with broken video? I tried going back in the email thread but 
I'm not 100% clear what you mean by this.

BTW, drm-fixes-4.16 are fixes we intend to get into the 4.16 upstream kernel. 
amd-staging-drm-next is our development tree and quite a bit farther ahead in 
development. I'm trying to find fixes to backmerge to 4.16 to make everyone 
happy when it releases.


Like on those screenshots


https://i.imgur.com/qnDOKY7.jpg
https://i.imgur.com/XH42zit.jpg

To describe it better, lets split the screen into 4 segments

+-+-+
|  A  |  B  |
+-+-+
|  C  |  D  |
+-+-+

The screen resolution kicks in as it should, 1080p, then A displays 
proper part of screen, around 25%, then B contain a copy of what A 
displays, so the same 25% of top-left part of screen, then C and D 
remain black, if I scroll the screen a bit, C and D an get a bold 
vertical line of whatever color was on screen, if I had a white/gray 
font, it will be it, if there was some blue text, it will be blue until 
I scroll a bit more.



drm-fixes-4.16 yield the same effect as amd-staging-drm-next without 
0001-drm-amd-display-Fix-takeover-from-VGA-mode.patch, but it seems that 
htis patch is already in drm-fixes-4.16 so I am confused.


-- Piotr.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Modesetting (amdgpudrmfb) artifacts on RAVEN APU

2018-03-07 Thread KARBOWSKI Piotr

On 2018-03-07 20:31, KARBOWSKI Piotr wrote:

On 2018-03-07 20:12, Harry Wentland wrote:

Thanks for testing.

What do you mean with broken video? I tried going back in the email 
thread but I'm not 100% clear what you mean by this.


BTW, drm-fixes-4.16 are fixes we intend to get into the 4.16 upstream 
kernel. amd-staging-drm-next is our development tree and quite a bit 
farther ahead in development. I'm trying to find fixes to backmerge to 
4.16 to make everyone happy when it releases.


Like on those screenshots


     https://i.imgur.com/qnDOKY7.jpg
     https://i.imgur.com/XH42zit.jpg

To describe it better, lets split the screen into 4 segments

+-+-+
|  A  |  B  |
+-+-+
|  C  |  D  |
+-+-+

The screen resolution kicks in as it should, 1080p, then A displays 
proper part of screen, around 25%, then B contain a copy of what A 
displays, so the same 25% of top-left part of screen, then C and D 
remain black, if I scroll the screen a bit, C and D an get a bold 
vertical line of whatever color was on screen, if I had a white/gray 
font, it will be it, if there was some blue text, it will be blue until 
I scroll a bit more.



drm-fixes-4.16 yield the same effect as amd-staging-drm-next without 
0001-drm-amd-display-Fix-takeover-from-VGA-mode.patch, but it seems that 
htis patch is already in drm-fixes-4.16 so I am confused.


I was not correct.

The drm-fixes-4.16 is a bit better It fixes half of my issue! So with 
this handy ascii art of mine


  +-+-+
  |  A  |  B  |
  +-+-+
  |  C  |  D  |
  +-+-+

A and C works, so I have full termianl on left side. but B remains copy 
of A and D either is black of have the fancy line.


I compared manually the patch that did worked for me and it seems it was 
not fully applied, the .h and .c does not look the same. Perhaps that's 
the source here.


-- Piotr.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support

2018-03-07 Thread Alex Deucher
On Wed, Mar 7, 2018 at 2:02 PM, Li, Samuel  wrote:
> How about that, in my previous patches, it actually allows three scenarios,
> 1) VRAM as display buffer
> 2) GTT as display buffer
> 3) Mixed GTT/display buffer, as you have asked.
>
> I think the first step is to make the 2nd scenario work at optimal level. 
> After that, if anyone wants to work the 3rd scenario, I have no objection at 
> all.

Regardless of which scenarios we need to support, I think we also need
to really plumb this through to mesa however since user space is who
ultimately requests the location.  Overriding it in the kernel gets
tricky and can lead to ping-ponging as others have noted.  Better to
have user space know what chips support it or not and request display
buffers in GTT or VRAM from the start.

Alex

>
> Regards,
> Samuel Li
>
>
>> -Original Message-
>> From: Samuel Li [mailto:samuel...@amd.com]
>> Sent: Wednesday, March 07, 2018 1:54 PM
>> To: Alex Deucher 
>> Cc: Michel Dänzer ; Koenig, Christian
>> ; amd-gfx list 
>> Subject: Re: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support
>>
>> > You might also want to prefer VRAM, but also allow buffers to fall
>> > back to GTT if necessary.
>> For display buffer, this case seems not really attractive. When display 
>> buffer
>> changes between GTT and VRAM dynamically, our driver needs to adpat too,
>> which is hard to see the benefits and not really worth the effort.
>>
>> Sam
>>
>>
>>
>> On 2018-03-07 01:27 PM, Alex Deucher wrote:
>> > On Wed, Mar 7, 2018 at 1:18 PM, Samuel Li  wrote:
>> >> I think it's not useful though. Think about that, SG display feature is
>> intended to use as less VRAM as possible. Will someone want a display
>> buffer sometimes VRAM, sometimes GTT?
>> >> Hardly a case to me, and I think it's a waste of effort. That also might
>> explain no driver does that now.
>> >
>> > You might want different strategies depending on how much VRAM you
>> > have.  If you have a bunch and it performs better, prefer it.  If you
>> > have limited VRAM, you might want to prefer GTT.  You might also want
>> > to prefer VRAM, but also allow buffers to fall back to GTT if
>> > necessary.  This would make the logic dynamic and all in one place.
>> > The kernel can advertise what asics support scanout from system ram
>> > and then UMDs can just query that to choose placements rather than
>> > adding hardcoded logic for specific asics.
>> >
>> > Alex
>> >
>> >>
>> >> Sam
>> >>
>> >>
>> >>
>> >> On 2018-03-07 12:50 PM, Michel Dänzer wrote:
>> >>> On 2018-03-07 06:38 PM, Alex Deucher wrote:
>>  On Wed, Mar 7, 2018 at 12:29 PM, Samuel Li 
>> wrote:
>> >
>> > Why so complicated? If old user space compatibility is required, just
>> use sg_display=0.
>> 
>>  It will always just work in that case and we can adjust for the
>>  optimal scenario by default in all cases without requiring the user
>>  to set misc parameters to tune their setup that may break in the
>>  future or hide bugs because a user sets it and forgets it.
>> >>>
>> >>> Right. I don't agree it's all that complicated anyway, we do this
>> >>> sort of thing all the time, it also makes our lives easier down the road.
>> >>>
>> >>>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: New KFD ioctls: taking the skeletons out of the closet

2018-03-07 Thread Alex Deucher
On Wed, Mar 7, 2018 at 11:38 AM, Daniel Vetter  wrote:
> On Wed, Mar 07, 2018 at 09:38:03AM +0100, Christian K??nig wrote:
>> Am 07.03.2018 um 00:09 schrieb Dave Airlie:
>> > On 7 March 2018 at 08:44, Felix Kuehling  wrote:
>> > > Hi all,
>> > >
>> > > Christian raised two potential issues in a recent KFD upstreaming code
>> > > review that are related to the KFD ioctl APIs:
>> > >
>> > >   1. behaviour of -ERESTARTSYS
>> > >   2. transactional nature of KFD ioctl definitions, or lack thereof
>> > >
>> > > I appreciate constructive feedback, but I also want to encourage an
>> > > open-minded rather than a dogmatic approach to API definitions. So let
>> > > me take all the skeletons out of my closet and get these APIs reviewed
>> > > in the appropriate forum before we commit to them upstream. See the end
>> > > of this email for reference.
>> > >
>> > > The controversial part at this point is kfd_ioctl_map_memory_to_gpu. If
>> > > any of the other APIs raise concerns or questions, please ask.
>> > >
>> > > Because of the HSA programming model, KFD memory management APIs are
>> > > synchronous. There is no pipelining. Command submission to GPUs through
>> > > user mode queues does not involve KFD. This means KFD doesn't know what
>> > > memory is used by the GPUs and when it's used. That means, when the
>> > > map_memory_to_gpu ioctl returns to user mode, all memory mapping
>> > > operations are complete and the memory can be used by the CPUs or GPUs
>> > > immediately.
>> > I've got a few opinions, but first up I still dislike user-mode queues
>> > and everything
>> > they entail. I still feel they are solving a Windows problem and not a
>> > Linux problem,
>> > and it would be nice if we had some Linux numbers on what they gain us over
>> > a dispatch ioctl, because they sure bring a lot of memory management 
>> > issues.
>>
>> Well user-mode queues are a problem as long as you don't have recoverable
>> page faults on the GPU.
>>
>> As soon as you got recoverable page faults and push the memory management
>> towards things like HMM I don't see an advantage of using a IOCTL based
>> command submission any more.
>>
>> So I would say that this is a problem which is slowly going away as the
>> hardware improves.
>
> Yeah, but up to the point where the hw actually works (instead of promises
> that maybe it'll work next generation, trust us, for like a few
> generations) it's much easier to hack up an ioctl with workarounds than
> intercepting an mmap write fault all the time (those are slower than
> ioctls).
>
> I think userspace queues are fine once we have known-working hw. Before
> that I'm kinda agreeing with Dave and not seeing the point. At least to my
> knowledge we still haven't arrived in the wonderful promised land of hw
> recoverable (well, restartable really) page faults on any vendors platform
> ...


I think user space queues are a bit of a distraction.  The original
point of KFD and HSA was to have a consistent programming model across
CPU and other devices with relatively seamless access to the same
memory pools.  KFD was originally focused on APUs and when we have an
IOMMUv2 with ATC available, we have support for recoverable page
faults.  It's been working for 3 generations of hw and has been
expanded to GPUVM on newer hw which doesn't have the dependency on
IOMMU and also support vram.  We added support for KFD for older dGPUs
that don't have this capability, but that is certainly not the only
use case we need to consider.

Alex

>
>> > That said amdkfd is here.
>> >
>> > The first question you should ask (which you haven't asked here at all) is
>> > what should userspace do with the ioctl result.
>> >
>> > > HSA also uses a shared virtual memory model, so typically memory gets
>> > > mapped on multiple GPUs and CPUs at the same virtual address.
>> > >
>> > > The point of contention seems to be the ability to map memory to
>> > > multiple GPUs in a single ioctl and the behaviour in failure cases. I'll
>> > > discuss two main failure cases:
>> > >
>> > > 1: Failure after all mappings have been dispatched via SDMA, but a
>> > > signal interrupts the wait for completion and we return -ERESTARTSYS.
>> > > Documentation/kernel-hacking/hacking.rst only says "[...] you should be
>> > > prepared to process the restart, e.g. if you're in the middle of
>> > > manipulating some data structure." I think we do that by ensuring that
>> > > memory that's already mapped won't be mapped again. So the restart will
>> > > become a no-op and just end up waiting for all the previous mappings to
>> > > complete.
>> > -ERESTARTSYS at that late stage points to a badly synchronous API,
>> > I'd have said you should have two ioctls, one that returns a fence after
>> > starting the processes, and one that waits for the fence separately.
>>
>> That is exactly what I suggested as well, but also exactly what Felix tries
>> to avoid :)
>>
>> > The overhead of ioctls isn't your 

Re: Modesetting (amdgpudrmfb) artifacts on RAVEN APU

2018-03-07 Thread Harry Wentland
Yup, that's there as

commit e92b44fdee3ec0b679bacdb9c7e95e55699167f0
Author: Tony Cheng 
Date:   Thu Oct 5 14:38:46 2017 -0400

drm/amd/display: default force_single_disp_pipe_split = 1 on RV

...


Harry

On 2018-03-07 02:16 PM, Tom St Denis wrote:
> It's an old patch but does drm-fixes-4.16 have this patch
> 
> commit e92b44fdee3ec0b679bacdb9c7e95e55699167f0
> Refs: v4.14-rc3-1871-ge92b44fdee3e
> Author: Tony Cheng 
> AuthorDate: Thu Oct 5 14:38:46 2017 -0400
> Commit: Alex Deucher 
> CommitDate: Sat Oct 21 16:52:21 2017 -0400
> 
>     drm/amd/display: default force_single_disp_pipe_split = 1 on RV
> 
>     1080p idle, stutter efficiency goes up from 95.8% to 97.8%
>     result in 5mW saving from APU and 8mW saving from DDR4
> 
>     Signed-off-by: Tony Cheng 
>     Reviewed-by: Yongqiang Sun 
>     Acked-by: Harry Wentland 
>     Signed-off-by: Alex Deucher 
> 
> ?
> 
> Tom
> 
> On 03/07/2018 02:12 PM, Harry Wentland wrote:
>> On 2018-03-07 12:38 PM, KARBOWSKI Piotr wrote:
>>> On 2018-03-07 17:48, KARBOWSKI Piotr wrote:
 On 2018-03-07 17:14, Harry Wentland wrote:
> Pushed a drm-fixes-4.16 branch (based on Alex's) to my FDO repo 
> athttps://cgit.freedesktop.org/~hwentland/linux/  but I don't have a 
> Raven system setup to test this at the moment.
>
> Tom, or Piotr, would you be able to check that the fix works as intended 
> on that branch?

 Tested that branch, confirmed that the VGA patch is there but I got the 
 original result of broken video. Rebooted few times. It seems that my 
 issue was not the VGA turnover afterall. I will re-test it with vanilla 
 amd-staging-drm-next of git://people.freedesktop.org/~agd5f/linux without 
 the VGA patch and update you.
>>
>> Thanks for testing.
>>
>> What do you mean with broken video? I tried going back in the email thread 
>> but I'm not 100% clear what you mean by this.
>>
>> BTW, drm-fixes-4.16 are fixes we intend to get into the 4.16 upstream 
>> kernel. amd-staging-drm-next is our development tree and quite a bit farther 
>> ahead in development. I'm trying to find fixes to backmerge to 4.16 to make 
>> everyone happy when it releases.
>>
>> Harry
>>
>>>
>>> I re-tested original tree of git://people.freedesktop.org/~agd5f/linux on 
>>> branch amd-staging-drm-next and got the 'double screen' glitches, then I 
>>> applied the patch Tom sent me as the very first response to this thread and 
>>> then it indeed fixed it. So the VGA takeover patch is needed it, it seems.
>>>
>>> I triple checked and build the images and the drm-fixes-4.16 of 
>>> git://people.freedesktop.org/~hwentland/linux does not fix the problem for 
>>> me.
>>>
>>> So far only amd-staging-drm-next with 
>>> 0001-drm-amd-display-Fix-takeover-from-VGA-mode.patch does the trick. One 
>>> thing worth note is that the amd-staging-drm-next goes as `rc1` where yours 
>>> fork Harry appears to be rc3, no idea if it's related or not.
>>>
>>> -- Piotr.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Modesetting (amdgpudrmfb) artifacts on RAVEN APU

2018-03-07 Thread Tom St Denis

It's an old patch but does drm-fixes-4.16 have this patch

commit e92b44fdee3ec0b679bacdb9c7e95e55699167f0 

Refs: v4.14-rc3-1871-ge92b44fdee3e 

Author: Tony Cheng  

AuthorDate: Thu Oct 5 14:38:46 2017 -0400 

Commit: Alex Deucher  

CommitDate: Sat Oct 21 16:52:21 2017 -0400 



drm/amd/display: default force_single_disp_pipe_split = 1 on RV

1080p idle, stutter efficiency goes up from 95.8% to 97.8%
result in 5mW saving from APU and 8mW saving from DDR4

Signed-off-by: Tony Cheng  

Reviewed-by: Yongqiang Sun  

Acked-by: Harry Wentland  

Signed-off-by: Alex Deucher  



?

Tom

On 03/07/2018 02:12 PM, Harry Wentland wrote:

On 2018-03-07 12:38 PM, KARBOWSKI Piotr wrote:

On 2018-03-07 17:48, KARBOWSKI Piotr wrote:

On 2018-03-07 17:14, Harry Wentland wrote:

Pushed a drm-fixes-4.16 branch (based on Alex's) to my FDO repo 
athttps://cgit.freedesktop.org/~hwentland/linux/  but I don't have a Raven 
system setup to test this at the moment.

Tom, or Piotr, would you be able to check that the fix works as intended on 
that branch?


Tested that branch, confirmed that the VGA patch is there but I got the 
original result of broken video. Rebooted few times. It seems that my issue was 
not the VGA turnover afterall. I will re-test it with vanilla 
amd-staging-drm-next of git://people.freedesktop.org/~agd5f/linux without the 
VGA patch and update you.


Thanks for testing.

What do you mean with broken video? I tried going back in the email thread but 
I'm not 100% clear what you mean by this.

BTW, drm-fixes-4.16 are fixes we intend to get into the 4.16 upstream kernel. 
amd-staging-drm-next is our development tree and quite a bit farther ahead in 
development. I'm trying to find fixes to backmerge to 4.16 to make everyone 
happy when it releases.

Harry



I re-tested original tree of git://people.freedesktop.org/~agd5f/linux on 
branch amd-staging-drm-next and got the 'double screen' glitches, then I 
applied the patch Tom sent me as the very first response to this thread and 
then it indeed fixed it. So the VGA takeover patch is needed it, it seems.

I triple checked and build the images and the drm-fixes-4.16 of 
git://people.freedesktop.org/~hwentland/linux does not fix the problem for me.

So far only amd-staging-drm-next with 
0001-drm-amd-display-Fix-takeover-from-VGA-mode.patch does the trick. One thing 
worth note is that the amd-staging-drm-next goes as `rc1` where yours fork 
Harry appears to be rc3, no idea if it's related or not.

-- Piotr.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Modesetting (amdgpudrmfb) artifacts on RAVEN APU

2018-03-07 Thread Harry Wentland
On 2018-03-07 12:38 PM, KARBOWSKI Piotr wrote:
> On 2018-03-07 17:48, KARBOWSKI Piotr wrote:
>> On 2018-03-07 17:14, Harry Wentland wrote:
>>> Pushed a drm-fixes-4.16 branch (based on Alex's) to my FDO repo 
>>> athttps://cgit.freedesktop.org/~hwentland/linux/  but I don't have a Raven 
>>> system setup to test this at the moment.
>>>
>>> Tom, or Piotr, would you be able to check that the fix works as intended on 
>>> that branch?
>>
>> Tested that branch, confirmed that the VGA patch is there but I got the 
>> original result of broken video. Rebooted few times. It seems that my issue 
>> was not the VGA turnover afterall. I will re-test it with vanilla 
>> amd-staging-drm-next of git://people.freedesktop.org/~agd5f/linux without 
>> the VGA patch and update you.

Thanks for testing.

What do you mean with broken video? I tried going back in the email thread but 
I'm not 100% clear what you mean by this.

BTW, drm-fixes-4.16 are fixes we intend to get into the 4.16 upstream kernel. 
amd-staging-drm-next is our development tree and quite a bit farther ahead in 
development. I'm trying to find fixes to backmerge to 4.16 to make everyone 
happy when it releases.

Harry

> 
> I re-tested original tree of git://people.freedesktop.org/~agd5f/linux on 
> branch amd-staging-drm-next and got the 'double screen' glitches, then I 
> applied the patch Tom sent me as the very first response to this thread and 
> then it indeed fixed it. So the VGA takeover patch is needed it, it seems.
> 
> I triple checked and build the images and the drm-fixes-4.16 of 
> git://people.freedesktop.org/~hwentland/linux does not fix the problem for me.
> 
> So far only amd-staging-drm-next with 
> 0001-drm-amd-display-Fix-takeover-from-VGA-mode.patch does the trick. One 
> thing worth note is that the amd-staging-drm-next goes as `rc1` where yours 
> fork Harry appears to be rc3, no idea if it's related or not.
> 
> -- Piotr.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support

2018-03-07 Thread Li, Samuel
How about that, in my previous patches, it actually allows three scenarios,
1) VRAM as display buffer
2) GTT as display buffer
3) Mixed GTT/display buffer, as you have asked.

I think the first step is to make the 2nd scenario work at optimal level. After 
that, if anyone wants to work the 3rd scenario, I have no objection at all.

Regards,
Samuel Li


> -Original Message-
> From: Samuel Li [mailto:samuel...@amd.com]
> Sent: Wednesday, March 07, 2018 1:54 PM
> To: Alex Deucher 
> Cc: Michel Dänzer ; Koenig, Christian
> ; amd-gfx list 
> Subject: Re: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support
> 
> > You might also want to prefer VRAM, but also allow buffers to fall
> > back to GTT if necessary.
> For display buffer, this case seems not really attractive. When display buffer
> changes between GTT and VRAM dynamically, our driver needs to adpat too,
> which is hard to see the benefits and not really worth the effort.
> 
> Sam
> 
> 
> 
> On 2018-03-07 01:27 PM, Alex Deucher wrote:
> > On Wed, Mar 7, 2018 at 1:18 PM, Samuel Li  wrote:
> >> I think it's not useful though. Think about that, SG display feature is
> intended to use as less VRAM as possible. Will someone want a display
> buffer sometimes VRAM, sometimes GTT?
> >> Hardly a case to me, and I think it's a waste of effort. That also might
> explain no driver does that now.
> >
> > You might want different strategies depending on how much VRAM you
> > have.  If you have a bunch and it performs better, prefer it.  If you
> > have limited VRAM, you might want to prefer GTT.  You might also want
> > to prefer VRAM, but also allow buffers to fall back to GTT if
> > necessary.  This would make the logic dynamic and all in one place.
> > The kernel can advertise what asics support scanout from system ram
> > and then UMDs can just query that to choose placements rather than
> > adding hardcoded logic for specific asics.
> >
> > Alex
> >
> >>
> >> Sam
> >>
> >>
> >>
> >> On 2018-03-07 12:50 PM, Michel Dänzer wrote:
> >>> On 2018-03-07 06:38 PM, Alex Deucher wrote:
>  On Wed, Mar 7, 2018 at 12:29 PM, Samuel Li 
> wrote:
> >
> > Why so complicated? If old user space compatibility is required, just
> use sg_display=0.
> 
>  It will always just work in that case and we can adjust for the
>  optimal scenario by default in all cases without requiring the user
>  to set misc parameters to tune their setup that may break in the
>  future or hide bugs because a user sets it and forgets it.
> >>>
> >>> Right. I don't agree it's all that complicated anyway, we do this
> >>> sort of thing all the time, it also makes our lives easier down the road.
> >>>
> >>>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support

2018-03-07 Thread Samuel Li
> You might also want to prefer VRAM, but also allow buffers to fall back to 
> GTT if
> necessary.
For display buffer, this case seems not really attractive. When display buffer 
changes between GTT and VRAM dynamically, our driver needs to adpat too, which 
is hard to see the benefits and not really worth the effort.

Sam



On 2018-03-07 01:27 PM, Alex Deucher wrote:
> On Wed, Mar 7, 2018 at 1:18 PM, Samuel Li  wrote:
>> I think it's not useful though. Think about that, SG display feature is 
>> intended to use as less VRAM as possible. Will someone want a display buffer 
>> sometimes VRAM, sometimes GTT?
>> Hardly a case to me, and I think it's a waste of effort. That also might 
>> explain no driver does that now.
> 
> You might want different strategies depending on how much VRAM you
> have.  If you have a bunch and it performs better, prefer it.  If you
> have limited VRAM, you might want to prefer GTT.  You might also want
> to prefer VRAM, but also allow buffers to fall back to GTT if
> necessary.  This would make the logic dynamic and all in one place.
> The kernel can advertise what asics support scanout from system ram
> and then UMDs can just query that to choose placements rather than
> adding hardcoded logic for specific asics.
> 
> Alex
> 
>>
>> Sam
>>
>>
>>
>> On 2018-03-07 12:50 PM, Michel Dänzer wrote:
>>> On 2018-03-07 06:38 PM, Alex Deucher wrote:
 On Wed, Mar 7, 2018 at 12:29 PM, Samuel Li  wrote:
>
> Why so complicated? If old user space compatibility is required, just use 
> sg_display=0.

 It will always just work in that case and we can adjust for the
 optimal scenario by default in all cases without requiring the user to
 set misc parameters to tune their setup that may break in the future
 or hide bugs because a user sets it and forgets it.
>>>
>>> Right. I don't agree it's all that complicated anyway, we do this sort
>>> of thing all the time, it also makes our lives easier down the road.
>>>
>>>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support

2018-03-07 Thread Alex Deucher
On Wed, Mar 7, 2018 at 1:18 PM, Samuel Li  wrote:
> I think it's not useful though. Think about that, SG display feature is 
> intended to use as less VRAM as possible. Will someone want a display buffer 
> sometimes VRAM, sometimes GTT?
> Hardly a case to me, and I think it's a waste of effort. That also might 
> explain no driver does that now.

You might want different strategies depending on how much VRAM you
have.  If you have a bunch and it performs better, prefer it.  If you
have limited VRAM, you might want to prefer GTT.  You might also want
to prefer VRAM, but also allow buffers to fall back to GTT if
necessary.  This would make the logic dynamic and all in one place.
The kernel can advertise what asics support scanout from system ram
and then UMDs can just query that to choose placements rather than
adding hardcoded logic for specific asics.

Alex

>
> Sam
>
>
>
> On 2018-03-07 12:50 PM, Michel Dänzer wrote:
>> On 2018-03-07 06:38 PM, Alex Deucher wrote:
>>> On Wed, Mar 7, 2018 at 12:29 PM, Samuel Li  wrote:

 Why so complicated? If old user space compatibility is required, just use 
 sg_display=0.
>>>
>>> It will always just work in that case and we can adjust for the
>>> optimal scenario by default in all cases without requiring the user to
>>> set misc parameters to tune their setup that may break in the future
>>> or hide bugs because a user sets it and forgets it.
>>
>> Right. I don't agree it's all that complicated anyway, we do this sort
>> of thing all the time, it also makes our lives easier down the road.
>>
>>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support

2018-03-07 Thread Samuel Li
I think it's not useful though. Think about that, SG display feature is 
intended to use as less VRAM as possible. Will someone want a display buffer 
sometimes VRAM, sometimes GTT?
Hardly a case to me, and I think it's a waste of effort. That also might 
explain no driver does that now.

Sam



On 2018-03-07 12:50 PM, Michel Dänzer wrote:
> On 2018-03-07 06:38 PM, Alex Deucher wrote:
>> On Wed, Mar 7, 2018 at 12:29 PM, Samuel Li  wrote:
>>>
>>> Why so complicated? If old user space compatibility is required, just use 
>>> sg_display=0.
>>
>> It will always just work in that case and we can adjust for the
>> optimal scenario by default in all cases without requiring the user to
>> set misc parameters to tune their setup that may break in the future
>> or hide bugs because a user sets it and forgets it.
> 
> Right. I don't agree it's all that complicated anyway, we do this sort
> of thing all the time, it also makes our lives easier down the road.
> 
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support

2018-03-07 Thread Michel Dänzer
On 2018-03-07 06:38 PM, Alex Deucher wrote:
> On Wed, Mar 7, 2018 at 12:29 PM, Samuel Li  wrote:
>>
>> Why so complicated? If old user space compatibility is required, just use 
>> sg_display=0.
> 
> It will always just work in that case and we can adjust for the
> optimal scenario by default in all cases without requiring the user to
> set misc parameters to tune their setup that may break in the future
> or hide bugs because a user sets it and forgets it.

Right. I don't agree it's all that complicated anyway, we do this sort
of thing all the time, it also makes our lives easier down the road.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Modesetting (amdgpudrmfb) artifacts on RAVEN APU

2018-03-07 Thread KARBOWSKI Piotr

On 2018-03-07 17:48, KARBOWSKI Piotr wrote:

On 2018-03-07 17:14, Harry Wentland wrote:
Pushed a drm-fixes-4.16 branch (based on Alex's) to my FDO repo 
athttps://cgit.freedesktop.org/~hwentland/linux/  but I don't have a 
Raven system setup to test this at the moment.


Tom, or Piotr, would you be able to check that the fix works as 
intended on that branch?


Tested that branch, confirmed that the VGA patch is there but I got the 
original result of broken video. Rebooted few times. It seems that my 
issue was not the VGA turnover afterall. I will re-test it with vanilla 
amd-staging-drm-next of git://people.freedesktop.org/~agd5f/linux 
without the VGA patch and update you.


I re-tested original tree of git://people.freedesktop.org/~agd5f/linux 
on branch amd-staging-drm-next and got the 'double screen' glitches, 
then I applied the patch Tom sent me as the very first response to this 
thread and then it indeed fixed it. So the VGA takeover patch is needed 
it, it seems.


I triple checked and build the images and the drm-fixes-4.16 of 
git://people.freedesktop.org/~hwentland/linux does not fix the problem 
for me.


So far only amd-staging-drm-next with 
0001-drm-amd-display-Fix-takeover-from-VGA-mode.patch does the trick. 
One thing worth note is that the amd-staging-drm-next goes as `rc1` 
where yours fork Harry appears to be rc3, no idea if it's related or not.


-- Piotr.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH xf86-video-amdgpu] Wrap the whole miPointerScreenFuncRec, instead of only Set/MoveCursor

2018-03-07 Thread Michel Dänzer
From: Michel Dänzer 

We were clobbering entries in mi's global miSpritePointerFuncs struct,
which cannot work correctly with multiple primary screens. Instead,
assign a pointer to our own wrapper struct to PointPriv->spriteFuncs.

Fixes crashes with multiple primary screens.

Fixes: 69e20839bfeb ("Keep track of how many SW cursors are visible on
  each screen")
Reported-by: Mario Kleiner 
Signed-off-by: Michel Dänzer 
---
 src/amdgpu_drv.h  |  4 +---
 src/amdgpu_kms.c  | 14 
 src/drmmode_display.c | 61 ++-
 src/drmmode_display.h |  7 +++---
 4 files changed, 63 insertions(+), 23 deletions(-)

diff --git a/src/amdgpu_drv.h b/src/amdgpu_drv.h
index 40157262e..acf697730 100644
--- a/src/amdgpu_drv.h
+++ b/src/amdgpu_drv.h
@@ -281,9 +281,7 @@ typedef struct {
CreateScreenResourcesProcPtr CreateScreenResources;
CreateWindowProcPtr CreateWindow;
WindowExposuresProcPtr WindowExposures;
-   void (*SetCursor) (DeviceIntPtr pDev, ScreenPtr pScreen,
-  CursorPtr pCursor, int x, int y);
-   void (*MoveCursor) (DeviceIntPtr pDev, ScreenPtr pScreen, int x, int y);
+   miPointerSpriteFuncPtr SpriteFuncs;
 
/* Number of SW cursors currently visible on this screen */
int sprites_visible;
diff --git a/src/amdgpu_kms.c b/src/amdgpu_kms.c
index 90e1f2531..b287fcc6e 100644
--- a/src/amdgpu_kms.c
+++ b/src/amdgpu_kms.c
@@ -1551,12 +1551,8 @@ static Bool AMDGPUCursorInit_KMS(ScreenPtr pScreen)
return FALSE;
}
 
-   if (PointPriv->spriteFuncs->SetCursor != 
drmmode_sprite_set_cursor) {
-   info->SetCursor = PointPriv->spriteFuncs->SetCursor;
-   info->MoveCursor = PointPriv->spriteFuncs->MoveCursor;
-   PointPriv->spriteFuncs->SetCursor = 
drmmode_sprite_set_cursor;
-   PointPriv->spriteFuncs->MoveCursor = 
drmmode_sprite_move_cursor;
-   }
+   info->SpriteFuncs = PointPriv->spriteFuncs;
+   PointPriv->spriteFuncs = _sprite_funcs;
}
 
if (xf86ReturnOptValBool(info->Options, OPTION_SW_CURSOR, FALSE))
@@ -1733,10 +1729,8 @@ static Bool AMDGPUCloseScreen_KMS(ScreenPtr pScreen)
miPointerScreenPtr PointPriv =
dixLookupPrivate(>devPrivates, 
miPointerScreenKey);
 
-   if (PointPriv->spriteFuncs->SetCursor == 
drmmode_sprite_set_cursor) {
-   PointPriv->spriteFuncs->SetCursor = info->SetCursor;
-   PointPriv->spriteFuncs->MoveCursor = info->MoveCursor;
-   }
+   if (PointPriv->spriteFuncs == _sprite_funcs)
+   PointPriv->spriteFuncs = info->SpriteFuncs;
}
 
pScreen->BlockHandler = info->BlockHandler;
diff --git a/src/drmmode_display.c b/src/drmmode_display.c
index 746e52aa7..3513c1f93 100644
--- a/src/drmmode_display.c
+++ b/src/drmmode_display.c
@@ -37,6 +37,7 @@
 #include "inputstr.h"
 #include "list.h"
 #include "micmap.h"
+#include "mipointrst.h"
 #include "xf86cmap.h"
 #include "xf86Priv.h"
 #include "sarea.h"
@@ -2550,8 +2551,8 @@ static void drmmode_sprite_do_set_cursor(struct 
amdgpu_device_priv *device_priv,
info->sprites_visible += device_priv->sprite_visible - sprite_visible;
 }
 
-void drmmode_sprite_set_cursor(DeviceIntPtr pDev, ScreenPtr pScreen,
-  CursorPtr pCursor, int x, int y)
+static void drmmode_sprite_set_cursor(DeviceIntPtr pDev, ScreenPtr pScreen,
+ CursorPtr pCursor, int x, int y)
 {
ScrnInfoPtr scrn = xf86ScreenToScrn(pScreen);
AMDGPUInfoPtr info = AMDGPUPTR(scrn);
@@ -2562,11 +2563,11 @@ void drmmode_sprite_set_cursor(DeviceIntPtr pDev, 
ScreenPtr pScreen,
device_priv->cursor = pCursor;
drmmode_sprite_do_set_cursor(device_priv, scrn, x, y);
 
-   info->SetCursor(pDev, pScreen, pCursor, x, y);
+   info->SpriteFuncs->SetCursor(pDev, pScreen, pCursor, x, y);
 }
 
-void drmmode_sprite_move_cursor(DeviceIntPtr pDev, ScreenPtr pScreen, int x,
-   int y)
+static void drmmode_sprite_move_cursor(DeviceIntPtr pDev, ScreenPtr pScreen,
+  int x, int y)
 {
ScrnInfoPtr scrn = xf86ScreenToScrn(pScreen);
AMDGPUInfoPtr info = AMDGPUPTR(scrn);
@@ -2576,9 +2577,57 @@ void drmmode_sprite_move_cursor(DeviceIntPtr pDev, 
ScreenPtr pScreen, int x,
 
drmmode_sprite_do_set_cursor(device_priv, scrn, x, y);
 
-   info->MoveCursor(pDev, pScreen, x, y);
+   info->SpriteFuncs->MoveCursor(pDev, pScreen, x, y);
 }
 
+static Bool drmmode_sprite_realize_realize_cursor(DeviceIntPtr pDev,
+ ScreenPtr pScreen,
+  

[PATCH xf86-video-ati] Wrap the whole miPointerScreenFuncRec, instead of only Set/MoveCursor

2018-03-07 Thread Michel Dänzer
From: Michel Dänzer 

We were clobbering entries in mi's global miSpritePointerFuncs struct,
which cannot work correctly with multiple primary screens. Instead,
assign a pointer to our own wrapper struct to PointPriv->spriteFuncs.

Fixes crashes with multiple primary screens.

Fixes: 1fe8ca75974c ("Keep track of how many SW cursors are visible on
  each screen")
Reported-by: Mario Kleiner 
Signed-off-by: Michel Dänzer 
---
 src/drmmode_display.c | 61 ++-
 src/drmmode_display.h |  7 +++---
 src/radeon.h  |  4 +---
 src/radeon_kms.c  | 14 
 4 files changed, 63 insertions(+), 23 deletions(-)

diff --git a/src/drmmode_display.c b/src/drmmode_display.c
index b1f5c4888..93261dc8d 100644
--- a/src/drmmode_display.c
+++ b/src/drmmode_display.c
@@ -37,6 +37,7 @@
 #include "inputstr.h"
 #include "list.h"
 #include "micmap.h"
+#include "mipointrst.h"
 #include "xf86cmap.h"
 #include "xf86Priv.h"
 #include "radeon.h"
@@ -2750,8 +2751,8 @@ static void drmmode_sprite_do_set_cursor(struct 
radeon_device_priv *device_priv,
info->sprites_visible += device_priv->sprite_visible - sprite_visible;
 }
 
-void drmmode_sprite_set_cursor(DeviceIntPtr pDev, ScreenPtr pScreen,
-  CursorPtr pCursor, int x, int y)
+static void drmmode_sprite_set_cursor(DeviceIntPtr pDev, ScreenPtr pScreen,
+ CursorPtr pCursor, int x, int y)
 {
ScrnInfoPtr scrn = xf86ScreenToScrn(pScreen);
RADEONInfoPtr info = RADEONPTR(scrn);
@@ -2762,11 +2763,11 @@ void drmmode_sprite_set_cursor(DeviceIntPtr pDev, 
ScreenPtr pScreen,
device_priv->cursor = pCursor;
drmmode_sprite_do_set_cursor(device_priv, scrn, x, y);
 
-   info->SetCursor(pDev, pScreen, pCursor, x, y);
+   info->SpriteFuncs->SetCursor(pDev, pScreen, pCursor, x, y);
 }
 
-void drmmode_sprite_move_cursor(DeviceIntPtr pDev, ScreenPtr pScreen, int x,
-   int y)
+static void drmmode_sprite_move_cursor(DeviceIntPtr pDev, ScreenPtr pScreen,
+  int x, int y)
 {
ScrnInfoPtr scrn = xf86ScreenToScrn(pScreen);
RADEONInfoPtr info = RADEONPTR(scrn);
@@ -2776,9 +2777,57 @@ void drmmode_sprite_move_cursor(DeviceIntPtr pDev, 
ScreenPtr pScreen, int x,
 
drmmode_sprite_do_set_cursor(device_priv, scrn, x, y);
 
-   info->MoveCursor(pDev, pScreen, x, y);
+   info->SpriteFuncs->MoveCursor(pDev, pScreen, x, y);
 }
 
+static Bool drmmode_sprite_realize_realize_cursor(DeviceIntPtr pDev,
+ ScreenPtr pScreen,
+ CursorPtr pCursor)
+{
+   ScrnInfoPtr scrn = xf86ScreenToScrn(pScreen);
+   RADEONInfoPtr info = RADEONPTR(scrn);
+
+   return info->SpriteFuncs->RealizeCursor(pDev, pScreen, pCursor);
+}
+
+static Bool drmmode_sprite_realize_unrealize_cursor(DeviceIntPtr pDev,
+   ScreenPtr pScreen,
+   CursorPtr pCursor)
+{
+   ScrnInfoPtr scrn = xf86ScreenToScrn(pScreen);
+   RADEONInfoPtr info = RADEONPTR(scrn);
+
+   return info->SpriteFuncs->UnrealizeCursor(pDev, pScreen, pCursor);
+}
+
+static Bool drmmode_sprite_device_cursor_initialize(DeviceIntPtr pDev,
+   ScreenPtr pScreen)
+{
+   ScrnInfoPtr scrn = xf86ScreenToScrn(pScreen);
+   RADEONInfoPtr info = RADEONPTR(scrn);
+
+   return info->SpriteFuncs->DeviceCursorInitialize(pDev, pScreen);
+}
+
+static void drmmode_sprite_device_cursor_cleanup(DeviceIntPtr pDev,
+ScreenPtr pScreen)
+{
+   ScrnInfoPtr scrn = xf86ScreenToScrn(pScreen);
+   RADEONInfoPtr info = RADEONPTR(scrn);
+
+   info->SpriteFuncs->DeviceCursorCleanup(pDev, pScreen);
+}
+
+miPointerSpriteFuncRec drmmode_sprite_funcs = {
+   .RealizeCursor = drmmode_sprite_realize_realize_cursor,
+   .UnrealizeCursor = drmmode_sprite_realize_unrealize_cursor,
+   .SetCursor = drmmode_sprite_set_cursor,
+   .MoveCursor = drmmode_sprite_move_cursor,
+   .DeviceCursorInitialize = drmmode_sprite_device_cursor_initialize,
+   .DeviceCursorCleanup = drmmode_sprite_device_cursor_cleanup,
+};
+
+   
 void drmmode_set_cursor(ScrnInfoPtr scrn, drmmode_ptr drmmode, int id, struct 
radeon_bo *bo)
 {
xf86CrtcConfigPtr   xf86_config = XF86_CRTC_CONFIG_PTR(scrn);
diff --git a/src/drmmode_display.h b/src/drmmode_display.h
index 39d2d94a1..e0b97e721 100644
--- a/src/drmmode_display.h
+++ b/src/drmmode_display.h
@@ -206,10 +206,6 @@ extern Bool drmmode_pre_init(ScrnInfoPtr pScrn, 
drmmode_ptr drmmode, int cpp);
 extern void drmmode_init(ScrnInfoPtr pScrn, drmmode_ptr drmmode);
 extern void 

Re: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support

2018-03-07 Thread Alex Deucher
On Wed, Mar 7, 2018 at 12:29 PM, Samuel Li  wrote:
>
> Why so complicated? If old user space compatibility is required, just use 
> sg_display=0.

It will always just work in that case and we can adjust for the
optimal scenario by default in all cases without requiring the user to
set misc parameters to tune their setup that may break in the future
or hide bugs because a user sets it and forgets it.

Alex

>
> Sam
>
>
> On 2018-03-07 05:12 AM, Michel Dänzer wrote:
>> On 2018-03-07 11:04 AM, Christian König wrote:
>>> Am 07.03.2018 um 10:53 schrieb Michel Dänzer:
 On 2018-03-07 10:47 AM, Christian König wrote:
> See when I tested this the DDX and Mesa where unmodified, so both still
> assumed VRAM as placement for scanout BOs, but the kernel forced scanout
> BOs into GTT for testing.
>
> So what happened was that on scanout we moved the VRAM BO to GTT and
> after unpinning it on the first command submission which used the BO we
> moved it back to VRAM again.
>
> So as long as we don't want a severe performance penalty when you
> combine old userspace with new kernel using a kernel parameter to force
> scanout from GTT is not possible.
 That means we either need to come up with a different workaround for
 hardware issues transitioning between scanout from VRAM and GTT, or we
 can't scan out from GTT in that case.
>>>
>>> What exactly was the problem with the transition from VRAM to GTT?
>>>
>>> I mean what we can rather easily do is check where the existing BO is
>>> pinned and then always pin into the same domain on page flip.
>>
>> Yeah, something like that could work. In which case, all that's needed
>> is a way for userspace to know that GTT scanout can work, right? With
>> that information, it can manage the domains of scanout BOs as it wants.
>>
>>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support

2018-03-07 Thread Samuel Li

Why so complicated? If old user space compatibility is required, just use 
sg_display=0.

Sam


On 2018-03-07 05:12 AM, Michel Dänzer wrote:
> On 2018-03-07 11:04 AM, Christian König wrote:
>> Am 07.03.2018 um 10:53 schrieb Michel Dänzer:
>>> On 2018-03-07 10:47 AM, Christian König wrote:
 See when I tested this the DDX and Mesa where unmodified, so both still
 assumed VRAM as placement for scanout BOs, but the kernel forced scanout
 BOs into GTT for testing.

 So what happened was that on scanout we moved the VRAM BO to GTT and
 after unpinning it on the first command submission which used the BO we
 moved it back to VRAM again.

 So as long as we don't want a severe performance penalty when you
 combine old userspace with new kernel using a kernel parameter to force
 scanout from GTT is not possible.
>>> That means we either need to come up with a different workaround for
>>> hardware issues transitioning between scanout from VRAM and GTT, or we
>>> can't scan out from GTT in that case.
>>
>> What exactly was the problem with the transition from VRAM to GTT?
>>
>> I mean what we can rather easily do is check where the existing BO is
>> pinned and then always pin into the same domain on page flip.
> 
> Yeah, something like that could work. In which case, all that's needed
> is a way for userspace to know that GTT scanout can work, right? With
> that information, it can manage the domains of scanout BOs as it wants.
> 
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 6/8] drm/amd/pp: Add new smu backend function smc_table_manager

2018-03-07 Thread Alex Deucher
On Wed, Mar 7, 2018 at 5:46 AM, Rex Zhu  wrote:
> Change-Id: I4c68f7627387c4ae67612e09651318f5ae90162a
> Signed-off-by: Rex Zhu 

Patches 6-8:
Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/powerplay/inc/hwmgr.h | 1 +
>  drivers/gpu/drm/amd/powerplay/inc/smumgr.h| 2 ++
>  drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c | 8 
>  3 files changed, 11 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h 
> b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
> index 312fbc3..494f891 100644
> --- a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
> +++ b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
> @@ -237,6 +237,7 @@ struct pp_smumgr_func {
> bool (*is_dpm_running)(struct pp_hwmgr *hwmgr);
> bool (*is_hw_avfs_present)(struct pp_hwmgr  *hwmgr);
> int (*update_dpm_settings)(struct pp_hwmgr *hwmgr, void 
> *profile_setting);
> +   int (*smc_table_manager)(struct pp_hwmgr *hwmgr, uint8_t *table, 
> uint16_t table_id, bool rw); /*rw: true for read, false for write */
>  };
>
>  struct pp_hwmgr_func {
> diff --git a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h 
> b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
> index f0ef02a..cbb0166 100644
> --- a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
> +++ b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
> @@ -113,4 +113,6 @@ extern uint32_t smum_get_offsetof(struct pp_hwmgr *hwmgr,
>
>  extern int smum_update_dpm_settings(struct pp_hwmgr *hwmgr, void 
> *profile_setting);
>
> +extern int smum_smc_table_manager(struct pp_hwmgr *hwmgr, uint8_t *table, 
> uint16_t table_id, bool rw);
> +
>  #endif
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
> index 68d943d..04c45c2 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
> @@ -199,3 +199,11 @@ int smum_update_dpm_settings(struct pp_hwmgr *hwmgr, 
> void *profile_setting)
>
> return -EINVAL;
>  }
> +
> +int smum_smc_table_manager(struct pp_hwmgr *hwmgr, uint8_t *table, uint16_t 
> table_id, bool rw)
> +{
> +   if (hwmgr->smumgr_funcs->smc_table_manager)
> +   return hwmgr->smumgr_funcs->smc_table_manager(hwmgr, table, 
> table_id, rw);
> +
> +   return -EINVAL;
> +}
> --
> 1.9.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 5/8] drm/amd/pp: Clean up code for RV in powerplay.

2018-03-07 Thread Alex Deucher
On Wed, Mar 7, 2018 at 5:46 AM, Rex Zhu  wrote:
> 1. make functions static
> 2. add rv_read_arg_from_smc to smu backend function table.

Please make this two patches since there are two logical changes here.
Also see my additional comment below.

>
> Change-Id: I9baecaec95553923e59d1a4cf92a67551769b755
> Signed-off-by: Rex Zhu 
> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c   | 12 ++--
>  drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c | 18 --
>  drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.h |  2 --
>  3 files changed, 14 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
> index 4b5c5fc..474612f 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
> @@ -386,11 +386,11 @@ static int rv_populate_clock_table(struct pp_hwmgr 
> *hwmgr)
> ARRAY_SIZE(VddPhyClk), [0]);
>
> smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetMinGfxclkFrequency);
> -   rv_read_arg_from_smc(hwmgr, );
> +   result = smum_get_argument(hwmgr);
> rv_data->gfx_min_freq_limit = result * 100;
>
> smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetMaxGfxclkFrequency);
> -   rv_read_arg_from_smc(hwmgr, );
> +   result = smum_get_argument(hwmgr);
> rv_data->gfx_max_freq_limit = result * 100;
>
> return 0;
> @@ -726,7 +726,7 @@ static int rv_print_clock_levels(struct pp_hwmgr *hwmgr,
> switch (type) {
> case PP_SCLK:
> smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetGfxclkFrequency);
> -   rv_read_arg_from_smc(hwmgr, );
> +   now = smum_get_argument(hwmgr);
>
> size += sprintf(buf + size, "0: %uMhz %s\n",
> data->gfx_min_freq_limit / 100,
> @@ -739,7 +739,7 @@ static int rv_print_clock_levels(struct pp_hwmgr *hwmgr,
> break;
> case PP_MCLK:
> smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetFclkFrequency);
> -   rv_read_arg_from_smc(hwmgr, );
> +   now = smum_get_argument(hwmgr);
>
> for (i = 0; i < mclk_table->count; i++)
> size += sprintf(buf + size, "%d: %uMhz %s\n",
> @@ -971,14 +971,14 @@ static int rv_read_sensor(struct pp_hwmgr *hwmgr, int 
> idx,
> switch (idx) {
> case AMDGPU_PP_SENSOR_GFX_SCLK:
> smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetGfxclkFrequency);
> -   rv_read_arg_from_smc(hwmgr, );
> +   sclk = smum_get_argument(hwmgr);
> /* in units of 10KHZ */
> *((uint32_t *)value) = sclk * 100;
> *size = 4;
> break;
> case AMDGPU_PP_SENSOR_GFX_MCLK:
> smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetFclkFrequency);
> -   rv_read_arg_from_smc(hwmgr, );
> +   mclk = smum_get_argument(hwmgr);
> /* in units of 10KHZ */
> *((uint32_t *)value) = mclk * 100;
> *size = 4;
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
> index b64bfe3..7f2866c 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
> @@ -60,7 +60,7 @@ static uint32_t rv_wait_for_response(struct pp_hwmgr *hwmgr)
> return cgs_read_register(hwmgr->device, reg);
>  }
>
> -int rv_send_msg_to_smc_without_waiting(struct pp_hwmgr *hwmgr,
> +static int rv_send_msg_to_smc_without_waiting(struct pp_hwmgr *hwmgr,
> uint16_t msg)
>  {
> uint32_t reg;
> @@ -72,19 +72,17 @@ int rv_send_msg_to_smc_without_waiting(struct pp_hwmgr 
> *hwmgr,
> return 0;
>  }
>
> -int rv_read_arg_from_smc(struct pp_hwmgr *hwmgr, uint32_t *arg)
> +static int rv_read_arg_from_smc(struct pp_hwmgr *hwmgr)
>  {
> uint32_t reg;
>
> reg = soc15_get_register_offset(MP1_HWID, 0,
> mmMP1_SMN_C2PMSG_82_BASE_IDX, mmMP1_SMN_C2PMSG_82);
>
> -   *arg = cgs_read_register(hwmgr->device, reg);
> -
> -   return 0;
> +   return cgs_read_register(hwmgr->device, reg);
>  }
>
> -int rv_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)
> +static int rv_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)
>  {
> uint32_t reg;
>
> @@ -103,7 +101,7 @@ int rv_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t 
> msg)
>  }
>
>
> -int rv_send_msg_to_smc_with_parameter(struct pp_hwmgr *hwmgr,
> +static int rv_send_msg_to_smc_with_parameter(struct pp_hwmgr *hwmgr,
> uint16_t msg, uint32_t parameter)
>  {
> uint32_t reg;
> @@ -190,8 +188,7 @@ static int rv_verify_smc_interface(struct pp_hwmgr *hwmgr)
>
> rv_send_msg_to_smc(hwmgr,
> 

Re: [PATCH 4/8] drm/amd/pp: Clean up the code for RV in powerplay

2018-03-07 Thread Alex Deucher
On Wed, Mar 7, 2018 at 5:46 AM, Rex Zhu  wrote:
> In send_message_to_smu helper functions,
> Print out the error code for debug if smu failed to response.
>
> and the helper functions always return true, so no need to
> check their return value.
>
> Signed-off-by: Rex Zhu 

Is this consistent with our other smu implementations?  If we never
use the errors, I'm ok with removing them, but I don't want to have to
add them again later if we decide we actually need them.

Alex

>
> Change-Id: I27da8bf22d3cea0df968df7b0809dc727461762f
> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c   | 72 +
>  drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c | 98 
> 
>  2 files changed, 53 insertions(+), 117 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
> index 8ddfb78..4b5c5fc 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
> @@ -243,8 +243,7 @@ static int rv_disable_gfx_off(struct pp_hwmgr *hwmgr)
> struct rv_hwmgr *rv_data = (struct rv_hwmgr *)(hwmgr->backend);
>
> if (rv_data->gfx_off_controled_by_driver)
> -   smum_send_msg_to_smc(hwmgr,
> -   PPSMC_MSG_DisableGfxOff);
> +   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_DisableGfxOff);
>
> return 0;
>  }
> @@ -259,8 +258,7 @@ static int rv_enable_gfx_off(struct pp_hwmgr *hwmgr)
> struct rv_hwmgr *rv_data = (struct rv_hwmgr *)(hwmgr->backend);
>
> if (rv_data->gfx_off_controled_by_driver)
> -   smum_send_msg_to_smc(hwmgr,
> -   PPSMC_MSG_EnableGfxOff);
> +   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_EnableGfxOff);
>
> return 0;
>  }
> @@ -387,24 +385,12 @@ static int rv_populate_clock_table(struct pp_hwmgr 
> *hwmgr)
> rv_get_clock_voltage_dependency_table(hwmgr, 
> >vdd_dep_on_phyclk,
> ARRAY_SIZE(VddPhyClk), [0]);
>
> -   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
> -   PPSMC_MSG_GetMinGfxclkFrequency),
> -   "Attempt to get min GFXCLK Failed!",
> -   return -1);
> -   PP_ASSERT_WITH_CODE(!rv_read_arg_from_smc(hwmgr,
> -   ),
> -   "Attempt to get min GFXCLK Failed!",
> -   return -1);
> +   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetMinGfxclkFrequency);
> +   rv_read_arg_from_smc(hwmgr, );
> rv_data->gfx_min_freq_limit = result * 100;
>
> -   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
> -   PPSMC_MSG_GetMaxGfxclkFrequency),
> -   "Attempt to get max GFXCLK Failed!",
> -   return -1);
> -   PP_ASSERT_WITH_CODE(!rv_read_arg_from_smc(hwmgr,
> -   ),
> -   "Attempt to get max GFXCLK Failed!",
> -   return -1);
> +   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetMaxGfxclkFrequency);
> +   rv_read_arg_from_smc(hwmgr, );
> rv_data->gfx_max_freq_limit = result * 100;
>
> return 0;
> @@ -739,14 +725,8 @@ static int rv_print_clock_levels(struct pp_hwmgr *hwmgr,
>
> switch (type) {
> case PP_SCLK:
> -   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
> -   PPSMC_MSG_GetGfxclkFrequency),
> -   "Attempt to get current GFXCLK Failed!",
> -   return -1);
> -   PP_ASSERT_WITH_CODE(!rv_read_arg_from_smc(hwmgr,
> -   ),
> -   "Attempt to get current GFXCLK Failed!",
> -   return -1);
> +   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetGfxclkFrequency);
> +   rv_read_arg_from_smc(hwmgr, );
>
> size += sprintf(buf + size, "0: %uMhz %s\n",
> data->gfx_min_freq_limit / 100,
> @@ -758,14 +738,8 @@ static int rv_print_clock_levels(struct pp_hwmgr *hwmgr,
>  == now) ? "*" : "");
> break;
> case PP_MCLK:
> -   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
> -   PPSMC_MSG_GetFclkFrequency),
> -   "Attempt to get current MEMCLK Failed!",
> -   return -1);
> -   PP_ASSERT_WITH_CODE(!rv_read_arg_from_smc(hwmgr,
> -   ),
> -   "Attempt to get current MEMCLK Failed!",
> -   return -1);
> +   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetFclkFrequency);
> +   rv_read_arg_from_smc(hwmgr, );
>
> for (i = 0; i < 

Re: [PATCH 3/8] drm/amd/pp: Delete is_smc_ram_running function on RV

2018-03-07 Thread Alex Deucher
On Wed, Mar 7, 2018 at 5:46 AM, Rex Zhu  wrote:
> 1. There is a race condition when another ip also use same register pairs
> 2. check once at boot up by GetDriverIfVersion message is sufficient
>to check SMU health. so delete is_smc_ram_running check.
>
> Change-Id: I3c5f5ebac36c1ae4a7dd72f482fc27a4f362975e
> Signed-off-by: Rex Zhu 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c | 27 
> 
>  1 file changed, 27 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
> index 6362d46..e6317fd 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
> @@ -47,34 +47,10 @@
>  #define smnMP1_FIRMWARE_FLAGS   0x3010028
>
>
> -bool rv_is_smc_ram_running(struct pp_hwmgr *hwmgr)
> -{
> -   uint32_t mp1_fw_flags, reg;
> -
> -   reg = soc15_get_register_offset(NBIF_HWID, 0,
> -   mmPCIE_INDEX2_BASE_IDX, mmPCIE_INDEX2);
> -
> -   cgs_write_register(hwmgr->device, reg,
> -   (MP1_Public | (smnMP1_FIRMWARE_FLAGS & 0x)));
> -
> -   reg = soc15_get_register_offset(NBIF_HWID, 0,
> -   mmPCIE_DATA2_BASE_IDX, mmPCIE_DATA2);
> -
> -   mp1_fw_flags = cgs_read_register(hwmgr->device, reg);
> -
> -   if (mp1_fw_flags & MP1_FIRMWARE_FLAGS__INTERRUPTS_ENABLED_MASK)
> -   return true;
> -
> -   return false;
> -}
> -
>  static uint32_t rv_wait_for_response(struct pp_hwmgr *hwmgr)
>  {
> uint32_t reg;
>
> -   if (!rv_is_smc_ram_running(hwmgr))
> -   return -EINVAL;
> -
> reg = soc15_get_register_offset(MP1_HWID, 0,
> mmMP1_SMN_C2PMSG_90_BASE_IDX, mmMP1_SMN_C2PMSG_90);
>
> @@ -89,9 +65,6 @@ int rv_send_msg_to_smc_without_waiting(struct pp_hwmgr 
> *hwmgr,
>  {
> uint32_t reg;
>
> -   if (!rv_is_smc_ram_running(hwmgr))
> -   return -EINVAL;
> -
> reg = soc15_get_register_offset(MP1_HWID, 0,
> mmMP1_SMN_C2PMSG_66_BASE_IDX, mmMP1_SMN_C2PMSG_66);
> cgs_write_register(hwmgr->device, reg, msg);
> --
> 1.9.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/8] drm/amd/pp: Clean up header file include

2018-03-07 Thread Alex Deucher
On Wed, Mar 7, 2018 at 5:46 AM, Rex Zhu  wrote:
> smu7_smumgr.h should not be included in rv and vega, The common functions
> for all smu7 asics are put in smu_smumgr.c.
>
> Not include cgs interface in smumgr.c. the code used cgs interface has been
> deleted.
>
> Not include smu_ucode_xfer_vi.h in rv/vega
>
> Change-Id: I3c9751594919c9598aed6a518d1b02eb698de9ec
> Signed-off-by: Rex Zhu 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c | 3 +--
>  drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c| 1 -
>  drivers/gpu/drm/amd/powerplay/smumgr/vega10_smumgr.c | 6 ++
>  3 files changed, 3 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
> index e2ee23a..6362d46 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
> @@ -31,8 +31,7 @@
>  #include "smu10.h"
>  #include "ppatomctrl.h"
>  #include "pp_debug.h"
> -#include "smu_ucode_xfer_vi.h"
> -#include "smu7_smumgr.h"
> +
>
>  #define VOLTAGE_SCALE 4
>
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
> index 3645127..68d943d 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
> @@ -28,7 +28,6 @@
>  #include 
>  #include 
>  #include "smumgr.h"
> -#include "cgs_common.h"
>
>  MODULE_FIRMWARE("amdgpu/topaz_smc.bin");
>  MODULE_FIRMWARE("amdgpu/topaz_k_smc.bin");
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/vega10_smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/vega10_smumgr.c
> index 15e1afa..ad4e8b4 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/vega10_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/vega10_smumgr.c
> @@ -27,11 +27,9 @@
>  #include "vega10_smumgr.h"
>  #include "vega10_ppsmc.h"
>  #include "smu9_driver_if.h"
> -
>  #include "ppatomctrl.h"
>  #include "pp_debug.h"
> -#include "smu_ucode_xfer_vi.h"
> -#include "smu7_smumgr.h"
> +
>
>  #define AVFS_EN_MSB1568
>  #define AVFS_EN_LSB1568
> @@ -386,7 +384,7 @@ static int vega10_smu_init(struct pp_hwmgr *hwmgr)
> struct cgs_firmware_info info = {0};
>
> ret = cgs_get_firmware_info(hwmgr->device,
> -   smu7_convert_fw_type_to_cgs(UCODE_ID_SMU),
> +   CGS_UCODE_ID_SMU,
> );
> if (ret || !info.kptr)
> return -EINVAL;
> --
> 1.9.1
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/8] drm/amd/pp: Refine code for smu7 in powerplay

2018-03-07 Thread Alex Deucher
On Wed, Mar 7, 2018 at 5:46 AM, Rex Zhu  wrote:
> 1. Add avfs_support in hwmgr to avoid to visit smu backend struct in
>hwmgr.so do not need to include smu7_smumgr.h under hwmgr folder.
> 2. simplify the list of enum AVFS_BTC_STATUS
>
> Change-Id: I04c769972deff797229339f3ccb1c442b35768a2
> Signed-off-by: Rex Zhu 
> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c| 14 ++
>  drivers/gpu/drm/amd/powerplay/inc/hwmgr.h   |  2 +-
>  drivers/gpu/drm/amd/powerplay/inc/smumgr.h  | 10 +-
>  drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c  |  4 +++-
>  drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c |  4 +++-
>  5 files changed, 10 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> index d4d1d2e..da25e7f 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> @@ -40,7 +40,6 @@
>
>  #include "hwmgr.h"
>  #include "smu7_hwmgr.h"
> -#include "smu7_smumgr.h"
>  #include "smu_ucode_xfer_vi.h"
>  #include "smu7_powertune.h"
>  #include "smu7_dyn_defaults.h"
> @@ -1353,12 +1352,7 @@ static int smu7_enable_dpm_tasks(struct pp_hwmgr 
> *hwmgr)
>
>  static int smu7_avfs_control(struct pp_hwmgr *hwmgr, bool enable)
>  {
> -   struct smu7_smumgr *smu_data = (struct smu7_smumgr 
> *)(hwmgr->smu_backend);
> -
> -   if (smu_data == NULL)
> -   return -EINVAL;
> -
> -   if (smu_data->avfs.avfs_btc_status == AVFS_BTC_NOTSUPPORTED)
> +   if (!hwmgr->avfs_supported)
> return 0;
>
> if (enable) {
> @@ -1382,13 +1376,9 @@ static int smu7_avfs_control(struct pp_hwmgr *hwmgr, 
> bool enable)
>
>  static int smu7_update_avfs(struct pp_hwmgr *hwmgr)
>  {
> -   struct smu7_smumgr *smu_data = (struct smu7_smumgr 
> *)(hwmgr->smu_backend);
> struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
>
> -   if (smu_data == NULL)
> -   return -EINVAL;
> -
> -   if (smu_data->avfs.avfs_btc_status == AVFS_BTC_NOTSUPPORTED)
> +   if (!hwmgr->avfs_supported)
> return 0;
>
> if (data->need_update_smu7_dpm_table & DPMTABLE_OD_UPDATE_VDDC) {
> diff --git a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h 
> b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
> index b151ad85..312fbc3 100644
> --- a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
> +++ b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
> @@ -748,7 +748,7 @@ struct pp_hwmgr {
> struct pp_power_state*uvd_ps;
> struct amd_pp_display_configuration display_config;
> uint32_t feature_mask;
> -
> +   bool avfs_supported;
> /* UMD Pstate */
> bool en_umd_pstate;
> uint32_t power_profile_mode;
> diff --git a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h 
> b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
> index 9bba0a0..f0ef02a 100644
> --- a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
> +++ b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
> @@ -28,23 +28,15 @@
>
>  enum AVFS_BTC_STATUS {
> AVFS_BTC_BOOT = 0,
> -   AVFS_BTC_BOOT_STARTEDSMU,
> AVFS_LOAD_VIRUS,
> AVFS_BTC_VIRUS_LOADED,
> AVFS_BTC_VIRUS_FAIL,
> AVFS_BTC_COMPLETED_PREVIOUSLY,
> AVFS_BTC_ENABLEAVFS,
> -   AVFS_BTC_STARTED,
> AVFS_BTC_FAILED,
> -   AVFS_BTC_RESTOREVFT_FAILED,
> -   AVFS_BTC_SAVEVFT_FAILED,
> AVFS_BTC_DPMTABLESETUP_FAILED,
> -   AVFS_BTC_COMPLETED_UNSAVED,
> -   AVFS_BTC_COMPLETED_SAVED,
> -   AVFS_BTC_COMPLETED_RESTORED,
> AVFS_BTC_DISABLED,
> -   AVFS_BTC_NOTSUPPORTED,
> -   AVFS_BTC_SMUMSG_ERROR
> +   AVFS_BTC_NOTSUPPORTED
>  };

This isn't used to interact with the hw anywhere is it?  If so, this
will break that.  It should probably be split out as a separate patch
since it's not really directly related to the rest of this patch.
With those addresses this patch is:
Reviewed-by: Alex Deucher 

>
>  enum SMU_TABLE {
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
> index 0b2b5d1..573c9be 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
> @@ -360,8 +360,10 @@ static bool fiji_is_hw_avfs_present(struct pp_hwmgr 
> *hwmgr)
>
> if (!atomctrl_read_efuse(hwmgr->device, AVFS_EN_LSB, AVFS_EN_MSB,
> mask, )) {
> -   if (efuse)
> +   if (efuse) {
> +   hwmgr->avfs_supported = true;
> return true;
> +   }
> }
> return false;
>  }
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
> index 632d1ca..1075ec1 100644
> --- 

Re: Modesetting (amdgpudrmfb) artifacts on RAVEN APU

2018-03-07 Thread KARBOWSKI Piotr

On 2018-03-07 17:14, Harry Wentland wrote:

Pushed a drm-fixes-4.16 branch (based on Alex's) to my FDO repo 
athttps://cgit.freedesktop.org/~hwentland/linux/  but I don't have a Raven 
system setup to test this at the moment.

Tom, or Piotr, would you be able to check that the fix works as intended on 
that branch?


Tested that branch, confirmed that the VGA patch is there but I got the 
original result of broken video. Rebooted few times. It seems that my 
issue was not the VGA turnover afterall. I will re-test it with vanilla 
amd-staging-drm-next of git://people.freedesktop.org/~agd5f/linux 
without the VGA patch and update you.


-- Piotr.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: New KFD ioctls: taking the skeletons out of the closet

2018-03-07 Thread Daniel Vetter
On Wed, Mar 07, 2018 at 09:38:03AM +0100, Christian K??nig wrote:
> Am 07.03.2018 um 00:09 schrieb Dave Airlie:
> > On 7 March 2018 at 08:44, Felix Kuehling  wrote:
> > > Hi all,
> > > 
> > > Christian raised two potential issues in a recent KFD upstreaming code
> > > review that are related to the KFD ioctl APIs:
> > > 
> > >   1. behaviour of -ERESTARTSYS
> > >   2. transactional nature of KFD ioctl definitions, or lack thereof
> > > 
> > > I appreciate constructive feedback, but I also want to encourage an
> > > open-minded rather than a dogmatic approach to API definitions. So let
> > > me take all the skeletons out of my closet and get these APIs reviewed
> > > in the appropriate forum before we commit to them upstream. See the end
> > > of this email for reference.
> > > 
> > > The controversial part at this point is kfd_ioctl_map_memory_to_gpu. If
> > > any of the other APIs raise concerns or questions, please ask.
> > > 
> > > Because of the HSA programming model, KFD memory management APIs are
> > > synchronous. There is no pipelining. Command submission to GPUs through
> > > user mode queues does not involve KFD. This means KFD doesn't know what
> > > memory is used by the GPUs and when it's used. That means, when the
> > > map_memory_to_gpu ioctl returns to user mode, all memory mapping
> > > operations are complete and the memory can be used by the CPUs or GPUs
> > > immediately.
> > I've got a few opinions, but first up I still dislike user-mode queues
> > and everything
> > they entail. I still feel they are solving a Windows problem and not a
> > Linux problem,
> > and it would be nice if we had some Linux numbers on what they gain us over
> > a dispatch ioctl, because they sure bring a lot of memory management issues.
> 
> Well user-mode queues are a problem as long as you don't have recoverable
> page faults on the GPU.
> 
> As soon as you got recoverable page faults and push the memory management
> towards things like HMM I don't see an advantage of using a IOCTL based
> command submission any more.
> 
> So I would say that this is a problem which is slowly going away as the
> hardware improves.

Yeah, but up to the point where the hw actually works (instead of promises
that maybe it'll work next generation, trust us, for like a few
generations) it's much easier to hack up an ioctl with workarounds than
intercepting an mmap write fault all the time (those are slower than
ioctls).

I think userspace queues are fine once we have known-working hw. Before
that I'm kinda agreeing with Dave and not seeing the point. At least to my
knowledge we still haven't arrived in the wonderful promised land of hw
recoverable (well, restartable really) page faults on any vendors platform
...

> > That said amdkfd is here.
> > 
> > The first question you should ask (which you haven't asked here at all) is
> > what should userspace do with the ioctl result.
> > 
> > > HSA also uses a shared virtual memory model, so typically memory gets
> > > mapped on multiple GPUs and CPUs at the same virtual address.
> > > 
> > > The point of contention seems to be the ability to map memory to
> > > multiple GPUs in a single ioctl and the behaviour in failure cases. I'll
> > > discuss two main failure cases:
> > > 
> > > 1: Failure after all mappings have been dispatched via SDMA, but a
> > > signal interrupts the wait for completion and we return -ERESTARTSYS.
> > > Documentation/kernel-hacking/hacking.rst only says "[...] you should be
> > > prepared to process the restart, e.g. if you're in the middle of
> > > manipulating some data structure." I think we do that by ensuring that
> > > memory that's already mapped won't be mapped again. So the restart will
> > > become a no-op and just end up waiting for all the previous mappings to
> > > complete.
> > -ERESTARTSYS at that late stage points to a badly synchronous API,
> > I'd have said you should have two ioctls, one that returns a fence after
> > starting the processes, and one that waits for the fence separately.
> 
> That is exactly what I suggested as well, but also exactly what Felix tries
> to avoid :)
> 
> > The overhead of ioctls isn't your enemy until you've measured it and
> > proven it's a problem.
> > 
> > > Christian has a stricter requirement, and I'd like to know where that
> > > comes from: "An interrupted IOCTL should never have a visible effect."
> > Christian might be taking things a bit further but synchronous gpu access
> > APIs are bad, but I don't think undoing a bunch of work is a good plan 
> > either
> > just because you got ERESTARTSYS. If you get ERESTARTSYS can you
> > handle it, if I've fired off 5 SDMAs and wait for them will I fire off 5 
> > more?
> > will I wait for the original SDMAs if I reenter?
> 
> Well it's not only the waiting for the SDMAs. If I understood it correctly
> the IOCTL proposed by Felix allows adding multiple mappings of buffer
> objects on multiple devices with just one IOCTL.
> 
> Now the 

Re: Modesetting (amdgpudrmfb) artifacts on RAVEN APU

2018-03-07 Thread Tom St Denis

Cloning/building will take a bit.  I'll let you know later this afternoon.

Cheers,
Tom

On 03/07/2018 11:14 AM, Harry Wentland wrote:

On 2018-03-07 09:55 AM, Harry Wentland wrote:

I'm prepping some cherry-picks for drm-fixes-4.16, including this. Should've 
done that a while ago, but shouldn't be too late yet.



Pushed a drm-fixes-4.16 branch (based on Alex's) to my FDO repo at 
https://cgit.freedesktop.org/~hwentland/linux/ but I don't have a Raven system 
setup to test this at the moment.

Tom, or Piotr, would you be able to check that the fix works as intended on 
that branch?

Harry


Harry

On 2018-03-07 07:55 AM, Tom St Denis wrote:

It's in the next set of display patches for drm-next but I don't think it'll be 
in 4.16 at this point unless it gets cherry picked out.

Alex would know for sure since he co-ordinates that with Dave.

Tom

On 03/06/2018 12:38 PM, KARBOWSKI Piotr wrote:

On 2018-03-06 18:28, Tom St Denis wrote:

I routinely rebase this patch on top of our amd-staging-drm-next tree at fdo.

https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next


Awesome.

So I used this tree, branch amd-staging-drm-next, and applied the patch - and 
my screen works now!

Is there a chance that those changes will still be included in 4.16, like, next 
rc?

-- Piotr.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Modesetting (amdgpudrmfb) artifacts on RAVEN APU

2018-03-07 Thread Harry Wentland
On 2018-03-07 09:55 AM, Harry Wentland wrote:
> I'm prepping some cherry-picks for drm-fixes-4.16, including this. Should've 
> done that a while ago, but shouldn't be too late yet.
> 

Pushed a drm-fixes-4.16 branch (based on Alex's) to my FDO repo at 
https://cgit.freedesktop.org/~hwentland/linux/ but I don't have a Raven system 
setup to test this at the moment.

Tom, or Piotr, would you be able to check that the fix works as intended on 
that branch?

Harry

> Harry
> 
> On 2018-03-07 07:55 AM, Tom St Denis wrote:
>> It's in the next set of display patches for drm-next but I don't think it'll 
>> be in 4.16 at this point unless it gets cherry picked out.
>>
>> Alex would know for sure since he co-ordinates that with Dave.
>>
>> Tom
>>
>> On 03/06/2018 12:38 PM, KARBOWSKI Piotr wrote:
>>> On 2018-03-06 18:28, Tom St Denis wrote:
 I routinely rebase this patch on top of our amd-staging-drm-next tree at 
 fdo.

 https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next
>>>
>>> Awesome.
>>>
>>> So I used this tree, branch amd-staging-drm-next, and applied the patch - 
>>> and my screen works now!
>>>
>>> Is there a chance that those changes will still be included in 4.16, like, 
>>> next rc?
>>>
>>> -- Piotr.
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 6/6] drm/amdgpu: explicit give BO type to amdgpu_bo_create

2018-03-07 Thread Felix Kuehling
On 2018-03-07 04:16 AM, Christian König wrote:
> Am 06.03.2018 um 22:29 schrieb Felix Kuehling:
>> NAK.
>>
>> For KFD we need the ability to create a BO from an SG list that doesn't
>> come from another BO. We use this for mapping pages from the doorbell
>> aperture into GPUVM for GPU self-dispatch.
> You can still do this, see amdgpu_gem_prime_import_sg_table.
>
> Just set the BO type to SG and then setting the SG table after
> creating it:
>> ret = amdgpu_bo_create(adev, attach->dmabuf->size, PAGE_SIZE,
>>    AMDGPU_GEM_DOMAIN_CPU, 0, ttm_bo_type_sg,
>>    resv, );
>>   if (ret)
>>   goto error;
>>   bo->tbo.sg = sg;
>> bo->tbo.ttm->sg = sg;
>
> Then validate the result into the GTT domain to actually use the SG
> table.

OK, I'll try implementing that before we merge your change.

>
> BTW: How do you create an SG table for the doorbell?

KFD manages the doorbell aperture and assigns one or two pages of
doorbells to each process. KFD knows the bus address of the doorbells
and passes it to amdgpu, which creates the SG like this:

static struct sg_table *create_doorbell_sg(uint64_t addr, uint32_t size)
{
struct sg_table *sg = kmalloc(sizeof(*sg), GFP_KERNEL);

if (!sg)
return NULL;
if (sg_alloc_table(sg, 1, GFP_KERNEL)) {
kfree(sg);
return NULL;
}
sg->sgl->dma_address = addr;
sg->sgl->length = size;
#ifdef CONFIG_NEED_SG_DMA_LENGTH
sg->sgl->dma_length = size;
#endif
return sg;
}

Regards,
  Felix


>
> Regards,
> Christian.
>
>>
>> If you remove this now, I'll need to add it back in some form in a month
>> or two when I get to that part of upstreaming KFD.
>>
>> There may be other ways to implement this. Currently we need to create a
>> BO for anything we want to map into a GPUVM page table, because the
>> amdgpu_vm code is based around BOs. If there was a way to map physical
>> addresses into GPUVM without creating a buffer object, that would
>> work too.
>>
>> Regards,
>>    Felix
>>
>>
>> On 2018-03-06 09:43 AM, Christian König wrote:
>>> Drop the "kernel" and sg parameter and give the BO type to create
>>> explicit to amdgpu_bo_create instead of figuring it out from the
>>> parameters.
>>>
>>> Signed-off-by: Christian König 
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  2 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    |  5 +--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c |  8 ++---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c  |  7 ++--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |  6 ++--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c    | 46
>>> +++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h    | 11 +++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c |  7 ++--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_test.c  | 11 +++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 11 ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c    |  8 ++---
>>>   11 files changed, 58 insertions(+), 64 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 292c7e72820c..b1116b773516 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -441,7 +441,7 @@ struct amdgpu_sa_bo {
>>>   void amdgpu_gem_force_release(struct amdgpu_device *adev);
>>>   int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned
>>> long size,
>>>    int alignment, u32 initial_domain,
>>> - u64 flags, bool kernel,
>>> + u64 flags, enum ttm_bo_type type,
>>>    struct reservation_object *resv,
>>>    struct drm_gem_object **obj);
>>>   diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> index 450426dbed92..7f096ed6e83d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
>>> @@ -215,8 +215,9 @@ int alloc_gtt_mem(struct kgd_dev *kgd, size_t size,
>>>   if ((*mem) == NULL)
>>>   return -ENOMEM;
>>>   -    r = amdgpu_bo_create(adev, size, PAGE_SIZE, true,
>>> AMDGPU_GEM_DOMAIN_GTT,
>>> - AMDGPU_GEM_CREATE_CPU_GTT_USWC, NULL, NULL,
>>> &(*mem)->bo);
>>> +    r = amdgpu_bo_create(adev, size, PAGE_SIZE, AMDGPU_GEM_DOMAIN_GTT,
>>> + AMDGPU_GEM_CREATE_CPU_GTT_USWC, ttm_bo_type_kernel,
>>> + NULL, &(*mem)->bo);
>>>   if (r) {
>>>   dev_err(adev->dev,
>>>   "failed to allocate BO for amdkfd (%d)\n", r);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
>>> index 2fb299afc12b..02b849be083b 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
>>> 

Re: deprecated register issues

2018-03-07 Thread Deucher, Alexander
Right.  We ran into issues with reading back that register at runtime when UMDs 
queried it when other stuff was in flight, so we just read it once at startup 
and cache the results. Now when UMDs request it, we return the cached value.


Alex


From: Koenig, Christian
Sent: Wednesday, March 7, 2018 9:31:13 AM
To: Mao, David; Liu, Monk
Cc: Deucher, Alexander; amd-gfx@lists.freedesktop.org; Jin, Jian-Rong
Subject: Re: deprecated register issues

Hi David,

well I just figured that this is a misunderstanding.

Accessing this register and some other deprecated registers can cause problem 
when invalidating VMHUBs.

This register itself isn't deprecated, the wording in a patch fixing things is 
just a bit unclear.

Question is is that register still accessed regularly or is it value cached 
after startup?

Regards,
Christian.

Am 07.03.2018 um 15:25 schrieb Mao, David:
We requires base driver to provide the mask of disabled RB.
This is why kernel read the CC_RB_BACKEND_DISABLE to collect the harvest 
configuration.
Where did you get to know that the register is deprecated?
I think it should still be there.

Best Regards,
David

On Mar 7, 2018, at 9:49 PM, Liu, Monk 
> wrote:

+ UMD guys

Hi David

Do you know if GC_USER_RB_BACKEND_DISABLE is still exist for gfx9/vega10 ?

We found CC_RB_BACKEND_DISABLE was deprecated but looks it is still in use in 
kmd, so
I want to check with you both of above registers

Thanks
/Monk

From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: 2018年3月7日 20:26
To: Liu, Monk >; Deucher, Alexander 
>
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: deprecated register issues

Hi Monk,

I honestly don't have the slightest idea why we are still accessing 
CC_RB_BACKEND_DISABLE. Maybe it still contains some useful values?

Key point was that we needed to stop accessing it all the time to avoid 
triggering problems.

Regards,
Christian.

Am 07.03.2018 um 13:11 schrieb Liu, Monk:

Hi Christian



I remember you and AlexD mentioned that a handful registers are deprecated for 
greenland (gfx9)

e.g. CC_RB_BACKEND_DISABLE



do you know why we still have this routine ?


static u32
gfx_v9_0_get_rb_active_bitmap(struct amdgpu_device *adev)

{

u32 data, mask;



data = RREG32_SOC15(GC,
0, mmCC_RB_BACKEND_DISABLE);

data |= RREG32_SOC15(GC,
0, mmGC_USER_RB_BACKEND_DISABLE);



data &= CC_RB_BACKEND_DISABLE__BACKEND_DISABLE_MASK;

data >>= GC_USER_RB_BACKEND_DISABLE__BACKEND_DISABLE__SHIFT;



mask = amdgpu_gfx_create_bitmask(adev->gfx.config.max_backends_per_se /

 adev->gfx.config.max_sh_per_se);



return (~data) & mask;

}



see that it still read CC_RB_BACKEND_DISABLE



thanks



/Monk






___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Modesetting (amdgpudrmfb) artifacts on RAVEN APU

2018-03-07 Thread Harry Wentland
I'm prepping some cherry-picks for drm-fixes-4.16, including this. Should've 
done that a while ago, but shouldn't be too late yet.

Harry

On 2018-03-07 07:55 AM, Tom St Denis wrote:
> It's in the next set of display patches for drm-next but I don't think it'll 
> be in 4.16 at this point unless it gets cherry picked out.
> 
> Alex would know for sure since he co-ordinates that with Dave.
> 
> Tom
> 
> On 03/06/2018 12:38 PM, KARBOWSKI Piotr wrote:
>> On 2018-03-06 18:28, Tom St Denis wrote:
>>> I routinely rebase this patch on top of our amd-staging-drm-next tree at 
>>> fdo.
>>>
>>> https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next
>>
>> Awesome.
>>
>> So I used this tree, branch amd-staging-drm-next, and applied the patch - 
>> and my screen works now!
>>
>> Is there a chance that those changes will still be included in 4.16, like, 
>> next rc?
>>
>> -- Piotr.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: Correct the place of amdgpu_pm_sysfs_fini

2018-03-07 Thread Alex Deucher
On Wed, Mar 7, 2018 at 5:17 AM, Emily Deng  wrote:
> The amdgpu_pm_sysfs_fini should call before amdgpu_device_ip_fini,
> or the adev->pm.dpm_enabled would be set to 0, then the device files
> related to pp won't be removed by amdgpu_pm_sysfs_fini when unload
> driver.
>
> Signed-off-by: Emily Deng 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index f67f5bf..8a8f8f6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2096,6 +2096,7 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
>
> amdgpu_ib_pool_fini(adev);
> amdgpu_fence_driver_fini(adev);
> +   amdgpu_pm_sysfs_fini(adev);
> amdgpu_fbdev_fini(adev);
> r = amdgpu_device_ip_fini(adev);
> if (adev->firmware.gpu_info_fw) {
> @@ -2124,7 +2125,6 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
> iounmap(adev->rmmio);
> adev->rmmio = NULL;
> amdgpu_device_doorbell_fini(adev);
> -   amdgpu_pm_sysfs_fini(adev);
> amdgpu_debugfs_regs_cleanup(adev);
>  }
>
> --
> 2.7.4
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: deprecated register issues

2018-03-07 Thread Christian König

Hi David,

well I just figured that this is a misunderstanding.

Accessing this register and some other deprecated registers can cause 
problem when invalidating VMHUBs.


This register itself isn't deprecated, the wording in a patch fixing 
things is just a bit unclear.


Question is is that register still accessed regularly or is it value 
cached after startup?


Regards,
Christian.

Am 07.03.2018 um 15:25 schrieb Mao, David:

We requires base driver to provide the mask of disabled RB.
This is why kernel read the CC_RB_BACKEND_DISABLE to collect the 
harvest configuration.

Where did you get to know that the register is deprecated?
I think it should still be there.

Best Regards,
David

On Mar 7, 2018, at 9:49 PM, Liu, Monk > wrote:


+ UMD guys
Hi David
Do you know if*GC_USER_RB_BACKEND_DISABLE is still exist for 
gfx9/vega10 ?*

**
*We found*CC_RB_BACKEND_DISABLE was deprecated but looks it is still 
in use in kmd, so

I want to check with you both of above registers
Thanks
/Monk
*From:*amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org]*On 
Behalf Of*Christian K?nig

*Sent:*2018年3月7日20:26
*To:*Liu, Monk >; Deucher, 
Alexander >

*Cc:*amd-gfx@lists.freedesktop.org 
*Subject:*Re: deprecated register issues
Hi Monk,

I honestly don't have the slightest idea why we are still accessing 
CC_RB_BACKEND_DISABLE. Maybe it still contains some useful values?


Key point was that we needed to stop accessing it all the time to 
avoid triggering problems.


Regards,
Christian.

Am 07.03.2018 um 13:11 schrieb Liu, Monk:

Hi Christian

I remember you and AlexD mentioned that a handful registers are
deprecated for greenland (gfx9)

e.g. CC_RB_BACKEND_DISABLE

do you know why we still have this routine ?

staticu32
gfx_v9_0_get_rb_active_bitmap(structamdgpu_device *adev)
{
u32 data, mask;
data =RREG32_SOC15(GC,
0, mmCC_RB_BACKEND_DISABLE);
data |=RREG32_SOC15(GC,
0, mmGC_USER_RB_BACKEND_DISABLE);
data &= CC_RB_BACKEND_DISABLE__BACKEND_DISABLE_MASK;
data >>= GC_USER_RB_BACKEND_DISABLE__BACKEND_DISABLE__SHIFT;
mask
=amdgpu_gfx_create_bitmask(adev->gfx.config.max_backends_per_se/
adev->gfx.config.max_sh_per_se);
return(~data) & mask;
}

see that it still read CC_RB_BACKEND_DISABLE

thanks

/Monk





___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: deprecated register issues

2018-03-07 Thread Mao, David
We requires base driver to provide the mask of disabled RB.
This is why kernel read the CC_RB_BACKEND_DISABLE to collect the harvest 
configuration.
Where did you get to know that the register is deprecated?
I think it should still be there.

Best Regards,
David

On Mar 7, 2018, at 9:49 PM, Liu, Monk 
> wrote:

+ UMD guys

Hi David

Do you know if GC_USER_RB_BACKEND_DISABLE is still exist for gfx9/vega10 ?

We found CC_RB_BACKEND_DISABLE was deprecated but looks it is still in use in 
kmd, so
I want to check with you both of above registers

Thanks
/Monk

From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: 2018年3月7日 20:26
To: Liu, Monk >; Deucher, Alexander 
>
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: deprecated register issues

Hi Monk,

I honestly don't have the slightest idea why we are still accessing 
CC_RB_BACKEND_DISABLE. Maybe it still contains some useful values?

Key point was that we needed to stop accessing it all the time to avoid 
triggering problems.

Regards,
Christian.

Am 07.03.2018 um 13:11 schrieb Liu, Monk:

Hi Christian



I remember you and AlexD mentioned that a handful registers are deprecated for 
greenland (gfx9)

e.g. CC_RB_BACKEND_DISABLE



do you know why we still have this routine ?


static u32
gfx_v9_0_get_rb_active_bitmap(struct amdgpu_device *adev)

{

u32 data, mask;



data = RREG32_SOC15(GC,
0, mmCC_RB_BACKEND_DISABLE);

data |= RREG32_SOC15(GC,
0, mmGC_USER_RB_BACKEND_DISABLE);



data &= CC_RB_BACKEND_DISABLE__BACKEND_DISABLE_MASK;

data >>= GC_USER_RB_BACKEND_DISABLE__BACKEND_DISABLE__SHIFT;



mask = amdgpu_gfx_create_bitmask(adev->gfx.config.max_backends_per_se /

 adev->gfx.config.max_sh_per_se);



return (~data) & mask;

}



see that it still read CC_RB_BACKEND_DISABLE



thanks



/Monk





___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: deprecated register issues

2018-03-07 Thread Liu, Monk
+ UMD guys

Hi David

Do you know if GC_USER_RB_BACKEND_DISABLE is still exist for gfx9/vega10 ?

We found CC_RB_BACKEND_DISABLE was deprecated but looks it is still in use in 
kmd, so
I want to check with you both of above registers

Thanks
/Monk

From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of 
Christian K?nig
Sent: 2018年3月7日 20:26
To: Liu, Monk ; Deucher, Alexander 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: deprecated register issues

Hi Monk,

I honestly don't have the slightest idea why we are still accessing 
CC_RB_BACKEND_DISABLE. Maybe it still contains some useful values?

Key point was that we needed to stop accessing it all the time to avoid 
triggering problems.

Regards,
Christian.

Am 07.03.2018 um 13:11 schrieb Liu, Monk:

Hi Christian



I remember you and AlexD mentioned that a handful registers are deprecated for 
greenland (gfx9)

e.g. CC_RB_BACKEND_DISABLE



do you know why we still have this routine ?

static u32
gfx_v9_0_get_rb_active_bitmap(struct amdgpu_device *adev)

{

u32 data, mask;



data = RREG32_SOC15(GC,
0, mmCC_RB_BACKEND_DISABLE);

data |= RREG32_SOC15(GC,
0, mmGC_USER_RB_BACKEND_DISABLE);



data &= CC_RB_BACKEND_DISABLE__BACKEND_DISABLE_MASK;

data >>= GC_USER_RB_BACKEND_DISABLE__BACKEND_DISABLE__SHIFT;



mask = amdgpu_gfx_create_bitmask(adev->gfx.config.max_backends_per_se /

 adev->gfx.config.max_sh_per_se);



return (~data) & mask;

}



see that it still read CC_RB_BACKEND_DISABLE



thanks



/Monk



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v2 2/2] drm/amdgpu:Always save uvd vcpu_bo in VM Mode

2018-03-07 Thread James Zhu
When UVD is in VM mode, there is not uvd handle exchanged,
uvd.handles are always 0. So vcpu_bo always need save,
Otherwise amdgpu driver will fail during suspend/resume.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105021
Signed-off-by: James Zhu 
Reviewed-by: Leo Liu 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index 9d037cb..f3c459b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -299,12 +299,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
 
cancel_delayed_work_sync(>uvd.idle_work);
 
-   for (i = 0; i < adev->uvd.max_handles; ++i)
-   if (atomic_read(>uvd.handles[i]))
-   break;
+   /* only valid for physical mode */
+   if (adev->asic_type < CHIP_POLARIS10) {
+   for (i = 0; i < adev->uvd.max_handles; ++i)
+   if (atomic_read(>uvd.handles[i]))
+   break;
 
-   if (i == adev->uvd.max_handles)
-   return 0;
+   if (i == adev->uvd.max_handles)
+   return 0;
+   }
 
size = amdgpu_bo_size(adev->uvd.vcpu_bo);
ptr = adev->uvd.cpu_addr;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH v2 2/2] drm/amdgpu:Always save uvd vcpu_bo in VM Mode

2018-03-07 Thread James Zhu
When UVD is in VM mode, there is not uvd handle exchanged,
uvd.handles are always 0. So vcpu_bo always need save,
Otherwise amdgpu driver will fail during suspend/resume.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105021
Signed-off-by: James Zhu 
Reviewed-by: Leo Liu 
Reviewed-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index 9d037cb..f3c459b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -299,12 +299,15 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
 
cancel_delayed_work_sync(>uvd.idle_work);
 
-   for (i = 0; i < adev->uvd.max_handles; ++i)
-   if (atomic_read(>uvd.handles[i]))
-   break;
+   /* only valid for physical mode */
+   if (adev->asic_type < CHIP_POLARIS10) {
+   for (i = 0; i < adev->uvd.max_handles; ++i)
+   if (atomic_read(>uvd.handles[i]))
+   break;
 
-   if (i == adev->uvd.max_handles)
-   return 0;
+   if (i == adev->uvd.max_handles)
+   return 0;
+   }
 
size = amdgpu_bo_size(adev->uvd.vcpu_bo);
ptr = adev->uvd.cpu_addr;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Modesetting (amdgpudrmfb) artifacts on RAVEN APU

2018-03-07 Thread Tom St Denis
It's in the next set of display patches for drm-next but I don't think 
it'll be in 4.16 at this point unless it gets cherry picked out.


Alex would know for sure since he co-ordinates that with Dave.

Tom

On 03/06/2018 12:38 PM, KARBOWSKI Piotr wrote:

On 2018-03-06 18:28, Tom St Denis wrote:
I routinely rebase this patch on top of our amd-staging-drm-next tree 
at fdo.


https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next


Awesome.

So I used this tree, branch amd-staging-drm-next, and applied the patch 
- and my screen works now!


Is there a chance that those changes will still be included in 4.16, 
like, next rc?


-- Piotr.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: deprecated register issues

2018-03-07 Thread Christian König

Hi Monk,

I honestly don't have the slightest idea why we are still accessing 
CC_RB_BACKEND_DISABLE. Maybe it still contains some useful values?


Key point was that we needed to stop accessing it all the time to avoid 
triggering problems.


Regards,
Christian.

Am 07.03.2018 um 13:11 schrieb Liu, Monk:


Hi Christian


I remember you and AlexD mentioned that a handful registers are 
deprecated for greenland (gfx9)


e.g. CC_RB_BACKEND_DISABLE


do you know why we still have this routine ?

static u32 gfx_v9_0_get_rb_active_bitmap(struct amdgpu_device *adev)
{
u32 data, mask;
data = RREG32_SOC15(GC, 0, mmCC_RB_BACKEND_DISABLE);
data |= RREG32_SOC15(GC, 0, mmGC_USER_RB_BACKEND_DISABLE);
data &= CC_RB_BACKEND_DISABLE__BACKEND_DISABLE_MASK;
data >>= GC_USER_RB_BACKEND_DISABLE__BACKEND_DISABLE__SHIFT;
mask = 
amdgpu_gfx_create_bitmask(adev->gfx.config.max_backends_per_se /

 adev->gfx.config.max_sh_per_se);
return (~data) & mask;
}

see that it still read CC_RB_BACKEND_DISABLE


thanks


/Monk




___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: refactoring mailbox to fix TDR handshake bugs(v2)

2018-03-07 Thread Monk Liu
this patch actually refactor mailbox implmentations, and
all below changes are needed together to fix all those mailbox
handshake issues exposured by heavey TDR test.

1)refactor all mailbox functions based on byte accessing for mb_control
reason is to avoid touching non-related bits when writing trn/rcv part of
mailbox_control, this way some incorrect INTR sent to hypervisor
side could be avoided, and it fixes couple handshake bug.

2)trans_msg function re-impled: put a invalid
logic before transmitting message to make sure the ACK bit is in
a clear status, otherwise there is chance that ACK asserted already
before transmitting message and lead to fake ACK polling.
(hypervisor side have some tricks to workaround ACK bit being corrupted
by VF FLR which hase an side effects that may make guest side ACK bit
asserted wrongly), and clear TRANS_MSG words after message transferred.

3)for mailbox_flr_work, it is also re-worked: it takes the mutex lock
first if invoked, to block gpu recover's participate too early while
hypervisor side is doing VF FLR. (hypervisor sends FLR_NOTIFY to guest
before doing VF FLR and sentds FLR_COMPLETE after VF FLR done, and
the FLR_NOTIFY will trigger interrupt to guest which lead to
mailbox_flr_work being invoked)

This can avoid the issue that mailbox trans msg being cleared by its VF FLR.

4)for mailbox_rcv_irq IRQ routine, it should only peek msg and schedule
mailbox_flr_work, instead of ACK to hypervisor itself, because FLR_NOTIFY
msg sent from hypervisor side doesn't need VF's ACK (this is because
VF's ACK would lead to hypervisor clear its trans_valid/msg, and this
would cause handshake bug if trans_valid/msg is cleared not due to
correct VF ACK but from a wrong VF ACK like this "FLR_NOTIFY" one)

This fixed handshake bug that sometimes GUEST always couldn't receive
"READY_TO_ACCESS_GPU" msg from hypervisor.

5)seperate polling time limite accordingly:
POLL ACK cost no more than 500ms
POLL MSG cost no more than 12000ms
POLL FLR finish cost no more than 500ms

6) we still need to set adev into in_gpu_reset mode after we received
FLR_NOTIFY from host side, this can prevent innocent app wrongly succesed
to open amdgpu dri device.

FLR_NOFITY is received due to an IDLE hang detected from hypervisor side
which indicating GPU is already die in this VF.

v2:
use MACRO as the offset of mailbox_control register
don't test if NOTIFY_CMPL event in rcv_msg since it won't
recieve that message anymore

Change-Id: I17df8b4490a5b53a1cc2bd6c8f9bc3ee14c23f1a
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 196 ++
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h |   7 +-
 2 files changed, 109 insertions(+), 94 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
index 271452d..8b47484 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
@@ -33,56 +33,34 @@
 
 static void xgpu_ai_mailbox_send_ack(struct amdgpu_device *adev)
 {
-   u32 reg;
-   int timeout = AI_MAILBOX_TIMEDOUT;
-   u32 mask = REG_FIELD_MASK(BIF_BX_PF0_MAILBOX_CONTROL, RCV_MSG_VALID);
-
-   reg = RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-mmBIF_BX_PF0_MAILBOX_CONTROL));
-   reg = REG_SET_FIELD(reg, BIF_BX_PF0_MAILBOX_CONTROL, RCV_MSG_ACK, 1);
-   WREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-  mmBIF_BX_PF0_MAILBOX_CONTROL), reg);
-
-   /*Wait for RCV_MSG_VALID to be 0*/
-   reg = RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-mmBIF_BX_PF0_MAILBOX_CONTROL));
-   while (reg & mask) {
-   if (timeout <= 0) {
-   pr_err("RCV_MSG_VALID is not cleared\n");
-   break;
-   }
-   mdelay(1);
-   timeout -=1;
-
-   reg = RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-
mmBIF_BX_PF0_MAILBOX_CONTROL));
-   }
+   WREG8(AI_MAIBOX_CONTROL_RCV_OFFSET_BYTE, 2);
 }
 
 static void xgpu_ai_mailbox_set_valid(struct amdgpu_device *adev, bool val)
 {
-   u32 reg;
+   WREG8(AI_MAIBOX_CONTROL_TRN_OFFSET_BYTE, val ? 1 : 0);
+}
 
-   reg = RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-mmBIF_BX_PF0_MAILBOX_CONTROL));
-   reg = REG_SET_FIELD(reg, BIF_BX_PF0_MAILBOX_CONTROL,
-   TRN_MSG_VALID, val ? 1 : 0);
-   WREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0, mmBIF_BX_PF0_MAILBOX_CONTROL),
- reg);
+/*
+ * this peek_msg could *only* be called in IRQ routine becuase in IRQ routine
+ * RCV_MSG_VALID filed of BIF_BX_PF0_MAILBOX_CONTROL must already be set to 1
+ * by host.
+ *
+ * if called no in IRQ routine, this peek_msg cannot guaranteed to return the
+ * correct value since it doesn't return the RCV_DW0 under the case 

deprecated register issues

2018-03-07 Thread Liu, Monk
Hi Christian


I remember you and AlexD mentioned that a handful registers are deprecated for 
greenland (gfx9)

e.g. CC_RB_BACKEND_DISABLE


do you know why we still have this routine ?

static u32 gfx_v9_0_get_rb_active_bitmap(struct amdgpu_device *adev)
{
u32 data, mask;

data = RREG32_SOC15(GC, 0, mmCC_RB_BACKEND_DISABLE);
data |= RREG32_SOC15(GC, 0, mmGC_USER_RB_BACKEND_DISABLE);

data &= CC_RB_BACKEND_DISABLE__BACKEND_DISABLE_MASK;
data >>= GC_USER_RB_BACKEND_DISABLE__BACKEND_DISABLE__SHIFT;

mask = amdgpu_gfx_create_bitmask(adev->gfx.config.max_backends_per_se /
 adev->gfx.config.max_sh_per_se);

return (~data) & mask;
}


see that it still read CC_RB_BACKEND_DISABLE


thanks


/Monk

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 1/4] drm/amd/pp: Replace files name rv_* with smu10_*

2018-03-07 Thread Zhu, Rex
>For patch 4, I think we should stick with the asic names.  The smu7 variants 
>are too confusing.  I can't remember if CI was 7.0.1 or 7.0.2, etc.  I'd 
>prefer to align the file names to the 
>smu interface headers (e.g., smu7, smu71, smu72, smu73, smu74).  

Rex: For patch4, I am ok with the asic names for smu7. So I will drop this 
patch.

>BTW, did you use git mv to generate these patches?

Rex: No, I just rename the file name and git add -A, then the git status show 
file rename.

Best Regards
Rex 


-Original Message-
From: Alex Deucher [mailto:alexdeuc...@gmail.com] 
Sent: Wednesday, March 07, 2018 12:42 AM
To: Zhu, Rex
Cc: amd-gfx list
Subject: Re: [PATCH 1/4] drm/amd/pp: Replace files name rv_* with smu10_*

On Tue, Mar 6, 2018 at 5:10 AM, Rex Zhu  wrote:
> Powerplay is for the hw ip smu, for RV, smu10 is used, so use smu10 as 
> the prefix of the files name.
>
> Change-Id: Icf9c8a9d4b5deccd4fbfb9ecfab443db79934c77
> Signed-off-by: Rex Zhu 
> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/Makefile   |2 +-
>  drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c | 1075 ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.h |  322 --
>  drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c  | 1076 
>   drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.h  |  
> 322 ++
>  drivers/gpu/drm/amd/powerplay/smumgr/Makefile  |2 +-
>  drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c   |  399 
>  drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.h   |   61 --
>  .../gpu/drm/amd/powerplay/smumgr/smu10_smumgr.c|  399 
>  .../gpu/drm/amd/powerplay/smumgr/smu10_smumgr.h|   61 ++
>  10 files changed, 1860 insertions(+), 1859 deletions(-)  delete mode 
> 100644 drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
>  delete mode 100644 drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.h
>  create mode 100644 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
>  create mode 100644 drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.h
>  delete mode 100644 drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
>  delete mode 100644 drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.h
>  create mode 100644 
> drivers/gpu/drm/amd/powerplay/smumgr/smu10_smumgr.c
>  create mode 100644 
> drivers/gpu/drm/amd/powerplay/smumgr/smu10_smumgr.h


I'm ok with patches 1-3, although I think we should work on renaming the 
functions eventually too to match the file names.  For patch 4, I think we 
should stick with the asic names.  The smu7 variants are too confusing.  I 
can't remember if CI was 7.0.1 or 7.0.2, etc.  I'd prefer to align the file 
names to the smu interface headers (e.g., smu7, smu71, smu72, smu73, smu74).  
BTW, did you use git mv to generate these patches?

Alex
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 7/8] drm/amd/pp: Add rv_copy_table_from/to_smc to smu backend function table

2018-03-07 Thread Rex Zhu
Change-Id: I67597a70008a93381e9bbe2439851e50f012d5ff
Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c   |  3 +-
 drivers/gpu/drm/amd/powerplay/inc/smumgr.h   |  5 ++
 drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c | 60 +++-
 drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.h | 11 +
 4 files changed, 44 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
index 474612f..4bdb28f 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
@@ -34,7 +34,6 @@
 #include "rv_ppsmc.h"
 #include "rv_hwmgr.h"
 #include "power_state.h"
-#include "rv_smumgr.h"
 #include "pp_soc15.h"
 
 #define RAVEN_MAX_DEEPSLEEP_DIVIDER_ID 5
@@ -347,7 +346,7 @@ static int rv_populate_clock_table(struct pp_hwmgr *hwmgr)
DpmClocks_t  *table = &(rv_data->clock_table);
struct rv_clock_voltage_information *pinfo = &(rv_data->clock_vol_info);
 
-   result = rv_copy_table_from_smc(hwmgr, (uint8_t *)table, CLOCKTABLE);
+   result = smum_smc_table_manager(hwmgr, (uint8_t *)table, 
SMU10_CLOCKTABLE, true);
 
PP_ASSERT_WITH_CODE((0 == result),
"Attempt to copy clock table from smc failed",
diff --git a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h 
b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
index cbb0166..4764132 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
@@ -82,6 +82,11 @@ enum SMU_MAC_DEFINITION {
SMU_UVD_MCLK_HANDSHAKE_DISABLE,
 };
 
+enum SMU10_TABLE_ID {
+   SMU10_WMTABLE = 0,
+   SMU10_CLOCKTABLE,
+};
+
 extern int smum_get_argument(struct pp_hwmgr *hwmgr);
 
 extern int smum_download_powerplay_table(struct pp_hwmgr *hwmgr, void **table);
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
index 7f2866c..d59ad5e 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
@@ -125,7 +125,7 @@ static int rv_send_msg_to_smc_with_parameter(struct 
pp_hwmgr *hwmgr,
return 0;
 }
 
-int rv_copy_table_from_smc(struct pp_hwmgr *hwmgr,
+static int rv_copy_table_from_smc(struct pp_hwmgr *hwmgr,
uint8_t *table, int16_t table_id)
 {
struct rv_smumgr *priv =
@@ -153,7 +153,7 @@ int rv_copy_table_from_smc(struct pp_hwmgr *hwmgr,
return 0;
 }
 
-int rv_copy_table_to_smc(struct pp_hwmgr *hwmgr,
+static int rv_copy_table_to_smc(struct pp_hwmgr *hwmgr,
uint8_t *table, int16_t table_id)
 {
struct rv_smumgr *priv =
@@ -232,12 +232,12 @@ static int rv_smu_fini(struct pp_hwmgr *hwmgr)
if (priv) {
rv_smc_disable_sdma(hwmgr);
rv_smc_disable_vcn(hwmgr);
-   amdgpu_bo_free_kernel(>smu_tables.entry[WMTABLE].handle,
-   
>smu_tables.entry[WMTABLE].mc_addr,
-   priv->smu_tables.entry[WMTABLE].table);
-   
amdgpu_bo_free_kernel(>smu_tables.entry[CLOCKTABLE].handle,
-   
>smu_tables.entry[CLOCKTABLE].mc_addr,
-   
priv->smu_tables.entry[CLOCKTABLE].table);
+   
amdgpu_bo_free_kernel(>smu_tables.entry[SMU10_WMTABLE].handle,
+   
>smu_tables.entry[SMU10_WMTABLE].mc_addr,
+   
priv->smu_tables.entry[SMU10_WMTABLE].table);
+   
amdgpu_bo_free_kernel(>smu_tables.entry[SMU10_CLOCKTABLE].handle,
+   
>smu_tables.entry[SMU10_CLOCKTABLE].mc_addr,
+   
priv->smu_tables.entry[SMU10_CLOCKTABLE].table);
kfree(hwmgr->smu_backend);
hwmgr->smu_backend = NULL;
}
@@ -289,12 +289,12 @@ static int rv_smu_init(struct pp_hwmgr *hwmgr)
if (r)
return -EINVAL;
 
-   priv->smu_tables.entry[WMTABLE].version = 0x01;
-   priv->smu_tables.entry[WMTABLE].size = sizeof(Watermarks_t);
-   priv->smu_tables.entry[WMTABLE].table_id = TABLE_WATERMARKS;
-   priv->smu_tables.entry[WMTABLE].mc_addr = mc_addr;
-   priv->smu_tables.entry[WMTABLE].table = kaddr;
-   priv->smu_tables.entry[WMTABLE].handle = handle;
+   priv->smu_tables.entry[SMU10_WMTABLE].version = 0x01;
+   priv->smu_tables.entry[SMU10_WMTABLE].size = sizeof(Watermarks_t);
+   priv->smu_tables.entry[SMU10_WMTABLE].table_id = TABLE_WATERMARKS;
+   priv->smu_tables.entry[SMU10_WMTABLE].mc_addr = mc_addr;
+   priv->smu_tables.entry[SMU10_WMTABLE].table = kaddr;
+   priv->smu_tables.entry[SMU10_WMTABLE].handle = handle;
 
/* allocate space for watermarks table */
r = amdgpu_bo_create_kernel((struct amdgpu_device *)hwmgr->adev,
@@ 

[PATCH 6/8] drm/amd/pp: Add new smu backend function smc_table_manager

2018-03-07 Thread Rex Zhu
Change-Id: I4c68f7627387c4ae67612e09651318f5ae90162a
Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/powerplay/inc/hwmgr.h | 1 +
 drivers/gpu/drm/amd/powerplay/inc/smumgr.h| 2 ++
 drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c | 8 
 3 files changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h 
b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
index 312fbc3..494f891 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
@@ -237,6 +237,7 @@ struct pp_smumgr_func {
bool (*is_dpm_running)(struct pp_hwmgr *hwmgr);
bool (*is_hw_avfs_present)(struct pp_hwmgr  *hwmgr);
int (*update_dpm_settings)(struct pp_hwmgr *hwmgr, void 
*profile_setting);
+   int (*smc_table_manager)(struct pp_hwmgr *hwmgr, uint8_t *table, 
uint16_t table_id, bool rw); /*rw: true for read, false for write */
 };
 
 struct pp_hwmgr_func {
diff --git a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h 
b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
index f0ef02a..cbb0166 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
@@ -113,4 +113,6 @@ extern uint32_t smum_get_offsetof(struct pp_hwmgr *hwmgr,
 
 extern int smum_update_dpm_settings(struct pp_hwmgr *hwmgr, void 
*profile_setting);
 
+extern int smum_smc_table_manager(struct pp_hwmgr *hwmgr, uint8_t *table, 
uint16_t table_id, bool rw);
+
 #endif
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
index 68d943d..04c45c2 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
@@ -199,3 +199,11 @@ int smum_update_dpm_settings(struct pp_hwmgr *hwmgr, void 
*profile_setting)
 
return -EINVAL;
 }
+
+int smum_smc_table_manager(struct pp_hwmgr *hwmgr, uint8_t *table, uint16_t 
table_id, bool rw)
+{
+   if (hwmgr->smumgr_funcs->smc_table_manager)
+   return hwmgr->smumgr_funcs->smc_table_manager(hwmgr, table, 
table_id, rw);
+
+   return -EINVAL;
+}
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 4/8] drm/amd/pp: Clean up the code for RV in powerplay

2018-03-07 Thread Rex Zhu
In send_message_to_smu helper functions,
Print out the error code for debug if smu failed to response.

and the helper functions always return true, so no need to
check their return value.

Signed-off-by: Rex Zhu 

Change-Id: I27da8bf22d3cea0df968df7b0809dc727461762f
---
 drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c   | 72 +
 drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c | 98 
 2 files changed, 53 insertions(+), 117 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
index 8ddfb78..4b5c5fc 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
@@ -243,8 +243,7 @@ static int rv_disable_gfx_off(struct pp_hwmgr *hwmgr)
struct rv_hwmgr *rv_data = (struct rv_hwmgr *)(hwmgr->backend);
 
if (rv_data->gfx_off_controled_by_driver)
-   smum_send_msg_to_smc(hwmgr,
-   PPSMC_MSG_DisableGfxOff);
+   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_DisableGfxOff);
 
return 0;
 }
@@ -259,8 +258,7 @@ static int rv_enable_gfx_off(struct pp_hwmgr *hwmgr)
struct rv_hwmgr *rv_data = (struct rv_hwmgr *)(hwmgr->backend);
 
if (rv_data->gfx_off_controled_by_driver)
-   smum_send_msg_to_smc(hwmgr,
-   PPSMC_MSG_EnableGfxOff);
+   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_EnableGfxOff);
 
return 0;
 }
@@ -387,24 +385,12 @@ static int rv_populate_clock_table(struct pp_hwmgr *hwmgr)
rv_get_clock_voltage_dependency_table(hwmgr, >vdd_dep_on_phyclk,
ARRAY_SIZE(VddPhyClk), [0]);
 
-   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
-   PPSMC_MSG_GetMinGfxclkFrequency),
-   "Attempt to get min GFXCLK Failed!",
-   return -1);
-   PP_ASSERT_WITH_CODE(!rv_read_arg_from_smc(hwmgr,
-   ),
-   "Attempt to get min GFXCLK Failed!",
-   return -1);
+   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetMinGfxclkFrequency);
+   rv_read_arg_from_smc(hwmgr, );
rv_data->gfx_min_freq_limit = result * 100;
 
-   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
-   PPSMC_MSG_GetMaxGfxclkFrequency),
-   "Attempt to get max GFXCLK Failed!",
-   return -1);
-   PP_ASSERT_WITH_CODE(!rv_read_arg_from_smc(hwmgr,
-   ),
-   "Attempt to get max GFXCLK Failed!",
-   return -1);
+   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetMaxGfxclkFrequency);
+   rv_read_arg_from_smc(hwmgr, );
rv_data->gfx_max_freq_limit = result * 100;
 
return 0;
@@ -739,14 +725,8 @@ static int rv_print_clock_levels(struct pp_hwmgr *hwmgr,
 
switch (type) {
case PP_SCLK:
-   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
-   PPSMC_MSG_GetGfxclkFrequency),
-   "Attempt to get current GFXCLK Failed!",
-   return -1);
-   PP_ASSERT_WITH_CODE(!rv_read_arg_from_smc(hwmgr,
-   ),
-   "Attempt to get current GFXCLK Failed!",
-   return -1);
+   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetGfxclkFrequency);
+   rv_read_arg_from_smc(hwmgr, );
 
size += sprintf(buf + size, "0: %uMhz %s\n",
data->gfx_min_freq_limit / 100,
@@ -758,14 +738,8 @@ static int rv_print_clock_levels(struct pp_hwmgr *hwmgr,
 == now) ? "*" : "");
break;
case PP_MCLK:
-   PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc(hwmgr,
-   PPSMC_MSG_GetFclkFrequency),
-   "Attempt to get current MEMCLK Failed!",
-   return -1);
-   PP_ASSERT_WITH_CODE(!rv_read_arg_from_smc(hwmgr,
-   ),
-   "Attempt to get current MEMCLK Failed!",
-   return -1);
+   smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetFclkFrequency);
+   rv_read_arg_from_smc(hwmgr, );
 
for (i = 0; i < mclk_table->count; i++)
size += sprintf(buf + size, "%d: %uMhz %s\n",
@@ -935,7 +909,6 @@ static int rv_get_clock_by_type_with_voltage(struct 
pp_hwmgr *hwmgr,
 int rv_display_clock_voltage_request(struct pp_hwmgr *hwmgr,
struct pp_display_clock_request *clock_req)
 {
-   int result = 0;
struct rv_hwmgr *rv_data = (struct rv_hwmgr *)(hwmgr->backend);
enum amd_pp_clock_type clk_type = 

[PATCH 5/8] drm/amd/pp: Clean up code for RV in powerplay.

2018-03-07 Thread Rex Zhu
1. make functions static
2. add rv_read_arg_from_smc to smu backend function table.

Change-Id: I9baecaec95553923e59d1a4cf92a67551769b755
Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c   | 12 ++--
 drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c | 18 --
 drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.h |  2 --
 3 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
index 4b5c5fc..474612f 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
@@ -386,11 +386,11 @@ static int rv_populate_clock_table(struct pp_hwmgr *hwmgr)
ARRAY_SIZE(VddPhyClk), [0]);
 
smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetMinGfxclkFrequency);
-   rv_read_arg_from_smc(hwmgr, );
+   result = smum_get_argument(hwmgr);
rv_data->gfx_min_freq_limit = result * 100;
 
smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetMaxGfxclkFrequency);
-   rv_read_arg_from_smc(hwmgr, );
+   result = smum_get_argument(hwmgr);
rv_data->gfx_max_freq_limit = result * 100;
 
return 0;
@@ -726,7 +726,7 @@ static int rv_print_clock_levels(struct pp_hwmgr *hwmgr,
switch (type) {
case PP_SCLK:
smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetGfxclkFrequency);
-   rv_read_arg_from_smc(hwmgr, );
+   now = smum_get_argument(hwmgr);
 
size += sprintf(buf + size, "0: %uMhz %s\n",
data->gfx_min_freq_limit / 100,
@@ -739,7 +739,7 @@ static int rv_print_clock_levels(struct pp_hwmgr *hwmgr,
break;
case PP_MCLK:
smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetFclkFrequency);
-   rv_read_arg_from_smc(hwmgr, );
+   now = smum_get_argument(hwmgr);
 
for (i = 0; i < mclk_table->count; i++)
size += sprintf(buf + size, "%d: %uMhz %s\n",
@@ -971,14 +971,14 @@ static int rv_read_sensor(struct pp_hwmgr *hwmgr, int idx,
switch (idx) {
case AMDGPU_PP_SENSOR_GFX_SCLK:
smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetGfxclkFrequency);
-   rv_read_arg_from_smc(hwmgr, );
+   sclk = smum_get_argument(hwmgr);
/* in units of 10KHZ */
*((uint32_t *)value) = sclk * 100;
*size = 4;
break;
case AMDGPU_PP_SENSOR_GFX_MCLK:
smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetFclkFrequency);
-   rv_read_arg_from_smc(hwmgr, );
+   mclk = smum_get_argument(hwmgr);
/* in units of 10KHZ */
*((uint32_t *)value) = mclk * 100;
*size = 4;
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
index b64bfe3..7f2866c 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
@@ -60,7 +60,7 @@ static uint32_t rv_wait_for_response(struct pp_hwmgr *hwmgr)
return cgs_read_register(hwmgr->device, reg);
 }
 
-int rv_send_msg_to_smc_without_waiting(struct pp_hwmgr *hwmgr,
+static int rv_send_msg_to_smc_without_waiting(struct pp_hwmgr *hwmgr,
uint16_t msg)
 {
uint32_t reg;
@@ -72,19 +72,17 @@ int rv_send_msg_to_smc_without_waiting(struct pp_hwmgr 
*hwmgr,
return 0;
 }
 
-int rv_read_arg_from_smc(struct pp_hwmgr *hwmgr, uint32_t *arg)
+static int rv_read_arg_from_smc(struct pp_hwmgr *hwmgr)
 {
uint32_t reg;
 
reg = soc15_get_register_offset(MP1_HWID, 0,
mmMP1_SMN_C2PMSG_82_BASE_IDX, mmMP1_SMN_C2PMSG_82);
 
-   *arg = cgs_read_register(hwmgr->device, reg);
-
-   return 0;
+   return cgs_read_register(hwmgr->device, reg);
 }
 
-int rv_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)
+static int rv_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)
 {
uint32_t reg;
 
@@ -103,7 +101,7 @@ int rv_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)
 }
 
 
-int rv_send_msg_to_smc_with_parameter(struct pp_hwmgr *hwmgr,
+static int rv_send_msg_to_smc_with_parameter(struct pp_hwmgr *hwmgr,
uint16_t msg, uint32_t parameter)
 {
uint32_t reg;
@@ -190,8 +188,7 @@ static int rv_verify_smc_interface(struct pp_hwmgr *hwmgr)
 
rv_send_msg_to_smc(hwmgr,
PPSMC_MSG_GetDriverIfVersion);
-   rv_read_arg_from_smc(hwmgr,
-   _driver_if_version);
+   smc_driver_if_version = rv_read_arg_from_smc(hwmgr);
 
if (smc_driver_if_version != SMU10_DRIVER_IF_VERSION) {
pr_err("Attempt to read SMC IF Version Number Failed!\n");
@@ -253,7 +250,7 @@ static int rv_start_smu(struct pp_hwmgr *hwmgr)
struct 

[PATCH 3/8] drm/amd/pp: Delete is_smc_ram_running function on RV

2018-03-07 Thread Rex Zhu
1. There is a race condition when another ip also use same register pairs
2. check once at boot up by GetDriverIfVersion message is sufficient
   to check SMU health. so delete is_smc_ram_running check.

Change-Id: I3c5f5ebac36c1ae4a7dd72f482fc27a4f362975e
Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c | 27 
 1 file changed, 27 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
index 6362d46..e6317fd 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
@@ -47,34 +47,10 @@
 #define smnMP1_FIRMWARE_FLAGS   0x3010028
 
 
-bool rv_is_smc_ram_running(struct pp_hwmgr *hwmgr)
-{
-   uint32_t mp1_fw_flags, reg;
-
-   reg = soc15_get_register_offset(NBIF_HWID, 0,
-   mmPCIE_INDEX2_BASE_IDX, mmPCIE_INDEX2);
-
-   cgs_write_register(hwmgr->device, reg,
-   (MP1_Public | (smnMP1_FIRMWARE_FLAGS & 0x)));
-
-   reg = soc15_get_register_offset(NBIF_HWID, 0,
-   mmPCIE_DATA2_BASE_IDX, mmPCIE_DATA2);
-
-   mp1_fw_flags = cgs_read_register(hwmgr->device, reg);
-
-   if (mp1_fw_flags & MP1_FIRMWARE_FLAGS__INTERRUPTS_ENABLED_MASK)
-   return true;
-
-   return false;
-}
-
 static uint32_t rv_wait_for_response(struct pp_hwmgr *hwmgr)
 {
uint32_t reg;
 
-   if (!rv_is_smc_ram_running(hwmgr))
-   return -EINVAL;
-
reg = soc15_get_register_offset(MP1_HWID, 0,
mmMP1_SMN_C2PMSG_90_BASE_IDX, mmMP1_SMN_C2PMSG_90);
 
@@ -89,9 +65,6 @@ int rv_send_msg_to_smc_without_waiting(struct pp_hwmgr *hwmgr,
 {
uint32_t reg;
 
-   if (!rv_is_smc_ram_running(hwmgr))
-   return -EINVAL;
-
reg = soc15_get_register_offset(MP1_HWID, 0,
mmMP1_SMN_C2PMSG_66_BASE_IDX, mmMP1_SMN_C2PMSG_66);
cgs_write_register(hwmgr->device, reg, msg);
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 2/8] drm/amd/pp: Clean up header file include

2018-03-07 Thread Rex Zhu
smu7_smumgr.h should not be included in rv and vega, The common functions
for all smu7 asics are put in smu_smumgr.c.

Not include cgs interface in smumgr.c. the code used cgs interface has been
deleted.

Not include smu_ucode_xfer_vi.h in rv/vega

Change-Id: I3c9751594919c9598aed6a518d1b02eb698de9ec
Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c | 3 +--
 drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c| 1 -
 drivers/gpu/drm/amd/powerplay/smumgr/vega10_smumgr.c | 6 ++
 3 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
index e2ee23a..6362d46 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/rv_smumgr.c
@@ -31,8 +31,7 @@
 #include "smu10.h"
 #include "ppatomctrl.h"
 #include "pp_debug.h"
-#include "smu_ucode_xfer_vi.h"
-#include "smu7_smumgr.h"
+
 
 #define VOLTAGE_SCALE 4
 
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
index 3645127..68d943d 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
@@ -28,7 +28,6 @@
 #include 
 #include 
 #include "smumgr.h"
-#include "cgs_common.h"
 
 MODULE_FIRMWARE("amdgpu/topaz_smc.bin");
 MODULE_FIRMWARE("amdgpu/topaz_k_smc.bin");
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/vega10_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/vega10_smumgr.c
index 15e1afa..ad4e8b4 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/vega10_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/vega10_smumgr.c
@@ -27,11 +27,9 @@
 #include "vega10_smumgr.h"
 #include "vega10_ppsmc.h"
 #include "smu9_driver_if.h"
-
 #include "ppatomctrl.h"
 #include "pp_debug.h"
-#include "smu_ucode_xfer_vi.h"
-#include "smu7_smumgr.h"
+
 
 #define AVFS_EN_MSB1568
 #define AVFS_EN_LSB1568
@@ -386,7 +384,7 @@ static int vega10_smu_init(struct pp_hwmgr *hwmgr)
struct cgs_firmware_info info = {0};
 
ret = cgs_get_firmware_info(hwmgr->device,
-   smu7_convert_fw_type_to_cgs(UCODE_ID_SMU),
+   CGS_UCODE_ID_SMU,
);
if (ret || !info.kptr)
return -EINVAL;
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH 1/8] drm/amd/pp: Refine code for smu7 in powerplay

2018-03-07 Thread Rex Zhu
1. Add avfs_support in hwmgr to avoid to visit smu backend struct in
   hwmgr.so do not need to include smu7_smumgr.h under hwmgr folder.
2. simplify the list of enum AVFS_BTC_STATUS

Change-Id: I04c769972deff797229339f3ccb1c442b35768a2
Signed-off-by: Rex Zhu 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c| 14 ++
 drivers/gpu/drm/amd/powerplay/inc/hwmgr.h   |  2 +-
 drivers/gpu/drm/amd/powerplay/inc/smumgr.h  | 10 +-
 drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c  |  4 +++-
 drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c |  4 +++-
 5 files changed, 10 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index d4d1d2e..da25e7f 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -40,7 +40,6 @@
 
 #include "hwmgr.h"
 #include "smu7_hwmgr.h"
-#include "smu7_smumgr.h"
 #include "smu_ucode_xfer_vi.h"
 #include "smu7_powertune.h"
 #include "smu7_dyn_defaults.h"
@@ -1353,12 +1352,7 @@ static int smu7_enable_dpm_tasks(struct pp_hwmgr *hwmgr)
 
 static int smu7_avfs_control(struct pp_hwmgr *hwmgr, bool enable)
 {
-   struct smu7_smumgr *smu_data = (struct smu7_smumgr 
*)(hwmgr->smu_backend);
-
-   if (smu_data == NULL)
-   return -EINVAL;
-
-   if (smu_data->avfs.avfs_btc_status == AVFS_BTC_NOTSUPPORTED)
+   if (!hwmgr->avfs_supported)
return 0;
 
if (enable) {
@@ -1382,13 +1376,9 @@ static int smu7_avfs_control(struct pp_hwmgr *hwmgr, 
bool enable)
 
 static int smu7_update_avfs(struct pp_hwmgr *hwmgr)
 {
-   struct smu7_smumgr *smu_data = (struct smu7_smumgr 
*)(hwmgr->smu_backend);
struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
 
-   if (smu_data == NULL)
-   return -EINVAL;
-
-   if (smu_data->avfs.avfs_btc_status == AVFS_BTC_NOTSUPPORTED)
+   if (!hwmgr->avfs_supported)
return 0;
 
if (data->need_update_smu7_dpm_table & DPMTABLE_OD_UPDATE_VDDC) {
diff --git a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h 
b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
index b151ad85..312fbc3 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
@@ -748,7 +748,7 @@ struct pp_hwmgr {
struct pp_power_state*uvd_ps;
struct amd_pp_display_configuration display_config;
uint32_t feature_mask;
-
+   bool avfs_supported;
/* UMD Pstate */
bool en_umd_pstate;
uint32_t power_profile_mode;
diff --git a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h 
b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
index 9bba0a0..f0ef02a 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
@@ -28,23 +28,15 @@
 
 enum AVFS_BTC_STATUS {
AVFS_BTC_BOOT = 0,
-   AVFS_BTC_BOOT_STARTEDSMU,
AVFS_LOAD_VIRUS,
AVFS_BTC_VIRUS_LOADED,
AVFS_BTC_VIRUS_FAIL,
AVFS_BTC_COMPLETED_PREVIOUSLY,
AVFS_BTC_ENABLEAVFS,
-   AVFS_BTC_STARTED,
AVFS_BTC_FAILED,
-   AVFS_BTC_RESTOREVFT_FAILED,
-   AVFS_BTC_SAVEVFT_FAILED,
AVFS_BTC_DPMTABLESETUP_FAILED,
-   AVFS_BTC_COMPLETED_UNSAVED,
-   AVFS_BTC_COMPLETED_SAVED,
-   AVFS_BTC_COMPLETED_RESTORED,
AVFS_BTC_DISABLED,
-   AVFS_BTC_NOTSUPPORTED,
-   AVFS_BTC_SMUMSG_ERROR
+   AVFS_BTC_NOTSUPPORTED
 };
 
 enum SMU_TABLE {
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
index 0b2b5d1..573c9be 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
@@ -360,8 +360,10 @@ static bool fiji_is_hw_avfs_present(struct pp_hwmgr *hwmgr)
 
if (!atomctrl_read_efuse(hwmgr->device, AVFS_EN_LSB, AVFS_EN_MSB,
mask, )) {
-   if (efuse)
+   if (efuse) {
+   hwmgr->avfs_supported = true;
return true;
+   }
}
return false;
 }
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
index 632d1ca..1075ec1 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
@@ -357,8 +357,10 @@ static bool polaris10_is_hw_avfs_present(struct pp_hwmgr 
*hwmgr)
 
efuse = cgs_read_ind_register(hwmgr->device, CGS_IND_REG__SMC, 
ixSMU_EFUSE_0 + (49*4));
efuse &= 0x0001;
-   if (efuse)
+   if (efuse) {
+   hwmgr->avfs_supported = true;
return true;
+   }
 
return false;
 }
-- 
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org

[PATCH] drm/amdgpu: Correct the place of amdgpu_pm_sysfs_fini

2018-03-07 Thread Emily Deng
The amdgpu_pm_sysfs_fini should call before amdgpu_device_ip_fini,
or the adev->pm.dpm_enabled would be set to 0, then the device files
related to pp won't be removed by amdgpu_pm_sysfs_fini when unload
driver.

Signed-off-by: Emily Deng 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f67f5bf..8a8f8f6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2096,6 +2096,7 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 
amdgpu_ib_pool_fini(adev);
amdgpu_fence_driver_fini(adev);
+   amdgpu_pm_sysfs_fini(adev);
amdgpu_fbdev_fini(adev);
r = amdgpu_device_ip_fini(adev);
if (adev->firmware.gpu_info_fw) {
@@ -2124,7 +2125,6 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
iounmap(adev->rmmio);
adev->rmmio = NULL;
amdgpu_device_doorbell_fini(adev);
-   amdgpu_pm_sysfs_fini(adev);
amdgpu_debugfs_regs_cleanup(adev);
 }
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 2/4] drm/amdgpu: refactoring mailbox to fix TDR handshake bugs

2018-03-07 Thread Liu, Monk
>Can you do that ? looks to me the MACRO in amdgpu only works for DW 
> aligned register names, so I'm afraid we cannot do that gracefully  

[Pixel] some readable macros can help us easily understand which register field 
is accessed.

[ml] I can try it 

>[Pixel]generally, CMPL is not handled by rcv_msg anymore but in peek_msg. we 
>can remove the logics here.

Yeah, we can remove it 

/Monk


-Original Message-
From: Ding, Pixel 
Sent: 2018年3月7日 17:22
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/4] drm/amdgpu: refactoring mailbox to fix TDR handshake 
bugs

Hi Monk,

See comments inline.
— 
Sincerely Yours,
Pixel


On 07/03/2018, 4:58 PM, "Liu, Monk"  wrote:

HI Pixel

The patch better not to be split, otherwise the stress TDR test still fail 
since there are couple issues and some of them mixed together 

For your concerns:
> Any VF ACK could lead missing msg if timing is bad while hypervisor send 
msg quite frequently and guest send ack frequently (suck like the case 
FLR_NOTIFY followed by READY_TO_ACCESS_GPU immediately). Is it correct? If so, 
maybe we need further protection in GIM.

We had a lot of fixings in GIM side along with this patch to make stress 
TDR test finally passed, you can watch the history, anyway GIM side is not 
related with this patch

[Pixel]Per talked offline, understand your point here.

> it’s not clear to understand the last 2 lines of code here to set 
RCV_MSG_ACK bit: 0x10 to +1 byte offset. Can we define some macros to make it 
clear?
Can you do that ? looks to me the MACRO in amdgpu only works for DW aligned 
register names, so I'm afraid we cannot do that gracefully  

[Pixel] some readable macros can help us easily understand which register field 
is accessed.

> Can we also set valid bit for IDH_FLR_NOTIFICATION_CMPL event in GIM? By 
now guest use peek_msg to catch this event, so I think it’s OK to set valid for 
it. Then the code here would be clear without CMPL condition. We can also use 
peek_ack to ack CMPL if it’s necessary. any problem?

I don't understand your point 
[Pixel]generally, CMPL is not handled by rcv_msg anymore but in peek_msg. we 
can remove the logics here.



-Original Message-
From: Ding, Pixel 
Sent: 2018年3月7日 16:31
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/4] drm/amdgpu: refactoring mailbox to fix TDR 
handshake bugs

Hi Monk,

It’s a long patch which could split into 6 pieces at most… anyway I got 
some questions inline to have a better understanding.

— 
Sincerely Yours,
Pixel


On 06/03/2018, 11:29 AM, "amd-gfx on behalf of Monk Liu" 
 wrote:

this patch actually refactor mailbox implmentations, and
all below changes are needed together to fix all those mailbox
handshake issues exposured by heavey TDR test.

1)refactor all mailbox functions based on byte accessing for mb_control
reason is to avoid touching non-related bits when writing trn/rcv part 
of
mailbox_control, this way some incorrect INTR sent to hypervisor
side could be avoided, and it fixes couple handshake bug.

2)trans_msg function re-impled: put a invalid
logic before transmitting message to make sure the ACK bit is in
a clear status, otherwise there is chance that ACK asserted already
before transmitting message and lead to fake ACK polling.
(hypervisor side have some tricks to workaround ACK bit being corrupted
by VF FLR which hase an side effects that may make guest side ACK bit
asserted wrongly), and clear TRANS_MSG words after message transferred.

3)for mailbox_flr_work, it is also re-worked: it takes the mutex lock
first if invoked, to block gpu recover's participate too early while
hypervisor side is doing VF FLR. (hypervisor sends FLR_NOTIFY to guest
before doing VF FLR and sentds FLR_COMPLETE after VF FLR done, and
the FLR_NOTIFY will trigger interrupt to guest which lead to
mailbox_flr_work being invoked)

This can avoid the issue that mailbox trans msg being cleared by its VF 
FLR.

4)for mailbox_rcv_irq IRQ routine, it should only peek msg and schedule
mailbox_flr_work, instead of ACK to hypervisor itself, because 
FLR_NOTIFY
msg sent from hypervisor side doesn't need VF's ACK (this is because
VF's ACK would lead to hypervisor clear its trans_valid/msg, and this
would cause handshake bug if trans_valid/msg is cleared not due to
correct VF ACK but from a wrong VF ACK like this "FLR_NOTIFY" one)

Any VF ACK could lead missing msg if timing is bad while 

Re: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support

2018-03-07 Thread Michel Dänzer
On 2018-03-07 11:04 AM, Christian König wrote:
> Am 07.03.2018 um 10:53 schrieb Michel Dänzer:
>> On 2018-03-07 10:47 AM, Christian König wrote:
>>> See when I tested this the DDX and Mesa where unmodified, so both still
>>> assumed VRAM as placement for scanout BOs, but the kernel forced scanout
>>> BOs into GTT for testing.
>>>
>>> So what happened was that on scanout we moved the VRAM BO to GTT and
>>> after unpinning it on the first command submission which used the BO we
>>> moved it back to VRAM again.
>>>
>>> So as long as we don't want a severe performance penalty when you
>>> combine old userspace with new kernel using a kernel parameter to force
>>> scanout from GTT is not possible.
>> That means we either need to come up with a different workaround for
>> hardware issues transitioning between scanout from VRAM and GTT, or we
>> can't scan out from GTT in that case.
> 
> What exactly was the problem with the transition from VRAM to GTT?
> 
> I mean what we can rather easily do is check where the existing BO is
> pinned and then always pin into the same domain on page flip.

Yeah, something like that could work. In which case, all that's needed
is a way for userspace to know that GTT scanout can work, right? With
that information, it can manage the domains of scanout BOs as it wants.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support

2018-03-07 Thread Christian König

Am 07.03.2018 um 10:53 schrieb Michel Dänzer:

On 2018-03-07 10:47 AM, Christian König wrote:

See when I tested this the DDX and Mesa where unmodified, so both still
assumed VRAM as placement for scanout BOs, but the kernel forced scanout
BOs into GTT for testing.

So what happened was that on scanout we moved the VRAM BO to GTT and
after unpinning it on the first command submission which used the BO we
moved it back to VRAM again.

So as long as we don't want a severe performance penalty when you
combine old userspace with new kernel using a kernel parameter to force
scanout from GTT is not possible.

That means we either need to come up with a different workaround for
hardware issues transitioning between scanout from VRAM and GTT, or we
can't scan out from GTT in that case.


What exactly was the problem with the transition from VRAM to GTT?

I mean what we can rather easily do is check where the existing BO is 
pinned and then always pin into the same domain on page flip.


Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support

2018-03-07 Thread Michel Dänzer
On 2018-03-07 10:47 AM, Christian König wrote:
> 
> See when I tested this the DDX and Mesa where unmodified, so both still
> assumed VRAM as placement for scanout BOs, but the kernel forced scanout
> BOs into GTT for testing.
> 
> So what happened was that on scanout we moved the VRAM BO to GTT and
> after unpinning it on the first command submission which used the BO we
> moved it back to VRAM again.
> 
> So as long as we don't want a severe performance penalty when you
> combine old userspace with new kernel using a kernel parameter to force
> scanout from GTT is not possible.

That means we either need to come up with a different workaround for
hardware issues transitioning between scanout from VRAM and GTT, or we
can't scan out from GTT in that case.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support

2018-03-07 Thread Christian König

Am 07.03.2018 um 09:42 schrieb Michel Dänzer:

On 2018-03-06 07:23 PM, Christian König wrote:

Am 06.03.2018 um 19:15 schrieb Michel Dänzer:

On 2018-03-06 07:02 PM, Christian König wrote:

Am 06.03.2018 um 18:51 schrieb Michel Dänzer:

On 2018-03-06 06:44 PM, Christian König wrote:

Am 06.03.2018 um 18:22 schrieb Li, Samuel:

addition to that I agree with Michel that the module parameter is
overkill.

That I already explained. Currently SG display feature needs to
provide options for all kinds of use cases. All amd drivers so far
provides options, and the default configuration is actually to
allocate everything from GTT when possible.

That isn't job of the kernel to have this parameter. Saying that we
should probably add a DDX and/or environment option to control that.

Why would we need that? It's the kernel driver's job to handle this as
needed.

Buffer placement is specified by the DDX/Mesa, when we override that in
the kernel we just force unnecessary buffer moves.

Userspace just needs a way to know which domains can/should be used for
scanout buffers, e.g. via an info ioctl.

Yeah, but why shouldn't they then make the decision where to place it as
well?

See when we enforce a certain placement in the kernel we will just get
an unnecessary buffer move with old user space components.

In other words the kernel just needs to advertise that the hw/sw
combination allows scanout from GTT as well and then an updated
userspace can actually make use of that placement or decide to not use
it in certain situations.

I think we'll need to always pin BOs to GTT for scanout in some cases,
because some hardware has issues when transitioning between scanout from
VRAM and GTT.

How about:

Userspace can query from the kernel which domain(s) can be used for
scanout: only VRAM / VRAM or GTT / only GTT. Userspace sets the allowed
domains correspondingly for scanout BOs.


Well that sounds close to what I had in mind as well, we just can't 
specify only GTT when we want to keep backward compatibility with 
existing user space.




If you can think of a scenario where this won't work, please
specifically describe it and how you propose addressing it.



E.g. the last time I tested it placing things into GTT still resulted
in quite a performance penalty for rendering.

FWIW, I think the penalty is most likely IOMMU related. Last time I
tested, I couldn't measure a big difference with IOMMU disabled.


No, the penalty I'm talking about came from the ping/pong we did with 
the scanout buffers.


See when I tested this the DDX and Mesa where unmodified, so both still 
assumed VRAM as placement for scanout BOs, but the kernel forced scanout 
BOs into GTT for testing.


So what happened was that on scanout we moved the VRAM BO to GTT and 
after unpinning it on the first command submission which used the BO we 
moved it back to VRAM again.


So as long as we don't want a severe performance penalty when you 
combine old userspace with new kernel using a kernel parameter to force 
scanout from GTT is not possible.


Christian.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH xf86-video-ati] Only change Set/MoveCursor hooks from what we expect

2018-03-07 Thread Mario Kleiner

On 03/07/2018 09:50 AM, Michel Dänzer wrote:

On 2018-03-07 04:45 AM, Mario Kleiner wrote:

On 03/05/2018 10:56 PM, Alex Deucher wrote:

On Mon, Mar 5, 2018 at 12:44 PM, Michel Dänzer 
wrote:

From: Michel Dänzer 

Since xf86CursorCloseScreen runs after RADEONCloseScreen_KMS,
PointPriv->spriteFuncs doesn't point to the same struct in the latter as
in RADEONCursorInit_KMS. So we were restoring info->Set/MoveCursor to
the wrong struct. Then in the next server generation,
info->Set/MoveCursor would end up pointing to
drmmode_sprite_set/move_cursor, resulting in an infinite loop if one of
them was called.

To avoid this, only change the Set/MoveCursor hooks if their values
match our expectations, otherwise leave them as is. This is kind of a
hack, but the alternative would be invasive and thus risky changes to
the way we're wrapping CloseScreen, and it's not even clear that can
work without changing xserver code.

Fixes: 1fe8ca75974c ("Keep track of how many SW cursors are visible on
    each screen")
Signed-off-by: Michel Dänzer 


Acked-by: Alex Deucher 



Nope, not quite, unfortunately. Tested against x-server master, mesa
master, ati-ddx master, with sddm login manager. With a freshly started
server, now on a dual-x-screen setup, instead of an infinite loop, i get
a server crash as soon as i move the mouse cursor from X-Screen 0 to
X-Screen 1:


Well, that's not the same issue I was seeing after all. I'll take
another look.




Bedtime here, but fwiw from some debug statements i added, it seems as 
if on dual-x-screen init, x-screen 1 somehow inherits the 
PointPriv->spriteFuncs->SetCursor = drmmode_sprite_set_cursor assignment 
done on x-screen 0, and therefore already has it == 
drmmode_sprite_set_cursor, so during RADEONCursorInit_KMS for x-screen 1 
the assignments get skipped, which leaves the info->SetCursor for 
x-screen 1 == NULL. Therefore cursor update for x-screen 1 = boom!


-mario
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2] drm/amdgpu: Clean sdma wptr register when only enable wptr polling

2018-03-07 Thread Christian König

Reviewed-by: Christian König .

Am 07.03.2018 um 07:17 schrieb Yu, Xiangliang:

Reviewed-by: Xiangliang Yu 



-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
Of Emily Deng
Sent: Wednesday, March 07, 2018 9:52 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deng, Emily 
Subject: [PATCH v2] drm/amdgpu: Clean sdma wptr register when only
enable wptr polling

The sdma wptr polling memory is not fast enough, then the sdma wptr
register will be random, and not equal to sdma rptr, which will cause sdma
engine hang when load driver, so clean up the sdma wptr directly to fix this
issue.

v2:add comment above the code and correct coding style
Signed-off-by: Emily Deng 
---
  drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 521978c..89ec17c 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -719,14 +719,17 @@ static int sdma_v3_0_gfx_resume(struct
amdgpu_device *adev)
WREG32(mmSDMA0_GFX_RB_WPTR_POLL_ADDR_HI +
sdma_offsets[i],
   upper_32_bits(wptr_gpu_addr));
wptr_poll_cntl =
RREG32(mmSDMA0_GFX_RB_WPTR_POLL_CNTL + sdma_offsets[i]);
-   if (ring->use_pollmem)
+   if (ring->use_pollmem) {
+   /*wptr polling is not enogh fast, directly clean the
wptr register */
+   WREG32(mmSDMA0_GFX_RB_WPTR +
sdma_offsets[i], 0);
wptr_poll_cntl = REG_SET_FIELD(wptr_poll_cntl,

SDMA0_GFX_RB_WPTR_POLL_CNTL,
   ENABLE, 1);
-   else
+   } else {
wptr_poll_cntl = REG_SET_FIELD(wptr_poll_cntl,

SDMA0_GFX_RB_WPTR_POLL_CNTL,
   ENABLE, 0);
+   }
WREG32(mmSDMA0_GFX_RB_WPTR_POLL_CNTL +
sdma_offsets[i], wptr_poll_cntl);

/* enable DMA RB */
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 2/4] drm/amdgpu: refactoring mailbox to fix TDR handshake bugs

2018-03-07 Thread Ding, Pixel
Hi Monk,

See comments inline.
— 
Sincerely Yours,
Pixel


On 07/03/2018, 4:58 PM, "Liu, Monk"  wrote:

HI Pixel

The patch better not to be split, otherwise the stress TDR test still fail 
since there are couple issues and some of them mixed together 

For your concerns:
> Any VF ACK could lead missing msg if timing is bad while hypervisor send 
msg quite frequently and guest send ack frequently (suck like the case 
FLR_NOTIFY followed by READY_TO_ACCESS_GPU immediately). Is it correct? If so, 
maybe we need further protection in GIM.

We had a lot of fixings in GIM side along with this patch to make stress 
TDR test finally passed, you can watch the history, anyway GIM side is not 
related with this patch

[Pixel]Per talked offline, understand your point here.

> it’s not clear to understand the last 2 lines of code here to set 
RCV_MSG_ACK bit: 0x10 to +1 byte offset. Can we define some macros to make it 
clear?
Can you do that ? looks to me the MACRO in amdgpu only works for DW aligned 
register names, so I'm afraid we cannot do that gracefully  

[Pixel] some readable macros can help us easily understand which register field 
is accessed.

> Can we also set valid bit for IDH_FLR_NOTIFICATION_CMPL event in GIM? By 
now guest use peek_msg to catch this event, so I think it’s OK to set valid for 
it. Then the code here would be clear without CMPL condition. We can also use 
peek_ack to ack CMPL if it’s necessary. any problem?

I don't understand your point 
[Pixel]generally, CMPL is not handled by rcv_msg anymore but in peek_msg. we 
can remove the logics here.



-Original Message-
From: Ding, Pixel 
Sent: 2018年3月7日 16:31
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/4] drm/amdgpu: refactoring mailbox to fix TDR 
handshake bugs

Hi Monk,

It’s a long patch which could split into 6 pieces at most… anyway I got 
some questions inline to have a better understanding.

— 
Sincerely Yours,
Pixel


On 06/03/2018, 11:29 AM, "amd-gfx on behalf of Monk Liu" 
 wrote:

this patch actually refactor mailbox implmentations, and
all below changes are needed together to fix all those mailbox
handshake issues exposured by heavey TDR test.

1)refactor all mailbox functions based on byte accessing for mb_control
reason is to avoid touching non-related bits when writing trn/rcv part 
of
mailbox_control, this way some incorrect INTR sent to hypervisor
side could be avoided, and it fixes couple handshake bug.

2)trans_msg function re-impled: put a invalid
logic before transmitting message to make sure the ACK bit is in
a clear status, otherwise there is chance that ACK asserted already
before transmitting message and lead to fake ACK polling.
(hypervisor side have some tricks to workaround ACK bit being corrupted
by VF FLR which hase an side effects that may make guest side ACK bit
asserted wrongly), and clear TRANS_MSG words after message transferred.

3)for mailbox_flr_work, it is also re-worked: it takes the mutex lock
first if invoked, to block gpu recover's participate too early while
hypervisor side is doing VF FLR. (hypervisor sends FLR_NOTIFY to guest
before doing VF FLR and sentds FLR_COMPLETE after VF FLR done, and
the FLR_NOTIFY will trigger interrupt to guest which lead to
mailbox_flr_work being invoked)

This can avoid the issue that mailbox trans msg being cleared by its VF 
FLR.

4)for mailbox_rcv_irq IRQ routine, it should only peek msg and schedule
mailbox_flr_work, instead of ACK to hypervisor itself, because 
FLR_NOTIFY
msg sent from hypervisor side doesn't need VF's ACK (this is because
VF's ACK would lead to hypervisor clear its trans_valid/msg, and this
would cause handshake bug if trans_valid/msg is cleared not due to
correct VF ACK but from a wrong VF ACK like this "FLR_NOTIFY" one)

Any VF ACK could lead missing msg if timing is bad while hypervisor send 
msg quite frequently and guest send ack frequently (suck like the case 
FLR_NOTIFY followed by READY_TO_ACCESS_GPU immediately). Is it correct? If so, 
maybe we need further protection in GIM.

This fixed handshake bug that sometimes GUEST always couldn't receive
"READY_TO_ACCESS_GPU" msg from hypervisor.

5)seperate polling time limite accordingly:
POLL ACK cost no more than 500ms
POLL MSG cost no more than 12000ms
POLL FLR finish cost no more than 500ms

6) we still need to set adev into in_gpu_reset mode after we received

Re: [PATCH 6/6] drm/amdgpu: explicit give BO type to amdgpu_bo_create

2018-03-07 Thread Christian König

Am 06.03.2018 um 22:29 schrieb Felix Kuehling:

NAK.

For KFD we need the ability to create a BO from an SG list that doesn't
come from another BO. We use this for mapping pages from the doorbell
aperture into GPUVM for GPU self-dispatch.

You can still do this, see amdgpu_gem_prime_import_sg_table.

Just set the BO type to SG and then setting the SG table after creating it:

ret = amdgpu_bo_create(adev, attach->dmabuf->size, PAGE_SIZE,
   AMDGPU_GEM_DOMAIN_CPU, 0, ttm_bo_type_sg,
   resv, );
if (ret)
goto error;
  
	bo->tbo.sg = sg;

bo->tbo.ttm->sg = sg;


Then validate the result into the GTT domain to actually use the SG table.

BTW: How do you create an SG table for the doorbell?

Regards,
Christian.



If you remove this now, I'll need to add it back in some form in a month
or two when I get to that part of upstreaming KFD.

There may be other ways to implement this. Currently we need to create a
BO for anything we want to map into a GPUVM page table, because the
amdgpu_vm code is based around BOs. If there was a way to map physical
addresses into GPUVM without creating a buffer object, that would work too.

Regards,
   Felix


On 2018-03-06 09:43 AM, Christian König wrote:

Drop the "kernel" and sg parameter and give the BO type to create
explicit to amdgpu_bo_create instead of figuring it out from the
parameters.

Signed-off-by: Christian König 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c|  5 +--
  drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c |  8 ++---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c  |  7 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |  6 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c| 46 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h| 11 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c |  7 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_test.c  | 11 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   | 11 ---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c|  8 ++---
  11 files changed, 58 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 292c7e72820c..b1116b773516 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -441,7 +441,7 @@ struct amdgpu_sa_bo {
  void amdgpu_gem_force_release(struct amdgpu_device *adev);
  int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
 int alignment, u32 initial_domain,
-u64 flags, bool kernel,
+u64 flags, enum ttm_bo_type type,
 struct reservation_object *resv,
 struct drm_gem_object **obj);
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c

index 450426dbed92..7f096ed6e83d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -215,8 +215,9 @@ int alloc_gtt_mem(struct kgd_dev *kgd, size_t size,
if ((*mem) == NULL)
return -ENOMEM;
  
-	r = amdgpu_bo_create(adev, size, PAGE_SIZE, true, AMDGPU_GEM_DOMAIN_GTT,

-AMDGPU_GEM_CREATE_CPU_GTT_USWC, NULL, NULL, 
&(*mem)->bo);
+   r = amdgpu_bo_create(adev, size, PAGE_SIZE, AMDGPU_GEM_DOMAIN_GTT,
+AMDGPU_GEM_CREATE_CPU_GTT_USWC, ttm_bo_type_kernel,
+NULL, &(*mem)->bo);
if (r) {
dev_err(adev->dev,
"failed to allocate BO for amdkfd (%d)\n", r);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
index 2fb299afc12b..02b849be083b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
@@ -80,8 +80,8 @@ static void amdgpu_benchmark_move(struct amdgpu_device *adev, 
unsigned size,
int time;
  
  	n = AMDGPU_BENCHMARK_ITERATIONS;

-   r = amdgpu_bo_create(adev, size, PAGE_SIZE, true, sdomain, 0, NULL,
-NULL, );
+   r = amdgpu_bo_create(adev, size, PAGE_SIZE,sdomain, 0,
+ttm_bo_type_kernel, NULL, );
if (r) {
goto out_cleanup;
}
@@ -93,8 +93,8 @@ static void amdgpu_benchmark_move(struct amdgpu_device *adev, 
unsigned size,
if (r) {
goto out_cleanup;
}
-   r = amdgpu_bo_create(adev, size, PAGE_SIZE, true, ddomain, 0, NULL,
-NULL, );
+   r = amdgpu_bo_create(adev, size, PAGE_SIZE, ddomain, 0,
+ttm_bo_type_kernel, NULL, );
if (r) {
goto out_cleanup;
}
diff --git 

Re: [PATCH 2/2] drm/amdgpu:Always save uvd vcpu_bo in VM Mode

2018-03-07 Thread Christian König

Am 06.03.2018 um 21:14 schrieb James Zhu:

When UVD is in VM mode, there is not uvd handle exchanged,
uvd.handles are always 0. So vcpu_bo always need save,
Otherwise amdgpu driver will fail during suspend/resume.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105021
Signed-off-by: James Zhu 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index 9d037cb..61a31e5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -303,7 +303,8 @@ int amdgpu_uvd_suspend(struct amdgpu_device *adev)
if (atomic_read(>uvd.handles[i]))
break;
  
-	if (i == adev->uvd.max_handles)

+   /* only valid for physical mode */
+   if (i == adev->uvd.max_handles && adev->asic_type < CHIP_POLARIS10)


Would be nice to completely skip the check on Polaris and newer. In 
other words:


If (adev->asic_type < CHIP_POLARIS10) {
    for (i= 0;.

    if (i== adev->uvd.max_handles)
        return 0;
}

But that is purely nice to have, both patches are Reviewed-by: Christian 
König .


Regards,
Christian.


return 0;
  
  	size = amdgpu_bo_size(adev->uvd.vcpu_bo);


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 2/4] drm/amdgpu: refactoring mailbox to fix TDR handshake bugs

2018-03-07 Thread Liu, Monk
HI Pixel

The patch better not to be split, otherwise the stress TDR test still fail 
since there are couple issues and some of them mixed together 

For your concerns:
> Any VF ACK could lead missing msg if timing is bad while hypervisor send msg 
> quite frequently and guest send ack frequently (suck like the case FLR_NOTIFY 
> followed by READY_TO_ACCESS_GPU immediately). Is it correct? If so, maybe we 
> need further protection in GIM.

We had a lot of fixings in GIM side along with this patch to make stress TDR 
test finally passed, you can watch the history, anyway GIM side is not related 
with this patch


> it’s not clear to understand the last 2 lines of code here to set RCV_MSG_ACK 
> bit: 0x10 to +1 byte offset. Can we define some macros to make it clear?
Can you do that ? looks to me the MACRO in amdgpu only works for DW aligned 
register names, so I'm afraid we cannot do that gracefully  


> Can we also set valid bit for IDH_FLR_NOTIFICATION_CMPL event in GIM? By now 
> guest use peek_msg to catch this event, so I think it’s OK to set valid for 
> it. Then the code here would be clear without CMPL condition. We can also use 
> peek_ack to ack CMPL if it’s necessary. any problem?

I don't understand your point 




-Original Message-
From: Ding, Pixel 
Sent: 2018年3月7日 16:31
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/4] drm/amdgpu: refactoring mailbox to fix TDR handshake 
bugs

Hi Monk,

It’s a long patch which could split into 6 pieces at most… anyway I got some 
questions inline to have a better understanding.

— 
Sincerely Yours,
Pixel


On 06/03/2018, 11:29 AM, "amd-gfx on behalf of Monk Liu" 
 wrote:

this patch actually refactor mailbox implmentations, and
all below changes are needed together to fix all those mailbox
handshake issues exposured by heavey TDR test.

1)refactor all mailbox functions based on byte accessing for mb_control
reason is to avoid touching non-related bits when writing trn/rcv part of
mailbox_control, this way some incorrect INTR sent to hypervisor
side could be avoided, and it fixes couple handshake bug.

2)trans_msg function re-impled: put a invalid
logic before transmitting message to make sure the ACK bit is in
a clear status, otherwise there is chance that ACK asserted already
before transmitting message and lead to fake ACK polling.
(hypervisor side have some tricks to workaround ACK bit being corrupted
by VF FLR which hase an side effects that may make guest side ACK bit
asserted wrongly), and clear TRANS_MSG words after message transferred.

3)for mailbox_flr_work, it is also re-worked: it takes the mutex lock
first if invoked, to block gpu recover's participate too early while
hypervisor side is doing VF FLR. (hypervisor sends FLR_NOTIFY to guest
before doing VF FLR and sentds FLR_COMPLETE after VF FLR done, and
the FLR_NOTIFY will trigger interrupt to guest which lead to
mailbox_flr_work being invoked)

This can avoid the issue that mailbox trans msg being cleared by its VF FLR.

4)for mailbox_rcv_irq IRQ routine, it should only peek msg and schedule
mailbox_flr_work, instead of ACK to hypervisor itself, because FLR_NOTIFY
msg sent from hypervisor side doesn't need VF's ACK (this is because
VF's ACK would lead to hypervisor clear its trans_valid/msg, and this
would cause handshake bug if trans_valid/msg is cleared not due to
correct VF ACK but from a wrong VF ACK like this "FLR_NOTIFY" one)

Any VF ACK could lead missing msg if timing is bad while hypervisor send msg 
quite frequently and guest send ack frequently (suck like the case FLR_NOTIFY 
followed by READY_TO_ACCESS_GPU immediately). Is it correct? If so, maybe we 
need further protection in GIM.

This fixed handshake bug that sometimes GUEST always couldn't receive
"READY_TO_ACCESS_GPU" msg from hypervisor.

5)seperate polling time limite accordingly:
POLL ACK cost no more than 500ms
POLL MSG cost no more than 12000ms
POLL FLR finish cost no more than 500ms

6) we still need to set adev into in_gpu_reset mode after we received
FLR_NOTIFY from host side, this can prevent innocent app wrongly succesed
to open amdgpu dri device.

FLR_NOFITY is received due to an IDLE hang detected from hypervisor side
which indicating GPU is already die in this VF.

Change-Id: I17df8b4490a5b53a1cc2bd6c8f9bc3ee14c23f1a
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 200 
++
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h |   4 +-
 2 files changed, 111 insertions(+), 93 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
index 271452d..8d09380 100644
   

Re: [PATCH xf86-video-ati] Only change Set/MoveCursor hooks from what we expect

2018-03-07 Thread Michel Dänzer
On 2018-03-07 04:45 AM, Mario Kleiner wrote:
> On 03/05/2018 10:56 PM, Alex Deucher wrote:
>> On Mon, Mar 5, 2018 at 12:44 PM, Michel Dänzer 
>> wrote:
>>> From: Michel Dänzer 
>>>
>>> Since xf86CursorCloseScreen runs after RADEONCloseScreen_KMS,
>>> PointPriv->spriteFuncs doesn't point to the same struct in the latter as
>>> in RADEONCursorInit_KMS. So we were restoring info->Set/MoveCursor to
>>> the wrong struct. Then in the next server generation,
>>> info->Set/MoveCursor would end up pointing to
>>> drmmode_sprite_set/move_cursor, resulting in an infinite loop if one of
>>> them was called.
>>>
>>> To avoid this, only change the Set/MoveCursor hooks if their values
>>> match our expectations, otherwise leave them as is. This is kind of a
>>> hack, but the alternative would be invasive and thus risky changes to
>>> the way we're wrapping CloseScreen, and it's not even clear that can
>>> work without changing xserver code.
>>>
>>> Fixes: 1fe8ca75974c ("Keep track of how many SW cursors are visible on
>>>    each screen")
>>> Signed-off-by: Michel Dänzer 
>>
>> Acked-by: Alex Deucher 
>>
> 
> Nope, not quite, unfortunately. Tested against x-server master, mesa
> master, ati-ddx master, with sddm login manager. With a freshly started
> server, now on a dual-x-screen setup, instead of an infinite loop, i get
> a server crash as soon as i move the mouse cursor from X-Screen 0 to
> X-Screen 1:

Well, that's not the same issue I was seeing after all. I'll take
another look.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: amdgpu with 8+ cards for GPU mining?

2018-03-07 Thread Christian König

Well that is very interesting use case you got there.

Just out of curiosity, can we get a dmesg and the content of /proc/iomem 
from that working system?


Thanks,
Christian.

Am 07.03.2018 um 02:02 schrieb Joseph Wang:
Thanks for all of the help.  I've purchased a Gigabyte GA-B250-Fintech 
motherboard, and was able to get 9 cards running on the system.  So 
the 8 card limit does appear to be a BIOS issue with the ASUS B250 
Mining Expert


Here are some of my experiences with mining motherboards.

https://bitquant.wordpress.com/2018/03/07/mining-motherboards-notes/





___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 1/2] drm/amdgpu: Enable scatter gather display support

2018-03-07 Thread Michel Dänzer
On 2018-03-06 07:23 PM, Christian König wrote:
> Am 06.03.2018 um 19:15 schrieb Michel Dänzer:
>> On 2018-03-06 07:02 PM, Christian König wrote:
>>> Am 06.03.2018 um 18:51 schrieb Michel Dänzer:
 On 2018-03-06 06:44 PM, Christian König wrote:
> Am 06.03.2018 um 18:22 schrieb Li, Samuel:
>>> addition to that I agree with Michel that the module parameter is
>>> overkill.
>> That I already explained. Currently SG display feature needs to
>> provide options for all kinds of use cases. All amd drivers so far
>> provides options, and the default configuration is actually to
>> allocate everything from GTT when possible.
> That isn't job of the kernel to have this parameter. Saying that we
> should probably add a DDX and/or environment option to control that.
 Why would we need that? It's the kernel driver's job to handle this as
 needed.
>>> Buffer placement is specified by the DDX/Mesa, when we override that in
>>> the kernel we just force unnecessary buffer moves.
>> Userspace just needs a way to know which domains can/should be used for
>> scanout buffers, e.g. via an info ioctl.
> 
> Yeah, but why shouldn't they then make the decision where to place it as
> well?
> 
> See when we enforce a certain placement in the kernel we will just get
> an unnecessary buffer move with old user space components.
> 
> In other words the kernel just needs to advertise that the hw/sw
> combination allows scanout from GTT as well and then an updated
> userspace can actually make use of that placement or decide to not use
> it in certain situations.

I think we'll need to always pin BOs to GTT for scanout in some cases,
because some hardware has issues when transitioning between scanout from
VRAM and GTT.

How about:

Userspace can query from the kernel which domain(s) can be used for
scanout: only VRAM / VRAM or GTT / only GTT. Userspace sets the allowed
domains correspondingly for scanout BOs.

If you can think of a scenario where this won't work, please
specifically describe it and how you propose addressing it.


> E.g. the last time I tested it placing things into GTT still resulted
> in quite a performance penalty for rendering.

FWIW, I think the penalty is most likely IOMMU related. Last time I
tested, I couldn't measure a big difference with IOMMU disabled.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: New KFD ioctls: taking the skeletons out of the closet

2018-03-07 Thread Christian König

Am 07.03.2018 um 00:34 schrieb Jerome Glisse:

On Tue, Mar 06, 2018 at 05:44:41PM -0500, Felix Kuehling wrote:

Hi all,

Christian raised two potential issues in a recent KFD upstreaming code
review that are related to the KFD ioctl APIs:

  1. behaviour of -ERESTARTSYS
  2. transactional nature of KFD ioctl definitions, or lack thereof

I appreciate constructive feedback, but I also want to encourage an
open-minded rather than a dogmatic approach to API definitions. So let
me take all the skeletons out of my closet and get these APIs reviewed
in the appropriate forum before we commit to them upstream. See the end
of this email for reference.

The controversial part at this point is kfd_ioctl_map_memory_to_gpu. If
any of the other APIs raise concerns or questions, please ask.

Because of the HSA programming model, KFD memory management APIs are
synchronous. There is no pipelining. Command submission to GPUs through
user mode queues does not involve KFD. This means KFD doesn't know what
memory is used by the GPUs and when it's used. That means, when the
map_memory_to_gpu ioctl returns to user mode, all memory mapping
operations are complete and the memory can be used by the CPUs or GPUs
immediately.

HSA also uses a shared virtual memory model, so typically memory gets
mapped on multiple GPUs and CPUs at the same virtual address.

Does this means that GPU memory get pin ? Or system memory for that matter
too. This was discuss previously but this really goes against kernel mantra
ie kernel no longer manage resources but userspace can hog GPU memory or
even system memory. This is bad !


Fortunately this time it is not about pinning.

All BOs which are part of the VM become a fence object when an user 
space queue is created.


Now when TTM needs to evict those buffer object it will try to wait for 
this fence object which in turn will unmap the user space queue from the 
hardware and wait for running work to finish.


After that TTM can move the BO around just like any normal GFX BO.

Regards,
Christian.



Cheers,
Jérôme


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: New KFD ioctls: taking the skeletons out of the closet

2018-03-07 Thread Christian König

Am 07.03.2018 um 00:09 schrieb Dave Airlie:

On 7 March 2018 at 08:44, Felix Kuehling  wrote:

Hi all,

Christian raised two potential issues in a recent KFD upstreaming code
review that are related to the KFD ioctl APIs:

  1. behaviour of -ERESTARTSYS
  2. transactional nature of KFD ioctl definitions, or lack thereof

I appreciate constructive feedback, but I also want to encourage an
open-minded rather than a dogmatic approach to API definitions. So let
me take all the skeletons out of my closet and get these APIs reviewed
in the appropriate forum before we commit to them upstream. See the end
of this email for reference.

The controversial part at this point is kfd_ioctl_map_memory_to_gpu. If
any of the other APIs raise concerns or questions, please ask.

Because of the HSA programming model, KFD memory management APIs are
synchronous. There is no pipelining. Command submission to GPUs through
user mode queues does not involve KFD. This means KFD doesn't know what
memory is used by the GPUs and when it's used. That means, when the
map_memory_to_gpu ioctl returns to user mode, all memory mapping
operations are complete and the memory can be used by the CPUs or GPUs
immediately.

I've got a few opinions, but first up I still dislike user-mode queues
and everything
they entail. I still feel they are solving a Windows problem and not a
Linux problem,
and it would be nice if we had some Linux numbers on what they gain us over
a dispatch ioctl, because they sure bring a lot of memory management issues.


Well user-mode queues are a problem as long as you don't have 
recoverable page faults on the GPU.


As soon as you got recoverable page faults and push the memory 
management towards things like HMM I don't see an advantage of using a 
IOCTL based command submission any more.


So I would say that this is a problem which is slowly going away as the 
hardware improves.



That said amdkfd is here.

The first question you should ask (which you haven't asked here at all) is
what should userspace do with the ioctl result.


HSA also uses a shared virtual memory model, so typically memory gets
mapped on multiple GPUs and CPUs at the same virtual address.

The point of contention seems to be the ability to map memory to
multiple GPUs in a single ioctl and the behaviour in failure cases. I'll
discuss two main failure cases:

1: Failure after all mappings have been dispatched via SDMA, but a
signal interrupts the wait for completion and we return -ERESTARTSYS.
Documentation/kernel-hacking/hacking.rst only says "[...] you should be
prepared to process the restart, e.g. if you're in the middle of
manipulating some data structure." I think we do that by ensuring that
memory that's already mapped won't be mapped again. So the restart will
become a no-op and just end up waiting for all the previous mappings to
complete.

-ERESTARTSYS at that late stage points to a badly synchronous API,
I'd have said you should have two ioctls, one that returns a fence after
starting the processes, and one that waits for the fence separately.


That is exactly what I suggested as well, but also exactly what Felix 
tries to avoid :)



The overhead of ioctls isn't your enemy until you've measured it and
proven it's a problem.


Christian has a stricter requirement, and I'd like to know where that
comes from: "An interrupted IOCTL should never have a visible effect."

Christian might be taking things a bit further but synchronous gpu access
APIs are bad, but I don't think undoing a bunch of work is a good plan either
just because you got ERESTARTSYS. If you get ERESTARTSYS can you
handle it, if I've fired off 5 SDMAs and wait for them will I fire off 5 more?
will I wait for the original SDMAs if I reenter?


Well it's not only the waiting for the SDMAs. If I understood it 
correctly the IOCTL proposed by Felix allows adding multiple mappings of 
buffer objects on multiple devices with just one IOCTL.


Now the problem is without a lot of redesign of the driver this can fail 
at any place in between those operations. E.g. we could run out of 
memory or hit a permission restriction or an invalid handle etc.. etc...


What would happen is that we end up with a halve complete IOCTL.

A possible solution might be that we could maybe add some kind of 
feedback noting which operations are already complete and then only 
retrying the one which failed.



2: Failure to map on some but not all GPUs. This comes down to the
question, do all ioctl APIs or system calls in general need to be
transactional? As a counter example I'd give incomplete read or write
system calls that return how much was actually read or written. Our
current implementation of map_memory_to_gpu doesn't do this, but it
could be modified to return to user mode how many of the mappings, or
which mappings specifically failed or succeeded.

What should userspace do? if it only get mappings on 3 of the gpus instead
of say 4? Is there a sane resolution other 

Re: [PATCH 2/4] drm/amdgpu: refactoring mailbox to fix TDR handshake bugs

2018-03-07 Thread Ding, Pixel
Hi Monk,

It’s a long patch which could split into 6 pieces at most… anyway I got some 
questions inline to have a better understanding.

— 
Sincerely Yours,
Pixel


On 06/03/2018, 11:29 AM, "amd-gfx on behalf of Monk Liu" 
 wrote:

this patch actually refactor mailbox implmentations, and
all below changes are needed together to fix all those mailbox
handshake issues exposured by heavey TDR test.

1)refactor all mailbox functions based on byte accessing for mb_control
reason is to avoid touching non-related bits when writing trn/rcv part of
mailbox_control, this way some incorrect INTR sent to hypervisor
side could be avoided, and it fixes couple handshake bug.

2)trans_msg function re-impled: put a invalid
logic before transmitting message to make sure the ACK bit is in
a clear status, otherwise there is chance that ACK asserted already
before transmitting message and lead to fake ACK polling.
(hypervisor side have some tricks to workaround ACK bit being corrupted
by VF FLR which hase an side effects that may make guest side ACK bit
asserted wrongly), and clear TRANS_MSG words after message transferred.

3)for mailbox_flr_work, it is also re-worked: it takes the mutex lock
first if invoked, to block gpu recover's participate too early while
hypervisor side is doing VF FLR. (hypervisor sends FLR_NOTIFY to guest
before doing VF FLR and sentds FLR_COMPLETE after VF FLR done, and
the FLR_NOTIFY will trigger interrupt to guest which lead to
mailbox_flr_work being invoked)

This can avoid the issue that mailbox trans msg being cleared by its VF FLR.

4)for mailbox_rcv_irq IRQ routine, it should only peek msg and schedule
mailbox_flr_work, instead of ACK to hypervisor itself, because FLR_NOTIFY
msg sent from hypervisor side doesn't need VF's ACK (this is because
VF's ACK would lead to hypervisor clear its trans_valid/msg, and this
would cause handshake bug if trans_valid/msg is cleared not due to
correct VF ACK but from a wrong VF ACK like this "FLR_NOTIFY" one)

Any VF ACK could lead missing msg if timing is bad while hypervisor send msg 
quite frequently and guest send ack frequently (suck like the case FLR_NOTIFY 
followed by READY_TO_ACCESS_GPU immediately). Is it correct? If so, maybe we 
need further protection in GIM.

This fixed handshake bug that sometimes GUEST always couldn't receive
"READY_TO_ACCESS_GPU" msg from hypervisor.

5)seperate polling time limite accordingly:
POLL ACK cost no more than 500ms
POLL MSG cost no more than 12000ms
POLL FLR finish cost no more than 500ms

6) we still need to set adev into in_gpu_reset mode after we received
FLR_NOTIFY from host side, this can prevent innocent app wrongly succesed
to open amdgpu dri device.

FLR_NOFITY is received due to an IDLE hang detected from hypervisor side
which indicating GPU is already die in this VF.

Change-Id: I17df8b4490a5b53a1cc2bd6c8f9bc3ee14c23f1a
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 200 
++
 drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h |   4 +-
 2 files changed, 111 insertions(+), 93 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c 
b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
index 271452d..8d09380 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
@@ -33,56 +33,42 @@
 
 static void xgpu_ai_mailbox_send_ack(struct amdgpu_device *adev)
 {
-   u32 reg;
-   int timeout = AI_MAILBOX_TIMEDOUT;
-   u32 mask = REG_FIELD_MASK(BIF_BX_PF0_MAILBOX_CONTROL, RCV_MSG_VALID);
-
-   reg = RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-mmBIF_BX_PF0_MAILBOX_CONTROL));
-   reg = REG_SET_FIELD(reg, BIF_BX_PF0_MAILBOX_CONTROL, RCV_MSG_ACK, 1);
-   WREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-  mmBIF_BX_PF0_MAILBOX_CONTROL), reg);
-
-   /*Wait for RCV_MSG_VALID to be 0*/
-   reg = RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-mmBIF_BX_PF0_MAILBOX_CONTROL));
-   while (reg & mask) {
-   if (timeout <= 0) {
-   pr_err("RCV_MSG_VALID is not cleared\n");
-   break;
-   }
-   mdelay(1);
-   timeout -=1;
-
-   reg = RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-
mmBIF_BX_PF0_MAILBOX_CONTROL));
-   }
  + const u32 offset = SOC15_REG_OFFSET(NBIO, 0, 
mmBIF_BX_PF0_MAILBOX_CONTROL) * 4 + 1;
  + WREG8(offset, 2);

it’s not clear to understand the last 2 lines of code here to set RCV_MSG_ACK 
bit: 0x10 to +1