amdgpu driver, question about temporal dithering.

2018-01-01 Thread JTL
Hello

I'm inquiring about the amdgpu driver and temporal dithering. Let me
give you a quick overview of the relevant parts of my setup.

Monitor: BenQ GW2760HS (27" VA panel, native 8-bit, no FRC) connected
via DVI
GPU: ASUS R9 270X
OS: Debian 8.x Jessie, also use Ubuntu MATE 16.04

It seems when using the above setup, despite the monitor being 8-bit I
notice on dark grey and gradient images that the pixels are unstable and
are "moving" with vertical banding, a telltale sign of dithering in my
experience.

I know there is supposedly a way to disable dithering on the older fglrx
driver. Unfortunately as fglrx doesn't work with the new Xorg ABI which
newer Linux distributions use I am unable to test it.

https://forums.guru3d.com/threads/how-to-disable-dithering-in-linux.387362/

I cloned the kernel source from this repository and edited some of the
dce_* files under drivers/gpu/drm/amd/amdgpu where it appears there are
case statements that control the dithering done by the GPU (just an
educated case), so said case statements do nothing instead of setting
dithering registers and managed to recompile the kernel. Unfortunately
that didn't seem to disable the temporal dithering so either I've
patched the wrong functions for my GPU or something else. Frankly I
wonder if AMD cards dither by default independent of the running
graphics driver as I think I saw the dithering when I was running Ubuntu
MATE on their grey colored boot screen (Plymouth).

I'll try and a get a microscope and/or hand lens to try and see the LCD
subpixels later.

https://i.ytimg.com/vi/fANsyzPcXyM/maxresdefault.jpg

```
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
@@ -467,28 +467,6 @@ static void dce_v6_0_program_fmt(struct drm_encoder
*encoder)


switch (bpc) {
-   case 6:
-   if (dither == AMDGPU_FMT_DITHER_ENABLE)
-   /* XXX sort out optimal dither settings */
-   tmp |=
(FMT_BIT_DEPTH_CONTROL__FMT_FRAME_RANDOM_ENABLE_MASK |
-
FMT_BIT_DEPTH_CONTROL__FMT_HIGHPASS_RANDOM_ENABLE_MASK |
-
FMT_BIT_DEPTH_CONTROL__FMT_SPATIAL_DITHER_EN_MASK);
-   else
-   tmp |= FMT_BIT_DEPTH_CONTROL__FMT_TRUNCATE_EN_MASK;
-   break;

```
[sic]

```
default:
/* not needed */
break;
```

My next goal is to figure out how to do live kernel debugging with
another physical computer to see what functions are being hit in regards
to dithering and color depth (as I could have edited a function that's
not called in the case of the R9 270X I have).

I'm actually working on a similar project with Macbook Pro's, OSX and
dithering. I have 2x 2015 Macbook Pro's and are working on reverse
engineering the AMD driver KEXT'S to try and disable dithering on OSX as
it almost always happens with an external display. I have live debugging
setup with lldb over a LAN with one computer being the host and another
being the target, seems like that's not trivial to do under Linux.

I know the motherboard inside my desktop has several COM port headers so
I might be able to do kernel debugging of the GPU driver with the serial
port and another computer. Something to look into.

As for why I am doing this. First of all I am a photographer (one of
many hobbies) and like having accurate colors in my workflow, second of
all I used to get bad headaches from certain visual stimuli due to
nervous system issues. To keep a long story short I've largely gotten
that taken care of but I'm working on disabling dithering to ensure it's
not something contributing to eye fatiguing. Third of all, it's been an
interesting challenge for myself.

Many thanks


-- 
JTL

Website: https://jtl.pw
Email: j...@teamclassified.ca
(other contact methods available on request)

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: Deadlocks with multiple applications on AMD RX 460 and RX 550 - Update 2

2018-01-01 Thread Chunming Zhou

Did you try it on x86 board? Is there same issue?

We should identify it is ARM specific or genera issue for amdgpu driver.


Thanks,

David Zhou


On 2018年01月02日 00:32, Luís Mendes wrote:

I am currently testing the amdgpu driver with AMD RX460 and RX550
graphics cards on an ARM Cortex-A9 with 1GB RAM and I am consistently
getting deadlocks when playing videos with Kodi or other applications.

I'm using Linux kernel from
https://cgit.freedesktop.org/~agd5f/linux/, branch drm-next-4.16 at
commit "drm/amdgpu: Correct the IB size of bo update mapping" -
104bd2ca1124dfd9aa904d5f5a96253ef2b580f6  along with libdrm-2.4.89 and
mesa-17.3.1 on an Ubuntu 17.10 with Mate desktop and Lightdm session
manager over X11.


I am consistently getting deadlocks, which sometimes are almost
immediate, but sometimes they take about half an hour to occur. There
are some video files that I am using for testing which have more
probability of causing a deadlock than others.

I got some kernel crash dumps, kodi process backtraces for the
offending thread and the deadlocked process tree listing which I
attach here. The kernel seems to deadlock during a page flip,
indefinitelly waiting for the DMA fence to complete, however, it
doesn't and the timeout doesn't expire either... as such this may be a
GPU lockup.

I can provide more details, if needed, if there is interest or time to
look into this.

Regards,
Luís Mendes
Software and Hardware engineer

[  253.904103] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, last signaled seq=43831, last emitted seq=43833
[  253.915041] [drm] IP block:gmc_v8_0 is hung!
[  253.915047] [drm] IP block:gfx_v8_0 is hung!
[  253.915162] [drm] GPU recovery disabled.
[  366.541614] INFO: task kworker/u4:4:90 blocked for more than 120
seconds.
[  366.548436]   Not tainted 4.15.0-rc4-drmnext2g #1
[  366.554300] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  366.562162] kworker/u4:4D090  2 0x
[  366.562196] Workqueue: events_unbound commit_work [drm_kms_helper]
[  366.562215] [<80b8c6d4>] (__schedule) from [<80b8cdd0>]
(schedule+0x4c/0xac)
[  366.562223] [<80b8cdd0>] (schedule) from [<80b91024>]
(schedule_timeout+0x228/0x444)
[  366.562233] [<80b91024>] (schedule_timeout) from [<80886738>]
(dma_fence_default_wait+0x2b4/0x2d8)
[  366.562241] [<80886738>] (dma_fence_default_wait) from [<80885d60>]
(dma_fence_wait_timeout+0x40/0x150)
[  366.562248] [<80885d60>] (dma_fence_wait_timeout) from [<80887b1c>]
(reservation_object_wait_timeout_rcu+0xfc/0x34c)
[  366.562476] [<80887b1c>] (reservation_object_wait_timeout_rcu) from
[<7f2d3988>] (amdgpu_dm_do_flip+0xec/0x36c [amdgpu])
[  366.562754] [<7f2d3988>] (amdgpu_dm_do_flip [amdgpu]) from
[<7f2d509c>] (amdgpu_dm_atomic_commit_tail+0xbfc/0xe58 [amdgpu])
[  366.562908] [<7f2d509c>] (amdgpu_dm_atomic_commit_tail [amdgpu])
from [<7f13e58c>] (commit_tail+0x50/0x94 [drm_kms_helper])
[  366.562931] [<7f13e58c>] (commit_tail [drm_kms_helper]) from
[<7f13e5ec>] (commit_work+0x1c/0x20 [drm_kms_helper])
[  366.562948] [<7f13e5ec>] (commit_work [drm_kms_helper]) from
[<8016f4c8>] (process_one_work+0x1a8/0x4ac)
[  366.562955] [<8016f4c8>] (process_one_work) from [<8017050c>]
(worker_thread+0x68/0x598)
[  366.562962] [<8017050c>] (worker_thread) from [<80175e50>]
(kthread+0x16c/0x174)
[  366.562970] [<80175e50>] (kthread) from [<80109de8>]
(ret_from_fork+0x14/0x2c)


 From userland side:
(gdb) info thread
   Id   Target Id Frame
* 1Thread 0x6eb17c70 (LWP 2071) "kodi.bin" 0x748b2246 in ioctl ()
 at ../sysdeps/unix/syscall-template.S:84
   2Thread 0x6eb14170 (LWP 2072) "Announce" __libc_do_syscall ()
 at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
   3Thread 0x6e1ff170 (LWP 2075) "ActiveAE" __libc_do_syscall ()
 at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
   4Thread 0x6d9ff170 (LWP 2076) "AESink" __libc_do_syscall ()
 at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
   5Thread 0x6b7c9170 (LWP 2081) "amdgpu_cs:0" __libc_do_syscall ()
 at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
   6Thread 0x6ae3c170 (LWP 2082) "disk_cache:0" __libc_do_syscall
()
 at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
   7Thread 0x571df170 (LWP 2083) "si_shader:0" __libc_do_syscall ()
 at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
   8Thread 0x569df170 (LWP 2084) "si_shader_low:0"
__libc_do_syscall ()
 at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
   9Thread 0x561df170 (LWP 2085) "gallium_drv:0" __libc_do_syscall
()
 at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
   10   Thread 0x551f6170 (LWP 2086) "kodi.bin" __libc_do_syscall ()
 at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
   11   Thread 0x549f6170 (LWP 2087) "PeripBusUSBUdev"
__libc_do_syscall ()
 at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
---Type  to continue, or q  to quit---
   12   Thread 0x541f6170 

Re: [PATCH] drm/amdgpu/gfx9: only init the apertures used by KGD

2018-01-01 Thread Chunming Zhou

Reviewed-by: Chunming Zhou 


On 2018年01月02日 05:17, Alex Deucher wrote:

Use adev->vm_manager.id_mgr[0].num_ids rather than hardcoded 16.

Noticed-by: Felix Kuehling 
Signed-off-by: Alex Deucher 
---
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 55670dbacace..4abaf802a260 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -1526,7 +1526,7 @@ static void gfx_v9_0_gpu_init(struct amdgpu_device *adev)
/* XXX SH_MEM regs */
/* where to put LDS, scratch, GPUVM in FSA64 space */
mutex_lock(>srbm_mutex);
-   for (i = 0; i < 16; i++) {
+   for (i = 0; i < adev->vm_manager.id_mgr[0].num_ids; i++) {
soc15_grbm_select(adev, 0, 0, 0, i);
/* CP and shaders */
if (i == 0) {


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH 2/4] drm/amd/powerplay: remove unused parameter of phm_start_thermal_controller

2018-01-01 Thread Quan, Evan
Hi Alex,

Sorry, i cannot get your point. The patch removes the 2nd parameter 
'temperature_range' which is always passed in as 'NULL'.
Do you mean someone sent another patch which uses this parameter as non 'NULL'?

Regards,
Evan
From: Deucher, Alexander
Sent: Sunday, December 31, 2017 2:31 AM
To: Quan, Evan ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/4] drm/amd/powerplay: remove unused parameter of 
phm_start_thermal_controller


Should this be used to set the default critical temperatures?  IIRC, they were 
set to 0 until last week when someone sent a patch to fix them.



Alex


From: amd-gfx 
>
 on behalf of Evan Quan >
Sent: Friday, December 29, 2017 2:44 AM
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan
Subject: [PATCH 2/4] drm/amd/powerplay: remove unused parameter of 
phm_start_thermal_controller

Change-Id: Id6039cb50b73bdf8a6df37e5383f4bea4ae737ed
Signed-off-by: Evan Quan >
---
 drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c | 14 +++---
 drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c   |  4 ++--
 drivers/gpu/drm/amd/powerplay/inc/hardwaremanager.h   |  2 +-
 3 files changed, 6 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c
index 623cff9..cba0aee 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/hardwaremanager.c
@@ -220,20 +220,12 @@ int phm_register_thermal_interrupt(struct pp_hwmgr 
*hwmgr, const void *info)
 * Initializes the thermal controller subsystem.
 *
 * @parampHwMgr  the address of the powerplay hardware manager.
-* @parampTemperatureRange the address of the structure holding the 
temperature range.
 * @exception PP_Result_Failed if any of the paramters is NULL, otherwise the 
return value from the dispatcher.
 */
-int phm_start_thermal_controller(struct pp_hwmgr *hwmgr, struct 
PP_TemperatureRange *temperature_range)
+int phm_start_thermal_controller(struct pp_hwmgr *hwmgr)
 {
-   struct PP_TemperatureRange range;
-
-   if (temperature_range == NULL) {
-   range.max = TEMP_RANGE_MAX;
-   range.min = TEMP_RANGE_MIN;
-   } else {
-   range.max = temperature_range->max;
-   range.min = temperature_range->min;
-   }
+   struct PP_TemperatureRange range = {{TEMP_RANGE_MIN, TEMP_RANGE_MAX}};
+
 if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps,
 PHM_PlatformCaps_ThermalController)
 && hwmgr->hwmgr_func->start_thermal_controller != NULL)
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
index 08b7963..38f7d0d 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
@@ -265,7 +265,7 @@ int hwmgr_hw_init(struct pp_instance *handle)
 ret = phm_enable_dynamic_state_management(hwmgr);
 if (ret)
 goto err2;
-   ret = phm_start_thermal_controller(hwmgr, NULL);
+   ret = phm_start_thermal_controller(hwmgr);
 ret |= psm_set_performance_states(hwmgr);
 if (ret)
 goto err2;
@@ -345,7 +345,7 @@ int hwmgr_hw_resume(struct pp_instance *handle)
 ret = phm_enable_dynamic_state_management(hwmgr);
 if (ret)
 return ret;
-   ret = phm_start_thermal_controller(hwmgr, NULL);
+   ret = phm_start_thermal_controller(hwmgr);
 if (ret)
 return ret;

diff --git a/drivers/gpu/drm/amd/powerplay/inc/hardwaremanager.h 
b/drivers/gpu/drm/amd/powerplay/inc/hardwaremanager.h
index b8bd86b..7489003 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/hardwaremanager.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/hardwaremanager.h
@@ -407,7 +407,7 @@ extern int phm_force_dpm_levels(struct pp_hwmgr *hwmgr, 
enum amd_dpm_forced_leve
 extern int phm_display_configuration_changed(struct pp_hwmgr *hwmgr);
 extern int phm_notify_smc_display_config_after_ps_adjustment(struct pp_hwmgr 
*hwmgr);
 extern int phm_register_thermal_interrupt(struct pp_hwmgr *hwmgr, const void 
*info);
-extern int phm_start_thermal_controller(struct pp_hwmgr *hwmgr, struct 
PP_TemperatureRange *temperature_range);
+extern int phm_start_thermal_controller(struct pp_hwmgr *hwmgr);
 extern int phm_stop_thermal_controller(struct pp_hwmgr *hwmgr);
 extern bool phm_check_smc_update_required_for_display_configuration(struct 
pp_hwmgr *hwmgr);

--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 1/4] drm/amd/powerplay: show the right unit for the temp printed out

2018-01-01 Thread Quan, Evan
Got it. I will drop this patch.

Regards,
Evan
From: Deucher, Alexander
Sent: Sunday, December 31, 2017 2:29 AM
To: Quan, Evan ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/4] drm/amd/powerplay: show the right unit for the temp 
printed out


I don't think the hwmon interface should expose the units in the string, this 
will probably break some applications that use the hwmon interface.  They are 
assumed to be milli-degrees celsius as per the API.



Alex


From: amd-gfx 
>
 on behalf of Evan Quan >
Sent: Friday, December 29, 2017 2:44 AM
To: amd-gfx@lists.freedesktop.org
Cc: Quan, Evan
Subject: [PATCH 1/4] drm/amd/powerplay: show the right unit for the temp 
printed out

Change-Id: I30ea29aa85ab89c0017ecb4e0ab469db5ab5c103
Signed-off-by: Evan Quan >
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
index 814329b..91f809e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c
@@ -789,7 +789,7 @@ static ssize_t amdgpu_hwmon_show_temp(struct device *dev,
 else
 temp = amdgpu_dpm_get_temperature(adev);

-   return snprintf(buf, PAGE_SIZE, "%d\n", temp);
+   return snprintf(buf, PAGE_SIZE, "%d millicelsius\n", temp);
 }

 static ssize_t amdgpu_hwmon_show_temp_thresh(struct device *dev,
@@ -805,7 +805,7 @@ static ssize_t amdgpu_hwmon_show_temp_thresh(struct device 
*dev,
 else
 temp = adev->pm.dpm.thermal.max_temp;

-   return snprintf(buf, PAGE_SIZE, "%d\n", temp);
+   return snprintf(buf, PAGE_SIZE, "%d millicelsius\n", temp);
 }

 static ssize_t amdgpu_hwmon_get_pwm1_enable(struct device *dev,
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu/gfx9: only init the apertures used by KGD

2018-01-01 Thread Alex Deucher
Use adev->vm_manager.id_mgr[0].num_ids rather than hardcoded 16.

Noticed-by: Felix Kuehling 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 55670dbacace..4abaf802a260 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -1526,7 +1526,7 @@ static void gfx_v9_0_gpu_init(struct amdgpu_device *adev)
/* XXX SH_MEM regs */
/* where to put LDS, scratch, GPUVM in FSA64 space */
mutex_lock(>srbm_mutex);
-   for (i = 0; i < 16; i++) {
+   for (i = 0; i < adev->vm_manager.id_mgr[0].num_ids; i++) {
soc15_grbm_select(adev, 0, 0, 0, i);
/* CP and shaders */
if (i == 0) {
-- 
2.13.6

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/radeon: use raw buffer printk specifier

2018-01-01 Thread Alex Deucher
On Thu, Dec 21, 2017 at 5:04 AM, Dmitry Rozhkov
 wrote:
> printk format strings accepting a single subsequent argument
> are shorter thus easier to read.

I'm not sure I agree it's easier to read.  IMHO, it's somewhat less
clear what's going on, but I don't have a particularly strong opinion
either way.  Applied.

Thanks,

Alex

>
> Instead of having format strings accepting 3 different arguments
> pointing to first 3 bytes of the same buffer rewrite the format
> string to accept only one argument - the buffer - with "%3ph"
> specifier.
>
> Signed-off-by: Dmitry Rozhkov 
> Suggested-by: Andy Shevchenko 
> ---
>  drivers/gpu/drm/radeon/radeon_dp_mst.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_dp_mst.c 
> b/drivers/gpu/drm/radeon/radeon_dp_mst.c
> index 183b4b482138..ca2bcfb32935 100644
> --- a/drivers/gpu/drm/radeon/radeon_dp_mst.c
> +++ b/drivers/gpu/drm/radeon/radeon_dp_mst.c
> @@ -718,7 +718,7 @@ radeon_dp_mst_check_status(struct radeon_connector 
> *radeon_connector)
>DP_SINK_COUNT_ESI, esi, 8);
>  go_again:
> if (dret == 8) {
> -   DRM_DEBUG_KMS("got esi %02x %02x %02x\n", esi[0], 
> esi[1], esi[2]);
> +   DRM_DEBUG_KMS("got esi %3ph\n", esi);
> ret = drm_dp_mst_hpd_irq(_connector->mst_mgr, 
> esi, );
>
> if (handled) {
> @@ -733,7 +733,7 @@ radeon_dp_mst_check_status(struct radeon_connector 
> *radeon_connector)
> dret = 
> drm_dp_dpcd_read(_connector->ddc_bus->aux,
> DP_SINK_COUNT_ESI, 
> esi, 8);
> if (dret == 8) {
> -   DRM_DEBUG_KMS("got esi2 %02x %02x 
> %02x\n", esi[0], esi[1], esi[2]);
> +   DRM_DEBUG_KMS("got esi2 %3ph\n", esi);
> goto go_again;
> }
> } else
> --
> 2.13.6
>
> ___
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH 11/12] drm/amd/powerplay: drop unneeded newline

2018-01-01 Thread Alex Deucher
On Wed, Dec 27, 2017 at 9:51 AM, Julia Lawall  wrote:
> PP_ASSERT_WITH_CODE prints a newline at the end of the message string,
> so the message string does not need to include a newline explicitly.
> Done using Coccinelle.
>
> Signed-off-by: Julia Lawall 

Applied.  thanks!

Alex

>
> ---
>
> I couldn't figure out how to configure the kernel to get any of this code
> to compile.
>
>  drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c|   12 
>  drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c  |2 +-
>  drivers/gpu/drm/amd/powerplay/smumgr/iceland_smumgr.c   |2 +-
>  drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c |2 +-
>  drivers/gpu/drm/amd/powerplay/smumgr/tonga_smumgr.c |2 +-
>  5 files changed, 12 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c 
> b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> index 40adc85..8d7fd06 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
> @@ -2266,14 +2266,18 @@ static int 
> smu7_set_private_data_based_on_pptable_v0(struct pp_hwmgr *hwmgr)
> struct phm_clock_voltage_dependency_table *allowed_mclk_vddci_table = 
> hwmgr->dyn_state.vddci_dependency_on_mclk;
>
> PP_ASSERT_WITH_CODE(allowed_sclk_vddc_table != NULL,
> -   "VDDC dependency on SCLK table is missing. This table is 
> mandatory\n", return -EINVAL);
> +   "VDDC dependency on SCLK table is missing. This table is 
> mandatory",
> +   return -EINVAL);
> PP_ASSERT_WITH_CODE(allowed_sclk_vddc_table->count >= 1,
> -   "VDDC dependency on SCLK table has to have is missing. This 
> table is mandatory\n", return -EINVAL);
> +   "VDDC dependency on SCLK table has to have is missing. This 
> table is mandatory",
> +   return -EINVAL);
>
> PP_ASSERT_WITH_CODE(allowed_mclk_vddc_table != NULL,
> -   "VDDC dependency on MCLK table is missing. This table is 
> mandatory\n", return -EINVAL);
> +   "VDDC dependency on MCLK table is missing. This table is 
> mandatory",
> +   return -EINVAL);
> PP_ASSERT_WITH_CODE(allowed_mclk_vddc_table->count >= 1,
> -   "VDD dependency on MCLK table has to have is missing. This 
> table is mandatory\n", return -EINVAL);
> +   "VDD dependency on MCLK table has to have is missing. This 
> table is mandatory",
> +   return -EINVAL);
>
> data->min_vddc_in_pptable = 
> (uint16_t)allowed_sclk_vddc_table->entries[0].v;
> data->max_vddc_in_pptable = 
> (uint16_t)allowed_sclk_vddc_table->entries[allowed_sclk_vddc_table->count - 
> 1].v;
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
> index 085d81c..427daa6 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
> @@ -1799,7 +1799,7 @@ static int 
> fiji_populate_clock_stretcher_data_table(struct pp_hwmgr *hwmgr)
> phm_cap_unset(hwmgr->platform_descriptor.platformCaps,
> PHM_PlatformCaps_ClockStretcher);
> PP_ASSERT_WITH_CODE(false,
> -   "Stretch Amount in PPTable not supported\n",
> +   "Stretch Amount in PPTable not supported",
> return -EINVAL);
> }
>
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/iceland_smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/iceland_smumgr.c
> index 1253126..6400065 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/iceland_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/iceland_smumgr.c
> @@ -546,7 +546,7 @@ static int iceland_get_std_voltage_value_sidd(struct 
> pp_hwmgr *hwmgr,
>
> /* SCLK/VDDC Dependency Table has to exist. */
> PP_ASSERT_WITH_CODE(NULL != hwmgr->dyn_state.vddc_dependency_on_sclk,
> -   "The SCLK/VDDC Dependency Table does not exist.\n",
> +   "The SCLK/VDDC Dependency Table does not exist.",
> return -EINVAL);
>
> if (NULL == hwmgr->dyn_state.cac_leakage_table) {
> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c 
> b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
> index cdb4765..fd874f7 100644
> --- a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
> @@ -1652,7 +1652,7 @@ static int 
> polaris10_populate_clock_stretcher_data_table(struct pp_hwmgr *hwmgr)
> phm_cap_unset(hwmgr->platform_descriptor.platformCaps,
> PHM_PlatformCaps_ClockStretcher);
> PP_ASSERT_WITH_CODE(false,
> -   "Stretch Amount in 

Deadlocks with multiple applications on AMD RX 460 and RX 550 - Update 2

2018-01-01 Thread Luís Mendes
I am currently testing the amdgpu driver with AMD RX460 and RX550
graphics cards on an ARM Cortex-A9 with 1GB RAM and I am consistently
getting deadlocks when playing videos with Kodi or other applications.

I'm using Linux kernel from
https://cgit.freedesktop.org/~agd5f/linux/, branch drm-next-4.16 at
commit "drm/amdgpu: Correct the IB size of bo update mapping" -
104bd2ca1124dfd9aa904d5f5a96253ef2b580f6  along with libdrm-2.4.89 and
mesa-17.3.1 on an Ubuntu 17.10 with Mate desktop and Lightdm session
manager over X11.


I am consistently getting deadlocks, which sometimes are almost
immediate, but sometimes they take about half an hour to occur. There
are some video files that I am using for testing which have more
probability of causing a deadlock than others.

I got some kernel crash dumps, kodi process backtraces for the
offending thread and the deadlocked process tree listing which I
attach here. The kernel seems to deadlock during a page flip,
indefinitelly waiting for the DMA fence to complete, however, it
doesn't and the timeout doesn't expire either... as such this may be a
GPU lockup.

I can provide more details, if needed, if there is interest or time to
look into this.

Regards,
Luís Mendes
Software and Hardware engineer

[  253.904103] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
timeout, last signaled seq=43831, last emitted seq=43833
[  253.915041] [drm] IP block:gmc_v8_0 is hung!
[  253.915047] [drm] IP block:gfx_v8_0 is hung!
[  253.915162] [drm] GPU recovery disabled.
[  366.541614] INFO: task kworker/u4:4:90 blocked for more than 120
seconds.
[  366.548436]   Not tainted 4.15.0-rc4-drmnext2g #1
[  366.554300] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  366.562162] kworker/u4:4D090  2 0x
[  366.562196] Workqueue: events_unbound commit_work [drm_kms_helper]
[  366.562215] [<80b8c6d4>] (__schedule) from [<80b8cdd0>]
(schedule+0x4c/0xac)
[  366.562223] [<80b8cdd0>] (schedule) from [<80b91024>]
(schedule_timeout+0x228/0x444)
[  366.562233] [<80b91024>] (schedule_timeout) from [<80886738>]
(dma_fence_default_wait+0x2b4/0x2d8)
[  366.562241] [<80886738>] (dma_fence_default_wait) from [<80885d60>]
(dma_fence_wait_timeout+0x40/0x150)
[  366.562248] [<80885d60>] (dma_fence_wait_timeout) from [<80887b1c>]
(reservation_object_wait_timeout_rcu+0xfc/0x34c)
[  366.562476] [<80887b1c>] (reservation_object_wait_timeout_rcu) from
[<7f2d3988>] (amdgpu_dm_do_flip+0xec/0x36c [amdgpu])
[  366.562754] [<7f2d3988>] (amdgpu_dm_do_flip [amdgpu]) from
[<7f2d509c>] (amdgpu_dm_atomic_commit_tail+0xbfc/0xe58 [amdgpu])
[  366.562908] [<7f2d509c>] (amdgpu_dm_atomic_commit_tail [amdgpu])
from [<7f13e58c>] (commit_tail+0x50/0x94 [drm_kms_helper])
[  366.562931] [<7f13e58c>] (commit_tail [drm_kms_helper]) from
[<7f13e5ec>] (commit_work+0x1c/0x20 [drm_kms_helper])
[  366.562948] [<7f13e5ec>] (commit_work [drm_kms_helper]) from
[<8016f4c8>] (process_one_work+0x1a8/0x4ac)
[  366.562955] [<8016f4c8>] (process_one_work) from [<8017050c>]
(worker_thread+0x68/0x598)
[  366.562962] [<8017050c>] (worker_thread) from [<80175e50>]
(kthread+0x16c/0x174)
[  366.562970] [<80175e50>] (kthread) from [<80109de8>]
(ret_from_fork+0x14/0x2c)


From userland side:
(gdb) info thread
  Id   Target Id Frame
* 1Thread 0x6eb17c70 (LWP 2071) "kodi.bin" 0x748b2246 in ioctl ()
at ../sysdeps/unix/syscall-template.S:84
  2Thread 0x6eb14170 (LWP 2072) "Announce" __libc_do_syscall ()
at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  3Thread 0x6e1ff170 (LWP 2075) "ActiveAE" __libc_do_syscall ()
at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  4Thread 0x6d9ff170 (LWP 2076) "AESink" __libc_do_syscall ()
at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  5Thread 0x6b7c9170 (LWP 2081) "amdgpu_cs:0" __libc_do_syscall ()
at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  6Thread 0x6ae3c170 (LWP 2082) "disk_cache:0" __libc_do_syscall
()
at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  7Thread 0x571df170 (LWP 2083) "si_shader:0" __libc_do_syscall ()
at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  8Thread 0x569df170 (LWP 2084) "si_shader_low:0"
__libc_do_syscall ()
at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  9Thread 0x561df170 (LWP 2085) "gallium_drv:0" __libc_do_syscall
()
at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  10   Thread 0x551f6170 (LWP 2086) "kodi.bin" __libc_do_syscall ()
at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  11   Thread 0x549f6170 (LWP 2087) "PeripBusUSBUdev"
__libc_do_syscall ()
at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
---Type  to continue, or q  to quit---
  12   Thread 0x541f6170 (LWP 2088) "PeripBusCEC" __libc_do_syscall ()
at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
  13   Thread 0x539f6170 (LWP 2089) "PeripBusAddon" __libc_do_syscall
()
at