[PATCH 3/7] drm/amd/powerplay: Use proper enums in smu_adjust_power_state_dynamic

2019-07-03 Thread Nathan Chancellor
clang warns:

drivers/gpu/drm/amd/amdgpu/../powerplay/amdgpu_smu.c:1374:30: warning:
implicit conversion from enumeration type 'enum pp_clock_type' to
different enumeration type 'enum smu_clk_type' [-Wenum-conversion]
smu_force_clk_levels(smu, PP_SCLK, 1 << sclk_mask);
~~^~~~
drivers/gpu/drm/amd/amdgpu/../powerplay/amdgpu_smu.c:1375:30: warning:
implicit conversion from enumeration type 'enum pp_clock_type' to
different enumeration type 'enum smu_clk_type' [-Wenum-conversion]
smu_force_clk_levels(smu, PP_MCLK, 1 << mclk_mask);
~~^~~~

This appears to be a copy and paste fail from when this was a call to
vega20_force_clk_levels.

Fixes: bc0fcffd36ba ("drm/amd/powerplay: Unify smu handle task function (v2)")
Link: https://github.com/ClangBuiltLinux/linux/issues/584
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c 
b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
index 31152d495f69..e897469f7431 100644
--- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
@@ -1371,8 +1371,8 @@ int smu_adjust_power_state_dynamic(struct smu_context 
*smu,
 &soc_mask);
if (ret)
return ret;
-   smu_force_clk_levels(smu, PP_SCLK, 1 << sclk_mask);
-   smu_force_clk_levels(smu, PP_MCLK, 1 << mclk_mask);
+   smu_force_clk_levels(smu, SMU_SCLK, 1 << sclk_mask);
+   smu_force_clk_levels(smu, SMU_MCLK, 1 << mclk_mask);
break;
 
case AMD_DPM_FORCED_LEVEL_MANUAL:
-- 
2.22.0



[PATCH 0/7] amdgpu clang warning fixes on next-20190703

2019-07-03 Thread Nathan Chancellor
Hi all,

I don't do threaded patches very often so if I have messed something up,
please forgive me :)

This series fixes all of the clang warnings that I saw added in
next-20190703. The full list is visible in the gist linked below and
each full individual warning can be seen in the GitHub link in each
patch.

https://gist.github.com/5411af08b96c99b14e60c60800e99a47

All of the warnings are fixed in what I believe is the optimal way but
the enum conversion warnings were the trickiest; please review carefully
as the code paths for some of them have changed (especially in patch 3
and 6).

Thank you!
Nathan


[PATCH 6/7] drm/amd/powerplay: Use proper enums in vega20_print_clk_levels

2019-07-03 Thread Nathan Chancellor
clang warns:

drivers/gpu/drm/amd/amdgpu/../powerplay/vega20_ppt.c:995:39: warning:
implicit conversion from enumeration type 'PPCLK_e' to different
enumeration type 'enum smu_clk_type' [-Wenum-conversion]
ret = smu_get_current_clk_freq(smu, PPCLK_SOCCLK, &now);
  ~~^~~
drivers/gpu/drm/amd/amdgpu/../powerplay/vega20_ppt.c:1016:39: warning:
implicit conversion from enumeration type 'PPCLK_e' to different
enumeration type 'enum smu_clk_type' [-Wenum-conversion]
ret = smu_get_current_clk_freq(smu, PPCLK_FCLK, &now);
  ~~^
drivers/gpu/drm/amd/amdgpu/../powerplay/vega20_ppt.c:1031:39: warning:
implicit conversion from enumeration type 'PPCLK_e' to different
enumeration type 'enum smu_clk_type' [-Wenum-conversion]
ret = smu_get_current_clk_freq(smu, PPCLK_DCEFCLK, &now);
  ~~^~~~

The values are mapped one to one in vega20_get_smu_clk_index so just use
the proper enums here.

Fixes: 096761014227 ("drm/amd/powerplay: support sysfs to get socclk, fclk, 
dcefclk")
Link: https://github.com/ClangBuiltLinux/linux/issues/587
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/powerplay/vega20_ppt.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/vega20_ppt.c 
b/drivers/gpu/drm/amd/powerplay/vega20_ppt.c
index 0f14fe14ecd8..e62dd6919b24 100644
--- a/drivers/gpu/drm/amd/powerplay/vega20_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/vega20_ppt.c
@@ -992,7 +992,7 @@ static int vega20_print_clk_levels(struct smu_context *smu,
break;
 
case SMU_SOCCLK:
-   ret = smu_get_current_clk_freq(smu, PPCLK_SOCCLK, &now);
+   ret = smu_get_current_clk_freq(smu, SMU_SOCCLK, &now);
if (ret) {
pr_err("Attempt to get current socclk Failed!");
return ret;
@@ -1013,7 +1013,7 @@ static int vega20_print_clk_levels(struct smu_context 
*smu,
break;
 
case SMU_FCLK:
-   ret = smu_get_current_clk_freq(smu, PPCLK_FCLK, &now);
+   ret = smu_get_current_clk_freq(smu, SMU_FCLK, &now);
if (ret) {
pr_err("Attempt to get current fclk Failed!");
return ret;
@@ -1028,7 +1028,7 @@ static int vega20_print_clk_levels(struct smu_context 
*smu,
break;
 
case SMU_DCEFCLK:
-   ret = smu_get_current_clk_freq(smu, PPCLK_DCEFCLK, &now);
+   ret = smu_get_current_clk_freq(smu, SMU_DCEFCLK, &now);
if (ret) {
pr_err("Attempt to get current dcefclk Failed!");
return ret;
-- 
2.22.0



[PATCH 1/7] drm/amdgpu/mes10.1: Fix header guard

2019-07-03 Thread Nathan Chancellor
clang warns:

 In file included from drivers/gpu/drm/amd/amdgpu/nv.c:53:
 drivers/gpu/drm/amd/amdgpu/../amdgpu/mes_v10_1.h:24:9: warning:
 '__MES_V10_1_H__' is used as a header guard here, followed by #define of
 a different macro [-Wheader-guard]
 #ifndef __MES_V10_1_H__
 ^~~
 drivers/gpu/drm/amd/amdgpu/../amdgpu/mes_v10_1.h:25:9: note:
 '__MES_v10_1_H__' is defined here; did you mean '__MES_V10_1_H__'?
 #define __MES_v10_1_H__
 ^~~
 __MES_V10_1_H__
 1 warning generated.

Capitalize the V.

Fixes: 886f82aa7a1d ("drm/amdgpu/mes10.1: add ip block mes10.1 (v2)")
Link: https://github.com/ClangBuiltLinux/linux/issues/582
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/amdgpu/mes_v10_1.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.h 
b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.h
index 17b9b53fa892..9afd6ddb01e9 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v10_1.h
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v10_1.h
@@ -22,7 +22,7 @@
  */
 
 #ifndef __MES_V10_1_H__
-#define __MES_v10_1_H__
+#define __MES_V10_1_H__
 
 extern const struct amdgpu_ip_block_version mes_v10_1_ip_block;
 
-- 
2.22.0



[PATCH 4/7] drm/amd/powerplay: Zero initialize freq in smu_v11_0_get_current_clk_freq

2019-07-03 Thread Nathan Chancellor
clang warns (trimmed for brevity):

drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:1098:10: warning:
variable 'freq' is used uninitialized whenever '?:' condition is false
[-Wsometimes-uninitialized]
ret =  smu_get_current_clk_freq_by_table(smu, clk_id, &freq);
   ^

If get_current_clk_freq_by_table is ever NULL, freq will fail to be
properly initialized. Zero initialize it to avoid using uninitialized
stack values.

smu_get_current_clk_freq_by_table expands to a ternary operator
conditional on smu->funcs->get_current_clk_freq_by_table being not NULL.
When this is false, freq will be uninitialized. Zero initialize freq to
avoid using random stack values if that ever happens.

Fixes: e36182490dec ("drm/amd/powerplay: fix dpm freq unit error (10KHz -> 
Mhz)")
Link: https://github.com/ClangBuiltLinux/linux/issues/585
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c 
b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
index 632a20587c8b..a6f8cd6df7f1 100644
--- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
@@ -1088,7 +1088,7 @@ static int smu_v11_0_get_current_clk_freq(struct 
smu_context *smu,
  uint32_t *value)
 {
int ret = 0;
-   uint32_t freq;
+   uint32_t freq = 0;
 
if (clk_id >= SMU_CLK_COUNT || !value)
return -EINVAL;
-- 
2.22.0



[PATCH 7/7] drm/amd/powerplay: Zero initialize current_rpm in vega20_get_fan_speed_percent

2019-07-03 Thread Nathan Chancellor
clang warns (trimmed for brevity):

drivers/gpu/drm/amd/amdgpu/../powerplay/vega20_ppt.c:3023:8: warning:
variable 'current_rpm' is used uninitialized whenever '?:' condition is
false [-Wsometimes-uninitialized]
ret = smu_get_current_rpm(smu, ¤t_rpm);
  ^~

smu_get_current_rpm expands to a ternary operator conditional on
smu->funcs->get_current_rpm being not NULL. When this is false,
current_rpm will be uninitialized. Zero initialize current_rpm to
avoid using random stack values if that ever happens.

Fixes: ee0db82027ee ("drm/amd/powerplay: move PPTable_t uses into asic level")
Link: https://github.com/ClangBuiltLinux/linux/issues/588
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/powerplay/vega20_ppt.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/vega20_ppt.c 
b/drivers/gpu/drm/amd/powerplay/vega20_ppt.c
index e62dd6919b24..e37b39987587 100644
--- a/drivers/gpu/drm/amd/powerplay/vega20_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/vega20_ppt.c
@@ -3016,8 +3016,7 @@ static int vega20_get_fan_speed_percent(struct 
smu_context *smu,
uint32_t *speed)
 {
int ret = 0;
-   uint32_t percent = 0;
-   uint32_t current_rpm;
+   uint32_t current_rpm = 0, percent = 0;
PPTable_t *pptable = smu->smu_table.driver_pptable;
 
ret = smu_get_current_rpm(smu, ¤t_rpm);
-- 
2.22.0



[PATCH 5/7] drm/amd/display: Use proper enum conversion functions

2019-07-03 Thread Nathan Chancellor
clang warns:

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_pp_smu.c:336:8:
warning: implicit conversion from enumeration type 'enum smu_clk_type'
to different enumeration type 'enum amd_pp_clock_type'
[-Wenum-conversion]
dc_to_smu_clock_type(clk_type),
^~~
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_pp_smu.c:421:14:
warning: implicit conversion from enumeration type 'enum
amd_pp_clock_type' to different enumeration type 'enum smu_clk_type'
[-Wenum-conversion]
dc_to_pp_clock_type(clk_type),
^~

There are functions to properly convert between all of these types, use
them so there are no longer any warnings.

Fixes: a43913ea50a5 ("drm/amd/powerplay: add function 
get_clock_by_type_with_latency for navi10")
Fixes: e5e4e22391c2 ("drm/amd/powerplay: add interface to get clock by type 
with latency for display (v2)")
Link: https://github.com/ClangBuiltLinux/linux/issues/586
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
index eac09bfe3be2..0f76cfff9d9b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_pp_smu.c
@@ -333,7 +333,7 @@ bool dm_pp_get_clock_levels_by_type(
}
} else if (adev->smu.funcs && adev->smu.funcs->get_clock_by_type) {
if (smu_get_clock_by_type(&adev->smu,
- dc_to_smu_clock_type(clk_type),
+ dc_to_pp_clock_type(clk_type),
  &pp_clks)) {
get_default_clock_levels(clk_type, dc_clks);
return true;
@@ -418,7 +418,7 @@ bool dm_pp_get_clock_levels_by_type_with_latency(
return false;
} else if (adev->smu.ppt_funcs && 
adev->smu.ppt_funcs->get_clock_by_type_with_latency) {
if (smu_get_clock_by_type_with_latency(&adev->smu,
-  
dc_to_pp_clock_type(clk_type),
+  
dc_to_smu_clock_type(clk_type),
   &pp_clks))
return false;
}
-- 
2.22.0



[PATCH 2/7] drm/amd/powerplay: Use memset to initialize metrics structs

2019-07-03 Thread Nathan Chancellor
clang warns:

drivers/gpu/drm/amd/amdgpu/../powerplay/navi10_ppt.c:601:33: warning:
suggest braces around initialization of subobject [-Wmissing-braces]
static SmuMetrics_t metrics = {0};
   ^
   {}
drivers/gpu/drm/amd/amdgpu/../powerplay/navi10_ppt.c:905:26: warning:
suggest braces around initialization of subobject [-Wmissing-braces]
SmuMetrics_t metrics = {0};
^
{}
2 warnings generated.

One way to fix these warnings is to add additional braces like clang
suggests; however, there has been a bit of push back from some
maintainers[1][2], who just prefer memset as it is unambiguous, doesn't
depend on a particular compiler version[3], and properly initializes all
subobjects. Do that here so there are no more warnings.

[1]: https://lore.kernel.org/lkml/022e41c0-8465-dc7a-a45c-64187ecd9...@amd.com/
[2]: 
https://lore.kernel.org/lkml/20181128.215241.702406654469517539.da...@davemloft.net/
[3]: https://lore.kernel.org/lkml/20181116150432.2408a...@redhat.com/

Fixes: 98e1a543c7b1 ("drm/amd/powerplay: add function get current clock freq 
interface for navi10")
Fixes: ab43c4bf1cc8 ("drm/amd/powerplay: fix fan speed show error (for hwmon 
pwm)")
Link: https://github.com/ClangBuiltLinux/linux/issues/583
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/powerplay/navi10_ppt.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c 
b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
index e00397f84b2f..f5d2ada05bc6 100644
--- a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
@@ -598,12 +598,14 @@ static int navi10_get_current_clk_freq_by_table(struct 
smu_context *smu,
   enum smu_clk_type clk_type,
   uint32_t *value)
 {
-   static SmuMetrics_t metrics = {0};
+   static SmuMetrics_t metrics;
int ret = 0, clk_id = 0;
 
if (!value)
return -EINVAL;
 
+   memset(&metrics, 0, sizeof(metrics));
+
ret = smu_update_table(smu, SMU_TABLE_SMU_METRICS, (void *)&metrics, 
false);
if (ret)
return ret;
@@ -902,12 +904,14 @@ static bool navi10_is_dpm_running(struct smu_context *smu)
 
 static int navi10_get_fan_speed(struct smu_context *smu, uint16_t *value)
 {
-   SmuMetrics_t metrics = {0};
+   SmuMetrics_t metrics;
int ret = 0;
 
if (!value)
return -EINVAL;
 
+   memset(&metrics, 0, sizeof(metrics));
+
ret = smu_update_table(smu, SMU_TABLE_SMU_METRICS,
   (void *)&metrics, false);
if (ret)
-- 
2.22.0



RE: [PATCH] drm/amd/powerplay: add baco smu reset function for smu11

2019-07-03 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang 

From: Wang, Kevin(Yang) 
Sent: 2019年7月4日 11:13
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking ; Xiao, Jack ; 
Huang, Ray ; Feng, Kenneth ; Quan, 
Evan 
Subject: Re: [PATCH] drm/amd/powerplay: add baco smu reset function for smu11


ping...,

which one can help me review this patch.

thanks.



Best Regards,

Kevin


From: Wang, Kevin(Yang)
Sent: Wednesday, July 3, 2019 11:09:45 AM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking; Xiao, Jack; Huang, Ray; Wang, Kevin(Yang)
Subject: [PATCH] drm/amd/powerplay: add baco smu reset function for smu11

add baco reset support for smu11.
it can help gpu do asic reset when gpu recovery.

Change-Id: I7714ed03ad87c13e93ca1a7e6aef81eba14667c8
Signed-off-by: Kevin Wang mailto:kevin1.w...@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c|  6 +-
 drivers/gpu/drm/amd/amdgpu/nv.c   |  9 +-
 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c| 14 +++
 .../gpu/drm/amd/powerplay/inc/amdgpu_smu.h| 26 ++
 drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h |  8 ++
 drivers/gpu/drm/amd/powerplay/navi10_ppt.c|  8 ++
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 91 +++
 7 files changed, 159 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index b41169261f7d..45dd22a1ef77 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -244,8 +244,10 @@ static void gmc_v10_0_flush_gpu_tlb(struct amdgpu_device 
*adev,
 mutex_lock(&adev->mman.gtt_window_lock);

 gmc_v10_0_flush_vm_hub(adev, vmid, AMDGPU_MMHUB, 0);
-   if (!adev->mman.buffer_funcs_enabled || !adev->ib_pool_ready ||
-   adev->asic_type != CHIP_NAVI10) {
+   if (!adev->mman.buffer_funcs_enabled ||
+   !adev->ib_pool_ready ||
+   adev->asic_type != CHIP_NAVI10 ||
+   adev->in_gpu_reset) {
 gmc_v10_0_flush_vm_hub(adev, vmid, AMDGPU_GFXHUB, 0);
 mutex_unlock(&adev->mman.gtt_window_lock);
 return;
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 8f605417b40a..cc5d06718e4c 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -31,6 +31,7 @@
 #include "amdgpu_vce.h"
 #include "amdgpu_ucode.h"
 #include "amdgpu_psp.h"
+#include "amdgpu_smu.h"
 #include "atom.h"
 #include "amd_pcie.h"

@@ -266,8 +267,14 @@ static int nv_asic_reset(struct amdgpu_device *adev)

 amdgpu_atombios_scratch_regs_engine_hung(adev, false);
 #endif
+   int ret = 0;
+   struct smu_context *smu = &adev->smu;

-   return 0;
+   if (smu_baco_is_support(smu)) {
+   ret = smu_baco_reset(smu);
+   }
+
+   return ret;
 }

 static int nv_set_uvd_clocks(struct amdgpu_device *adev, u32 vclk, u32 dclk)
diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c 
b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
index b28a923f998d..fc416c686151 100644
--- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
@@ -633,6 +633,11 @@ static int smu_sw_init(void *handle)
 bitmap_zero(smu->smu_feature.supported, SMU_FEATURE_MAX);
 bitmap_zero(smu->smu_feature.enabled, SMU_FEATURE_MAX);
 bitmap_zero(smu->smu_feature.allowed, SMU_FEATURE_MAX);
+
+   mutex_init(&smu->smu_baco.mutex);
+   smu->smu_baco.state = SMU_BACO_STATE_EXIT;
+   smu->smu_baco.platform_support = false;
+
 smu->watermarks_bitmap = 0;
 smu->power_profile_mode = PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT;
 smu->default_power_profile_mode = PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT;
@@ -1057,11 +1062,20 @@ static int smu_suspend(void *handle)
 int ret;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 struct smu_context *smu = &adev->smu;
+   bool baco_feature_is_enabled = smu_feature_is_enabled(smu, 
SMU_FEATURE_BACO_BIT);

 ret = smu_system_features_control(smu, false);
 if (ret)
 return ret;

+   if (adev->in_gpu_reset && baco_feature_is_enabled) {
+   ret = smu_feature_set_enabled(smu, SMU_FEATURE_BACO_BIT, true);
+   if (ret) {
+   pr_warn("set BACO feature enabled failed, return %d\n", 
ret);
+   return ret;
+   }
+   }
+
 smu->watermarks_bitmap &= ~(WATERMARKS_LOADED);

 if (adev->asic_type >= CHIP_NAVI10 &&
diff --git a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
index 2818df46481c..c97324ef7db2 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
@@ -241,6 +241,7 @@ enum smu_message_type
 SMU_MSG_PowerUpJpeg,
 SMU_MSG_PowerDownJpeg,
 SMU_MSG_BacoAudioD

Re: [PATCH] drm/amd/powerplay: add baco smu reset function for smu11

2019-07-03 Thread Wang, Kevin(Yang)
ping...,

which one can help me review this patch.

thanks.


Best Regards,

Kevin


From: Wang, Kevin(Yang)
Sent: Wednesday, July 3, 2019 11:09:45 AM
To: amd-gfx@lists.freedesktop.org
Cc: Zhang, Hawking; Xiao, Jack; Huang, Ray; Wang, Kevin(Yang)
Subject: [PATCH] drm/amd/powerplay: add baco smu reset function for smu11

add baco reset support for smu11.
it can help gpu do asic reset when gpu recovery.

Change-Id: I7714ed03ad87c13e93ca1a7e6aef81eba14667c8
Signed-off-by: Kevin Wang 
---
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c|  6 +-
 drivers/gpu/drm/amd/amdgpu/nv.c   |  9 +-
 drivers/gpu/drm/amd/powerplay/amdgpu_smu.c| 14 +++
 .../gpu/drm/amd/powerplay/inc/amdgpu_smu.h| 26 ++
 drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h |  8 ++
 drivers/gpu/drm/amd/powerplay/navi10_ppt.c|  8 ++
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 91 +++
 7 files changed, 159 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index b41169261f7d..45dd22a1ef77 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -244,8 +244,10 @@ static void gmc_v10_0_flush_gpu_tlb(struct amdgpu_device 
*adev,
 mutex_lock(&adev->mman.gtt_window_lock);

 gmc_v10_0_flush_vm_hub(adev, vmid, AMDGPU_MMHUB, 0);
-   if (!adev->mman.buffer_funcs_enabled || !adev->ib_pool_ready ||
-   adev->asic_type != CHIP_NAVI10) {
+   if (!adev->mman.buffer_funcs_enabled ||
+   !adev->ib_pool_ready ||
+   adev->asic_type != CHIP_NAVI10 ||
+   adev->in_gpu_reset) {
 gmc_v10_0_flush_vm_hub(adev, vmid, AMDGPU_GFXHUB, 0);
 mutex_unlock(&adev->mman.gtt_window_lock);
 return;
diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 8f605417b40a..cc5d06718e4c 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -31,6 +31,7 @@
 #include "amdgpu_vce.h"
 #include "amdgpu_ucode.h"
 #include "amdgpu_psp.h"
+#include "amdgpu_smu.h"
 #include "atom.h"
 #include "amd_pcie.h"

@@ -266,8 +267,14 @@ static int nv_asic_reset(struct amdgpu_device *adev)

 amdgpu_atombios_scratch_regs_engine_hung(adev, false);
 #endif
+   int ret = 0;
+   struct smu_context *smu = &adev->smu;

-   return 0;
+   if (smu_baco_is_support(smu)) {
+   ret = smu_baco_reset(smu);
+   }
+
+   return ret;
 }

 static int nv_set_uvd_clocks(struct amdgpu_device *adev, u32 vclk, u32 dclk)
diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c 
b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
index b28a923f998d..fc416c686151 100644
--- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
@@ -633,6 +633,11 @@ static int smu_sw_init(void *handle)
 bitmap_zero(smu->smu_feature.supported, SMU_FEATURE_MAX);
 bitmap_zero(smu->smu_feature.enabled, SMU_FEATURE_MAX);
 bitmap_zero(smu->smu_feature.allowed, SMU_FEATURE_MAX);
+
+   mutex_init(&smu->smu_baco.mutex);
+   smu->smu_baco.state = SMU_BACO_STATE_EXIT;
+   smu->smu_baco.platform_support = false;
+
 smu->watermarks_bitmap = 0;
 smu->power_profile_mode = PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT;
 smu->default_power_profile_mode = PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT;
@@ -1057,11 +1062,20 @@ static int smu_suspend(void *handle)
 int ret;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 struct smu_context *smu = &adev->smu;
+   bool baco_feature_is_enabled = smu_feature_is_enabled(smu, 
SMU_FEATURE_BACO_BIT);

 ret = smu_system_features_control(smu, false);
 if (ret)
 return ret;

+   if (adev->in_gpu_reset && baco_feature_is_enabled) {
+   ret = smu_feature_set_enabled(smu, SMU_FEATURE_BACO_BIT, true);
+   if (ret) {
+   pr_warn("set BACO feature enabled failed, return %d\n", 
ret);
+   return ret;
+   }
+   }
+
 smu->watermarks_bitmap &= ~(WATERMARKS_LOADED);

 if (adev->asic_type >= CHIP_NAVI10 &&
diff --git a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
index 2818df46481c..c97324ef7db2 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
@@ -241,6 +241,7 @@ enum smu_message_type
 SMU_MSG_PowerUpJpeg,
 SMU_MSG_PowerDownJpeg,
 SMU_MSG_BacoAudioD3PME,
+   SMU_MSG_ArmD3,
 SMU_MSG_MAX_COUNT,
 };

@@ -489,6 +490,19 @@ struct mclock_latency_table {
 struct mclk_latency_entries  entries[MAX_REGULAR_DPM_NUM];
 };

+enum smu_baco_state
+{
+   SMU_BACO_STATE_ENTER = 0,
+   SMU_BACO_STATE_EXIT,
+};
+
+struct smu_baco_context
+{
+   struct mutex mutex;

[PATCH] drm/amdgpu: Disable ras features on all IPs before gpu reset

2019-07-03 Thread Pan, Xinhui
Perform a ras_suspend to disable ras on all IPs to workaround
some ROCm stability issue.

Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 5132c59b4397..99208fe684aa 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3759,6 +3759,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
 
/* block all schedulers and reset given job's ring */
list_for_each_entry(tmp_adev, device_list_handle, gmc.xgmi.head) {
+   /* disable ras on ALL IPs */
+   if (amdgpu_device_ip_need_full_reset(tmp_adev))
+   amdgpu_ras_suspend(tmp_adev);
+
for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
struct amdgpu_ring *ring = tmp_adev->rings[i];
 
-- 
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/2] drm/amdgpu/navi10: add thermal sensor support for navi10

2019-07-03 Thread Wang, Kevin(Yang)
From: amd-gfx  on behalf of Alex Deucher 

Sent: Thursday, July 4, 2019 10:58:22 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander
Subject: [PATCH 2/2] drm/amdgpu/navi10: add thermal sensor support for navi10

This was dropped when the code was refactored.  Re-add it
for navi10.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/powerplay/navi10_ppt.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c 
b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
index 5794f7cef1c8..34fbc4be224c 100644
--- a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
@@ -900,6 +900,42 @@ static int navi10_get_current_activity_percent(struct 
smu_context *smu,
 return 0;
 }

+static int navi10_thermal_get_temperature(struct smu_context *smu,
+ enum amd_pp_sensors sensor,
+ uint32_t *value)
+{
+   int ret = 0;
+   SmuMetrics_t metrics;
+
+   if (!value)
+   return -EINVAL;
+
+   ret = smu_update_table(smu, SMU_TABLE_SMU_METRICS, (void *)&metrics,
+  false);
+   if (ret)
+   return ret;
+
+   switch (sensor) {
+   case AMDGPU_PP_SENSOR_HOTSPOT_TEMP:
+   *value = metrics.TemperatureHotspot *
+   SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
+   break;
+   case AMDGPU_PP_SENSOR_EDGE_TEMP:
+   *value = metrics.TemperatureEdge *
+   SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
+   break;
+   case AMDGPU_PP_SENSOR_MEM_TEMP:
+   *value = metrics.TemperatureMem *
+   SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
[kevin]:
the TemperatureMem is not valid for navi10, (it always return 0, it maybe work 
on HBM type of Memmory, but navi10 is DDR6),
and we can use TemperatureVrMem0 to replace it , and it is verify on my local 
side.

Reviewed-by: Kevin Wang 

+   break;
+   default:
+   pr_err("Invalid sensor for retrieving temp\n");
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
 static bool navi10_is_dpm_running(struct smu_context *smu)
 {
 int ret = 0;
@@ -1280,6 +1316,12 @@ static int navi10_read_sensor(struct smu_context *smu,
 ret = navi10_get_gpu_power(smu, (uint32_t *)data);
 *size = 4;
 break;
+   case AMDGPU_PP_SENSOR_HOTSPOT_TEMP:
+   case AMDGPU_PP_SENSOR_EDGE_TEMP:
+   case AMDGPU_PP_SENSOR_MEM_TEMP:
+   ret = navi10_thermal_get_temperature(smu, sensor, (uint32_t 
*)data);
+   *size = 4;
+   break;
 default:
 return -EINVAL;
 }
--
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
amd-gfx Info Page - 
freedesktop.org
lists.freedesktop.org
To see the collection of prior postings to the list, visit the amd-gfx 
Archives.. Using amd-gfx: To post a message to all the list members, send email 
to amd-gfx@lists.freedesktop.org. You can subscribe to the list, or change your 
existing subscription, in the sections below.

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH v4] drm/amdgpu: fix scheduler timeout calc

2019-07-03 Thread Cui, Flora
Ping...

-Original Message-
From: Cui, Flora  
Sent: Monday, July 1, 2019 11:37 AM
To: amd-gfx@lists.freedesktop.org
Cc: Cui, Flora 
Subject: [PATCH v4] drm/amdgpu: fix scheduler timeout calc

scheduler timeout is in jiffies
v2: move timeout check to amdgpu_device_get_job_timeout_settings after parsing 
the value
v3: add lockup_timeout param check. 0: keep default value. negative:
infinity timeout.
v4: refactor codes.

Change-Id: I26708c163db943ff8d930dd81bcab4b4b9d84eb2
Signed-off-by: Flora Cui 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index e74a175..e448f8e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -245,7 +245,8 @@ module_param_named(msi, amdgpu_msi, int, 0444);
  * By default(with no lockup_timeout settings), the timeout for all 
non-compute(GFX, SDMA and Video)
  * jobs is 1. And there is no timeout enforced on compute jobs.
  */
-MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default: 1 for 
non-compute jobs and no timeout for compute jobs), "
+MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default: 1 for 
non-compute jobs and infinity timeout for compute jobs."
+   " 0: keep default value. negative: infinity timeout), "
"format is [Non-Compute] or [GFX,Compute,SDMA,Video]");  
module_param_string(lockup_timeout, amdgpu_lockup_timeout, 
sizeof(amdgpu_lockup_timeout), 0444);
 
@@ -1300,7 +1301,8 @@ int amdgpu_device_get_job_timeout_settings(struct 
amdgpu_device *adev)
 * By default timeout for non compute jobs is 1.
 * And there is no timeout enforced on compute jobs.
 */
-   adev->gfx_timeout = adev->sdma_timeout = adev->video_timeout = 1;
+   adev->gfx_timeout = msecs_to_jiffies(1);
+   adev->sdma_timeout = adev->video_timeout = adev->gfx_timeout;
adev->compute_timeout = MAX_SCHEDULE_TIMEOUT;
 
if (strnlen(input, AMDGPU_MAX_TIMEOUT_PARAM_LENTH)) { @@ -1310,10 
+1312,13 @@ int amdgpu_device_get_job_timeout_settings(struct amdgpu_device 
*adev)
if (ret)
return ret;
 
-   /* Invalidate 0 and negative values */
-   if (timeout <= 0) {
+   if (timeout == 0) {
index++;
continue;
+   } else if (timeout < 0) {
+   timeout = MAX_SCHEDULE_TIMEOUT;
+   } else {
+   timeout = msecs_to_jiffies(timeout);
}
 
switch (index++) {
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/powerplay: add temperature sensor support for navi10

2019-07-03 Thread Wang, Kevin(Yang)
the hwmon interface need temperature sensor type support.
1. SENSOR_HOTSPOT_TEMP
2. SENSOR_EDGE_TEMP(SENSOR_GPU_TEMP)
3. SENSOR_MEM_TEMP

Change-Id: I3db762e4032072fae67c95b7ba6d62e20ae5bead
Signed-off-by: Kevin Wang 
---
 drivers/gpu/drm/amd/powerplay/navi10_ppt.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c 
b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
index 7574a02350c6..d5876c2393e7 100644
--- a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
@@ -1255,6 +1255,42 @@ static int navi10_set_watermarks_table(struct 
smu_context *smu,
return 0;
 }
 
+static int navi10_thermal_get_temperature(struct smu_context *smu,
+enum amd_pp_sensors sensor,
+uint32_t *value)
+{
+   SmuMetrics_t metrics;
+   int ret = 0;
+
+   if (!value)
+   return -EINVAL;
+
+   ret = smu_update_table(smu, SMU_TABLE_SMU_METRICS, (void *)&metrics, 
false);
+   if (ret)
+   return ret;
+
+   switch (sensor) {
+   case AMDGPU_PP_SENSOR_HOTSPOT_TEMP:
+   *value = metrics.TemperatureHotspot *
+   SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
+   break;
+   case AMDGPU_PP_SENSOR_GPU_TEMP:
+   case AMDGPU_PP_SENSOR_EDGE_TEMP:
+   *value = metrics.TemperatureEdge *
+   SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
+   break;
+   case AMDGPU_PP_SENSOR_MEM_TEMP:
+   *value = metrics.TemperatureVrMem0 *
+   SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
+   break;
+   default:
+   pr_err("Invalid sensor for retrieving temp\n");
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
 static int navi10_read_sensor(struct smu_context *smu,
 enum amd_pp_sensors sensor,
 void *data, uint32_t *size)
@@ -1276,6 +1312,12 @@ static int navi10_read_sensor(struct smu_context *smu,
ret = navi10_get_gpu_power(smu, (uint32_t *)data);
*size = 4;
break;
+   case AMDGPU_PP_SENSOR_HOTSPOT_TEMP:
+   case AMDGPU_PP_SENSOR_EDGE_TEMP:
+   case AMDGPU_PP_SENSOR_MEM_TEMP:
+   ret = navi10_thermal_get_temperature(smu, sensor, (uint32_t 
*)data);
+   *size = 4;
+   break;
default:
return -EINVAL;
}
-- 
2.22.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/2] drm/amdgpu/navi10: add thermal sensor support for navi10

2019-07-03 Thread Alex Deucher
This was dropped when the code was refactored.  Re-add it
for navi10.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/powerplay/navi10_ppt.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c 
b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
index 5794f7cef1c8..34fbc4be224c 100644
--- a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
@@ -900,6 +900,42 @@ static int navi10_get_current_activity_percent(struct 
smu_context *smu,
return 0;
 }
 
+static int navi10_thermal_get_temperature(struct smu_context *smu,
+ enum amd_pp_sensors sensor,
+ uint32_t *value)
+{
+   int ret = 0;
+   SmuMetrics_t metrics;
+
+   if (!value)
+   return -EINVAL;
+
+   ret = smu_update_table(smu, SMU_TABLE_SMU_METRICS, (void *)&metrics,
+  false);
+   if (ret)
+   return ret;
+
+   switch (sensor) {
+   case AMDGPU_PP_SENSOR_HOTSPOT_TEMP:
+   *value = metrics.TemperatureHotspot *
+   SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
+   break;
+   case AMDGPU_PP_SENSOR_EDGE_TEMP:
+   *value = metrics.TemperatureEdge *
+   SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
+   break;
+   case AMDGPU_PP_SENSOR_MEM_TEMP:
+   *value = metrics.TemperatureMem *
+   SMU_TEMPERATURE_UNITS_PER_CENTIGRADES;
+   break;
+   default:
+   pr_err("Invalid sensor for retrieving temp\n");
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
 static bool navi10_is_dpm_running(struct smu_context *smu)
 {
int ret = 0;
@@ -1280,6 +1316,12 @@ static int navi10_read_sensor(struct smu_context *smu,
ret = navi10_get_gpu_power(smu, (uint32_t *)data);
*size = 4;
break;
+   case AMDGPU_PP_SENSOR_HOTSPOT_TEMP:
+   case AMDGPU_PP_SENSOR_EDGE_TEMP:
+   case AMDGPU_PP_SENSOR_MEM_TEMP:
+   ret = navi10_thermal_get_temperature(smu, sensor, (uint32_t 
*)data);
+   *size = 4;
+   break;
default:
return -EINVAL;
}
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/2] drm/amdgpu/navi10: add uclk activity sensor

2019-07-03 Thread Alex Deucher
Query the metrics table for the current uclk activity.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/powerplay/navi10_ppt.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c 
b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
index e00397f84b2f..5794f7cef1c8 100644
--- a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
@@ -869,6 +869,7 @@ static int navi10_get_gpu_power(struct smu_context *smu, 
uint32_t *value)
 }
 
 static int navi10_get_current_activity_percent(struct smu_context *smu,
+  enum amd_pp_sensors sensor,
   uint32_t *value)
 {
int ret = 0;
@@ -884,7 +885,17 @@ static int navi10_get_current_activity_percent(struct 
smu_context *smu,
if (ret)
return ret;
 
-   *value = metrics.AverageGfxActivity;
+   switch (sensor) {
+   case AMDGPU_PP_SENSOR_GPU_LOAD:
+   *value = metrics.AverageGfxActivity;
+   break;
+   case AMDGPU_PP_SENSOR_MEM_LOAD:
+   *value = metrics.AverageUclkActivity;
+   break;
+   default:
+   pr_err("Invalid sensor for retrieving clock activity\n");
+   return -EINVAL;
+   }
 
return 0;
 }
@@ -1260,8 +1271,9 @@ static int navi10_read_sensor(struct smu_context *smu,
*(uint32_t *)data = pptable->FanMaximumRpm;
*size = 4;
break;
+   case AMDGPU_PP_SENSOR_MEM_LOAD:
case AMDGPU_PP_SENSOR_GPU_LOAD:
-   ret = navi10_get_current_activity_percent(smu, (uint32_t 
*)data);
+   ret = navi10_get_current_activity_percent(smu, sensor, 
(uint32_t *)data);
*size = 4;
break;
case AMDGPU_PP_SENSOR_GPU_POWER:
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amd/powerplay: increase waiting time for smu response

2019-07-03 Thread Zhang, Hawking
Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: Xiao, Jack  
Sent: 2019年7月3日 12:18
To: amd-gfx@lists.freedesktop.org; Deucher, Alexander 
; Zhang, Hawking 
Cc: Xiao, Jack 
Subject: [PATCH] drm/amd/powerplay: increase waiting time for smu response

We observed some SMU commands take more time for execution, so increase waiting 
time for response.

Signed-off-by: Jack Xiao 
---
 drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c 
b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
index bc39690..88d3127 100644
--- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
@@ -64,9 +64,9 @@ static int smu_v11_0_read_arg(struct smu_context *smu, 
uint32_t *arg)  static int smu_v11_0_wait_for_response(struct smu_context *smu) 
 {
struct amdgpu_device *adev = smu->adev;
-   uint32_t cur_value, i;
+   uint32_t cur_value, i, timeout = adev->usec_timeout * 10;
 
-   for (i = 0; i < adev->usec_timeout; i++) {
+   for (i = 0; i < timeout; i++) {
cur_value = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90);
if ((cur_value & MP1_C2PMSG_90__CONTENT_MASK) != 0)
break;
@@ -74,7 +74,7 @@ static int smu_v11_0_wait_for_response(struct smu_context 
*smu)
}
 
/* timeout means wrong logic */
-   if (i == adev->usec_timeout)
+   if (i == timeout)
return -ETIME;
 
return RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90) == 0x1 ? 0 : -EIO;
--
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 1/1] drm/amdkfd: Consistently apply noretry setting

2019-07-03 Thread Yang, Philip
amdgpu_noretry default value is 0, this will generate VM fault storm 
because the vm fault is not recovered. It may slow down the machine and 
need reboot after application VM fault. Maybe change default value to 1?

Other than that, this is reviewed by Philip Yang 

On 2019-07-02 3:05 p.m., Kuehling, Felix wrote:
> Ping.
> 
> Christian, Philip, any opinion about this patch?
> 
> On 2019-06-21 8:20 p.m., Kuehling, Felix wrote:
>> Apply the same setting to SH_MEM_CONFIG and VM_CONTEXT1_CNTL. This
>> makes the noretry param no longer KFD-specific. On GFX10 I'm not
>> changing SH_MEM_CONFIG in this commit because GFX10 has different
>> retry behaviour in the SQ and I don't have a way to test it at the
>> moment.
>>
>> Suggested-by: Christian König 
>> CC: Philip Yang 
>> Signed-off-by: Felix Kuehling 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu.h  |  1 +
>>drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  | 16 +---
>>drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c|  4 
>>drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c |  3 ++-
>>drivers/gpu/drm/amd/amdgpu/gfxhub_v2_0.c |  3 ++-
>>drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c  |  3 ++-
>>drivers/gpu/drm/amd/amdgpu/mmhub_v2_0.c  |  3 ++-
>>.../drm/amd/amdkfd/kfd_device_queue_manager_v9.c |  2 +-
>>drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  2 +-
>>9 files changed, 20 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 9b1efdf94bdf..05875279c09e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -164,6 +164,7 @@ extern int amdgpu_async_gfx_ring;
>>extern int amdgpu_mcbp;
>>extern int amdgpu_discovery;
>>extern int amdgpu_mes;
>> +extern int amdgpu_noretry;
>>
>>#ifdef CONFIG_DRM_AMDGPU_SI
>>extern int amdgpu_si_support;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index 7cf6ab07b113..0d578d95be93 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -140,6 +140,7 @@ int amdgpu_async_gfx_ring = 1;
>>int amdgpu_mcbp = 0;
>>int amdgpu_discovery = 0;
>>int amdgpu_mes = 0;
>> +int amdgpu_noretry;
>>
>>struct amdgpu_mgpu_info mgpu_info = {
>>  .mutex = __MUTEX_INITIALIZER(mgpu_info.mutex),
>> @@ -591,6 +592,10 @@ MODULE_PARM_DESC(mes,
>>  "Enable Micro Engine Scheduler (0 = disabled (default), 1 = enabled)");
>>module_param_named(mes, amdgpu_mes, int, 0444);
>>
>> +MODULE_PARM_DESC(noretry,
>> +"Disable retry faults (0 = retry enabled (default), 1 = retry 
>> disabled)");
>> +module_param_named(noretry, amdgpu_noretry, int, 0644);
>> +
>>#ifdef CONFIG_HSA_AMD
>>/**
>> * DOC: sched_policy (int)
>> @@ -666,17 +671,6 @@ module_param(ignore_crat, int, 0444);
>>MODULE_PARM_DESC(ignore_crat,
>>  "Ignore CRAT table during KFD initialization (0 = use CRAT (default), 1 
>> = ignore CRAT)");
>>
>> -/**
>> - * DOC: noretry (int)
>> - * This parameter sets sh_mem_config.retry_disable. Default value, 0, 
>> enables retry.
>> - * Setting 1 disables retry.
>> - * Retry is needed for recoverable page faults.
>> - */
>> -int noretry;
>> -module_param(noretry, int, 0644);
>> -MODULE_PARM_DESC(noretry,
>> -"Set sh_mem_config.retry_disable on Vega10 (0 = retry enabled 
>> (default), 1 = retry disabled)");
>> -
>>/**
>> * DOC: halt_if_hws_hang (int)
>> * Halt if HWS hang is detected. Default value, 0, disables the halt on 
>> hang.
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> index e0f3014e76ea..c4e715170bfe 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>> @@ -1938,11 +1938,15 @@ static void gfx_v9_0_constants_init(struct 
>> amdgpu_device *adev)
>>  if (i == 0) {
>>  tmp = REG_SET_FIELD(0, SH_MEM_CONFIG, ALIGNMENT_MODE,
>>  SH_MEM_ALIGNMENT_MODE_UNALIGNED);
>> +tmp = REG_SET_FIELD(tmp, SH_MEM_CONFIG, RETRY_DISABLE,
>> +!!amdgpu_noretry);
>>  WREG32_SOC15_RLC(GC, 0, mmSH_MEM_CONFIG, tmp);
>>  WREG32_SOC15_RLC(GC, 0, mmSH_MEM_BASES, 0);
>>  } else {
>>  tmp = REG_SET_FIELD(0, SH_MEM_CONFIG, ALIGNMENT_MODE,
>>  SH_MEM_ALIGNMENT_MODE_UNALIGNED);
>> +tmp = REG_SET_FIELD(tmp, SH_MEM_CONFIG, RETRY_DISABLE,
>> +!!amdgpu_noretry);
>>  WREG32_SOC15_RLC(GC, 0, mmSH_MEM_CONFIG, tmp);
>>  tmp = REG_SET_FIELD(0, SH_MEM_BASES, PRIVATE_BASE,
>>  (adev->gmc.private_aperture_start >> 48)

Re: The problem "ring gfx timeout" are experienced yet another AMD GPU Vega 8 user

2019-07-03 Thread Marek Olšák
It looks like memory corruption. You can try to disable IOMMU in the BIOS.

Marek

On Tue, Jul 2, 2019 at 12:07 AM Mikhail Gavrilov <
mikhail.v.gavri...@gmail.com> wrote:

> On Wed, 27 Feb 2019 at 00:57, Marek Olšák  wrote:
> >
> > Sadly, the logs don't contain any clue as to why it hangs.
> >
> > It would be helpful to check if the hang can be reproduced on Vega 56 or
> 64 as well.
> >
> > Marek
> >
>
> Hi, Marek.
>
> I'm sorry to trouble you.
> But today the user of described above Vega 8 graphic sended me fresh logs.
>
> Actual versions: kernel 5.1.15 / DRM 3.30.0 / Mesa 19.0. / LLVM 8.0.0
>
> I uploaded all logs to mega cloud storage.
> Can you look this logs please?
>
> https://mega.nz/#F!Mt5mhKiI!8Sv2T5a6yTxBqVknhH1NjA
>
>
> --
> Best Regards,
> Mike Gavrilov.
>
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH v2 07/35] drm/amdgpu: Use kmemdup rather than duplicating its implementation

2019-07-03 Thread Fuqian Huang
kmemdup is introduced to duplicate a region of memory in a neat way.
Rather than kmalloc/kzalloc + memcpy, which the programmer needs to
write the size twice (sometimes lead to mistakes), kmemdup improves
readability, leads to smaller code and also reduce the chances of mistakes.
Suggestion to use kmemdup rather than using kmalloc/kzalloc + memcpy.

Reviewed-by: Emil Velikov 
Signed-off-by: Fuqian Huang 
---
Changes in v2:
  - Fix a typo in commit message (memset -> memcpy)

 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c   | 5 ++---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 5 ++---
 drivers/gpu/drm/amd/display/dc/core/dc.c| 6 ++
 drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 4 +---
 4 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 02955e6e9dd9..48e38479d634 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -3925,11 +3925,10 @@ static int gfx_v8_0_init_save_restore_list(struct 
amdgpu_device *adev)
 
int list_size;
unsigned int *register_list_format =
-   kmalloc(adev->gfx.rlc.reg_list_format_size_bytes, GFP_KERNEL);
+   kmemdup(adev->gfx.rlc.register_list_format,
+   adev->gfx.rlc.reg_list_format_size_bytes, GFP_KERNEL);
if (!register_list_format)
return -ENOMEM;
-   memcpy(register_list_format, adev->gfx.rlc.register_list_format,
-   adev->gfx.rlc.reg_list_format_size_bytes);
 
gfx_v8_0_parse_ind_reg_list(register_list_format,
RLC_FormatDirectRegListLength,
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index b610e3b30d95..09d901ef216d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -2092,11 +2092,10 @@ static int gfx_v9_1_init_rlc_save_restore_list(struct 
amdgpu_device *adev)
u32 tmp = 0;
 
u32 *register_list_format =
-   kmalloc(adev->gfx.rlc.reg_list_format_size_bytes, GFP_KERNEL);
+   kmemdup(adev->gfx.rlc.register_list_format,
+   adev->gfx.rlc.reg_list_format_size_bytes, GFP_KERNEL);
if (!register_list_format)
return -ENOMEM;
-   memcpy(register_list_format, adev->gfx.rlc.register_list_format,
-   adev->gfx.rlc.reg_list_format_size_bytes);
 
/* setup unique_indirect_regs array and indirect_start_offsets array */
unique_indirect_reg_count = ARRAY_SIZE(unique_indirect_regs);
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 18c775a950cc..6ced3b9cdce2 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1263,14 +1263,12 @@ struct dc_state *dc_create_state(struct dc *dc)
 struct dc_state *dc_copy_state(struct dc_state *src_ctx)
 {
int i, j;
-   struct dc_state *new_ctx = kzalloc(sizeof(struct dc_state),
-  GFP_KERNEL);
+   struct dc_state *new_ctx = kmemdup(src_ctx,
+   sizeof(struct dc_state), GFP_KERNEL);
 
if (!new_ctx)
return NULL;
 
-   memcpy(new_ctx, src_ctx, sizeof(struct dc_state));
-
for (i = 0; i < MAX_PIPES; i++) {
struct pipe_ctx *cur_pipe = 
&new_ctx->res_ctx.pipe_ctx[i];
 
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
index 96e97d25d639..d4b563a2e220 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
@@ -167,12 +167,10 @@ struct dc_stream_state *dc_copy_stream(const struct 
dc_stream_state *stream)
 {
struct dc_stream_state *new_stream;
 
-   new_stream = kzalloc(sizeof(struct dc_stream_state), GFP_KERNEL);
+   new_stream = kzalloc(stream, sizeof(struct dc_stream_state), 
GFP_KERNEL);
if (!new_stream)
return NULL;
 
-   memcpy(new_stream, stream, sizeof(struct dc_stream_state));
-
if (new_stream->sink)
dc_sink_retain(new_stream->sink);
 
-- 
2.11.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v2 07/35] drm/amdgpu: Use kmemdup rather than duplicating its implementation

2019-07-03 Thread Koenig, Christian


Am 03.07.2019 18:27 schrieb Fuqian Huang :
kmemdup is introduced to duplicate a region of memory in a neat way.
Rather than kmalloc/kzalloc + memcpy, which the programmer needs to
write the size twice (sometimes lead to mistakes), kmemdup improves
readability, leads to smaller code and also reduce the chances of mistakes.
Suggestion to use kmemdup rather than using kmalloc/kzalloc + memcpy.

Reviewed-by: Emil Velikov 
Signed-off-by: Fuqian Huang 

Reviewed-by: Christian König 

---
Changes in v2:
  - Fix a typo in commit message (memset -> memcpy)

 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c   | 5 ++---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 5 ++---
 drivers/gpu/drm/amd/display/dc/core/dc.c| 6 ++
 drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 4 +---
 4 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 02955e6e9dd9..48e38479d634 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -3925,11 +3925,10 @@ static int gfx_v8_0_init_save_restore_list(struct 
amdgpu_device *adev)

 int list_size;
 unsigned int *register_list_format =
-   kmalloc(adev->gfx.rlc.reg_list_format_size_bytes, GFP_KERNEL);
+   kmemdup(adev->gfx.rlc.register_list_format,
+   adev->gfx.rlc.reg_list_format_size_bytes, GFP_KERNEL);
 if (!register_list_format)
 return -ENOMEM;
-   memcpy(register_list_format, adev->gfx.rlc.register_list_format,
-   adev->gfx.rlc.reg_list_format_size_bytes);

 gfx_v8_0_parse_ind_reg_list(register_list_format,
 RLC_FormatDirectRegListLength,
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index b610e3b30d95..09d901ef216d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -2092,11 +2092,10 @@ static int gfx_v9_1_init_rlc_save_restore_list(struct 
amdgpu_device *adev)
 u32 tmp = 0;

 u32 *register_list_format =
-   kmalloc(adev->gfx.rlc.reg_list_format_size_bytes, GFP_KERNEL);
+   kmemdup(adev->gfx.rlc.register_list_format,
+   adev->gfx.rlc.reg_list_format_size_bytes, GFP_KERNEL);
 if (!register_list_format)
 return -ENOMEM;
-   memcpy(register_list_format, adev->gfx.rlc.register_list_format,
-   adev->gfx.rlc.reg_list_format_size_bytes);

 /* setup unique_indirect_regs array and indirect_start_offsets array */
 unique_indirect_reg_count = ARRAY_SIZE(unique_indirect_regs);
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 18c775a950cc..6ced3b9cdce2 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1263,14 +1263,12 @@ struct dc_state *dc_create_state(struct dc *dc)
 struct dc_state *dc_copy_state(struct dc_state *src_ctx)
 {
 int i, j;
-   struct dc_state *new_ctx = kzalloc(sizeof(struct dc_state),
-  GFP_KERNEL);
+   struct dc_state *new_ctx = kmemdup(src_ctx,
+   sizeof(struct dc_state), GFP_KERNEL);

 if (!new_ctx)
 return NULL;

-   memcpy(new_ctx, src_ctx, sizeof(struct dc_state));
-
 for (i = 0; i < MAX_PIPES; i++) {
 struct pipe_ctx *cur_pipe = 
&new_ctx->res_ctx.pipe_ctx[i];

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
index 96e97d25d639..d4b563a2e220 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
@@ -167,12 +167,10 @@ struct dc_stream_state *dc_copy_stream(const struct 
dc_stream_state *stream)
 {
 struct dc_stream_state *new_stream;

-   new_stream = kzalloc(sizeof(struct dc_stream_state), GFP_KERNEL);
+   new_stream = kzalloc(stream, sizeof(struct dc_stream_state), 
GFP_KERNEL);
 if (!new_stream)
 return NULL;

-   memcpy(new_stream, stream, sizeof(struct dc_stream_state));
-
 if (new_stream->sink)
 dc_sink_retain(new_stream->sink);

--
2.11.0


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 06/30] drm/amdgpu: Use kmemdup rather than duplicating its implementation

2019-07-03 Thread Emil Velikov
On Wed, 3 Jul 2019 at 14:15, Fuqian Huang  wrote:
>
> kmemdup is introduced to duplicate a region of memory in a neat way.
> Rather than kmalloc/kzalloc + memset, which the programmer needs to
> write the size twice (sometimes lead to mistakes), kmemdup improves
> readability, leads to smaller code and also reduce the chances of mistakes.
> Suggestion to use kmemdup rather than using kmalloc/kzalloc + memset.
>
> Signed-off-by: Fuqian Huang 
Fuqian please add reviewed-by and other tags when sending new revisions.

Fwiw the patch is:
Reviewed-by: Emil Velikov 

-Emil
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables

2019-07-03 Thread Jason Gunthorpe
On Wed, Jul 03, 2019 at 02:27:22AM +, Kuehling, Felix wrote:
> On 2019-07-02 6:59 p.m., Jason Gunthorpe wrote:
> > On Wed, Jul 03, 2019 at 12:49:12AM +0200, Christoph Hellwig wrote:
> >> On Tue, Jul 02, 2019 at 07:53:23PM +, Jason Gunthorpe wrote:
>  I'm sending this out now since we are updating many of the HMM APIs
>  and I think it will be useful.
> >>> This make so much sense, I'd like to apply this in hmm.git, is there
> >>> any objection?
> >> As this creates a somewhat hairy conflict for amdgpu, wouldn't it be
> >> a better idea to wait a bit and apply it first thing for next merge
> >> window?
> > My thinking is that AMD GPU already has a monster conflict from this:
> >
> >   int hmm_range_register(struct hmm_range *range,
> > -  struct mm_struct *mm,
> > +  struct hmm_mirror *mirror,
> > unsigned long start,
> > unsigned long end,
> > unsigned page_shift);
> >
> > So, depending on how that is resolved we might want to do both API
> > changes at once.
> 
> I just sent out a fix for the hmm_mirror API change.

I think if you follow my suggestion to apply a prep patch to AMD GPU
to make the conflict resolution simple, we should defer this patch
until next kernel for the reasons CH gave.

> > Or we may have to revert the above change at this late date.
> >
> > Waiting for AMDGPU team to discuss what process they want to use.
> 
> Yeah, I'm wondering what the process is myself. With HMM and driver 
> development happening on different branches these kinds of API changes 
> are painful. There seems to be a built-in assumption in the current 
> process, that code flows mostly in one direction amd-staging-drm-next -> 
> drm-next -> linux-next -> linux. That assumption is broken with HMM code 
> evolving rapidly in both amdgpu and mm.

It looks to me like AMD GPU uses a pull request model. So a goal as a
tree runner should be to work with the other trees (ie hmm.git, etc)
to minimize conflicts between the PR you will send and the PR other
trees will send.

Do not focus on linux-next, that is just an 'early warning system'
that conflicts are on the horizon, we knew about this one :) (well,
mostly, I was surprised how big it was, my bad)

So we must stay in co-ordination with patches in-flight on the list
and make the right decision, depending on the situation. Communication
here is key :)

We have lots of strategies available to deal with these situations.

> If we want to continue developing HMM driver changes in
> amd-staging-drm-next, we'll need to synchronize with hmm.git more 
> frequently, both ways.

It can't really go both ways. hmm.git has to be only the hmm topic,
otherwise it doesn't really work.

> I believe part of the problem is, that there is a fairly long
> lead-time from getting changes from amd-staging-drm-next into
> linux-next, as they are held for one release cycle in drm-next.
> Pushing HMM-related changes through drm-fixes may offer a kind of
> shortcut. Philip and my latest fixup is just bypassing drm-next
> completely and going straight into linux-next, though.

I'm not so familiar with the DRM work flow to give you advice on this.

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/3] drm: introduce DRIVER_FORCE_AUTH

2019-07-03 Thread Emil Velikov
On Wed, 3 Jul 2019 at 15:58, Koenig, Christian  wrote:
> Am 03.07.2019 16:51 schrieb Emil Velikov :
>
> On Wed, 3 Jul 2019 at 15:33, Koenig, Christian  
> wrote:
> > Am 03.07.2019 16:00 schrieb Emil Velikov :
> >
> > On Wed, 3 Jul 2019 at 14:48, Koenig, Christian  
> > wrote:
> > >
> > > Well this is still a NAK.
> > >
> > > As stated previously please just don't remove DRM_AUTH and keep the 
> > > functionality as it is.
> > >
> > AFAICT nobody was in favour of your suggestion to remove DRM_AUTH from
> > the handle to/from fd ioclts.
> > Thus this seems like the second best option.
> >
> >
> > Well just keep it. As I said please don't change anything here.
> >
> > Dropping DRM_AUTH from the driver IOCTLs was sufficient to work around the 
> > problems at hand far as I know.
> >
> We also need the DRM_AUTH for handle to/from fd ones. Mesa drivers use
> those ioctls.
>
>
> Yeah, but only for importing/exporting things.
>
> And in those cases we either already gave render nodes or correctly 
> authenticated primary nodes.
>
> So no need to change anything here as far as I see.
>
Not quite. When working with the primary node we have the following scenarios:
 - handle to fd -> pass fd to other APIs - gbm, opencl, vdpau, etc
 - handle to fd -> fd to handle - use it internally

-Emil
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 1/2] drm/amdgpu: add missing documentation on new module parameters

2019-07-03 Thread Alex Deucher
New parameters added for navi lack documentation.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 3913a75924c6..7941a5368fb5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -581,14 +581,27 @@ MODULE_PARM_DESC(async_gfx_ring,
"Asynchronous GFX rings that could be configured with either different 
priorities (HP3D ring and LP3D ring), or equal priorities (0 = disabled, 1 = 
enabled (default))");
 module_param_named(async_gfx_ring, amdgpu_async_gfx_ring, int, 0444);
 
+/**
+ * DOC: mcbp (int)
+ * It is used to enable mid command buffer preemption. (0 = disabled 
(default), 1 = enabled)
+ */
 MODULE_PARM_DESC(mcbp,
"Enable Mid-command buffer preemption (0 = disabled (default), 1 = 
enabled)");
 module_param_named(mcbp, amdgpu_mcbp, int, 0444);
 
+/**
+ * DOC: discovery (int)
+ * Allow driver to discover hardware IP information from IP Discovery table at 
the top of VRAM.
+ */
 MODULE_PARM_DESC(discovery,
"Allow driver to discover hardware IPs from IP Discovery table at the 
top of VRAM");
 module_param_named(discovery, amdgpu_discovery, int, 0444);
 
+/**
+ * DOC: mes (int)
+ * Enable Micro Engine Scheduler. This is a new hw scheduling engine for gfx, 
sdma, and compute.
+ * (0 = disabled (default), 1 = enabled)
+ */
 MODULE_PARM_DESC(mes,
"Enable Micro Engine Scheduler (0 = disabled (default), 1 = enabled)");
 module_param_named(mes, amdgpu_mes, int, 0444);
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/2] drm/amdgpu: enable IP discovery by default on navi

2019-07-03 Thread Alex Deucher
Use the IP discovery table rather than hardcoding the
settings in the driver.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 7941a5368fb5..6f7772eeeb78 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -140,7 +140,7 @@ uint amdgpu_smu_memory_pool_size = 0;
 uint amdgpu_dc_feature_mask = 0;
 int amdgpu_async_gfx_ring = 1;
 int amdgpu_mcbp = 0;
-int amdgpu_discovery = 0;
+int amdgpu_discovery = -1;
 int amdgpu_mes = 0;
 
 struct amdgpu_mgpu_info mgpu_info = {
@@ -592,6 +592,7 @@ module_param_named(mcbp, amdgpu_mcbp, int, 0444);
 /**
  * DOC: discovery (int)
  * Allow driver to discover hardware IP information from IP Discovery table at 
the top of VRAM.
+ * (-1 = auto (default), 0 = disabled, 1 = enabled)
  */
 MODULE_PARM_DESC(discovery,
"Allow driver to discover hardware IPs from IP Discovery table at the 
top of VRAM");
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/3] drm: introduce DRIVER_FORCE_AUTH

2019-07-03 Thread Koenig, Christian


Am 03.07.2019 16:51 schrieb Emil Velikov :
On Wed, 3 Jul 2019 at 15:33, Koenig, Christian  wrote:
> Am 03.07.2019 16:00 schrieb Emil Velikov :
>
> On Wed, 3 Jul 2019 at 14:48, Koenig, Christian  
> wrote:
> >
> > Well this is still a NAK.
> >
> > As stated previously please just don't remove DRM_AUTH and keep the 
> > functionality as it is.
> >
> AFAICT nobody was in favour of your suggestion to remove DRM_AUTH from
> the handle to/from fd ioclts.
> Thus this seems like the second best option.
>
>
> Well just keep it. As I said please don't change anything here.
>
> Dropping DRM_AUTH from the driver IOCTLs was sufficient to work around the 
> problems at hand far as I know.
>
We also need the DRM_AUTH for handle to/from fd ones. Mesa drivers use
those ioctls.

Yeah, but only for importing/exporting things.

And in those cases we either already gave render nodes or correctly 
authenticated primary nodes.

So no need to change anything here as far as I see.

I simply want to prevent that userspace gets the same functionality from the 
primary node they get from the render node. And that actually seems to be a 
good way to keep the restriction and still work around the userspace problems.

Christian.



-Emil

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/3] drm: introduce DRIVER_FORCE_AUTH

2019-07-03 Thread Emil Velikov
On Wed, 3 Jul 2019 at 15:33, Koenig, Christian  wrote:
> Am 03.07.2019 16:00 schrieb Emil Velikov :
>
> On Wed, 3 Jul 2019 at 14:48, Koenig, Christian  
> wrote:
> >
> > Well this is still a NAK.
> >
> > As stated previously please just don't remove DRM_AUTH and keep the 
> > functionality as it is.
> >
> AFAICT nobody was in favour of your suggestion to remove DRM_AUTH from
> the handle to/from fd ioclts.
> Thus this seems like the second best option.
>
>
> Well just keep it. As I said please don't change anything here.
>
> Dropping DRM_AUTH from the driver IOCTLs was sufficient to work around the 
> problems at hand far as I know.
>
We also need the DRM_AUTH for handle to/from fd ones. Mesa drivers use
those ioctls.

-Emil
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/3] drm: introduce DRIVER_FORCE_AUTH

2019-07-03 Thread Koenig, Christian


Am 03.07.2019 16:00 schrieb Emil Velikov :
On Wed, 3 Jul 2019 at 14:48, Koenig, Christian  wrote:
>
> Well this is still a NAK.
>
> As stated previously please just don't remove DRM_AUTH and keep the 
> functionality as it is.
>
AFAICT nobody was in favour of your suggestion to remove DRM_AUTH from
the handle to/from fd ioclts.
Thus this seems like the second best option.

Well just keep it. As I said please don't change anything here.

Dropping DRM_AUTH from the driver IOCTLs was sufficient to work around the 
problems at hand far as I know.

And stopping those two at least prevents userspace to abuse this even more.

On the other hand I haven't seen any NAK on dropping DRM_AUTH from them.

Christian.



Third route that I see is doing driver_name == "amdgpu" || driver_name
== "radeon" in core.
If you have alternative solution I'm all ears.

-Emil

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/3] drm: introduce DRIVER_FORCE_AUTH

2019-07-03 Thread Emil Velikov
On Wed, 3 Jul 2019 at 14:48, Koenig, Christian  wrote:
>
> Well this is still a NAK.
>
> As stated previously please just don't remove DRM_AUTH and keep the 
> functionality as it is.
>
AFAICT nobody was in favour of your suggestion to remove DRM_AUTH from
the handle to/from fd ioclts.
Thus this seems like the second best option.

Third route that I see is doing driver_name == "amdgpu" || driver_name
== "radeon" in core.
If you have alternative solution I'm all ears.

-Emil
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/3] drm: introduce DRIVER_FORCE_AUTH

2019-07-03 Thread Koenig, Christian
Well this is still a NAK.

As stated previously please just don't remove DRM_AUTH and keep the 
functionality as it is.

I absolutely don't see the point to add a new flag to remove the same 
functionality a different flag provides.

Christian.

Am 03.07.2019 15:30 schrieb Emil Velikov :
From: Emil Velikov 

With earlier commits we've removed DRM_AUTH for driver ioctls annotated
with DRM_AUTH | DRM_RENDER_ALLOW, as the protection it introduces is
effectively not existent.

With next commit, we'll effectively do the same for DRM core.

Yet the AMD developers have voiced concerns that by doing so, developers
working on the closed source user-space driver might remove render node
support.

Since we do _not_ want that to happen, add workaround for those two
drivers

Cc: Alex Deucher 
Cc: Christian König 
Cc: amd-gfx@lists.freedesktop.org
Cc: Daniel Vetter 
Signed-off-by: Emil Velikov 
---
Christian, Alex this is the cleaner way to handle AMDGPU/radeon although
if you prefer alternative methods let me know.

Review, acks and others are appreciated, since I'd like to get this
through the drm-misc tree.

Thanks
Emil

Unrelated:
The USE_AGP flag in AMDGPU should be nuked. While for radeon, one can
copy in the driver the 10-20 lines worth of agp_init/release and also
drop the flag.

Bonus points of agp_init code gets a LEGACY check alongside the USE_AGP
one.
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  2 +-
 drivers/gpu/drm/radeon/radeon_drv.c |  2 +-
 include/drm/drm_drv.h   | 10 ++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 8e1b269351e8..cfc2ef11330c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1307,7 +1307,7 @@ amdgpu_get_crtc_scanout_position(struct drm_device *dev, 
unsigned int pipe,

 static struct drm_driver kms_driver = {
 .driver_features =
-   DRIVER_USE_AGP | DRIVER_ATOMIC |
+   DRIVER_USE_AGP | DRIVER_ATOMIC | DRIVER_FORCE_AUTH |
 DRIVER_GEM |
 DRIVER_RENDER | DRIVER_MODESET | DRIVER_SYNCOBJ,
 .load = amdgpu_driver_load_kms,
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 4403e76e1ae0..5a1bfad1ad5e 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -538,7 +538,7 @@ radeon_get_crtc_scanout_position(struct drm_device *dev, 
unsigned int pipe,

 static struct drm_driver kms_driver = {
 .driver_features =
-   DRIVER_USE_AGP | DRIVER_GEM | DRIVER_RENDER,
+   DRIVER_USE_AGP | DRIVER_GEM | DRIVER_RENDER | DRIVER_FORCE_AUTH,
 .load = radeon_driver_load_kms,
 .open = radeon_driver_open_kms,
 .postclose = radeon_driver_postclose_kms,
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index b33f2cee2099..5fb2846396bc 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -92,6 +92,16 @@ enum drm_driver_feature {
  * synchronization of command submission.
  */
 DRIVER_SYNCOBJ_TIMELINE = BIT(6),
+   /**
+* @DRIVER_FORCE_AUTH:
+*
+* Driver mandates that DRM_AUTH is honoured, even if the same ioctl
+* is exposed via the render node - aka any of an "authentication" is
+* a fallacy.
+*
+* Used only by amdgpu and radeon. Do not use.
+*/
+   DRIVER_FORCE_AUTH   = BIT(7),

 /* IMPORTANT: Below are all the legacy flags, add new ones above. */

--
2.21.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 2/3] drm: introduce DRIVER_FORCE_AUTH

2019-07-03 Thread Emil Velikov
From: Emil Velikov 

With earlier commits we've removed DRM_AUTH for driver ioctls annotated
with DRM_AUTH | DRM_RENDER_ALLOW, as the protection it introduces is
effectively not existent.

With next commit, we'll effectively do the same for DRM core.

Yet the AMD developers have voiced concerns that by doing so, developers
working on the closed source user-space driver might remove render node
support.

Since we do _not_ want that to happen, add workaround for those two
drivers

Cc: Alex Deucher 
Cc: Christian König 
Cc: amd-gfx@lists.freedesktop.org
Cc: Daniel Vetter 
Signed-off-by: Emil Velikov 
---
Christian, Alex this is the cleaner way to handle AMDGPU/radeon although
if you prefer alternative methods let me know.

Review, acks and others are appreciated, since I'd like to get this
through the drm-misc tree.

Thanks
Emil

Unrelated:
The USE_AGP flag in AMDGPU should be nuked. While for radeon, one can
copy in the driver the 10-20 lines worth of agp_init/release and also
drop the flag.

Bonus points of agp_init code gets a LEGACY check alongside the USE_AGP
one.
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |  2 +-
 drivers/gpu/drm/radeon/radeon_drv.c |  2 +-
 include/drm/drm_drv.h   | 10 ++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 8e1b269351e8..cfc2ef11330c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1307,7 +1307,7 @@ amdgpu_get_crtc_scanout_position(struct drm_device *dev, 
unsigned int pipe,
 
 static struct drm_driver kms_driver = {
.driver_features =
-   DRIVER_USE_AGP | DRIVER_ATOMIC |
+   DRIVER_USE_AGP | DRIVER_ATOMIC | DRIVER_FORCE_AUTH |
DRIVER_GEM |
DRIVER_RENDER | DRIVER_MODESET | DRIVER_SYNCOBJ,
.load = amdgpu_driver_load_kms,
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 4403e76e1ae0..5a1bfad1ad5e 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -538,7 +538,7 @@ radeon_get_crtc_scanout_position(struct drm_device *dev, 
unsigned int pipe,
 
 static struct drm_driver kms_driver = {
.driver_features =
-   DRIVER_USE_AGP | DRIVER_GEM | DRIVER_RENDER,
+   DRIVER_USE_AGP | DRIVER_GEM | DRIVER_RENDER | DRIVER_FORCE_AUTH,
.load = radeon_driver_load_kms,
.open = radeon_driver_open_kms,
.postclose = radeon_driver_postclose_kms,
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index b33f2cee2099..5fb2846396bc 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -92,6 +92,16 @@ enum drm_driver_feature {
 * synchronization of command submission.
 */
DRIVER_SYNCOBJ_TIMELINE = BIT(6),
+   /**
+* @DRIVER_FORCE_AUTH:
+*
+* Driver mandates that DRM_AUTH is honoured, even if the same ioctl
+* is exposed via the render node - aka any of an "authentication" is
+* a fallacy.
+*
+* Used only by amdgpu and radeon. Do not use.
+*/
+   DRIVER_FORCE_AUTH   = BIT(7),
 
/* IMPORTANT: Below are all the legacy flags, add new ones above. */
 
-- 
2.21.0

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [pull] amdgpu drm-fixes-5.2

2019-07-03 Thread Daniel Vetter
On Wed, Jul 3, 2019 at 3:10 PM Alex Deucher  wrote:
>
> On Wed, Jul 3, 2019 at 5:54 AM Daniel Vetter  wrote:
> >
> > On Tue, Jul 02, 2019 at 08:57:05PM -0500, Alex Deucher wrote:
> > > Hi Dave, Daniel,
> > >
> > > 3 fixes all cc'ed to stable.  Note that dim complains about the Fixes tag
> > > in one of the patches.  The patch has:
> > > Fixes: 921935dc6404 ("drm/amd/powerplay: enforce display related settings 
> > > only on needed")
> > > while dim recommends:
> >
> > This is the right format.
> >
> > > Fixes: commit 921935dc6404 ("drm/amd/powerplay: enforce display related 
> > > settings only on needed")
> >
> > Adding a "commit" is wrong. And at least my dim here doesn't complain
> > about your pull. How did you test this?
>
> dim checkpatch 665d6d4e32313a7952bb3339647f74c3a6b0d78a

Hah, another one fell into the trap :-/

dim checkpatch is just a wrapper around scripts/checkpatch.pl, it's
not what's used when you push to a dim managed branch, or what we use
when processing a pull request.

> -:8: ERROR:GIT_COMMIT_ID: Please use git commit description style
> 'commit <12+ chars of sha1> ("")' - ie: 'commit
> 921935dc6404 ("drm/amd/powerplay: enforce display related settings
> only on needed")'
> #8:
> 921935dc6404 ("drm/amd/powerplay: enforce display related settings
> only on needed")

Because checkpatch is garbage. The real dim checks only get used for
dim push and dim apply-pull. There was a half-baked patch somewhere to
integrate that into dim checkpatch, but that still leaves the problem
that checkpatch.pl is useless. Other option is if you use dim push to
push to the drm-amd.git tree (it's still all set up from years back
when at least Harry seemed somewhat enthusiastic about
group-maintaining amd.git outside of the amd firewall instead of
inside).




> Alex
>
> > -Daniel
> >
> > > I feel like the former is the more common nomencleture (at least 
> > > historically),
> > > but I'm happy to respin if you'd prefer.
> > >
> > > The following changes since commit 
> > > 665d6d4e32313a7952bb3339647f74c3a6b0d78a:
> > >
> > >   Merge tag 'drm-misc-fixes-2019-06-26' of 
> > > git://anongit.freedesktop.org/drm/drm-misc into drm-fixes (2019-06-27 
> > > 11:34:52 +1000)
> > >
> > > are available in the Git repository at:
> > >
> > >   git://people.freedesktop.org/~agd5f/linux tags/drm-fixes-5.2-2019-07-02
> > >
> > > for you to fetch changes up to 25f09f858835b0e9a06213811031190a17d8ab78:
> > >
> > >   drm/amdgpu/gfx9: use reset default for PA_SC_FIFO_SIZE (2019-07-01 
> > > 12:16:26 -0500)
> > >
> > > 
> > > drm-fixes-5.2-2019-07-02:
> > >
> > > Fixes for stable
> > >
> > > amdgpu:
> > > - stability fix for gfx9
> > > - regression fix for HG on some polaris boards
> > > - crash fix for some new OEM boards
> > >
> > > 
> > > Alex Deucher (1):
> > >   drm/amdgpu/gfx9: use reset default for PA_SC_FIFO_SIZE
> > >
> > > Evan Quan (1):
> > >   drm/amd/powerplay: use hardware fan control if no powerplay fan 
> > > table
> > >
> > > Lyude Paul (1):
> > >   drm/amdgpu: Don't skip display settings in hwmgr_resume()
> > >
> > >  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 19 
> > > ---
> > >  drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c   |  2 +-
> > >  .../drm/amd/powerplay/hwmgr/process_pptables_v1_0.c   |  4 +++-
> > >  drivers/gpu/drm/amd/powerplay/inc/hwmgr.h |  1 +
> > >  .../gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c   |  4 
> > >  5 files changed, 9 insertions(+), 21 deletions(-)
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Which changes in graphic stack could cause eye strain problems

2019-07-03 Thread Михаил Богданов
 I've been using Ubuntu 14.4.0 (with updates) for many years and can't switch 
to any newer Linux distribution.
Actually on attempt to upgrade to anything newer I got problems with eyes and 
head (actually problem feelings are started in muscle at my right head side and 
then in eyes - irritation, burn, discomfort and so on, then i got a weakness in 
my full body). 
My hardware configuration is same during attempt of upgrade and I have similar 
filing regardless of used driver (opensource or proprietary).Problem feelings 
doesn't depends on used graphic card: AMD/ATI, Nvidia or Intel.
The hardness of feelings just depends on distribution: the hardest ones on 
Ubuntu 19.04 and less ones on Mint 19.1

I think that some relevant for my story changes was added to graphic stack 
around 2014 year 
or maybe even before 2014 and default  behaviour was switched to them around 
2014.
There are several report of same problem with intel graphic made in 
2014:https://lists.freedesktop.org/archives/intel-gfx/2014-January/038104.html
https://lists.freedesktop.org/archives/intel-gfx/2014-March/042689.html
Actually I can't use any display with LED backlight and still using one with 
CCFL. Maybe this changes are related to LED display support.

There is also report on ledstrain 
site:https://ledstrain.org/d/384-linux-users-any-known-good-distro-de/15

Could you suggest which changes could be related to described problem?Could it 
be some hidden dithering in graphics stack or something else?My attempt to play 
with different configuration options doesn't give me any result (including 
dithering disabling in graphic card driver)?
Best regards,   Mike





___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH 06/30] drm/amdgpu: Use kmemdup rather than duplicating its implementation

2019-07-03 Thread Fuqian Huang
kmemdup is introduced to duplicate a region of memory in a neat way.
Rather than kmalloc/kzalloc + memset, which the programmer needs to
write the size twice (sometimes lead to mistakes), kmemdup improves
readability, leads to smaller code and also reduce the chances of mistakes.
Suggestion to use kmemdup rather than using kmalloc/kzalloc + memset.

Signed-off-by: Fuqian Huang 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c   | 5 ++---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c   | 5 ++---
 drivers/gpu/drm/amd/display/dc/core/dc.c| 6 ++
 drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 4 +---
 4 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index 02955e6e9dd9..48e38479d634 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -3925,11 +3925,10 @@ static int gfx_v8_0_init_save_restore_list(struct 
amdgpu_device *adev)
 
int list_size;
unsigned int *register_list_format =
-   kmalloc(adev->gfx.rlc.reg_list_format_size_bytes, GFP_KERNEL);
+   kmemdup(adev->gfx.rlc.register_list_format,
+   adev->gfx.rlc.reg_list_format_size_bytes, GFP_KERNEL);
if (!register_list_format)
return -ENOMEM;
-   memcpy(register_list_format, adev->gfx.rlc.register_list_format,
-   adev->gfx.rlc.reg_list_format_size_bytes);
 
gfx_v8_0_parse_ind_reg_list(register_list_format,
RLC_FormatDirectRegListLength,
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index b610e3b30d95..09d901ef216d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -2092,11 +2092,10 @@ static int gfx_v9_1_init_rlc_save_restore_list(struct 
amdgpu_device *adev)
u32 tmp = 0;
 
u32 *register_list_format =
-   kmalloc(adev->gfx.rlc.reg_list_format_size_bytes, GFP_KERNEL);
+   kmemdup(adev->gfx.rlc.register_list_format,
+   adev->gfx.rlc.reg_list_format_size_bytes, GFP_KERNEL);
if (!register_list_format)
return -ENOMEM;
-   memcpy(register_list_format, adev->gfx.rlc.register_list_format,
-   adev->gfx.rlc.reg_list_format_size_bytes);
 
/* setup unique_indirect_regs array and indirect_start_offsets array */
unique_indirect_reg_count = ARRAY_SIZE(unique_indirect_regs);
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 18c775a950cc..6ced3b9cdce2 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1263,14 +1263,12 @@ struct dc_state *dc_create_state(struct dc *dc)
 struct dc_state *dc_copy_state(struct dc_state *src_ctx)
 {
int i, j;
-   struct dc_state *new_ctx = kzalloc(sizeof(struct dc_state),
-  GFP_KERNEL);
+   struct dc_state *new_ctx = kmemdup(src_ctx,
+   sizeof(struct dc_state), GFP_KERNEL);
 
if (!new_ctx)
return NULL;
 
-   memcpy(new_ctx, src_ctx, sizeof(struct dc_state));
-
for (i = 0; i < MAX_PIPES; i++) {
struct pipe_ctx *cur_pipe = 
&new_ctx->res_ctx.pipe_ctx[i];
 
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
index 96e97d25d639..d4b563a2e220 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
@@ -167,12 +167,10 @@ struct dc_stream_state *dc_copy_stream(const struct 
dc_stream_state *stream)
 {
struct dc_stream_state *new_stream;
 
-   new_stream = kzalloc(sizeof(struct dc_stream_state), GFP_KERNEL);
+   new_stream = kzalloc(stream, sizeof(struct dc_stream_state), 
GFP_KERNEL);
if (!new_stream)
return NULL;
 
-   memcpy(new_stream, stream, sizeof(struct dc_stream_state));
-
if (new_stream->sink)
dc_sink_retain(new_stream->sink);
 
-- 
2.11.0



Re: [pull] amdgpu drm-fixes-5.2

2019-07-03 Thread Alex Deucher
On Wed, Jul 3, 2019 at 5:54 AM Daniel Vetter  wrote:
>
> On Tue, Jul 02, 2019 at 08:57:05PM -0500, Alex Deucher wrote:
> > Hi Dave, Daniel,
> >
> > 3 fixes all cc'ed to stable.  Note that dim complains about the Fixes tag
> > in one of the patches.  The patch has:
> > Fixes: 921935dc6404 ("drm/amd/powerplay: enforce display related settings 
> > only on needed")
> > while dim recommends:
>
> This is the right format.
>
> > Fixes: commit 921935dc6404 ("drm/amd/powerplay: enforce display related 
> > settings only on needed")
>
> Adding a "commit" is wrong. And at least my dim here doesn't complain
> about your pull. How did you test this?

dim checkpatch 665d6d4e32313a7952bb3339647f74c3a6b0d78a

-:8: ERROR:GIT_COMMIT_ID: Please use git commit description style
'commit <12+ chars of sha1> ("")' - ie: 'commit
921935dc6404 ("drm/amd/powerplay: enforce display related settings
only on needed")'
#8:
921935dc6404 ("drm/amd/powerplay: enforce display related settings
only on needed")

Alex

> -Daniel
>
> > I feel like the former is the more common nomencleture (at least 
> > historically),
> > but I'm happy to respin if you'd prefer.
> >
> > The following changes since commit 665d6d4e32313a7952bb3339647f74c3a6b0d78a:
> >
> >   Merge tag 'drm-misc-fixes-2019-06-26' of 
> > git://anongit.freedesktop.org/drm/drm-misc into drm-fixes (2019-06-27 
> > 11:34:52 +1000)
> >
> > are available in the Git repository at:
> >
> >   git://people.freedesktop.org/~agd5f/linux tags/drm-fixes-5.2-2019-07-02
> >
> > for you to fetch changes up to 25f09f858835b0e9a06213811031190a17d8ab78:
> >
> >   drm/amdgpu/gfx9: use reset default for PA_SC_FIFO_SIZE (2019-07-01 
> > 12:16:26 -0500)
> >
> > 
> > drm-fixes-5.2-2019-07-02:
> >
> > Fixes for stable
> >
> > amdgpu:
> > - stability fix for gfx9
> > - regression fix for HG on some polaris boards
> > - crash fix for some new OEM boards
> >
> > 
> > Alex Deucher (1):
> >   drm/amdgpu/gfx9: use reset default for PA_SC_FIFO_SIZE
> >
> > Evan Quan (1):
> >   drm/amd/powerplay: use hardware fan control if no powerplay fan table
> >
> > Lyude Paul (1):
> >   drm/amdgpu: Don't skip display settings in hwmgr_resume()
> >
> >  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 19 
> > ---
> >  drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c   |  2 +-
> >  .../drm/amd/powerplay/hwmgr/process_pptables_v1_0.c   |  4 +++-
> >  drivers/gpu/drm/amd/powerplay/inc/hwmgr.h |  1 +
> >  .../gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c   |  4 
> >  5 files changed, 9 insertions(+), 21 deletions(-)
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [pull] amdgpu drm-fixes-5.2

2019-07-03 Thread Daniel Vetter
On Tue, Jul 02, 2019 at 08:57:05PM -0500, Alex Deucher wrote:
> Hi Dave, Daniel,
> 
> 3 fixes all cc'ed to stable.  Note that dim complains about the Fixes tag
> in one of the patches.  The patch has:
> Fixes: 921935dc6404 ("drm/amd/powerplay: enforce display related settings 
> only on needed")
> while dim recommends:

This is the right format.

> Fixes: commit 921935dc6404 ("drm/amd/powerplay: enforce display related 
> settings only on needed")

Adding a "commit" is wrong. And at least my dim here doesn't complain
about your pull. How did you test this?
-Daniel

> I feel like the former is the more common nomencleture (at least 
> historically),
> but I'm happy to respin if you'd prefer.
> 
> The following changes since commit 665d6d4e32313a7952bb3339647f74c3a6b0d78a:
> 
>   Merge tag 'drm-misc-fixes-2019-06-26' of 
> git://anongit.freedesktop.org/drm/drm-misc into drm-fixes (2019-06-27 
> 11:34:52 +1000)
> 
> are available in the Git repository at:
> 
>   git://people.freedesktop.org/~agd5f/linux tags/drm-fixes-5.2-2019-07-02
> 
> for you to fetch changes up to 25f09f858835b0e9a06213811031190a17d8ab78:
> 
>   drm/amdgpu/gfx9: use reset default for PA_SC_FIFO_SIZE (2019-07-01 12:16:26 
> -0500)
> 
> 
> drm-fixes-5.2-2019-07-02:
> 
> Fixes for stable
> 
> amdgpu:
> - stability fix for gfx9
> - regression fix for HG on some polaris boards
> - crash fix for some new OEM boards
> 
> 
> Alex Deucher (1):
>   drm/amdgpu/gfx9: use reset default for PA_SC_FIFO_SIZE
> 
> Evan Quan (1):
>   drm/amd/powerplay: use hardware fan control if no powerplay fan table
> 
> Lyude Paul (1):
>   drm/amdgpu: Don't skip display settings in hwmgr_resume()
> 
>  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 19 
> ---
>  drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c   |  2 +-
>  .../drm/amd/powerplay/hwmgr/process_pptables_v1_0.c   |  4 +++-
>  drivers/gpu/drm/amd/powerplay/inc/hwmgr.h |  1 +
>  .../gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c   |  4 
>  5 files changed, 9 insertions(+), 21 deletions(-)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx