Re: [PATCH] amd/display/debugfs: add sysfs entry to read PSR residency from firmware

2023-03-09 Thread S, Shirish



On 3/10/2023 12:00 PM, S, Shirish wrote:


On 3/8/2023 11:52 PM, Hamza Mahfooz wrote:


On 3/8/23 02:10, Shirish S wrote:

[Why]
Currently there aren't any methods to determine PSR state residency.

[How]
create a sysfs entry for reading residency and internally hook it up
to existing functionality of reading PSR residency from firmware.

Signed-off-by: Shirish S 
---
  .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 19 
+++

  1 file changed, 19 insertions(+)

diff --git 
a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c

index abf7895d1608..8ff2802db5b5 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -27,6 +27,7 @@
  #include 
    #include "dc.h"
+#include "dc_link.h"


Please drop this include, the relevant function should already be
accessible from dc.h.


Good catch. Removed and respun the patch 
(https://patchwork.freedesktop.org/patch/526211/)


Pls review.

Regards,

Shirish S


Well, the code structure has been changed now since : "0078c924e733 
drm/amd/display: move eDP panel control logic to link_edp_panel_control"


Now "dc.h" no more includes "link.h".





  #include "amdgpu.h"
  #include "amdgpu_dm.h"
  #include "amdgpu_dm_debugfs.h"
@@ -2793,6 +2794,22 @@ static int psr_get(void *data, u64 *val)
  return 0;
  }
  +/*
+ *  Read PSR state residency
+ */
+static int psr_read_residency(void *data, u64 *val)
+{
+    struct amdgpu_dm_connector *connector = data;
+    struct dc_link *link = connector->dc_link;
+    u32 residency;
+
+    dc_link_get_psr_residency(link, );


Did you mean to use link_get_psr_residency() here?


Yes, my code was a bit older, have incorporated final changes in new 
patch: https://patchwork.freedesktop.org/patch/526213/


Pls review.

Regards,

Shirish S




+
+    *val = (u64)residency;
+
+    return 0;
+}
+
  /*
   * Set dmcub trace event IRQ enable or disable.
   * Usage to enable dmcub trace event IRQ: echo 1 > 
/sys/kernel/debug/dri/0/amdgpu_dm_dmcub_trace_event_en
@@ -2828,6 +2845,7 @@ 
DEFINE_DEBUGFS_ATTRIBUTE(dmcub_trace_event_state_fops, 
dmcub_trace_event_state_g

   dmcub_trace_event_state_set, "%llu\n");
    DEFINE_DEBUGFS_ATTRIBUTE(psr_fops, psr_get, NULL, "%llu\n");
+DEFINE_DEBUGFS_ATTRIBUTE(psr_residency_fops, psr_read_residency, 
NULL, "%llu\n");

    DEFINE_SHOW_ATTRIBUTE(current_backlight);
  DEFINE_SHOW_ATTRIBUTE(target_backlight);
@@ -2991,6 +3009,7 @@ void connector_debugfs_init(struct 
amdgpu_dm_connector *connector)

  if (connector->base.connector_type == DRM_MODE_CONNECTOR_eDP) {
  debugfs_create_file_unsafe("psr_capability", 0444, dir, 
connector, _capability_fops);
  debugfs_create_file_unsafe("psr_state", 0444, dir, 
connector, _fops);
+    debugfs_create_file_unsafe("psr_residency", 0444, dir, 
connector, _residency_fops);
  debugfs_create_file("amdgpu_current_backlight_pwm", 0444, 
dir, connector,

  _backlight_fops);
  debugfs_create_file("amdgpu_target_backlight_pwm", 0444, 
dir, connector,




Re: [PATCH] amd/display/debugfs: add sysfs entry to read PSR residency from firmware

2023-03-09 Thread S, Shirish



On 3/8/2023 11:52 PM, Hamza Mahfooz wrote:


On 3/8/23 02:10, Shirish S wrote:

[Why]
Currently there aren't any methods to determine PSR state residency.

[How]
create a sysfs entry for reading residency and internally hook it up
to existing functionality of reading PSR residency from firmware.

Signed-off-by: Shirish S 
---
  .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 19 +++
  1 file changed, 19 insertions(+)

diff --git 
a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c

index abf7895d1608..8ff2802db5b5 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -27,6 +27,7 @@
  #include 
    #include "dc.h"
+#include "dc_link.h"


Please drop this include, the relevant function should already be
accessible from dc.h.


Good catch. Removed and respun the patch 
(https://patchwork.freedesktop.org/patch/526211/)


Pls review.

Regards,

Shirish S




  #include "amdgpu.h"
  #include "amdgpu_dm.h"
  #include "amdgpu_dm_debugfs.h"
@@ -2793,6 +2794,22 @@ static int psr_get(void *data, u64 *val)
  return 0;
  }
  +/*
+ *  Read PSR state residency
+ */
+static int psr_read_residency(void *data, u64 *val)
+{
+    struct amdgpu_dm_connector *connector = data;
+    struct dc_link *link = connector->dc_link;
+    u32 residency;
+
+    dc_link_get_psr_residency(link, );


Did you mean to use link_get_psr_residency() here?


+
+    *val = (u64)residency;
+
+    return 0;
+}
+
  /*
   * Set dmcub trace event IRQ enable or disable.
   * Usage to enable dmcub trace event IRQ: echo 1 > 
/sys/kernel/debug/dri/0/amdgpu_dm_dmcub_trace_event_en
@@ -2828,6 +2845,7 @@ 
DEFINE_DEBUGFS_ATTRIBUTE(dmcub_trace_event_state_fops, 
dmcub_trace_event_state_g

   dmcub_trace_event_state_set, "%llu\n");
    DEFINE_DEBUGFS_ATTRIBUTE(psr_fops, psr_get, NULL, "%llu\n");
+DEFINE_DEBUGFS_ATTRIBUTE(psr_residency_fops, psr_read_residency, 
NULL, "%llu\n");

    DEFINE_SHOW_ATTRIBUTE(current_backlight);
  DEFINE_SHOW_ATTRIBUTE(target_backlight);
@@ -2991,6 +3009,7 @@ void connector_debugfs_init(struct 
amdgpu_dm_connector *connector)

  if (connector->base.connector_type == DRM_MODE_CONNECTOR_eDP) {
  debugfs_create_file_unsafe("psr_capability", 0444, dir, 
connector, _capability_fops);
  debugfs_create_file_unsafe("psr_state", 0444, dir, 
connector, _fops);
+    debugfs_create_file_unsafe("psr_residency", 0444, dir, 
connector, _residency_fops);
  debugfs_create_file("amdgpu_current_backlight_pwm", 0444, 
dir, connector,

  _backlight_fops);
  debugfs_create_file("amdgpu_target_backlight_pwm", 0444, 
dir, connector,




Re: [PATCH] drm/amd/display: disable psr whenever applicable

2022-10-06 Thread S, Shirish



On 10/6/2022 10:51 PM, Leo Li wrote:




On 2022-10-06 03:46, S, Shirish wrote:


On 10/6/2022 4:33 AM, Leo Li wrote:



On 2022-10-03 11:26, S, Shirish wrote:

Ping!

Regards,

Shirish S

On 9/30/2022 7:17 PM, S, Shirish wrote:



On 9/30/2022 6:59 PM, Harry Wentland wrote:

+Leo

On 9/30/22 06:27, Shirish S wrote:

[Why]
psr feature continues to be enabled for non capable links.


Do you have more info on what issues you're seeing with this?


Code wise without this change we end up setting 
"vblank_disable_immediate" parameter to false for the failing 
links also.


Issue wise there is a remote chance of this leading to 
eDP/connected monitor not lighting up.


I'm surprised psr_settings.psr_feature_enabled can be 'true' before
amdgpu_dm_set_psr_caps() runs. it should default to 'false', and it's
set early on during amdgpu_dm_initialize_drm_device() before any other
psr-related code runs.

In other words, I don't expect psr_settings.psr_feature_enabled to be
'true' on early return of dm_set_psr_caps().

What are the sequence of events that causes an issue for you?


psr_feature_enabled is set to true by default in 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L4264 
for DCN 3.0 onwards


(Also, in ChromeOS wherein KMS driver is statically built in kernel, 
we set PSR feature  as enabled as command-line argument via 
amdgpu_dc_feature_mask.)


Hence, the variable is set to true while entering 
amdgpu_dm_set_psr_caps().


Hmm, that is a local variable in the function, not the same as
link->psr_settings.psr_feature_enabled. Unless I'm missing something, it
looks like link->psr_settings.psr_feature_enabled is never set to true.

More below,










[How]
disable the feature on links that are not capable of the same.

Signed-off-by: Shirish S
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c | 10 
--

  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git 
a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c

index 8ca10ab3dfc1..f73af028f312 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
@@ -60,11 +60,17 @@ static bool link_supports_psrsu(struct 
dc_link *link)

   */
  void amdgpu_dm_set_psr_caps(struct dc_link *link)
  {
-    if (!(link->connector_signal & SIGNAL_TYPE_EDP))
+    if (!(link->connector_signal & SIGNAL_TYPE_EDP)) {
+    DRM_ERROR("Disabling PSR as connector is not eDP\n")

I don't think we should log an error here.


My objective of logging an error was to inform user/developer that 
this boot PSR enablement had issues.


It's not really an issue, PSR simply cannot be enabled on non-eDP or
disconnected links. 


Agree, the idea here is to avoid decisions being taken presuming 
psr_feature_enabled being set on such links, like disabling 
vblank_disable_immediate 
<https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L4330>etc.,


Regards,

Shirish S


However, it is concerning if we enter this function
with psr_feature_enabled == true.

Thanks,
Leo



Am fine with moving it to INFO or remove it, if you insist.

Thanks for your comments.

Regards,

Shirish S


+ link->psr_settings.psr_feature_enabled = false;


Never the less, explicitly setting to false rather than leaving it as
default sounds like a good idea to me.

But I don't see how this fixes an issue.

If this is a readability fix, I suggest changing commit title and
description to reflect that.


Done.

Patch here: https://patchwork.freedesktop.org/patch/506242/

Regards,

Shirish S



Thanks,
Leo


  return;
+    }
  -    if (link->type == dc_connection_none)
+    if (link->type == dc_connection_none) {
+    DRM_ERROR("Disabling PSR as eDP connection type is 
invalid\n")

Same here, this doesn't warrant an error log.

Harry


+ link->psr_settings.psr_feature_enabled = false;
  return;
+    }
    if (link->dpcd_caps.psr_info.psr_version == 0) {
  link->psr_settings.psr_version = 
DC_PSR_VERSION_UNSUPPORTED;


Re: [PATCH] drm/amd/display: disable psr whenever applicable

2022-10-06 Thread S, Shirish


On 10/6/2022 4:33 AM, Leo Li wrote:



On 2022-10-03 11:26, S, Shirish wrote:

Ping!

Regards,

Shirish S

On 9/30/2022 7:17 PM, S, Shirish wrote:



On 9/30/2022 6:59 PM, Harry Wentland wrote:

+Leo

On 9/30/22 06:27, Shirish S wrote:

[Why]
psr feature continues to be enabled for non capable links.


Do you have more info on what issues you're seeing with this?


Code wise without this change we end up setting 
"vblank_disable_immediate" parameter to false for the failing links 
also.


Issue wise there is a remote chance of this leading to eDP/connected 
monitor not lighting up.


I'm surprised psr_settings.psr_feature_enabled can be 'true' before
amdgpu_dm_set_psr_caps() runs. it should default to 'false', and it's
set early on during amdgpu_dm_initialize_drm_device() before any other
psr-related code runs.

In other words, I don't expect psr_settings.psr_feature_enabled to be
'true' on early return of dm_set_psr_caps().

What are the sequence of events that causes an issue for you?


psr_feature_enabled is set to true by default in 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L4264 
for DCN 3.0 onwards


(Also, in ChromeOS wherein KMS driver is statically built in kernel, we 
set PSR feature  as enabled as command-line argument via 
amdgpu_dc_feature_mask.)


Hence, the variable is set to true while entering amdgpu_dm_set_psr_caps().








[How]
disable the feature on links that are not capable of the same.

Signed-off-by: Shirish S
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c | 10 
--

  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c

index 8ca10ab3dfc1..f73af028f312 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
@@ -60,11 +60,17 @@ static bool link_supports_psrsu(struct dc_link 
*link)

   */
  void amdgpu_dm_set_psr_caps(struct dc_link *link)
  {
-    if (!(link->connector_signal & SIGNAL_TYPE_EDP))
+    if (!(link->connector_signal & SIGNAL_TYPE_EDP)) {
+    DRM_ERROR("Disabling PSR as connector is not eDP\n")

I don't think we should log an error here.


My objective of logging an error was to inform user/developer that 
this boot PSR enablement had issues.


It's not really an issue, PSR simply cannot be enabled on non-eDP or
disconnected links. 


Agree, the idea here is to avoid decisions being taken presuming 
psr_feature_enabled being set on such links, like disabling 
vblank_disable_immediate 
<https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c#L4330>etc.,


Regards,

Shirish S


However, it is concerning if we enter this function
with psr_feature_enabled == true.

Thanks,
Leo



Am fine with moving it to INFO or remove it, if you insist.

Thanks for your comments.

Regards,

Shirish S


+ link->psr_settings.psr_feature_enabled = false;
  return;
+    }
  -    if (link->type == dc_connection_none)
+    if (link->type == dc_connection_none) {
+    DRM_ERROR("Disabling PSR as eDP connection type is 
invalid\n")

Same here, this doesn't warrant an error log.

Harry


+ link->psr_settings.psr_feature_enabled = false;
  return;
+    }
    if (link->dpcd_caps.psr_info.psr_version == 0) {
  link->psr_settings.psr_version = 
DC_PSR_VERSION_UNSUPPORTED;

Re: [PATCH] drm/amd/display: disable psr whenever applicable

2022-10-03 Thread S, Shirish

Ping!

Regards,

Shirish S

On 9/30/2022 7:17 PM, S, Shirish wrote:



On 9/30/2022 6:59 PM, Harry Wentland wrote:

+Leo

On 9/30/22 06:27, Shirish S wrote:

[Why]
psr feature continues to be enabled for non capable links.


Do you have more info on what issues you're seeing with this?


Code wise without this change we end up setting 
"vblank_disable_immediate" parameter to false for the failing links also.


Issue wise there is a remote chance of this leading to eDP/connected 
monitor not lighting up.



[How]
disable the feature on links that are not capable of the same.

Signed-off-by: Shirish S
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c | 10 --
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
index 8ca10ab3dfc1..f73af028f312 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
@@ -60,11 +60,17 @@ static bool link_supports_psrsu(struct dc_link *link)
   */
  void amdgpu_dm_set_psr_caps(struct dc_link *link)
  {
-   if (!(link->connector_signal & SIGNAL_TYPE_EDP))
+   if (!(link->connector_signal & SIGNAL_TYPE_EDP)) {
+   DRM_ERROR("Disabling PSR as connector is not eDP\n")

I don't think we should log an error here.


My objective of logging an error was to inform user/developer that 
this boot PSR enablement had issues.


Am fine with moving it to INFO or remove it, if you insist.

Thanks for your comments.

Regards,

Shirish S


+   link->psr_settings.psr_feature_enabled = false;
return;
+   }
  
-	if (link->type == dc_connection_none)

+   if (link->type == dc_connection_none) {
+   DRM_ERROR("Disabling PSR as eDP connection type is invalid\n")

Same here, this doesn't warrant an error log.

Harry


+   link->psr_settings.psr_feature_enabled = false;
return;
+   }
  
  	if (link->dpcd_caps.psr_info.psr_version == 0) {

link->psr_settings.psr_version = DC_PSR_VERSION_UNSUPPORTED;

Re: [PATCH] drm/amd/display: disable psr whenever applicable

2022-09-30 Thread S, Shirish


On 9/30/2022 6:59 PM, Harry Wentland wrote:

+Leo

On 9/30/22 06:27, Shirish S wrote:

[Why]
psr feature continues to be enabled for non capable links.


Do you have more info on what issues you're seeing with this?


Code wise without this change we end up setting 
"vblank_disable_immediate" parameter to false for the failing links also.


Issue wise there is a remote chance of this leading to eDP/connected 
monitor not lighting up.





[How]
disable the feature on links that are not capable of the same.

Signed-off-by: Shirish S
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c | 10 --
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
index 8ca10ab3dfc1..f73af028f312 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
@@ -60,11 +60,17 @@ static bool link_supports_psrsu(struct dc_link *link)
   */
  void amdgpu_dm_set_psr_caps(struct dc_link *link)
  {
-   if (!(link->connector_signal & SIGNAL_TYPE_EDP))
+   if (!(link->connector_signal & SIGNAL_TYPE_EDP)) {
+   DRM_ERROR("Disabling PSR as connector is not eDP\n")

I don't think we should log an error here.


My objective of logging an error was to inform user/developer that this 
boot PSR enablement had issues.


Am fine with moving it to INFO or remove it, if you insist.

Thanks for your comments.

Regards,

Shirish S




+   link->psr_settings.psr_feature_enabled = false;
return;
+   }
  
-	if (link->type == dc_connection_none)

+   if (link->type == dc_connection_none) {
+   DRM_ERROR("Disabling PSR as eDP connection type is invalid\n")

Same here, this doesn't warrant an error log.

Harry


+   link->psr_settings.psr_feature_enabled = false;
return;
+   }
  
  	if (link->dpcd_caps.psr_info.psr_version == 0) {

link->psr_settings.psr_version = DC_PSR_VERSION_UNSUPPORTED;

RE: [PATCH] amd/display: set backlight only if required

2022-03-21 Thread S, Shirish
[AMD Official Use Only]

Ping!




Regards,
Shirish S

-Original Message-
From: S, Shirish  
Sent: Monday, March 14, 2022 12:24 PM
To: Wentland, Harry ; S, Shirish ; 
Wentland, Harry ; Kazlauskas, Nicholas 
; Lakha, Bhawanpreet 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] amd/display: set backlight only if required


On 3/11/2022 9:11 PM, Harry Wentland wrote:
>
> On 3/11/22 10:33, Shirish S wrote:
>> [Why]
>> comparing pwm bl values (coverted) with user brightness(converted) 
>> levels in commit_tail leads to continuous setting of backlight via 
>> dmub as they don't to match.
> Why do the values not match?

Here is a sample of values:

dmub_abm_get_current_backlight() reads backlight value as 11526 =>
convert_to_user() as 45.

user_brightness value to be set at this point is 159 =>
convert_from_user() gives 40863.

Now, we are continuously comparing 45 (current backlight) with 159 (to be set 
from user space) in every commit tail till any actual changes happen to 
brightness.

Ideally, current brightness/backlight value read from pwm register, when 
converted should yield 159 but it returns 45.

Hence, I believe, there's a bug either in conversion back and forth of user 
space levels or pwm register is not the right way to arrive at current 
brightness values.

>   It looks like the value mismatch
> is our root cause.
Yes, apparently I could not find any other register read that could bail us out 
here and provide actual/proper values, hence this patch.
> I remember a while back looking at an issue where we the readback was 
> from DMCU while we were setting BL directly via PWM. I wonder if the 
> opposite is happening now.
>
> See this for the previous fix:
> 2bf3d62dabcc drm/amd/display: Get backlight from PWM if DMCU is not 
> initialized

The sample values mentioned above are with this patch applied.

Is there a better way of reading current backlight levels, that reflect user 
space ones?


>> This leads overdrive in queuing of commands to DMCU that sometimes lead
>> to depending on load on DMCU fw:
>>
>> "[drm:dc_dmub_srv_wait_idle] *ERROR* Error waiting for DMUB idle: status=3"
>>
>> [How]
>> Store last successfully set backlight value and compare with it instead
>> of pwm reads which is not what we should compare with.
>>
> Does BL work reliably after S3 or S4 with your change? I wonder if
> there are use-cases that might break because we're no longer comparing
> against the actual BL value but against a stored variable.
I've verified this patch for boot, S0i3 and GUI method of changing 
brightness on ChromeOS
>
>> Signed-off-by: Shirish S 
>> ---
>>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 7 ---
>>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 6 ++
>>   2 files changed, 10 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> index df0980ff9a63..2b8337e47861 100644
>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> @@ -3972,7 +3972,7 @@ static u32 convert_brightness_to_user(const struct 
>> amdgpu_dm_backlight_caps *cap
>>   max - min);
>>   }
>>   
>> -static int amdgpu_dm_backlight_set_level(struct amdgpu_display_manager *dm,
>> +static void amdgpu_dm_backlight_set_level(struct amdgpu_display_manager *dm,
>>   int bl_idx,
>>   u32 user_brightness)
>>   {
>> @@ -4003,7 +4003,8 @@ static int amdgpu_dm_backlight_set_level(struct 
>> amdgpu_display_manager *dm,
>>  DRM_DEBUG("DM: Failed to update backlight on 
>> eDP[%d]\n", bl_idx);
>>  }
>>   
>> -return rc ? 0 : 1;
>> +if (rc)
>> +dm->actual_brightness[bl_idx] = user_brightness;
>>   }
>>   
>>   static int amdgpu_dm_backlight_update_status(struct backlight_device *bd)
>> @@ -9944,7 +9945,7 @@ static void amdgpu_dm_atomic_commit_tail(struct 
>> drm_atomic_state *state)
>>  /* restore the backlight level */
>>  for (i = 0; i < dm->num_of_edps; i++) {
>>  if (dm->backlight_dev[i] &&
>> -(amdgpu_dm_backlight_get_level(dm, i) != dm->brightness[i]))
>> +(dm->actual_brightness[i] != dm->brightness[i]))
>>  amdgpu_dm_backlight_set_level(dm, i, dm->brightness[i]);
>>  }
>>   #endif
>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
>> b/drivers/gpu/drm/amd

Re: [PATCH] amd/display: set backlight only if required

2022-03-14 Thread S, Shirish



On 3/11/2022 9:11 PM, Harry Wentland wrote:


On 3/11/22 10:33, Shirish S wrote:

[Why]
comparing pwm bl values (coverted) with user brightness(converted)
levels in commit_tail leads to continuous setting of backlight via dmub
as they don't to match.

Why do the values not match?


Here is a sample of values:

dmub_abm_get_current_backlight() reads backlight value as 11526 => 
convert_to_user() as 45.


user_brightness value to be set at this point is 159 => 
convert_from_user() gives 40863.


Now, we are continuously comparing 45 (current backlight) with 159 (to 
be set from user space) in every commit tail till any actual changes 
happen to brightness.


Ideally, current brightness/backlight value read from pwm register, when 
converted should yield 159 but it returns 45.


Hence, I believe, there's a bug either in conversion back and forth of 
user space levels or pwm register is not the right way to arrive at 
current brightness values.



  It looks like the value mismatch
is our root cause.
Yes, apparently I could not find any other register read that could bail 
us out here and provide actual/proper values, hence this patch.

I remember a while back looking at an issue
where we the readback was from DMCU while we were setting BL
directly via PWM. I wonder if the opposite is happening now.

See this for the previous fix:
2bf3d62dabcc drm/amd/display: Get backlight from PWM if DMCU is not initialized


The sample values mentioned above are with this patch applied.

Is there a better way of reading current backlight levels, that reflect 
user space ones?




This leads overdrive in queuing of commands to DMCU that sometimes lead
to depending on load on DMCU fw:

"[drm:dc_dmub_srv_wait_idle] *ERROR* Error waiting for DMUB idle: status=3"

[How]
Store last successfully set backlight value and compare with it instead
of pwm reads which is not what we should compare with.


Does BL work reliably after S3 or S4 with your change? I wonder if
there are use-cases that might break because we're no longer comparing
against the actual BL value but against a stored variable.
I've verified this patch for boot, S0i3 and GUI method of changing 
brightness on ChromeOS



Signed-off-by: Shirish S 
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 7 ---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 6 ++
  2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index df0980ff9a63..2b8337e47861 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3972,7 +3972,7 @@ static u32 convert_brightness_to_user(const struct 
amdgpu_dm_backlight_caps *cap
 max - min);
  }
  
-static int amdgpu_dm_backlight_set_level(struct amdgpu_display_manager *dm,

+static void amdgpu_dm_backlight_set_level(struct amdgpu_display_manager *dm,
 int bl_idx,
 u32 user_brightness)
  {
@@ -4003,7 +4003,8 @@ static int amdgpu_dm_backlight_set_level(struct 
amdgpu_display_manager *dm,
DRM_DEBUG("DM: Failed to update backlight on 
eDP[%d]\n", bl_idx);
}
  
-	return rc ? 0 : 1;

+   if (rc)
+   dm->actual_brightness[bl_idx] = user_brightness;
  }
  
  static int amdgpu_dm_backlight_update_status(struct backlight_device *bd)

@@ -9944,7 +9945,7 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
/* restore the backlight level */
for (i = 0; i < dm->num_of_edps; i++) {
if (dm->backlight_dev[i] &&
-   (amdgpu_dm_backlight_get_level(dm, i) != dm->brightness[i]))
+   (dm->actual_brightness[i] != dm->brightness[i]))
amdgpu_dm_backlight_set_level(dm, i, dm->brightness[i]);
}
  #endif
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index 372f9adf091a..321279bc877b 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -540,6 +540,12 @@ struct amdgpu_display_manager {
 * cached backlight values.
 */
u32 brightness[AMDGPU_DM_MAX_NUM_EDP];
+   /**
+* @actual_brightness:

"actual" seems misleading here. We might want to call this
"last" or something along those lines.

But let's first see if we can fix the mismatch of BL reads
and writes.


Yes, lets thoroughly evaluate if there is any other way.

Regards,

Shirish S



Harry


+*
+* last successfully applied backlight values.
+*/
+   u32 actual_brightness[AMDGPU_DM_MAX_NUM_EDP];
  };
  
  enum dsc_clock_force_state {


Re: [PATCH 3/3] drm/amdgpu: add AMDGPURESET uevent on AMD GPU reset

2022-01-18 Thread S, Shirish

Hi Shashank,


On 1/12/2022 6:30 PM, Sharma, Shashank wrote:



On 1/11/2022 12:26 PM, Christian König wrote:

Am 11.01.22 um 08:12 schrieb Somalapuram Amaranath:

AMDGPURESET uevent added to notify userspace,
collect dump_stack and amdgpu_reset_reg_dumps

Signed-off-by: Somalapuram Amaranath 
---
  drivers/gpu/drm/amd/amdgpu/nv.c | 31 +++
  1 file changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c 
b/drivers/gpu/drm/amd/amdgpu/nv.c

index 2ec1ffb36b1f..41a2c37e825f 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -529,10 +529,41 @@ nv_asic_reset_method(struct amdgpu_device *adev)
  }
  }
+/**
+ * drm_sysfs_reset_event - generate a DRM uevent
+ * @dev: DRM device
+ *
+ * Send a uevent for the DRM device specified by @dev. Currently we 
only
+ * set AMDGPURESET=1 in the uevent environment, but this could be 
expanded to

+ * deal with other types of events.
+ *
+ * Any new uapi should be using the drm_sysfs_connector_status_event()
+ * for uevents on connector status change.
+ */
+void drm_sysfs_reset_event(struct drm_device *dev)
+{
+    char *event_string = "AMDGPURESET=1";
+    char *envp[2] = { event_string, NULL };
+
+ kobject_uevent_env(>primary->kdev->kobj, KOBJ_CHANGE, envp);


That won't work like this.

kobject_uevent_env() needs to allocate memory to send the event to 
userspace and that is not allowed while we do an reset. The Intel 
guys felt into the same trap already.


What we could maybe do is to teach kobject_uevent_env() gfp flags and 
make all allocations from the atomic pool.


Regards,
Christian.


Hi Amar,

I see another problem here,

We are sending the event at the GPU reset, but we are collecting the 
register values only when the corresponding userspace agent calls a 
read() on the respective sysfs entry.


Is the presumption here that gpu reset is always triggered within kernel 
& user space has to be made aware of it?


From what I know OS'/apps use GL extensions like robustness and other 
ways to detect hangs/gpu resets and flush out guilty contexts or take 
approp next steps.


BTW, is there any userspace infra already in place that have a 
task/thread listening  for reset events implemented, similar to hpd?


I believe there are several ways to make user space aware of reset via 
gpu_reset_counter etc, also if the objective is the have call trace upon 
reset or dump registers you can do it in the amdgpu_device_gpu_recover() 
but guard it with a proper CONFIG


that can be enabled in kernel's debug builds only, like tag along with 
KASAN etc.,


This way there will be lesser dependency on userspace.


Regards,

Shirish S



There is a very fair possibility that the register values are reset by 
the HW by then, and we are reading re-programmed values. At least 
there will be a race().


I think we should change this design in such a way:
1. Get into gpu_reset()
2. collect the register values and save this context into a separate 
file/node. Probably sending a trace_event here would be easiest way.

3. Send the drm event to the userspace client
4. The client reads from the trace file, and gets the data.

- Shashank




+}
+
+void amdgpu_reset_dumps(struct amdgpu_device *adev)
+{
+    struct drm_device *ddev = adev_to_drm(adev);
+    /* original raven doesn't have full asic reset */
+    if ((adev->apu_flags & AMD_APU_IS_RAVEN) &&
+    !(adev->apu_flags & AMD_APU_IS_RAVEN2))
+    return;
+    drm_sysfs_reset_event(ddev);
+    dump_stack();
+}
+
  static int nv_asic_reset(struct amdgpu_device *adev)
  {
  int ret = 0;
+    amdgpu_reset_dumps(adev);
  switch (nv_asic_reset_method(adev)) {
  case AMD_RESET_METHOD_PCI:
  dev_info(adev->dev, "PCI reset\n");




Re: [PATCH] drm/amd/display: log amdgpu_dm_atomic_check() failure cause

2021-11-08 Thread S, Shirish



On 11/8/2021 8:27 PM, Harry Wentland wrote:


On 2021-11-08 06:23, Christian König wrote:


Am 08.11.21 um 12:13 schrieb S, Shirish:

Hi Paul,

On 11/8/2021 2:27 PM, Paul Menzel wrote:

Dear Shrish,


Am 08.11.21 um 09:40 schrieb Shirish S:

update user with next level of info about which condition led to
atomic check failure.

Signed-off-by: Shirish S 
---
   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 70 ++-
   1 file changed, 52 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 1e26d9be8993..37ea8a76fa09 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10746,8 +10746,10 @@ static int amdgpu_dm_atomic_check(struct drm_device 
*dev,
   trace_amdgpu_dm_atomic_check_begin(state);
     ret = drm_atomic_helper_check_modeset(dev, state);
-    if (ret)
+    if (ret) {
+    DRM_DEV_ERROR(adev->dev, "drm_atomic_helper_check_modeset() failed\n");

Does the Linux kernel provide means (for example ftrace) to trace such things, 
so the (generic) debug lines don’t have to be added? Or is it to better debug 
user bug reports?


ftrace requires additional tooling, am trying to avoid it and make the error 
reporting more trivial to the developers, in case there is a failure in 
atomic_check.

Yeah, but Paul is right that here looks like totally overkill to me as well.

And especially calls to functions like drm_atomic_helper_check_modeset() sound 
like parameter validation to me which the kernel should absolute NOT report 
about on default severity level.


Atomic_check is also expected to fail as userspace might want to just query 
whether an atomic_state can be applied.

Debug messages might make sense here and would help with debug. These shouldn't 
be error prints, though.


Thanks Harry, have updated the prints to debug from error.

Regards,

Shirish S


Harry


Otherwise you allow userspace to flood the logs with trivial error messages.

Regards,
Christian.


Regards,

Shirish S


Kind regards,

Paul



   goto fail;
+    }
     /* Check connector changes */
   for_each_oldnew_connector_in_state(state, connector, old_con_state, 
new_con_state, i) {
@@ -10763,6 +10765,7 @@ static int amdgpu_dm_atomic_check(struct drm_device 
*dev,
     new_crtc_state = drm_atomic_get_crtc_state(state, 
new_con_state->crtc);
   if (IS_ERR(new_crtc_state)) {
+    DRM_DEV_ERROR(adev->dev, "drm_atomic_get_crtc_state() failed\n");
   ret = PTR_ERR(new_crtc_state);
   goto fail;
   }
@@ -10777,8 +10780,10 @@ static int amdgpu_dm_atomic_check(struct drm_device 
*dev,
   for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
new_crtc_state, i) {
   if (drm_atomic_crtc_needs_modeset(new_crtc_state)) {
   ret = add_affected_mst_dsc_crtcs(state, crtc);
-    if (ret)
+    if (ret) {
+    DRM_DEV_ERROR(adev->dev, "add_affected_mst_dsc_crtcs() 
failed\n");
   goto fail;
+    }
   }
   }
   }
@@ -10793,19 +10798,25 @@ static int amdgpu_dm_atomic_check(struct drm_device 
*dev,
   continue;
     ret = amdgpu_dm_verify_lut_sizes(new_crtc_state);
-    if (ret)
+    if (ret) {
+    DRM_DEV_ERROR(adev->dev, "amdgpu_dm_verify_lut_sizes() failed\n");
   goto fail;
+    }
     if (!new_crtc_state->enable)
   continue;
     ret = drm_atomic_add_affected_connectors(state, crtc);
-    if (ret)
-    return ret;
+    if (ret) {
+    DRM_DEV_ERROR(adev->dev, "drm_atomic_add_affected_connectors() 
failed\n");
+    goto fail;
+    }
     ret = drm_atomic_add_affected_planes(state, crtc);
-    if (ret)
+    if (ret) {
+    DRM_DEV_ERROR(adev->dev, "drm_atomic_add_affected_planes() 
failed\n");
   goto fail;
+    }
     if (dm_old_crtc_state->dsc_force_changed)
   new_crtc_state->mode_changed = true;
@@ -10842,6 +10853,7 @@ static int amdgpu_dm_atomic_check(struct drm_device 
*dev,
     if (IS_ERR(new_plane_state)) {
   ret = PTR_ERR(new_plane_state);
+    DRM_DEV_ERROR(adev->dev, "new_plane_state is BAD\n");
   goto fail;
   }
   }
@@ -10854,8 +10866,10 @@ static int amdgpu_dm_atomic_check(struct drm_device 
*dev,
   new_plane_state,
   false,
   _and_validation_needed);
-    if (ret)
+    if (ret) {
+    DRM_DEV_ERROR(adev->dev, "dm_update_plane_state() failed\n");
   goto 

Re: [PATCH] drm/amd/display: reject both non-zero src_x and src_y only for DCN1x

2021-11-08 Thread S, Shirish

Hi Paul,

On 11/8/2021 7:51 PM, Paul Menzel wrote:

[Which address should be used: sshan...@amd.com or shiris...@amd.com?]

"shiris...@amd.com"


Dear Shirish,


Am 08.11.21 um 12:11 schrieb S, Shirish:


On 11/8/2021 2:25 PM, Paul Menzel wrote:



Am 08.11.21 um 09:15 schrieb Shirish S:

limit the MPO rejection only for DCN1x as its not required on later


it’s


versions.


Where is it documented, that it’s not required for later versions?


This is a workaround to avoid system hang & I've verified its not 
required DCN2.0.


Please extend the commit message with that information, and also add 
how you verified, that it’s not required for DCN2.0 exactly. (Just 
test one system?)



We generally don't have documentation for WA's.


WA is workaround?


yes.

Regards,

Shirish S




Kind regards,

Paul


Shortly describing the implementation is also useful. Something 
like: Require `fill_dc_scaling_info()` to receive the device to be 
able to check the version.


Fixes: d89f6048bdcb ("drm/amd/display: Reject non-zero src_y and 
src_x for video planes")




I’d remove the blank line.


Signed-off-by: Shirish S 
---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 20 
++-

  1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index 1e26d9be8993..26b29d561919 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4572,7 +4572,8 @@ static void 
get_min_max_dc_plane_scaling(struct drm_device *dev,

  }
    -static int fill_dc_scaling_info(const struct drm_plane_state 
*state,

+static int fill_dc_scaling_info(struct amdgpu_device *adev,
+    const struct drm_plane_state *state,
  struct dc_scaling_info *scaling_info)
  {
  int scale_w, scale_h, min_downscale, max_upscale;
@@ -4586,7 +4587,8 @@ static int fill_dc_scaling_info(const struct 
drm_plane_state *state,

  /*
   * For reasons we don't (yet) fully understand a non-zero
   * src_y coordinate into an NV12 buffer can cause a
- * system hang. To avoid hangs (and maybe be overly cautious)
+ * system hang on DCN1x.
+ * To avoid hangs (and maybe be overly cautious)


I’d remove the added line break.


   * let's reject both non-zero src_x and src_y.
   *
   * We currently know of only one use-case to reproduce a
@@ -4594,10 +4596,10 @@ static int fill_dc_scaling_info(const 
struct drm_plane_state *state,

   * is to gesture the YouTube Android app into full screen
   * on ChromeOS.
   */
-    if (state->fb &&
-    state->fb->format->format == DRM_FORMAT_NV12 &&
-    (scaling_info->src_rect.x != 0 ||
- scaling_info->src_rect.y != 0))
+    if (((adev->ip_versions[DCE_HWIP][0] == IP_VERSION(1, 0, 0)) ||
+    (adev->ip_versions[DCE_HWIP][0] == IP_VERSION(1, 0, 1))) &&
+    (state->fb && state->fb->format->format == DRM_FORMAT_NV12 &&
+    (scaling_info->src_rect.x != 0 || scaling_info->src_rect.y 
!= 0)))

  return -EINVAL;
    scaling_info->src_rect.width = state->src_w >> 16;
@@ -5503,7 +5505,7 @@ static int fill_dc_plane_attributes(struct 
amdgpu_device *adev,

  int ret;
  bool force_disable_dcc = false;
  -    ret = fill_dc_scaling_info(plane_state, _info);
+    ret = fill_dc_scaling_info(adev, plane_state, _info);
  if (ret)
  return ret;
  @@ -7566,7 +7568,7 @@ static int dm_plane_atomic_check(struct 
drm_plane *plane,

  if (ret)
  return ret;
  -    ret = fill_dc_scaling_info(new_plane_state, _info);
+    ret = fill_dc_scaling_info(adev, new_plane_state, _info);
  if (ret)
  return ret;
  @@ -9014,7 +9016,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
bundle->surface_updates[planes_count].gamut_remap_matrix = 
_plane->gamut_remap_matrix;

  }
  -    fill_dc_scaling_info(new_plane_state,
+    fill_dc_scaling_info(dm->adev, new_plane_state,
>scaling_infos[planes_count]);
bundle->surface_updates[planes_count].scaling_info =



Re: [PATCH] drm/amd/display: log amdgpu_dm_atomic_check() failure cause

2021-11-08 Thread S, Shirish

Hi Paul,

On 11/8/2021 2:27 PM, Paul Menzel wrote:

Dear Shrish,


Am 08.11.21 um 09:40 schrieb Shirish S:

update user with next level of info about which condition led to
atomic check failure.

Signed-off-by: Shirish S 
---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 70 ++-
  1 file changed, 52 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index 1e26d9be8993..37ea8a76fa09 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10746,8 +10746,10 @@ static int amdgpu_dm_atomic_check(struct 
drm_device *dev,

  trace_amdgpu_dm_atomic_check_begin(state);
    ret = drm_atomic_helper_check_modeset(dev, state);
-    if (ret)
+    if (ret) {
+    DRM_DEV_ERROR(adev->dev, "drm_atomic_helper_check_modeset() 
failed\n");


Does the Linux kernel provide means (for example ftrace) to trace such 
things, so the (generic) debug lines don’t have to be added? Or is it 
to better debug user bug reports?


ftrace requires additional tooling, am trying to avoid it and make the 
error reporting more trivial to the developers, in case there is a 
failure in atomic_check.


Regards,

Shirish S



Kind regards,

Paul



  goto fail;
+    }
    /* Check connector changes */
  for_each_oldnew_connector_in_state(state, connector, 
old_con_state, new_con_state, i) {
@@ -10763,6 +10765,7 @@ static int amdgpu_dm_atomic_check(struct 
drm_device *dev,
    new_crtc_state = drm_atomic_get_crtc_state(state, 
new_con_state->crtc);

  if (IS_ERR(new_crtc_state)) {
+    DRM_DEV_ERROR(adev->dev, "drm_atomic_get_crtc_state() 
failed\n");

  ret = PTR_ERR(new_crtc_state);
  goto fail;
  }
@@ -10777,8 +10780,10 @@ static int amdgpu_dm_atomic_check(struct 
drm_device *dev,
  for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
new_crtc_state, i) {

  if (drm_atomic_crtc_needs_modeset(new_crtc_state)) {
  ret = add_affected_mst_dsc_crtcs(state, crtc);
-    if (ret)
+    if (ret) {
+    DRM_DEV_ERROR(adev->dev, 
"add_affected_mst_dsc_crtcs() failed\n");

  goto fail;
+    }
  }
  }
  }
@@ -10793,19 +10798,25 @@ static int amdgpu_dm_atomic_check(struct 
drm_device *dev,

  continue;
    ret = amdgpu_dm_verify_lut_sizes(new_crtc_state);
-    if (ret)
+    if (ret) {
+    DRM_DEV_ERROR(adev->dev, "amdgpu_dm_verify_lut_sizes() 
failed\n");

  goto fail;
+    }
    if (!new_crtc_state->enable)
  continue;
    ret = drm_atomic_add_affected_connectors(state, crtc);
-    if (ret)
-    return ret;
+    if (ret) {
+    DRM_DEV_ERROR(adev->dev, 
"drm_atomic_add_affected_connectors() failed\n");

+    goto fail;
+    }
    ret = drm_atomic_add_affected_planes(state, crtc);
-    if (ret)
+    if (ret) {
+    DRM_DEV_ERROR(adev->dev, 
"drm_atomic_add_affected_planes() failed\n");

  goto fail;
+    }
    if (dm_old_crtc_state->dsc_force_changed)
  new_crtc_state->mode_changed = true;
@@ -10842,6 +10853,7 @@ static int amdgpu_dm_atomic_check(struct 
drm_device *dev,

    if (IS_ERR(new_plane_state)) {
  ret = PTR_ERR(new_plane_state);
+    DRM_DEV_ERROR(adev->dev, "new_plane_state is BAD\n");
  goto fail;
  }
  }
@@ -10854,8 +10866,10 @@ static int amdgpu_dm_atomic_check(struct 
drm_device *dev,

  new_plane_state,
  false,
  _and_validation_needed);
-    if (ret)
+    if (ret) {
+    DRM_DEV_ERROR(adev->dev, "dm_update_plane_state() 
failed\n");

  goto fail;
+    }
  }
    /* Disable all crtcs which require disable */
@@ -10865,8 +10879,10 @@ static int amdgpu_dm_atomic_check(struct 
drm_device *dev,

 new_crtc_state,
 false,
 _and_validation_needed);
-    if (ret)
+    if (ret) {
+    DRM_DEV_ERROR(adev->dev, "DISABLE: 
dm_update_crtc_state() failed\n");

  goto fail;
+    }
  }
    /* Enable all crtcs which require enable */
@@ -10876,8 +10892,10 @@ static int amdgpu_dm_atomic_check(struct 
drm_device *dev,

 new_crtc_state,
 true,
 _and_validation_needed);
-    if (ret)
+    if (ret) {
+    DRM_DEV_ERROR(adev->dev, "ENABLE: dm_update_crtc_state() 
failed\n");

  goto fail;
+    }
  }
    /* Add new/modified planes */
@@ -10887,20 +10905,26 @@ static int 

Re: [PATCH] drm/amd/display: reject both non-zero src_x and src_y only for DCN1x

2021-11-08 Thread S, Shirish

Hi Paul,

On 11/8/2021 2:25 PM, Paul Menzel wrote:

Dear Shirish,


Am 08.11.21 um 09:15 schrieb Shirish S:

limit the MPO rejection only for DCN1x as its not required on later


it’s


versions.


Where is it documented, that it’s not required for later versions?


This is a workaround to avoid system hang & I've verified its not 
required DCN2.0.


We generally don't have documentation for WA's.

Regards,

Shirish S



Shortly describing the implementation is also useful. Something like: 
Require `fill_dc_scaling_info()` to receive the device to be able to 
check the version.


Fixes: d89f6048bdcb ("drm/amd/display: Reject non-zero src_y and 
src_x for video planes")




I’d remove the blank line.


Signed-off-by: Shirish S 
---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 20 ++-
  1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index 1e26d9be8993..26b29d561919 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4572,7 +4572,8 @@ static void get_min_max_dc_plane_scaling(struct 
drm_device *dev,

  }
    -static int fill_dc_scaling_info(const struct drm_plane_state 
*state,

+static int fill_dc_scaling_info(struct amdgpu_device *adev,
+    const struct drm_plane_state *state,
  struct dc_scaling_info *scaling_info)
  {
  int scale_w, scale_h, min_downscale, max_upscale;
@@ -4586,7 +4587,8 @@ static int fill_dc_scaling_info(const struct 
drm_plane_state *state,

  /*
   * For reasons we don't (yet) fully understand a non-zero
   * src_y coordinate into an NV12 buffer can cause a
- * system hang. To avoid hangs (and maybe be overly cautious)
+ * system hang on DCN1x.
+ * To avoid hangs (and maybe be overly cautious)


I’d remove the added line break.


   * let's reject both non-zero src_x and src_y.
   *
   * We currently know of only one use-case to reproduce a
@@ -4594,10 +4596,10 @@ static int fill_dc_scaling_info(const struct 
drm_plane_state *state,

   * is to gesture the YouTube Android app into full screen
   * on ChromeOS.
   */
-    if (state->fb &&
-    state->fb->format->format == DRM_FORMAT_NV12 &&
-    (scaling_info->src_rect.x != 0 ||
- scaling_info->src_rect.y != 0))
+    if (((adev->ip_versions[DCE_HWIP][0] == IP_VERSION(1, 0, 0)) ||
+    (adev->ip_versions[DCE_HWIP][0] == IP_VERSION(1, 0, 1))) &&
+    (state->fb && state->fb->format->format == DRM_FORMAT_NV12 &&
+    (scaling_info->src_rect.x != 0 || scaling_info->src_rect.y 
!= 0)))

  return -EINVAL;
    scaling_info->src_rect.width = state->src_w >> 16;
@@ -5503,7 +5505,7 @@ static int fill_dc_plane_attributes(struct 
amdgpu_device *adev,

  int ret;
  bool force_disable_dcc = false;
  -    ret = fill_dc_scaling_info(plane_state, _info);
+    ret = fill_dc_scaling_info(adev, plane_state, _info);
  if (ret)
  return ret;
  @@ -7566,7 +7568,7 @@ static int dm_plane_atomic_check(struct 
drm_plane *plane,

  if (ret)
  return ret;
  -    ret = fill_dc_scaling_info(new_plane_state, _info);
+    ret = fill_dc_scaling_info(adev, new_plane_state, _info);
  if (ret)
  return ret;
  @@ -9014,7 +9016,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
bundle->surface_updates[planes_count].gamut_remap_matrix = 
_plane->gamut_remap_matrix;

  }
  -    fill_dc_scaling_info(new_plane_state,
+    fill_dc_scaling_info(dm->adev, new_plane_state,
>scaling_infos[planes_count]);
bundle->surface_updates[planes_count].scaling_info =



RE: [PATCH] drm/amdgpu/powerplay/smu10: add support for gpu busy query

2021-03-10 Thread S, Shirish
Tested-by: Shirish S 



Regards,
Shirish S

-Original Message-
From: amd-gfx  On Behalf Of Quan, Evan
Sent: Wednesday, March 10, 2021 1:11 PM
To: Deucher, Alexander ; 
amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: RE: [PATCH] drm/amdgpu/powerplay/smu10: add support for gpu busy query

[AMD Public Use]

Reviewed-by: Evan Quan 

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Wednesday, March 10, 2021 12:12 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: [PATCH] drm/amdgpu/powerplay/smu10: add support for gpu busy query

Was added in newer versions of the firmware.  Add support for it.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/pm/inc/rv_ppsmc.h |  1 +
 .../drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c  | 30 ++-
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/rv_ppsmc.h 
b/drivers/gpu/drm/amd/pm/inc/rv_ppsmc.h
index 4c7e08ba5fa4..171f12b82716 100644
--- a/drivers/gpu/drm/amd/pm/inc/rv_ppsmc.h
+++ b/drivers/gpu/drm/amd/pm/inc/rv_ppsmc.h
@@ -84,6 +84,7 @@
 #define PPSMC_MSG_PowerGateMmHub0x35
 #define PPSMC_MSG_SetRccPfcPmeRestoreRegister   0x36
 #define PPSMC_MSG_GpuChangeState0x37
+#define PPSMC_MSG_GetGfxBusy0x3D
 #define PPSMC_Message_Count 0x42
 
 typedef uint16_t PPSMC_Result;
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c
index c932b632ddd4..52fcdec738e9 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c
@@ -1261,9 +1261,21 @@ static int smu10_read_sensor(struct pp_hwmgr *hwmgr, int 
idx,
  void *value, int *size)
 {
struct smu10_hwmgr *smu10_data = (struct smu10_hwmgr *)(hwmgr->backend);
-   uint32_t sclk, mclk;
+   struct amdgpu_device *adev = hwmgr->adev;
+   uint32_t sclk, mclk, activity_percent;
+   bool has_gfx_busy;
int ret = 0;
 
+   /* GetGfxBusy support was added on RV SMU FW 30.85.00 and PCO 4.30.59 */
+   if ((adev->apu_flags & AMD_APU_IS_PICASSO) &&
+   (hwmgr->smu_version >= 0x41e3b))
+   has_gfx_busy = true;
+   else if ((adev->apu_flags & AMD_APU_IS_RAVEN) &&
+(hwmgr->smu_version >= 0x1e5500))
+   has_gfx_busy = true;
+   else
+   has_gfx_busy = false;
+
switch (idx) {
case AMDGPU_PP_SENSOR_GFX_SCLK:
smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetGfxclkFrequency, 
); @@ -1284,6 +1296,22 @@ static int smu10_read_sensor(struct pp_hwmgr 
*hwmgr, int idx,
*(uint32_t *)value =  smu10_data->vcn_power_gated ? 0 : 1;
*size = 4;
break;
+   case AMDGPU_PP_SENSOR_GPU_LOAD:
+   if (has_gfx_busy) {
+   ret = smum_send_msg_to_smc(hwmgr,
+  PPSMC_MSG_GetGfxBusy,
+  _percent);
+   if (!ret) {
+   activity_percent = activity_percent > 100 ? 100 
: activity_percent;
+   } else {
+   activity_percent = 50;
+   }
+   *((uint32_t *)value) = activity_percent;
+   return 0;
+   } else {
+   return -EOPNOTSUPP;
+   }
+   break;
default:
ret = -EOPNOTSUPP;
break;
--
2.29.2

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7CShirish.S%40amd.com%7C91ec8f556727479d060408d8e397d498%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637509588548327792%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=rQknLutpWSubH5e1T29n2hIpYT048FFfd8gf8bAQgEQ%3Dreserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfxdata=04%7C01%7CShirish.S%40amd.com%7C91ec8f556727479d060408d8e397d498%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637509588548327792%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=rQknLutpWSubH5e1T29n2hIpYT048FFfd8gf8bAQgEQ%3Dreserved=0
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: backlight rescaling to original range

2020-06-22 Thread S, Shirish

Some basic nit-picks inline.

On 6/22/2020 7:34 AM, Chauhan, Ikshwaku wrote:

[AMD Official Use Only - Internal Distribution Only]

Hello All,

Could you please provide your feedback for this patch?

Regards,
Ikshwaku

-Original Message-
From: Chauhan, Ikshwaku 
Sent: Wednesday, June 17, 2020 3:20 AM
To: Wentland, Harry ; Li, Sun peng (Leo) 
; Deucher, Alexander 
Cc: amd-gfx@lists.freedesktop.org; Chauhan, Ikshwaku 
Subject: [PATCH] drm/amdgpu: backlight rescaling to original range


Since you are touching display folder, the commit message should point 
to "drm/amd/display"


Also the commit message is not clear, can you correct it to reflect what 
this patch is doing.




[why]
The brightness input is in the range 0-255.It is getting scaled between the 
requested min and max input signal and also scaled up by 0x101 to match the DC 
interface which has a range of 0 to 0x. This scaled brightness value is not 
rescaled back to original range(0-255) when we reads it back.It returns the 
brightness value in the range of 0-65535 instead of 0-255.

[how]
Rescaled the brightness value form the scaled brightness range 0-65535 to input 
brightness range 0-255.


Please provide a sample output of backlight set & get values with and 
without your patch to understand what this patch is fixing, better.


Regards,

Shirish S



Signed-off-by: Ikshwaku Chauhan 
---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 40 ++-  
.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  5 +++
  2 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 9ab0d8521576..73b0a084e893 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2881,7 +2881,8 @@ static int set_backlight_via_aux(struct dc_link *link, 
uint32_t brightness)  }
  
  static u32 convert_brightness(const struct amdgpu_dm_backlight_caps *caps,

- const uint32_t user_brightness)
+ const uint32_t user_brightness,
+ enum convert_backlight flag)
  {
u32 min, max, conversion_pace;
u32 brightness = user_brightness;
@@ -2901,12 +2902,18 @@ static u32 convert_brightness(const struct 
amdgpu_dm_backlight_caps *caps,
 * 0 to 0x
 */
conversion_pace = 0x101;
-   brightness =
-   user_brightness
-   * conversion_pace
-   * (max - min)
-   / AMDGPU_MAX_BL_LEVEL
-   + min * conversion_pace;
+   if (flag == set_backlight)
+   brightness =
+   user_brightness
+   * conversion_pace
+   * (max - min)
+   / AMDGPU_MAX_BL_LEVEL
+   + min * conversion_pace;
+   else
+   brightness =
+   ((user_brightness - min * conversion_pace)
+* AMDGPU_MAX_BL_LEVEL)
+/ (conversion_pace * (max - min));
} else {
/* TODO
 * We are doing a linear interpolation here, which is OK but @@ 
-2940,24 +2947,35 @@ static int amdgpu_dm_backlight_update_status(struct 
backlight_device *bd)
  
  	link = (struct dc_link *)dm->backlight_link;
  
-	brightness = convert_brightness(, bd->props.brightness);

+   brightness = convert_brightness(, bd->props.brightness,
+   set_backlight);
// Change brightness based on AUX property
if (caps.aux_support)
return set_backlight_via_aux(link, brightness);
  
  	rc = dc_link_set_backlight_level(dm->backlight_link, brightness, 0);

-
return rc ? 0 : 1;
  }
  
  static int amdgpu_dm_backlight_get_brightness(struct backlight_device *bd)  {

struct amdgpu_display_manager *dm = bl_get_data(bd);
-   int ret = dc_link_get_backlight_level(dm->backlight_link);
+   struct amdgpu_dm_backlight_caps caps;
+   int ret;
+
+   amdgpu_dm_update_backlight_caps(dm);
+   caps = dm->backlight_caps;
+
+   ret = dc_link_get_backlight_level(dm->backlight_link);
+   ret = (int)convert_brightness(, (uint32_t)ret, get_backlight);
  
  	if (ret == DC_ERROR_UNEXPECTED)

return bd->props.brightness;
-   return ret;
+
+   if (ret == AMDGPU_MAX_BL_LEVEL || ret == 0)
+   return ret;
+   else
+   return ret+1;
  }
  
  static const struct backlight_ops amdgpu_dm_backlight_ops = { diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h

index 1df0ce047e1c..d54fc00148f9 100644
--- 

Re: [PATCH] drm/amdgpu: dont schedule jobs while in reset

2019-10-30 Thread S, Shirish

On 10/30/2019 3:50 PM, Koenig, Christian wrote:
> Am 30.10.19 um 10:13 schrieb S, Shirish:
>> [Why]
>>
>> doing kthread_park()/unpark() from drm_sched_entity_fini
>> while GPU reset is in progress defeats all the purpose of
>> drm_sched_stop->kthread_park.
>> If drm_sched_entity_fini->kthread_unpark() happens AFTER
>> drm_sched_stop->kthread_park nothing prevents from another
>> (third) thread to keep submitting job to HW which will be
>> picked up by the unparked scheduler thread and try to submit
>> to HW but fail because the HW ring is deactivated.
>>
>> [How]
>> grab the reset lock before calling drm_sched_entity_fini()
>>
>> Signed-off-by: Shirish S 
>> Suggested-by: Christian König 
> Patch itself is Reviewed-by: Christian König 
>
> Does that also fix the problems you have been seeing?

Yes Christian.

Regards,

Shirish S

>
> Thanks,
> Christian.
>
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 5 -
>>1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> index 6614d8a..2cdaf3b 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
>> @@ -604,8 +604,11 @@ void amdgpu_ctx_mgr_entity_fini(struct amdgpu_ctx_mgr 
>> *mgr)
>>  continue;
>>  }
>>
>> -for (i = 0; i < num_entities; i++)
>> +for (i = 0; i < num_entities; i++) {
>> +mutex_lock(>adev->lock_reset);
>>  drm_sched_entity_fini(>entities[0][i].entity);
>> +mutex_unlock(>adev->lock_reset);
>> +}
>>  }
>>}
>>

-- 
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: dont schedule jobs while in reset

2019-10-30 Thread S, Shirish
[Why]

doing kthread_park()/unpark() from drm_sched_entity_fini
while GPU reset is in progress defeats all the purpose of
drm_sched_stop->kthread_park.
If drm_sched_entity_fini->kthread_unpark() happens AFTER
drm_sched_stop->kthread_park nothing prevents from another
(third) thread to keep submitting job to HW which will be
picked up by the unparked scheduler thread and try to submit
to HW but fail because the HW ring is deactivated.

[How]
grab the reset lock before calling drm_sched_entity_fini()

Signed-off-by: Shirish S 
Suggested-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
index 6614d8a..2cdaf3b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
@@ -604,8 +604,11 @@ void amdgpu_ctx_mgr_entity_fini(struct amdgpu_ctx_mgr *mgr)
continue;
}
 
-   for (i = 0; i < num_entities; i++)
+   for (i = 0; i < num_entities; i++) {
+   mutex_lock(>adev->lock_reset);
drm_sched_entity_fini(>entities[0][i].entity);
+   mutex_unlock(>adev->lock_reset);
+   }
}
 }
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: guard ib scheduling while in reset

2019-10-30 Thread S, Shirish

On 10/25/2019 9:32 PM, Grodzovsky, Andrey wrote:


On 10/25/19 11:57 AM, Koenig, Christian wrote:
Am 25.10.19 um 17:35 schrieb Grodzovsky, Andrey:


On 10/25/19 5:26 AM, Koenig, Christian wrote:
Am 25.10.19 um 11:22 schrieb S, Shirish:


On 10/25/2019 2:23 PM, Koenig, Christian wrote:

amdgpu_do_asic_reset starting to resume blocks

...

amdgpu :03:00.0: couldn't schedule ib on ring 
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)
...

amdgpu_device_ip_resume_phase2 resumed gfx_v9_0

amdgpu_device_ip_resume_phase2 resumed sdma_v4_0

amdgpu_device_ip_resume_phase2 resumed powerplay

This is what's the root of the problem.

The scheduler should never be resumed before we are done with bringing back the 
hardware into an usable state.

I dont see the scheduler being resumed when the ib is scheduled, its done way 
after the hardware is ready in reset code path.

Below are the logs:

amdgpu :03:00.0: GPU reset begin!
amdgpu_device_gpu_recover calling drm_sched_stop <==
...
amdgpu :03:00.0: couldn't schedule ib on ring 
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)
...
amdgpu_device_ip_resume_phase2 resumed sdma_v4_0
amdgpu_device_ip_resume_phase2 resumed powerplay
amdgpu_device_ip_resume_phase2 resumed dm
...
[drm] recover vram bo from shadow done
amdgpu_device_gpu_recover calling  drm_sched_start <==
...

As mentioned in the call trace, drm_sched_main() is responsible for this 
job_run which seems to be called during cleanup.

Then the scheduler isn't stopped for some reason and we need to investigate why.

We used to have another kthread_park()/unpark() in drm_sched_entity_fini(), 
maybe an application is crashing while we are trying to reset the GPU?


We still have it and isn't doing kthread_park()/unpark() from 
drm_sched_entity_fini while GPU reset in progress defeats all the purpose of 
drm_sched_stop->kthread_park ? If drm_sched_entity_fini-> kthread_unpark 
happens AFTER drm_sched_stop->kthread_park nothing prevents from another 
(third) thread keep submitting job to HW which will be picked up by the 
unparked scheduler thread try to submit to HW but fail because the HW ring is 
deactivated.

If so maybe we should serialize calls to kthread_park/unpark(sched->thread) ?

Yeah, that was my thinking as well. Probably best to just grab the reset lock 
before calling drm_sched_entity_fini().


Shirish - please try locking >lock_reset around calls to 
drm_sched_entity_fini as Christian suggests and see if this actually helps the 
issue.

Yes that also works.

Regards,

Shirish S

Andrey


Alternative I think we could change the kthread_park/unpark to a 
wait_event_ in drm_sched_entity_fini().

Regards,
Christian.


Andrey


Would be rather unlikely, especially that would be rather hard to reproduce but 
currently my best bet what's going wrong here.

Regards,
Christian.


Regards,

Shirish S

Regards,
Christian.

Am 25.10.19 um 10:50 schrieb S, Shirish:

Here is the call trace:

Call Trace:
 dump_stack+0x4d/0x63
 amdgpu_ib_schedule+0x86/0x4b7
 ? __mod_timer+0x21e/0x244
 amdgpu_job_run+0x108/0x178
 drm_sched_main+0x253/0x2fa
 ? remove_wait_queue+0x51/0x51
 ? drm_sched_cleanup_jobs.part.12+0xda/0xda
 kthread+0x14f/0x157
 ? kthread_park+0x86/0x86
 ret_from_fork+0x22/0x40
amdgpu :03:00.0: couldn't schedule ib on ring 
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)

printed via below change:

@@ -151,6 +152,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
}

if (!ring->sched.ready) {
+  dump_stack();
dev_err(adev->dev, "couldn't schedule ib on ring <%s>\n", 
ring->name);
return -EINVAL;

On 10/24/2019 10:00 PM, Christian König wrote:
Am 24.10.19 um 17:06 schrieb Grodzovsky, Andrey:


On 10/24/19 7:01 AM, Christian König wrote:
Am 24.10.19 um 12:58 schrieb S, Shirish:
[Why]
Upon GPU reset, kernel cleans up already submitted jobs
via drm_sched_cleanup_jobs.
This schedules ib's via drm_sched_main()->run_job, leading to
race condition of rings being ready or not, since during reset
rings may be suspended.

NAK, exactly that's what should not happen.

The scheduler should be suspend while a GPU reset is in progress.

So you are running into a completely different race here.

Below is the series of events when the issue occurs.

(Note that as you & Andrey mentioned the scheduler has been suspended but the 
job is scheduled nonetheless.)

amdgpu :03:00.0: GPU reset begin!

...

amdgpu_device_gpu_recover stopping ring sdma0 via drm_sched_stop

...

amdgpu :03:00.0: GPU reset succeeded, trying to resume

amdgpu_do_asic_reset starting to resume blocks

...

amdgpu :03:00.0: couldn't schedule ib on ring 
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)
...

amdgpu_device_ip_resume_phase2 resumed gfx_v9_0

amdgpu_device_ip_resume_phase2 resumed sdma_v4_0

amdgpu_device_ip_resume_phase2 resumed po

Re: [PATCH] drm/amdgpu: guard ib scheduling while in reset

2019-10-25 Thread S, Shirish

On 10/25/2019 2:56 PM, Koenig, Christian wrote:
Am 25.10.19 um 11:22 schrieb S, Shirish:


On 10/25/2019 2:23 PM, Koenig, Christian wrote:

amdgpu_do_asic_reset starting to resume blocks

...

amdgpu :03:00.0: couldn't schedule ib on ring 
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)
...

amdgpu_device_ip_resume_phase2 resumed gfx_v9_0

amdgpu_device_ip_resume_phase2 resumed sdma_v4_0

amdgpu_device_ip_resume_phase2 resumed powerplay

This is what's the root of the problem.

The scheduler should never be resumed before we are done with bringing back the 
hardware into an usable state.

I dont see the scheduler being resumed when the ib is scheduled, its done way 
after the hardware is ready in reset code path.

Below are the logs:

amdgpu :03:00.0: GPU reset begin!
amdgpu_device_gpu_recover calling drm_sched_stop <==
...
amdgpu :03:00.0: couldn't schedule ib on ring 
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)
...
amdgpu_device_ip_resume_phase2 resumed sdma_v4_0
amdgpu_device_ip_resume_phase2 resumed powerplay
amdgpu_device_ip_resume_phase2 resumed dm
...
[drm] recover vram bo from shadow done
amdgpu_device_gpu_recover calling  drm_sched_start <==
...

As mentioned in the call trace, drm_sched_main() is responsible for this 
job_run which seems to be called during cleanup.

Then the scheduler isn't stopped for some reason and we need to investigate why.


The drm_sched_stop() as i mentioned only parks the thread and cancels work and 
nothing else, not sure why you think it hasn't stopped or done what it is 
supposed to do.

Since it works 3/5 times.

We used to have another kthread_park()/unpark() in drm_sched_entity_fini(), 
maybe an application is crashing while we are trying to reset the GPU?

Would be rather unlikely, especially that would be rather hard to reproduce but 
currently my best bet what's going wrong here.

Its sometimes triggered from drm_sched_entity_fini(), as i can see in prints 
but not always.

I believe application crashing while GPU resets is anticipated, depending upon 
workloads and state of gfx renderer when reset has happened.

Since reset is something that is not a usual/routine/regular event, such 
anomalies are to be expected when it happens,

so we need to have failsafe methods like this patch and may be some more based 
on system behavior upon reset.

Regards,

Shirish S

Regards,
Christian.


Regards,

Shirish S

Regards,
Christian.

Am 25.10.19 um 10:50 schrieb S, Shirish:

Here is the call trace:

Call Trace:
 dump_stack+0x4d/0x63
 amdgpu_ib_schedule+0x86/0x4b7
 ? __mod_timer+0x21e/0x244
 amdgpu_job_run+0x108/0x178
 drm_sched_main+0x253/0x2fa
 ? remove_wait_queue+0x51/0x51
 ? drm_sched_cleanup_jobs.part.12+0xda/0xda
 kthread+0x14f/0x157
 ? kthread_park+0x86/0x86
 ret_from_fork+0x22/0x40
amdgpu :03:00.0: couldn't schedule ib on ring 
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)

printed via below change:

@@ -151,6 +152,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
}

if (!ring->sched.ready) {
+  dump_stack();
dev_err(adev->dev, "couldn't schedule ib on ring <%s>\n", 
ring->name);
return -EINVAL;

On 10/24/2019 10:00 PM, Christian König wrote:
Am 24.10.19 um 17:06 schrieb Grodzovsky, Andrey:


On 10/24/19 7:01 AM, Christian König wrote:
Am 24.10.19 um 12:58 schrieb S, Shirish:
[Why]
Upon GPU reset, kernel cleans up already submitted jobs
via drm_sched_cleanup_jobs.
This schedules ib's via drm_sched_main()->run_job, leading to
race condition of rings being ready or not, since during reset
rings may be suspended.

NAK, exactly that's what should not happen.

The scheduler should be suspend while a GPU reset is in progress.

So you are running into a completely different race here.

Below is the series of events when the issue occurs.

(Note that as you & Andrey mentioned the scheduler has been suspended but the 
job is scheduled nonetheless.)

amdgpu :03:00.0: GPU reset begin!

...

amdgpu_device_gpu_recover stopping ring sdma0 via drm_sched_stop

...

amdgpu :03:00.0: GPU reset succeeded, trying to resume

amdgpu_do_asic_reset starting to resume blocks

...

amdgpu :03:00.0: couldn't schedule ib on ring 
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)
...

amdgpu_device_ip_resume_phase2 resumed gfx_v9_0

amdgpu_device_ip_resume_phase2 resumed sdma_v4_0

amdgpu_device_ip_resume_phase2 resumed powerplay

...

FWIW, since the job is always NULL when "drm_sched_stop(>sched, job ? 
>base : NULL);" when called during reset, all drm_sched_stop() does

is  cancel delayed work and park the sched->thread. There is no job list to be 
iterated to de-activate or remove or update fences.

Based on all this analysis, adding a mutex is more failsafe and less intrusive 
in the current code flow and lastly seems to be logical as well, hence I 
devised thi

Re: [PATCH] drm/amdgpu: guard ib scheduling while in reset

2019-10-25 Thread S, Shirish

On 10/25/2019 2:23 PM, Koenig, Christian wrote:

amdgpu_do_asic_reset starting to resume blocks

...

amdgpu :03:00.0: couldn't schedule ib on ring 
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)
...

amdgpu_device_ip_resume_phase2 resumed gfx_v9_0

amdgpu_device_ip_resume_phase2 resumed sdma_v4_0

amdgpu_device_ip_resume_phase2 resumed powerplay

This is what's the root of the problem.

The scheduler should never be resumed before we are done with bringing back the 
hardware into an usable state.

I dont see the scheduler being resumed when the ib is scheduled, its done way 
after the hardware is ready in reset code path.

Below are the logs:

amdgpu :03:00.0: GPU reset begin!
amdgpu_device_gpu_recover calling drm_sched_stop <==
...
amdgpu :03:00.0: couldn't schedule ib on ring 
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)
...
amdgpu_device_ip_resume_phase2 resumed sdma_v4_0
amdgpu_device_ip_resume_phase2 resumed powerplay
amdgpu_device_ip_resume_phase2 resumed dm
...
[drm] recover vram bo from shadow done
amdgpu_device_gpu_recover calling  drm_sched_start <==
...

As mentioned in the call trace, drm_sched_main() is responsible for this 
job_run which seems to be called during cleanup.

Regards,

Shirish S

Regards,
Christian.

Am 25.10.19 um 10:50 schrieb S, Shirish:

Here is the call trace:

Call Trace:
 dump_stack+0x4d/0x63
 amdgpu_ib_schedule+0x86/0x4b7
 ? __mod_timer+0x21e/0x244
 amdgpu_job_run+0x108/0x178
 drm_sched_main+0x253/0x2fa
 ? remove_wait_queue+0x51/0x51
 ? drm_sched_cleanup_jobs.part.12+0xda/0xda
 kthread+0x14f/0x157
 ? kthread_park+0x86/0x86
 ret_from_fork+0x22/0x40
amdgpu :03:00.0: couldn't schedule ib on ring 
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)

printed via below change:

@@ -151,6 +152,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
}

if (!ring->sched.ready) {
+  dump_stack();
dev_err(adev->dev, "couldn't schedule ib on ring <%s>\n", 
ring->name);
return -EINVAL;

On 10/24/2019 10:00 PM, Christian König wrote:
Am 24.10.19 um 17:06 schrieb Grodzovsky, Andrey:


On 10/24/19 7:01 AM, Christian König wrote:
Am 24.10.19 um 12:58 schrieb S, Shirish:
[Why]
Upon GPU reset, kernel cleans up already submitted jobs
via drm_sched_cleanup_jobs.
This schedules ib's via drm_sched_main()->run_job, leading to
race condition of rings being ready or not, since during reset
rings may be suspended.

NAK, exactly that's what should not happen.

The scheduler should be suspend while a GPU reset is in progress.

So you are running into a completely different race here.

Below is the series of events when the issue occurs.

(Note that as you & Andrey mentioned the scheduler has been suspended but the 
job is scheduled nonetheless.)

amdgpu :03:00.0: GPU reset begin!

...

amdgpu_device_gpu_recover stopping ring sdma0 via drm_sched_stop

...

amdgpu :03:00.0: GPU reset succeeded, trying to resume

amdgpu_do_asic_reset starting to resume blocks

...

amdgpu :03:00.0: couldn't schedule ib on ring 
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)
...

amdgpu_device_ip_resume_phase2 resumed gfx_v9_0

amdgpu_device_ip_resume_phase2 resumed sdma_v4_0

amdgpu_device_ip_resume_phase2 resumed powerplay

...

FWIW, since the job is always NULL when "drm_sched_stop(>sched, job ? 
>base : NULL);" when called during reset, all drm_sched_stop() does

is  cancel delayed work and park the sched->thread. There is no job list to be 
iterated to de-activate or remove or update fences.

Based on all this analysis, adding a mutex is more failsafe and less intrusive 
in the current code flow and lastly seems to be logical as well, hence I 
devised this approach


Please sync up with Andrey how this was able to happen.

Regards,
Christian.


Shirish - Christian makes a good point - note that in amdgpu_device_gpu_recover 
drm_sched_stop which stop all the scheduler threads is called way before we 
suspend the HW in amdgpu_device_pre_asic_reset->amdgpu_device_ip_suspend where 
SDMA suspension is happening and where the HW ring marked as not ready - please 
provide call stack for when you hit [drm:amdgpu_job_run] *ERROR* Error 
scheduling IBs (-22) to identify the code path which tried to submit the SDMA IB

Well the most likely cause of this is that the hardware failed to resume after 
the reset.

Infact hardware resume has not yet started, when the job is scheduled, which is 
the race am trying to address with this patch.

Regards,

Shirish S

Christian.


Andrey



[How]
make GPU reset's amdgpu_device_ip_resume_phase2() &
amdgpu_ib_schedule() in amdgpu_job_run() mutually exclusive.

Signed-off-by: Shirish S <mailto:shiris...@amd.com>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 

Re: [PATCH] drm/amdgpu: guard ib scheduling while in reset

2019-10-25 Thread S, Shirish
Here is the call trace:

Call Trace:
 dump_stack+0x4d/0x63
 amdgpu_ib_schedule+0x86/0x4b7
 ? __mod_timer+0x21e/0x244
 amdgpu_job_run+0x108/0x178
 drm_sched_main+0x253/0x2fa
 ? remove_wait_queue+0x51/0x51
 ? drm_sched_cleanup_jobs.part.12+0xda/0xda
 kthread+0x14f/0x157
 ? kthread_park+0x86/0x86
 ret_from_fork+0x22/0x40
amdgpu :03:00.0: couldn't schedule ib on ring 
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)

printed via below change:

@@ -151,6 +152,10 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned 
num_ibs,
}

if (!ring->sched.ready) {
+  dump_stack();
dev_err(adev->dev, "couldn't schedule ib on ring <%s>\n", 
ring->name);
return -EINVAL;

On 10/24/2019 10:00 PM, Christian König wrote:
Am 24.10.19 um 17:06 schrieb Grodzovsky, Andrey:


On 10/24/19 7:01 AM, Christian König wrote:
Am 24.10.19 um 12:58 schrieb S, Shirish:
[Why]
Upon GPU reset, kernel cleans up already submitted jobs
via drm_sched_cleanup_jobs.
This schedules ib's via drm_sched_main()->run_job, leading to
race condition of rings being ready or not, since during reset
rings may be suspended.

NAK, exactly that's what should not happen.

The scheduler should be suspend while a GPU reset is in progress.

So you are running into a completely different race here.

Below is the series of events when the issue occurs.

(Note that as you & Andrey mentioned the scheduler has been suspended but the 
job is scheduled nonetheless.)

amdgpu :03:00.0: GPU reset begin!

...

amdgpu_device_gpu_recover stopping ring sdma0 via drm_sched_stop

...

amdgpu :03:00.0: GPU reset succeeded, trying to resume

amdgpu_do_asic_reset starting to resume blocks

...

amdgpu :03:00.0: couldn't schedule ib on ring 
[drm:amdgpu_job_run] *ERROR* Error scheduling IBs (-22)
...

amdgpu_device_ip_resume_phase2 resumed gfx_v9_0

amdgpu_device_ip_resume_phase2 resumed sdma_v4_0

amdgpu_device_ip_resume_phase2 resumed powerplay

...

FWIW, since the job is always NULL when "drm_sched_stop(>sched, job ? 
>base : NULL);" when called during reset, all drm_sched_stop() does

is  cancel delayed work and park the sched->thread. There is no job list to be 
iterated to de-activate or remove or update fences.

Based on all this analysis, adding a mutex is more failsafe and less intrusive 
in the current code flow and lastly seems to be logical as well, hence I 
devised this approach


Please sync up with Andrey how this was able to happen.

Regards,
Christian.


Shirish - Christian makes a good point - note that in amdgpu_device_gpu_recover 
drm_sched_stop which stop all the scheduler threads is called way before we 
suspend the HW in amdgpu_device_pre_asic_reset->amdgpu_device_ip_suspend where 
SDMA suspension is happening and where the HW ring marked as not ready - please 
provide call stack for when you hit [drm:amdgpu_job_run] *ERROR* Error 
scheduling IBs (-22) to identify the code path which tried to submit the SDMA IB

Well the most likely cause of this is that the hardware failed to resume after 
the reset.

Infact hardware resume has not yet started, when the job is scheduled, which is 
the race am trying to address with this patch.

Regards,

Shirish S

Christian.


Andrey



[How]
make GPU reset's amdgpu_device_ip_resume_phase2() &
amdgpu_ib_schedule() in amdgpu_job_run() mutually exclusive.

Signed-off-by: Shirish S <mailto:shiris...@amd.com>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 2 ++
  3 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index f4d9041..7b07a47b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -973,6 +973,7 @@ struct amdgpu_device {
  boolin_gpu_reset;
  enum pp_mp1_state   mp1_state;
  struct mutex  lock_reset;
+struct mutex  lock_ib_sched;
  struct amdgpu_doorbell_index doorbell_index;
int asic_reset_res;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 676cad1..63cad74 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2759,6 +2759,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
  mutex_init(>virt.vf_errors.lock);
  hash_init(adev->mn_hash);
  mutex_init(>lock_reset);
+mutex_init(>lock_ib_sched);
  mutex_init(>virt.dpm_mutex);
  mutex_init(>psp.mutex);
  @@ -3795,7 +3796,9 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info 
*hive,
  if (r)
  return r;
  +mutex_lock(_adev->lock_ib_sched);
  r = amdgpu_device_ip_resume_phase2(tmp_adev);
+mutex_

[PATCH] drm/amdgpu: guard ib scheduling while in reset

2019-10-24 Thread S, Shirish
[Why]
Upon GPU reset, kernel cleans up already submitted jobs
via drm_sched_cleanup_jobs.
This schedules ib's via drm_sched_main()->run_job, leading to
race condition of rings being ready or not, since during reset
rings may be suspended.

[How]
make GPU reset's amdgpu_device_ip_resume_phase2() &
amdgpu_ib_schedule() in amdgpu_job_run() mutually exclusive.

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 2 ++
 3 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index f4d9041..7b07a47b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -973,6 +973,7 @@ struct amdgpu_device {
boolin_gpu_reset;
enum pp_mp1_state   mp1_state;
struct mutex  lock_reset;
+   struct mutex  lock_ib_sched;
struct amdgpu_doorbell_index doorbell_index;
 
int asic_reset_res;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 676cad1..63cad74 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2759,6 +2759,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
mutex_init(>virt.vf_errors.lock);
hash_init(adev->mn_hash);
mutex_init(>lock_reset);
+   mutex_init(>lock_ib_sched);
mutex_init(>virt.dpm_mutex);
mutex_init(>psp.mutex);
 
@@ -3795,7 +3796,9 @@ static int amdgpu_do_asic_reset(struct amdgpu_hive_info 
*hive,
if (r)
return r;
 
+   mutex_lock(_adev->lock_ib_sched);
r = amdgpu_device_ip_resume_phase2(tmp_adev);
+   mutex_unlock(_adev->lock_ib_sched);
if (r)
goto out;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index e1bad99..cd6082d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -233,8 +233,10 @@ static struct dma_fence *amdgpu_job_run(struct 
drm_sched_job *sched_job)
if (finished->error < 0) {
DRM_INFO("Skip scheduling IBs!\n");
} else {
+   mutex_lock(>adev->lock_ib_sched);
r = amdgpu_ib_schedule(ring, job->num_ibs, job->ibs, job,
   );
+   mutex_unlock(>adev->lock_ib_sched);
if (r)
DRM_ERROR("Error scheduling IBs (%d)\n", r);
}
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] UPSTREAM: drm/amd/display: Fix Apple dongle cannot be successfully detected

2019-10-23 Thread S, Shirish
The UPSTREAM tag in the commit message needs to be removed.

On 10/21/2019 1:24 PM, Louis Li wrote:
> [Why]
> External monitor cannot be displayed consistently, if connecting
> via this Apple dongle (A1621, USB Type-C to HDMI).
> By experiments, it is confirmed that the dongle needs 200ms at least
> to be ready for communication, after it sets HPD signal high.
>
> [How]
> When receiving HPD IRQ, delay 500ms at the beginning of handle_hpd_irq().

Am not sure how this delay shall impact on dongles that don't need it,

ideally it should be added as quirk, at least restrict it to these 
specific vendors.

Instead of delay, can you find any parameter to wait for for the 
communication to be ready,

in that way it shall be failsafe.

> Then run the original procedure.
> With this patch applied, the problem cannot be reproduced.
> With other dongles, test results are PASS.
> Test result is PASS after system resumes from suspend.
>
> Signed-off-by: Louis Li 
> ---
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5 +
>   1 file changed, 5 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 0aef92b7c037..043ddac73862 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -1025,6 +1025,11 @@ static void handle_hpd_irq(void *param)
>   struct drm_device *dev = connector->dev;
>   enum dc_connection_type new_connection_type = dc_connection_none;
>   
> +/* Some monitors/dongles need around 200ms to be ready for communication
> + * after they drive HPD signal high.
> + */
> +mdelay(500);
> +
>   /* In case of failure or MST no need to update connector status or 
> notify the OS
>* since (for MST case) MST does this in it's own context.
>*/

-- 
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH 1/3] drm/amdgpu: fix stack alignment ABI mismatch for Clang

2019-10-17 Thread S, Shirish
Tested-by: Shirish S  



Regards,
Shirish S

-Original Message-
From: Nick Desaulniers  
Sent: Thursday, October 17, 2019 4:32 AM
To: Wentland, Harry ; Deucher, Alexander 

Cc: yshu...@gmail.com; andrew.coop...@citrix.com; a...@arndb.de; 
clang-built-li...@googlegroups.com; m...@google.com; S, Shirish 
; Zhou, David(ChunMing) ; Koenig, 
Christian ; amd-gfx@lists.freedesktop.org; 
linux-ker...@vger.kernel.org; Nick Desaulniers 
Subject: [PATCH 1/3] drm/amdgpu: fix stack alignment ABI mismatch for Clang

The x86 kernel is compiled with an 8B stack alignment via 
`-mpreferred-stack-boundary=3` for GCC since 3.6-rc1 via commit d9b0cde91c60 
("x86-64, gcc: Use -mpreferred-stack-boundary=3 if supported") or 
`-mstack-alignment=8` for Clang. Parts of the AMDGPU driver are compiled with 
16B stack alignment.

Generally, the stack alignment is part of the ABI. Linking together two 
different translation units with differing stack alignment is dangerous, 
particularly when the translation unit with the smaller stack alignment makes 
calls into the translation unit with the larger stack alignment.
While 8B aligned stacks are sometimes also 16B aligned, they are not always.

Multiple users have reported General Protection Faults (GPF) when using the 
AMDGPU driver compiled with Clang. Clang is placing objects in stack slots 
assuming the stack is 16B aligned, and selecting instructions that require 16B 
aligned memory operands.

At runtime, syscall handlers with 8B aligned stack call into code that assumes 
16B stack alignment.  When the stack is a multiple of 8B but not 16B, these 
instructions result in a GPF.

Remove the code that added compatibility between the differing compiler flags, 
as it will result in runtime GPFs when built with Clang. Cleanups for GCC will 
be sent in later patches in the series.

Link: https://github.com/ClangBuiltLinux/linux/issues/735
Debugged-by: Yuxuan Shui 
Reported-by: Shirish S 
Reported-by: Yuxuan Shui 
Suggested-by: Andrew Cooper 
Signed-off-by: Nick Desaulniers 
---
 drivers/gpu/drm/amd/display/dc/calcs/Makefile | 10 --  
drivers/gpu/drm/amd/display/dc/dcn20/Makefile | 10 --  
drivers/gpu/drm/amd/display/dc/dcn21/Makefile | 10 --
 drivers/gpu/drm/amd/display/dc/dml/Makefile   | 10 --
 drivers/gpu/drm/amd/display/dc/dsc/Makefile   | 10 --
 5 files changed, 20 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/calcs/Makefile 
b/drivers/gpu/drm/amd/display/dc/calcs/Makefile
index 985633c08a26..4b1a8a08a5de 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/calcs/Makefile
@@ -24,13 +24,11 @@
 # It calculates Bandwidth and Watermarks values for HW programming  #
 
-ifneq ($(call cc-option, -mpreferred-stack-boundary=4),)
-   cc_stack_align := -mpreferred-stack-boundary=4
-else ifneq ($(call cc-option, -mstack-alignment=16),)
-   cc_stack_align := -mstack-alignment=16
-endif
+calcs_ccflags := -mhard-float -msse
 
-calcs_ccflags := -mhard-float -msse $(cc_stack_align)
+ifdef CONFIG_CC_IS_GCC
+calcs_ccflags += -mpreferred-stack-boundary=4 endif
 
 ifdef CONFIG_CC_IS_CLANG
 calcs_ccflags += -msse2
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn20/Makefile
index ddb8d5649e79..5fe3eb80075d 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/Makefile
@@ -10,13 +10,11 @@ ifdef CONFIG_DRM_AMD_DC_DSC_SUPPORT
 DCN20 += dcn20_dsc.o
 endif
 
-ifneq ($(call cc-option, -mpreferred-stack-boundary=4),)
-   cc_stack_align := -mpreferred-stack-boundary=4
-else ifneq ($(call cc-option, -mstack-alignment=16),)
-   cc_stack_align := -mstack-alignment=16
-endif
+CFLAGS_$(AMDDALPATH)/dc/dcn20/dcn20_resource.o := -mhard-float -msse
 
-CFLAGS_$(AMDDALPATH)/dc/dcn20/dcn20_resource.o := -mhard-float -msse 
$(cc_stack_align)
+ifdef CONFIG_CC_IS_GCC
+CFLAGS_$(AMDDALPATH)/dc/dcn20/dcn20_resource.o += 
+-mpreferred-stack-boundary=4 endif
 
 ifdef CONFIG_CC_IS_CLANG
 CFLAGS_$(AMDDALPATH)/dc/dcn20/dcn20_resource.o += -msse2 diff --git 
a/drivers/gpu/drm/amd/display/dc/dcn21/Makefile 
b/drivers/gpu/drm/amd/display/dc/dcn21/Makefile
index ef673bffc241..7057e20748b9 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn21/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/dcn21/Makefile
@@ -3,13 +3,11 @@
 
 DCN21 = dcn21_hubp.o dcn21_hubbub.o dcn21_resource.o
 
-ifneq ($(call cc-option, -mpreferred-stack-boundary=4),)
-   cc_stack_align := -mpreferred-stack-boundary=4
-else ifneq ($(call cc-option, -mstack-alignment=16),)
-   cc_stack_align := -mstack-alignment=16
-endif
+CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o := -mhard-float -msse
 
-CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o := -mhard-float -msse 
$(cc_stack_align)
+ifdef CONFIG_CC_IS_GCC
+CFLAGS_$(AMDDALPATH)/dc/dcn21/dcn21_resource.o += 
+-mpreferred-stack-boundary=4 endif
 
 ifdef CONFIG_CC_IS_CLANG
 CFLAGS_$(AMDD

Re: AMDGPU and 16B stack alignment

2019-10-15 Thread S, Shirish
Hi Nick,

On 10/15/2019 3:52 AM, Nick Desaulniers wrote:

Hello!

The x86 kernel is compiled with an 8B stack alignment via
`-mpreferred-stack-boundary=3` for GCC since 3.6-rc1 via
commit d9b0cde91c60 ("x86-64, gcc: Use
-mpreferred-stack-boundary=3 if supported")
or `-mstack-alignment=8` for Clang. Parts of the AMDGPU driver are
compiled with 16B stack alignment.

Generally, the stack alignment is part of the ABI. Linking together two
different translation units with differing stack alignment is dangerous,
particularly when the translation unit with the smaller stack alignment
makes calls into the translation unit with the larger stack alignment.
While 8B aligned stacks are sometimes also 16B aligned, they are not
always.

Multiple users have reported General Protection Faults (GPF) when using
the AMDGPU driver compiled with Clang. Clang is placing objects in stack
slots assuming the stack is 16B aligned, and selecting instructions that
require 16B aligned memory operands. At runtime, syscalls handling 8B
stack aligned code calls into code that assumes 16B stack alignment.
When the stack is a multiple of 8B but not 16B, these instructions
result in a GPF.

GCC doesn't select instructions with alignment requirements, so the GPFs
aren't observed, but it is still considered an ABI breakage to mix and
match stack alignment.

I have patches that basically remove -mpreferred-stack-boundary=4 and
-mstack-alignment=16 from AMDGPU:
https://github.com/ClangBuiltLinux/linux/issues/735#issuecomment-541247601
Yuxuan has tested with Clang and GCC and reported it fixes the GPF's observed.

My gcc build fails with below errors:

dcn_calcs.c:1:0: error: -mpreferred-stack-boundary=3 is not between 4 and 12

dcn_calc_math.c:1:0: error: -mpreferred-stack-boundary=3 is not between 4 and 12

While GPF observed on clang builds seem to be fixed.

--
Regards,
Shirish S



I've split the patch into 4; same commit message but different Fixes
tags so that they backport to stable on finer granularity. 2 questions
BEFORE I send the series:

1. Would you prefer 4 patches with unique `fixes` tags, or 1 patch?
2. Was there or is there still a good reason for the stack alignment mismatch?

(Further, I think we can use -msse2 for BOTH clang+gcc after my patch,
but I don't have hardware to test on. I'm happy to write/send the
follow up patch, but I'd need help testing).




___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu/psp: silence response status warning

2019-09-24 Thread S, Shirish
log the response status related error to the driver's
debug log since  psp response status is not 0 even though
there was no problem while the command was submitted.

This warning misleads, hence this change.

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 76c59d5..37ffed5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -170,7 +170,7 @@ psp_cmd_submit_buf(struct psp_context *psp,
if (ucode)
DRM_WARN("failed to load ucode id (%d) ",
  ucode->ucode_id);
-   DRM_WARN("psp command (0x%X) failed and response status is 
(0x%X)\n",
+   DRM_DEBUG_DRIVER("psp command (0x%X) failed and response status 
is (0x%X)\n",
 psp->cmd_buf_mem->cmd_id,
 psp->cmd_buf_mem->resp.status & GFX_CMD_STATUS_MASK);
if (!timeout) {
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu: fix build error without CONFIG_HSA_AMD (V2)

2019-09-12 Thread S, Shirish

On 9/12/2019 3:29 AM, Kuehling, Felix wrote:
> On 2019-09-11 2:52 a.m., S, Shirish wrote:
>> If CONFIG_HSA_AMD is not set, build fails:
>>
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.o: In function 
>> `amdgpu_device_ip_early_init':
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626: undefined reference to 
>> `sched_policy'
>>
>> Use CONFIG_HSA_AMD to guard this.
>>
>> Fixes: 1abb680ad371 ("drm/amdgpu: disable gfxoff while use no H/W scheduling 
>> policy")
>>
>> V2: declare sched_policy in amdgpu.h and remove changes in amdgpu_device.c
> Which branch is this for. V1 of this patch was already submitted to
> amd-staging-drm-next. So unless you're planning to revert v1 and submit
> v2, I was expecting to see a change that fixes up the previous patch,
> rather than a patch that replaces it.

Have sent a patch that fixes up previous patch as well.

Apparently, I did not send the revert but my plan was to revert and only 
then submit V2.

Anyways both work for me as long as the kernel builds.

Regards,

Shirish S

> Regards,
>     Felix
>
>
>> Signed-off-by: Shirish S 
>> ---
>>drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 
>>1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 1030cb3..6ff02bb 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -169,7 +169,11 @@ extern int amdgpu_discovery;
>>extern int amdgpu_mes;
>>extern int amdgpu_noretry;
>>extern int amdgpu_force_asic_type;
>> +#ifdef CONFIG_HSA_AMD
>>extern int sched_policy;
>> +#else
>> +static const int sched_policy = KFD_SCHED_POLICY_HWS;
>> +#endif
>>
>>#ifdef CONFIG_DRM_AMDGPU_SI
>>extern int amdgpu_si_support;

-- 
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: remove needless usage of #ifdef

2019-09-12 Thread S, Shirish
define sched_policy in case CONFIG_HSA_AMD is not
enabled, with this there is no need to check for CONFIG_HSA_AMD
else where in driver code.

Suggested-by: Felix Kuehling 
Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +-
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index a1516a3..6ff02bb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -171,6 +171,8 @@ extern int amdgpu_noretry;
 extern int amdgpu_force_asic_type;
 #ifdef CONFIG_HSA_AMD
 extern int sched_policy;
+#else
+static const int sched_policy = KFD_SCHED_POLICY_HWS;
 #endif
 
 #ifdef CONFIG_DRM_AMDGPU_SI
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 740638e..3b5282b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1623,11 +1623,7 @@ static int amdgpu_device_ip_early_init(struct 
amdgpu_device *adev)
}
 
adev->pm.pp_feature = amdgpu_pp_feature_mask;
-   if (amdgpu_sriov_vf(adev)
-   #ifdef CONFIG_HSA_AMD
-   || sched_policy == KFD_SCHED_POLICY_NO_HWS
-   #endif
-   )
+   if (amdgpu_sriov_vf(adev) || sched_policy == KFD_SCHED_POLICY_NO_HWS)
adev->pm.pp_feature &= ~PP_GFXOFF_MASK;
 
for (i = 0; i < adev->num_ip_blocks; i++) {
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: fix build error without CONFIG_HSA_AMD

2019-09-11 Thread S, Shirish
Agree, have sent V2.
My patch was actually in line to already up streamed patch:
https://lkml.org/lkml/2019/8/26/201



Regards,
Shirish S

-Original Message-
From: Kuehling, Felix  
Sent: Wednesday, September 11, 2019 9:09 AM
To: Huang, Ray ; S, Shirish ; Deucher, 
Alexander ; Koenig, Christian 

Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: fix build error without CONFIG_HSA_AMD

This is pretty ugly. See a suggestion inline.

On 2019-09-10 4:12 a.m., Huang, Ray wrote:
>> -Original Message-
>> From: S, Shirish 
>> Sent: Tuesday, September 10, 2019 3:54 PM
>> To: Deucher, Alexander ; Koenig, Christian 
>> ; Huang, Ray 
>> Cc: amd-gfx@lists.freedesktop.org; S, Shirish 
>> Subject: [PATCH] drm/amdgpu: fix build error without CONFIG_HSA_AMD
>>
>> If CONFIG_HSA_AMD is not set, build fails:
>>
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.o: In function
>> `amdgpu_device_ip_early_init':
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626: undefined reference 
>> to `sched_policy'
>>
>> Use CONFIG_HSA_AMD to guard this.
>>
>> Fixes: 1abb680ad371 ("drm/amdgpu: disable gfxoff while use no H/W 
>> scheduling policy")
>>
>> Signed-off-by: Shirish S 
> + Felix for his awareness.
>
> Reviewed-by: Huang Rui 
>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 ++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +-
>>   2 files changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index 1030cb30720c..a1516a3ae9a8 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -169,7 +169,9 @@ extern int amdgpu_discovery;  extern int 
>> amdgpu_mes;  extern int amdgpu_noretry;  extern int 
>> amdgpu_force_asic_type;
>> +#ifdef CONFIG_HSA_AMD
>>   extern int sched_policy;

#else
static const int sched_policy = KFD_SCHED_POLICY_HWS; #endif

This way you don't need another set of ugly #ifdefs in amdgpu_device.c.

Regards,
   Felix


>> +#endif
>>
>>   #ifdef CONFIG_DRM_AMDGPU_SI
>>   extern int amdgpu_si_support;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index bd423dd64e18..2535db27f821 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -1623,7 +1623,11 @@ static int amdgpu_device_ip_early_init(struct
>> amdgpu_device *adev)
>>  }
>>
>>  adev->pm.pp_feature = amdgpu_pp_feature_mask;
>> -if (amdgpu_sriov_vf(adev) || sched_policy ==
>> KFD_SCHED_POLICY_NO_HWS)
>> +if (amdgpu_sriov_vf(adev)
>> +#ifdef CONFIG_HSA_AMD
>> +|| sched_policy == KFD_SCHED_POLICY_NO_HWS
>> +#endif
>> +)
>>  adev->pm.pp_feature &= ~PP_GFXOFF_MASK;
>>
>>  for (i = 0; i < adev->num_ip_blocks; i++) {
>> --
>> 2.20.1
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: fix build error without CONFIG_HSA_AMD (V2)

2019-09-11 Thread S, Shirish
If CONFIG_HSA_AMD is not set, build fails:

drivers/gpu/drm/amd/amdgpu/amdgpu_device.o: In function 
`amdgpu_device_ip_early_init':
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626: undefined reference to 
`sched_policy'

Use CONFIG_HSA_AMD to guard this.

Fixes: 1abb680ad371 ("drm/amdgpu: disable gfxoff while use no H/W scheduling 
policy")

V2: declare sched_policy in amdgpu.h and remove changes in amdgpu_device.c

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 1030cb3..6ff02bb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -169,7 +169,11 @@ extern int amdgpu_discovery;
 extern int amdgpu_mes;
 extern int amdgpu_noretry;
 extern int amdgpu_force_asic_type;
+#ifdef CONFIG_HSA_AMD
 extern int sched_policy;
+#else
+static const int sched_policy = KFD_SCHED_POLICY_HWS;
+#endif
 
 #ifdef CONFIG_DRM_AMDGPU_SI
 extern int amdgpu_si_support;
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: fix build error without CONFIG_HSA_AMD

2019-09-10 Thread S, Shirish
If CONFIG_HSA_AMD is not set, build fails:

drivers/gpu/drm/amd/amdgpu/amdgpu_device.o: In function 
`amdgpu_device_ip_early_init':
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626: undefined reference to 
`sched_policy'

Use CONFIG_HSA_AMD to guard this.

Fixes: 1abb680ad371 ("drm/amdgpu: disable gfxoff while use no H/W scheduling 
policy")

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 1030cb30720c..a1516a3ae9a8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -169,7 +169,9 @@ extern int amdgpu_discovery;
 extern int amdgpu_mes;
 extern int amdgpu_noretry;
 extern int amdgpu_force_asic_type;
+#ifdef CONFIG_HSA_AMD
 extern int sched_policy;
+#endif
 
 #ifdef CONFIG_DRM_AMDGPU_SI
 extern int amdgpu_si_support;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index bd423dd64e18..2535db27f821 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1623,7 +1623,11 @@ static int amdgpu_device_ip_early_init(struct 
amdgpu_device *adev)
}
 
adev->pm.pp_feature = amdgpu_pp_feature_mask;
-   if (amdgpu_sriov_vf(adev) || sched_policy == KFD_SCHED_POLICY_NO_HWS)
+   if (amdgpu_sriov_vf(adev)
+   #ifdef CONFIG_HSA_AMD
+   || sched_policy == KFD_SCHED_POLICY_NO_HWS
+   #endif
+   )
adev->pm.pp_feature &= ~PP_GFXOFF_MASK;
 
for (i = 0; i < adev->num_ip_blocks; i++) {
-- 
2.20.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgpu/{uvd,vcn}: fetch ring's read_ptr after alloc

2019-06-04 Thread S, Shirish
Hi Alex,

On 6/4/2019 9:43 PM, Alex Deucher wrote:
> On Tue, Jun 4, 2019 at 12:07 PM S, Shirish  wrote:
>> [What]
>> readptr read always returns zero, since most likely
>> UVD block is either power or clock gated.
>>
>> [How]
>> fetch rptr after amdgpu_ring_alloc() which informs
>> the power management code that the block is about to be
>> used and hence the gating is turned off.
>>
>> Signed-off-by: Louis Li 
>> Signed-off-by: Shirish S 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 4 +++-
>>   drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c   | 5 -
>>   drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c   | 5 -
> What about uvd 4.2, 5.0 and VCE 2.0, 3.0, 4.0?
amdgpu_vce_ring_test_ring() is the common function for VCE 2.0, 3.0 & 4.0
and patch that fixes it, is already reviewed.

UVD 4.2 & 5.0  use mmUVD_CONTEXT_ID instead of readptr,
so i beleive this fix is not applicable for them.
Regards,
Shirish S
>
> Alex
>
>>   3 files changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> index 118451f..d786098 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
>> @@ -468,7 +468,7 @@ int amdgpu_vcn_dec_ring_test_ib(struct amdgpu_ring 
>> *ring, long timeout)
>>   int amdgpu_vcn_enc_ring_test_ring(struct amdgpu_ring *ring)
>>   {
>>  struct amdgpu_device *adev = ring->adev;
>> -   uint32_t rptr = amdgpu_ring_get_rptr(ring);
>> +   uint32_t rptr;
>>  unsigned i;
>>  int r;
>>
>> @@ -476,6 +476,8 @@ int amdgpu_vcn_enc_ring_test_ring(struct amdgpu_ring 
>> *ring)
>>  if (r)
>>  return r;
>>
>> +   rptr = amdgpu_ring_get_rptr(ring);
>> +
>>  amdgpu_ring_write(ring, VCN_ENC_CMD_END);
>>  amdgpu_ring_commit(ring);
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
>> index c61a314..16682b7 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
>> @@ -170,13 +170,16 @@ static void uvd_v6_0_enc_ring_set_wptr(struct 
>> amdgpu_ring *ring)
>>   static int uvd_v6_0_enc_ring_test_ring(struct amdgpu_ring *ring)
>>   {
>>  struct amdgpu_device *adev = ring->adev;
>> -   uint32_t rptr = amdgpu_ring_get_rptr(ring);
>> +   uint32_t rptr;
>>  unsigned i;
>>  int r;
>>
>>  r = amdgpu_ring_alloc(ring, 16);
>>  if (r)
>>  return r;
>> +
>> +   rptr = amdgpu_ring_get_rptr(ring);
>> +
>>  amdgpu_ring_write(ring, HEVC_ENC_CMD_END);
>>  amdgpu_ring_commit(ring);
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
>> index cdb96d4..74811b2 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
>> @@ -175,7 +175,7 @@ static void uvd_v7_0_enc_ring_set_wptr(struct 
>> amdgpu_ring *ring)
>>   static int uvd_v7_0_enc_ring_test_ring(struct amdgpu_ring *ring)
>>   {
>>  struct amdgpu_device *adev = ring->adev;
>> -   uint32_t rptr = amdgpu_ring_get_rptr(ring);
>> +   uint32_t rptr;
>>  unsigned i;
>>  int r;
>>
>> @@ -185,6 +185,9 @@ static int uvd_v7_0_enc_ring_test_ring(struct 
>> amdgpu_ring *ring)
>>  r = amdgpu_ring_alloc(ring, 16);
>>  if (r)
>>  return r;
>> +
>> +   rptr = amdgpu_ring_get_rptr(ring);
>> +
>>  amdgpu_ring_write(ring, HEVC_ENC_CMD_END);
>>  amdgpu_ring_commit(ring);
>>
>> --
>> 2.7.4
>>
>> ___
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

-- 
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: [PATCH] drm/amdgpu: fix ring test failure issue during s3 in vce 3.0 (V2)

2019-06-04 Thread S, Shirish
Thanks Christian.
Have sent the patch for uvd & vcn.
(https://patchwork.freedesktop.org/patch/308575/)


Regards,
Shirish S

-Original Message-
From: Christian König  
Sent: Tuesday, June 4, 2019 4:38 PM
To: S, Shirish ; Deucher, Alexander 
; Koenig, Christian ; 
jerry.zh...@amd.com; Deng, Emily ; Liu, Leo 

Cc: amd-gfx@lists.freedesktop.org; Li, Ching-shih (Louis) 

Subject: Re: [PATCH] drm/amdgpu: fix ring test failure issue during s3 in vce 
3.0 (V2)

Am 04.06.19 um 10:36 schrieb S, Shirish:
> From: Louis Li 
>
> [What]
> vce ring test fails consistently during resume in s3 cycle, due to 
> mismatch read & write pointers.
> On debug/analysis its found that rptr to be compared is not being 
> correctly updated/read, which leads to this failure.
> Below is the failure signature:
>   [drm:amdgpu_vce_ring_test_ring] *ERROR* amdgpu: ring 12 test failed
>   [drm:amdgpu_device_ip_resume_phase2] *ERROR* resume of IP block 
>  failed -110
>   [drm:amdgpu_device_resume] *ERROR* amdgpu_device_ip_resume failed 
> (-110).
>
> [How]
> fetch rptr appropriately, meaning move its read location further down 
> in the code flow.
> With this patch applied the s3 failure is no more seen for >5k s3 
> cycles, which otherwise is pretty consistent.
>
> V2: remove reduntant fetch of rptr
>
> Signed-off-by: Louis Li 

Reviewed-by: Christian König 
CC: stable...

Who does the same patch for UVD and VCN? Exactly the same thing is wrong there 
as well.

Christian.

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> index c021b11..f7189e2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
> @@ -1072,7 +1072,7 @@ void amdgpu_vce_ring_emit_fence(struct amdgpu_ring 
> *ring, u64 addr, u64 seq,
>   int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring)
>   {
>   struct amdgpu_device *adev = ring->adev;
> - uint32_t rptr = amdgpu_ring_get_rptr(ring);
> + uint32_t rptr;
>   unsigned i;
>   int r, timeout = adev->usec_timeout;
>   
> @@ -1084,6 +1084,8 @@ int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring)
>   if (r)
>   return r;
>   
> + rptr = amdgpu_ring_get_rptr(ring);
> +
>   amdgpu_ring_write(ring, VCE_CMD_END);
>   amdgpu_ring_commit(ring);
>   

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu/{uvd,vcn}: fetch ring's read_ptr after alloc

2019-06-04 Thread S, Shirish
[What]
readptr read always returns zero, since most likely
UVD block is either power or clock gated.

[How]
fetch rptr after amdgpu_ring_alloc() which informs
the power management code that the block is about to be
used and hence the gating is turned off.

Signed-off-by: Louis Li 
Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 4 +++-
 drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c   | 5 -
 drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c   | 5 -
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 118451f..d786098 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -468,7 +468,7 @@ int amdgpu_vcn_dec_ring_test_ib(struct amdgpu_ring *ring, 
long timeout)
 int amdgpu_vcn_enc_ring_test_ring(struct amdgpu_ring *ring)
 {
struct amdgpu_device *adev = ring->adev;
-   uint32_t rptr = amdgpu_ring_get_rptr(ring);
+   uint32_t rptr;
unsigned i;
int r;
 
@@ -476,6 +476,8 @@ int amdgpu_vcn_enc_ring_test_ring(struct amdgpu_ring *ring)
if (r)
return r;
 
+   rptr = amdgpu_ring_get_rptr(ring);
+
amdgpu_ring_write(ring, VCN_ENC_CMD_END);
amdgpu_ring_commit(ring);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
index c61a314..16682b7 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c
@@ -170,13 +170,16 @@ static void uvd_v6_0_enc_ring_set_wptr(struct amdgpu_ring 
*ring)
 static int uvd_v6_0_enc_ring_test_ring(struct amdgpu_ring *ring)
 {
struct amdgpu_device *adev = ring->adev;
-   uint32_t rptr = amdgpu_ring_get_rptr(ring);
+   uint32_t rptr;
unsigned i;
int r;
 
r = amdgpu_ring_alloc(ring, 16);
if (r)
return r;
+
+   rptr = amdgpu_ring_get_rptr(ring);
+
amdgpu_ring_write(ring, HEVC_ENC_CMD_END);
amdgpu_ring_commit(ring);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index cdb96d4..74811b2 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -175,7 +175,7 @@ static void uvd_v7_0_enc_ring_set_wptr(struct amdgpu_ring 
*ring)
 static int uvd_v7_0_enc_ring_test_ring(struct amdgpu_ring *ring)
 {
struct amdgpu_device *adev = ring->adev;
-   uint32_t rptr = amdgpu_ring_get_rptr(ring);
+   uint32_t rptr;
unsigned i;
int r;
 
@@ -185,6 +185,9 @@ static int uvd_v7_0_enc_ring_test_ring(struct amdgpu_ring 
*ring)
r = amdgpu_ring_alloc(ring, 16);
if (r)
return r;
+
+   rptr = amdgpu_ring_get_rptr(ring);
+
amdgpu_ring_write(ring, HEVC_ENC_CMD_END);
amdgpu_ring_commit(ring);
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: fix ring test failure issue during s3 in vce 3.0 (V2)

2019-06-04 Thread S, Shirish
From: Louis Li 

[What]
vce ring test fails consistently during resume in s3 cycle, due to
mismatch read & write pointers.
On debug/analysis its found that rptr to be compared is not being
correctly updated/read, which leads to this failure.
Below is the failure signature:
[drm:amdgpu_vce_ring_test_ring] *ERROR* amdgpu: ring 12 test failed
[drm:amdgpu_device_ip_resume_phase2] *ERROR* resume of IP block 
 failed -110
[drm:amdgpu_device_resume] *ERROR* amdgpu_device_ip_resume failed 
(-110).

[How]
fetch rptr appropriately, meaning move its read location further down
in the code flow.
With this patch applied the s3 failure is no more seen for >5k s3 cycles,
which otherwise is pretty consistent.

V2: remove reduntant fetch of rptr

Signed-off-by: Louis Li 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index c021b11..f7189e2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -1072,7 +1072,7 @@ void amdgpu_vce_ring_emit_fence(struct amdgpu_ring *ring, 
u64 addr, u64 seq,
 int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring)
 {
struct amdgpu_device *adev = ring->adev;
-   uint32_t rptr = amdgpu_ring_get_rptr(ring);
+   uint32_t rptr;
unsigned i;
int r, timeout = adev->usec_timeout;
 
@@ -1084,6 +1084,8 @@ int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring)
if (r)
return r;
 
+   rptr = amdgpu_ring_get_rptr(ring);
+
amdgpu_ring_write(ring, VCE_CMD_END);
amdgpu_ring_commit(ring);
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amdgpu: fix ring test failure issue during s3 in vce 3.0

2019-05-27 Thread S, Shirish
From: Louis Li 

[What]
vce ring test fails consistently during resume in s3 cycle, due to
mismatch read & write pointers.
On debug/analysis its found that rptr to be compared is not being
correctly updated/read, which leads to this failure.
Below is the failure signature:
[drm:amdgpu_vce_ring_test_ring] *ERROR* amdgpu: ring 12 test failed
[drm:amdgpu_device_ip_resume_phase2] *ERROR* resume of IP block 
 failed -110
[drm:amdgpu_device_resume] *ERROR* amdgpu_device_ip_resume failed 
(-110).

[How]
fetch rptr appropriately, meaning move its read location further down
in the code flow.
With this patch applied the s3 failure is no more seen for >5k s3 cycles,
which otherwise is pretty consistent.

Signed-off-by: Louis Li 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index c021b11..92f9d46 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -1084,6 +1084,8 @@ int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring)
if (r)
return r;
 
+   rptr = amdgpu_ring_get_rptr(ring);
+
amdgpu_ring_write(ring, VCE_CMD_END);
amdgpu_ring_commit(ring);
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/pp/smu10: log smu version and modify error logging in send_msg

2019-04-04 Thread S, Shirish

On 4/4/2019 8:58 PM, Alex Deucher wrote:
> On Thu, Apr 4, 2019 at 6:38 AM S, Shirish  wrote:
>> Signed-off-by: Shirish S 
> Please include a patch description.  Why are you you making this change?

Was not aware of the debugfs entry, i wish to abandon this patch.

I shall get back with a patch in case we need more info from send_msg 
failures.

>
>> ---
>>   drivers/gpu/drm/amd/powerplay/smumgr/smu10_smumgr.c | 9 +++--
>>   1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu10_smumgr.c 
>> b/drivers/gpu/drm/amd/powerplay/smumgr/smu10_smumgr.c
>> index 6d11076a..373f384 100644
>> --- a/drivers/gpu/drm/amd/powerplay/smumgr/smu10_smumgr.c
>> +++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu10_smumgr.c
>> @@ -85,7 +85,7 @@ static int smu10_send_msg_to_smc(struct pp_hwmgr *hwmgr, 
>> uint16_t msg)
>>  smu10_send_msg_to_smc_without_waiting(hwmgr, msg);
>>
>>  if (smu10_wait_for_response(hwmgr) == 0)
>> -   printk("Failed to send Message %x.\n", msg);
>> +   pr_err("%s Failed to send Message (0x%04x)\n", __func__, 
>> msg);
>>
>>  return 0;
>>   }
>> @@ -106,7 +106,7 @@ static int smu10_send_msg_to_smc_with_parameter(struct 
>> pp_hwmgr *hwmgr,
>>
>>
>>  if (smu10_wait_for_response(hwmgr) == 0)
>> -   printk("Failed to send Message %x.\n", msg);
>> +   pr_err("%s Failed to send Message (0x%04x)\n", __func__, 
>> msg);
>>
>>  return 0;
>>   }
> Are there any cases where these are harmless and can be ignored?
>
>> @@ -210,6 +210,11 @@ static int smu10_start_smu(struct pp_hwmgr *hwmgr)
>>
>>  smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetSmuVersion);
>>  hwmgr->smu_version = smu10_read_arg_from_smc(hwmgr);
>> +   pr_info("smu version %02d.%02d.%02d.%02d\n",
>> +   ((hwmgr->smu_version >> 24) & 0xFF),
>> +   ((hwmgr->smu_version >> 16) & 0xFF),
>> +   ((hwmgr->smu_version >> 8) & 0xFF),
>> +   (hwmgr->smu_version & 0xFF));
> Do we need to print this here?  Would it be better as a debug output?
> We already expose the smu firmware version via debugfs along with all
> of the other firmware versions.

Thanks Alex, its very useful information.

Regards,

Shirish S

>
> Alex
>
>>  adev->pm.fw_version = hwmgr->smu_version >> 8;
>>
>>  if (adev->rev_id < 0x8 && adev->pdev->device != 0x15d8 &&
>> --
>> 2.7.4
>>
>> ___
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

-- 
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[PATCH] drm/amd/pp/smu10: log smu version and modify error logging in send_msg

2019-04-04 Thread S, Shirish
Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/powerplay/smumgr/smu10_smumgr.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu10_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/smu10_smumgr.c
index 6d11076a..373f384 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu10_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu10_smumgr.c
@@ -85,7 +85,7 @@ static int smu10_send_msg_to_smc(struct pp_hwmgr *hwmgr, 
uint16_t msg)
smu10_send_msg_to_smc_without_waiting(hwmgr, msg);
 
if (smu10_wait_for_response(hwmgr) == 0)
-   printk("Failed to send Message %x.\n", msg);
+   pr_err("%s Failed to send Message (0x%04x)\n", __func__, msg);
 
return 0;
 }
@@ -106,7 +106,7 @@ static int smu10_send_msg_to_smc_with_parameter(struct 
pp_hwmgr *hwmgr,
 
 
if (smu10_wait_for_response(hwmgr) == 0)
-   printk("Failed to send Message %x.\n", msg);
+   pr_err("%s Failed to send Message (0x%04x)\n", __func__, msg);
 
return 0;
 }
@@ -210,6 +210,11 @@ static int smu10_start_smu(struct pp_hwmgr *hwmgr)
 
smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetSmuVersion);
hwmgr->smu_version = smu10_read_arg_from_smc(hwmgr);
+   pr_info("smu version %02d.%02d.%02d.%02d\n",
+   ((hwmgr->smu_version >> 24) & 0xFF),
+   ((hwmgr->smu_version >> 16) & 0xFF),
+   ((hwmgr->smu_version >> 8) & 0xFF),
+   (hwmgr->smu_version & 0xFF));
adev->pm.fw_version = hwmgr->smu_version >> 8;
 
if (adev->rev_id < 0x8 && adev->pdev->device != 0x15d8 &&
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

RE: s2idle not working

2019-02-20 Thread S, Shirish
Please mail your query to 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> .



Regards,
Shirish S

From: shahul hameed 
Sent: Wednesday, February 20, 2019 3:14 PM
To: S, Shirish 
Subject: s2idle not working

Hi Shirish

I am porting Android_N and keenel 4.19.2 to Ryzen platform. Graphic card is 
gfx9.
Porting is done successfully. Now i am working on suspend/resume.
Suspend/resume is working fine in deep mode.
But suspend/resume is not working in s2idle (Suspend to idle) mode.

during resume i found error in gpu driver.

[  105.862161] [drm:gfx_v9_0_hw_init [amdgpu]] *ERROR* KCQ enable failed 
(scratch(0xC040)=0xCAFEDEAD)
[  105.862187] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of 
IP block  failed -22
[  105.862210] [drm:amdgpu_device_resume [amdgpu]] *ERROR* 
amdgpu_device_ip_resume failed (-22).
[  105.862215] dpm_run_callback(): pci_pm_resume+0x0/0xd6 returns -22
[  105.862345] PM: Device :03:00.0 failed to resume async: error -22

same issue is reproduced with ubuntu16.04.

Can you please help me to resolve this issue.
Thanks in Advance,
Regards,
Sk shahul.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amdgu/vce_v3: start vce block before ring test

2019-02-05 Thread S, Shirish

On 2/4/2019 9:00 PM, Liu, Leo wrote:
> On 2/4/19 7:49 AM, Koenig, Christian wrote:
>> Am 04.02.19 um 13:44 schrieb S, Shirish:
>>> vce ring test fails during resume since mmVCE_RB_RPTR*
>>> is not intitalized/updated.
>>>
>>> Hence start vce block before ring test.
>> Mhm, I wonder why this ever worked. But yeah, same problem seems to
>> exits for VCE 2 as well.
>>
>> Leo any comment on this?
> UVD and VCE start function were at hw_init originally from the bring up
> on all the HW. And later the DPM developer moved them to
> set_powergating_state() for some reason.
>
> @Shirish, are you sure the vce_v3_0_start() is not there?
>
> Just simply adding it back to hw_init, might break the DPM logic, so
> please make sure.

Sure Leo, i will check and get back.

Regards,

Shirish S

>
> Thanks,
>
> Leo
>
>
>> Thanks,
>> Christian.
>>
>>> Signed-off-by: Shirish S 
>>> ---
>>> * vce_v4_0.c's hw_init sequence already has this change.
>>>
>>> drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 4 
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
>>> b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
>>> index 6ec65cf1..d809c10 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
>>> @@ -469,6 +469,10 @@ static int vce_v3_0_hw_init(void *handle)
>>> int r, i;
>>> struct amdgpu_device *adev = (struct amdgpu_device *)handle;
>>> 
>>> +   r = vce_v3_0_start(adev);
>>> +   if (r)
>>> +   return r;
>>> +
>>> vce_v3_0_override_vce_clock_gating(adev, true);
>>> 
>>> amdgpu_asic_set_vce_clocks(adev, 1, 1);

-- 
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgu/vce_v3: start vce block before ring test

2019-02-04 Thread S, Shirish
vce ring test fails during resume since mmVCE_RB_RPTR*
is not intitalized/updated.

Hence start vce block before ring test.

Signed-off-by: Shirish S 
---
* vce_v4_0.c's hw_init sequence already has this change.

 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
index 6ec65cf1..d809c10 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -469,6 +469,10 @@ static int vce_v3_0_hw_init(void *handle)
int r, i;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   r = vce_v3_0_start(adev);
+   if (r)
+   return r;
+
vce_v3_0_override_vce_clock_gating(adev, true);
 
amdgpu_asic_set_vce_clocks(adev, 1, 1);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: Use context parameters to enable FBC

2019-02-04 Thread S, Shirish
[What]
FBC fails to get enabled when switched between LINEAR(console/VT)
and non-LINEAR(GUI) based rendering due to default value of
tiling info stored in the current_state which is used for deciding
whether or not to turn FBC on or off.

[How]
Use context structure's tiling information which is coherant with
the screen updates.

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index db0ef41..fd7cd5b 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -2535,7 +2535,7 @@ static void dce110_apply_ctx_for_surface(
}
 
if (dc->fbc_compressor)
-   enable_fbc(dc, dc->current_state);
+   enable_fbc(dc, context);
 }
 
 static void dce110_power_down_fe(struct dc *dc, struct pipe_ctx *pipe_ctx)
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: fix compliler errors [-Werror,-Wmissing-braces]

2018-12-20 Thread S, Shirish
Initializing structures with { } is known to be problematic since
it doesn't necessararily initializes all bytes, in case of padding,
causing random failures when structures are memcmp().

This patch fixes the structure initialisation compiler error by memsetting
the entire structure elements instead of only the first one.

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c   | 3 ++-
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer_debug.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 0bd33a7..1b5630f 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -92,9 +92,10 @@ static void log_mpc_crc(struct dc *dc,
 void dcn10_log_hubbub_state(struct dc *dc, struct dc_log_buffer_ctx *log_ctx)
 {
struct dc_context *dc_ctx = dc->ctx;
-   struct dcn_hubbub_wm wm = {0};
+   struct dcn_hubbub_wm wm;
int i;
 
+   memset(, 0, sizeof(struct dcn_hubbub_wm));
hubbub1_wm_read_state(dc->res_pool->hubbub, );
 
DTN_INFO("HUBBUB WM:  data_urgent  pte_meta_urgent"
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer_debug.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer_debug.c
index cd46901..3fccec2 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer_debug.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer_debug.c
@@ -72,7 +72,7 @@ static unsigned int snprintf_count(char *pBuf, unsigned int 
bufSize, char *fmt,
 static unsigned int dcn10_get_hubbub_state(struct dc *dc, char *pBuf, unsigned 
int bufSize)
 {
struct dc_context *dc_ctx = dc->ctx;
-   struct dcn_hubbub_wm wm = {0};
+   struct dcn_hubbub_wm wm;
int i;
 
unsigned int chars_printed = 0;
@@ -81,6 +81,7 @@ static unsigned int dcn10_get_hubbub_state(struct dc *dc, 
char *pBuf, unsigned i
const uint32_t ref_clk_mhz = dc_ctx->dc->res_pool->ref_clock_inKhz / 
1000;
static const unsigned int frac = 1000;
 
+   memset(, 0, sizeof(struct dcn_hubbub_wm));
hubbub1_wm_read_state(dc->res_pool->hubbub, );
 
chars_printed = snprintf_count(pBuf, remaining_buffer, 
"wm_set_index,data_urgent,pte_meta_urgent,sr_enter,sr_exit,dram_clk_chanage\n");
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: limit high pixel clock modes on ST/CZ

2018-11-27 Thread S, Shirish
This is for the devices with type-c ports.

Wherein the signal type is same 32 (DISPLAY PORT)  for both HDMI and DP  
monitors connected to the system via type-c dongles/convertors.

Regards,
Shirish S

Get Outlook for Android<https://aka.ms/ghei36>


From: Deucher, Alexander
Sent: Wednesday, November 28, 2018 12:06:26 AM
To: S, Shirish; Li, Sun peng (Leo); Wentland, Harry
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/display: limit high pixel clock modes on ST/CZ


Is this a DP to HDMI adapter?  I think 4k@60 should be valid on DP in general 
on ST/CZ, but Harry or Leo should comment.


Alex


From: S, Shirish
Sent: Tuesday, November 27, 2018 3:58:12 AM
To: Deucher, Alexander; Li, Sun peng (Leo); Wentland, Harry
Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amd/display: limit high pixel clock modes on ST/CZ


However, while using Type-C connectors noted that the signal type is actually 
SIGNAL_TYPE_DISPLAY_PORT and found that the check was missing.

Hence have added the same in https://patchwork.freedesktop.org/patch/264033/



Regards,

Shirish S



From: S, Shirish
Sent: Tuesday, November 27, 2018 9:54 AM
To: Deucher, Alexander ; Li, Sun peng (Leo) 
; Wentland, Harry 
Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amd/display: limit high pixel clock modes on ST/CZ



Thanks Alex, found that patch.

My patch is no more required.





Regards,

Shirish S



From: Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>
Sent: Monday, November 26, 2018 7:46 PM
To: S, Shirish mailto:shiris...@amd.com>>; Li, Sun peng 
(Leo) mailto:sunpeng...@amd.com>>; Wentland, Harry 
mailto:harry.wentl...@amd.com>>
Cc: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/amd/display: limit high pixel clock modes on ST/CZ



I thought there was a patch to do this already that got sent out a few weeks 
ago.  Basically limit ST/CZ to modes that do not require a retimer.  Is an 
additional patch needed?



Alex



From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of S, Shirish mailto:shiris...@amd.com>>
Sent: Monday, November 26, 2018 1:36:30 AM
To: Li, Sun peng (Leo); Wentland, Harry
Cc: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; S, 
Shirish
Subject: [PATCH] drm/amd/display: limit high pixel clock modes on ST/CZ



[Why]
ST/CZ (dce110) advertises modes such as 4k@60Hz etc.,
that it cannot handle correctly, hence  resulting in
several issues like flickering, black lines/flashes and so on.

[How]
These modes are basically high pixel clock ones, hence
limit the same to be advertised to avoid bad user experiences

Signed-off-by: Shirish S mailto:shiris...@amd.com>>
Suggested-by: Harry Wentland 
mailto:harry.wentl...@amd.com>>
---
 .../gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c| 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c
index 1b2fe0d..1b8fe99 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c
@@ -1121,6 +1121,16 @@ bool dce110_timing_generator_validate_timing(
 if (!timing)
 return false;

+   /* Limit all modes that have a high pixel clock
+* which seems to be problematic on dce110
+* These include: 4k@60Hz, 1080p@144Hz,1440p@120Hz
+* based on the below formula:
+* refresh rate = pixel clock / (htotal * vtotal)
+*/
+   if (timing->pix_clk_khz > 30)
+   return false;
+
+
 hsync_offset = timing->h_border_right + timing->h_front_porch;
 h_sync_start = timing->h_addressable + hsync_offset;

--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/display: limit high pixel clock modes on ST/CZ

2018-11-27 Thread S, Shirish
However, while using Type-C connectors noted that the signal type is actually 
SIGNAL_TYPE_DISPLAY_PORT and found that the check was missing.
Hence have added the same in https://patchwork.freedesktop.org/patch/264033/

Regards,
Shirish S

From: S, Shirish
Sent: Tuesday, November 27, 2018 9:54 AM
To: Deucher, Alexander ; Li, Sun peng (Leo) 
; Wentland, Harry 
Cc: amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amd/display: limit high pixel clock modes on ST/CZ

Thanks Alex, found that patch.
My patch is no more required.


Regards,
Shirish S

From: Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>
Sent: Monday, November 26, 2018 7:46 PM
To: S, Shirish mailto:shiris...@amd.com>>; Li, Sun peng 
(Leo) mailto:sunpeng...@amd.com>>; Wentland, Harry 
mailto:harry.wentl...@amd.com>>
Cc: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/amd/display: limit high pixel clock modes on ST/CZ


I thought there was a patch to do this already that got sent out a few weeks 
ago.  Basically limit ST/CZ to modes that do not require a retimer.  Is an 
additional patch needed?



Alex


From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of S, Shirish mailto:shiris...@amd.com>>
Sent: Monday, November 26, 2018 1:36:30 AM
To: Li, Sun peng (Leo); Wentland, Harry
Cc: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; S, 
Shirish
Subject: [PATCH] drm/amd/display: limit high pixel clock modes on ST/CZ

[Why]
ST/CZ (dce110) advertises modes such as 4k@60Hz etc.,
that it cannot handle correctly, hence  resulting in
several issues like flickering, black lines/flashes and so on.

[How]
These modes are basically high pixel clock ones, hence
limit the same to be advertised to avoid bad user experiences

Signed-off-by: Shirish S mailto:shiris...@amd.com>>
Suggested-by: Harry Wentland 
mailto:harry.wentl...@amd.com>>
---
 .../gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c| 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c
index 1b2fe0d..1b8fe99 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c
@@ -1121,6 +1121,16 @@ bool dce110_timing_generator_validate_timing(
 if (!timing)
 return false;

+   /* Limit all modes that have a high pixel clock
+* which seems to be problematic on dce110
+* These include: 4k@60Hz, 1080p@144Hz,1440p@120Hz
+* based on the below formula:
+* refresh rate = pixel clock / (htotal * vtotal)
+*/
+   if (timing->pix_clk_khz > 30)
+   return false;
+
+
 hsync_offset = timing->h_border_right + timing->h_front_porch;
 h_sync_start = timing->h_addressable + hsync_offset;

--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: Disable 4k@60 on DP as well for DCE11

2018-11-27 Thread S, Shirish
This patch extends the below patch to apply DP signal type, for exactly
the same reasons it was disabled for HDMI.

"1a0e348 drm/amd/display: Disable 4k 60 HDMI on DCE11"

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c
index 3e18ea8..d578828 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_link_encoder.c
@@ -662,6 +662,10 @@ bool dce110_link_encoder_validate_dp_output(
const struct dce110_link_encoder *enc110,
const struct dc_crtc_timing *crtc_timing)
 {
+   if (crtc_timing->pix_clk_khz >
+   enc110->base.features.max_hdmi_pixel_clock)
+   return false;
+
if (crtc_timing->pixel_encoding == PIXEL_ENCODING_YCBCR420)
return false;
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/display: limit high pixel clock modes on ST/CZ

2018-11-26 Thread S, Shirish
Thanks Alex, found that patch.
My patch is no more required.


Regards,
Shirish S

From: Deucher, Alexander 
Sent: Monday, November 26, 2018 7:46 PM
To: S, Shirish ; Li, Sun peng (Leo) ; 
Wentland, Harry 
Cc: amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/display: limit high pixel clock modes on ST/CZ


I thought there was a patch to do this already that got sent out a few weeks 
ago.  Basically limit ST/CZ to modes that do not require a retimer.  Is an 
additional patch needed?



Alex


From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of S, Shirish mailto:shiris...@amd.com>>
Sent: Monday, November 26, 2018 1:36:30 AM
To: Li, Sun peng (Leo); Wentland, Harry
Cc: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; S, 
Shirish
Subject: [PATCH] drm/amd/display: limit high pixel clock modes on ST/CZ

[Why]
ST/CZ (dce110) advertises modes such as 4k@60Hz etc.,
that it cannot handle correctly, hence  resulting in
several issues like flickering, black lines/flashes and so on.

[How]
These modes are basically high pixel clock ones, hence
limit the same to be advertised to avoid bad user experiences

Signed-off-by: Shirish S mailto:shiris...@amd.com>>
Suggested-by: Harry Wentland 
mailto:harry.wentl...@amd.com>>
---
 .../gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c| 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c
index 1b2fe0d..1b8fe99 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_timing_generator.c
@@ -1121,6 +1121,16 @@ bool dce110_timing_generator_validate_timing(
 if (!timing)
 return false;

+   /* Limit all modes that have a high pixel clock
+* which seems to be problematic on dce110
+* These include: 4k@60Hz, 1080p@144Hz,1440p@120Hz
+* based on the below formula:
+* refresh rate = pixel clock / (htotal * vtotal)
+*/
+   if (timing->pix_clk_khz > 30)
+   return false;
+
+
 hsync_offset = timing->h_border_right + timing->h_front_porch;
 h_sync_start = timing->h_addressable + hsync_offset;

--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: refactor smu8_send_msg_to_smc and WARN_ON time out

2018-11-12 Thread S, Shirish
From: Daniel Kurtz 

This patch refactors smu8_send_msg_to_smc_with_parameter() to include
smu8_send_msg_to_smc_async() so that all the messages sent to SMU can be
profiled and appropriately reported if they fail.

Signed-off-by: Daniel Kurtz 
Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c | 45 ++
 1 file changed, 21 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c
index 09b844e..b6e8c89 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -61,9 +62,13 @@ static uint32_t smu8_get_argument(struct pp_hwmgr *hwmgr)
mmSMU_MP1_SRBM2P_ARG_0);
 }
 
-static int smu8_send_msg_to_smc_async(struct pp_hwmgr *hwmgr, uint16_t msg)
+/* Send a message to the SMC, and wait for its response.*/
+static int smu8_send_msg_to_smc_with_parameter(struct pp_hwmgr *hwmgr,
+   uint16_t msg, uint32_t parameter)
 {
int result = 0;
+   ktime_t t_start;
+   s64 elapsed_us;
 
if (hwmgr == NULL || hwmgr->device == NULL)
return -EINVAL;
@@ -74,28 +79,31 @@ static int smu8_send_msg_to_smc_async(struct pp_hwmgr 
*hwmgr, uint16_t msg)
/* Read the last message to SMU, to report actual cause */
uint32_t val = cgs_read_register(hwmgr->device,
 mmSMU_MP1_SRBM2P_MSG_0);
-   pr_err("smu8_send_msg_to_smc_async (0x%04x) failed\n", msg);
-   pr_err("SMU still servicing msg (0x%04x)\n", val);
+   pr_err("%s(0x%04x) aborted; SMU still servicing msg (0x%04x)\n",
+   __func__, msg, val);
return result;
}
+   t_start = ktime_get();
+
+   cgs_write_register(hwmgr->device, mmSMU_MP1_SRBM2P_ARG_0, parameter);
 
cgs_write_register(hwmgr->device, mmSMU_MP1_SRBM2P_RESP_0, 0);
cgs_write_register(hwmgr->device, mmSMU_MP1_SRBM2P_MSG_0, msg);
 
-   return 0;
+   result = PHM_WAIT_FIELD_UNEQUAL(hwmgr,
+   SMU_MP1_SRBM2P_RESP_0, CONTENT, 0);
+
+   elapsed_us = ktime_us_delta(ktime_get(), t_start);
+
+   WARN(result, "%s(0x%04x, %#x) timed out after %lld us\n",
+   __func__, msg, parameter, elapsed_us);
+
+   return result;
 }
 
-/* Send a message to the SMC, and wait for its response.*/
 static int smu8_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)
 {
-   int result = 0;
-
-   result = smu8_send_msg_to_smc_async(hwmgr, msg);
-   if (result != 0)
-   return result;
-
-   return PHM_WAIT_FIELD_UNEQUAL(hwmgr,
-   SMU_MP1_SRBM2P_RESP_0, CONTENT, 0);
+   return smu8_send_msg_to_smc_with_parameter(hwmgr, msg, 0);
 }
 
 static int smu8_set_smc_sram_address(struct pp_hwmgr *hwmgr,
@@ -135,17 +143,6 @@ static int smu8_write_smc_sram_dword(struct pp_hwmgr 
*hwmgr,
return result;
 }
 
-static int smu8_send_msg_to_smc_with_parameter(struct pp_hwmgr *hwmgr,
- uint16_t msg, uint32_t parameter)
-{
-   if (hwmgr == NULL || hwmgr->device == NULL)
-   return -EINVAL;
-
-   cgs_write_register(hwmgr->device, mmSMU_MP1_SRBM2P_ARG_0, parameter);
-
-   return smu8_send_msg_to_smc(hwmgr, msg);
-}
-
 static int smu8_check_fw_load_finish(struct pp_hwmgr *hwmgr,
   uint32_t firmware)
 {
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: refactor smu8_send_msg_to_smc and WARN_ON time out

2018-11-12 Thread S, Shirish
From: Daniel Kurtz 

This patch refactors smu8_send_msg_to_smc_with_parameter() to include
smu8_send_msg_to_smc_async() so that all the messages sent to SMU can be
profiled and appropriately reported if they fail.

Signed-off-by: Daniel Kurtz 
Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c | 45 ++
 1 file changed, 21 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c
index 09b844e..bf97abc 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -61,9 +62,13 @@ static uint32_t smu8_get_argument(struct pp_hwmgr *hwmgr)
mmSMU_MP1_SRBM2P_ARG_0);
 }
 
-static int smu8_send_msg_to_smc_async(struct pp_hwmgr *hwmgr, uint16_t msg)
+/* Send a message to the SMC, and wait for its response.*/
+static int smu8_send_msg_to_smc_with_parameter(struct pp_hwmgr *hwmgr,
+   uint16_t msg, uint32_t parameter)
 {
int result = 0;
+   ktime_t t_start;
+   s64 elapsed_us;
 
if (hwmgr == NULL || hwmgr->device == NULL)
return -EINVAL;
@@ -74,28 +79,31 @@ static int smu8_send_msg_to_smc_async(struct pp_hwmgr 
*hwmgr, uint16_t msg)
/* Read the last message to SMU, to report actual cause */
uint32_t val = cgs_read_register(hwmgr->device,
 mmSMU_MP1_SRBM2P_MSG_0);
-   pr_err("smu8_send_msg_to_smc_async (0x%04x) failed\n", msg);
-   pr_err("SMU still servicing msg (0x%04x)\n", val);
+   pr_err("%s(0x%04x) aborted; SMU still servicing msg (0x%04x)\n",
+   msg, val);
return result;
}
+   t_start = ktime_get();
+
+   cgs_write_register(hwmgr->device, mmSMU_MP1_SRBM2P_ARG_0, parameter);
 
cgs_write_register(hwmgr->device, mmSMU_MP1_SRBM2P_RESP_0, 0);
cgs_write_register(hwmgr->device, mmSMU_MP1_SRBM2P_MSG_0, msg);
 
-   return 0;
+   result = PHM_WAIT_FIELD_UNEQUAL(hwmgr,
+   SMU_MP1_SRBM2P_RESP_0, CONTENT, 0);
+
+   elapsed_us = ktime_us_delta(ktime_get(), t_start);
+
+   WARN(result, "%s(0x%04x, %#x) timed out after %lld us\n",
+   __func__, msg, parameter, elapsed_us);
+
+   return result;
 }
 
-/* Send a message to the SMC, and wait for its response.*/
 static int smu8_send_msg_to_smc(struct pp_hwmgr *hwmgr, uint16_t msg)
 {
-   int result = 0;
-
-   result = smu8_send_msg_to_smc_async(hwmgr, msg);
-   if (result != 0)
-   return result;
-
-   return PHM_WAIT_FIELD_UNEQUAL(hwmgr,
-   SMU_MP1_SRBM2P_RESP_0, CONTENT, 0);
+   return smu8_send_msg_to_smc_with_parameter(hwmgr, msg, 0);
 }
 
 static int smu8_set_smc_sram_address(struct pp_hwmgr *hwmgr,
@@ -135,17 +143,6 @@ static int smu8_write_smc_sram_dword(struct pp_hwmgr 
*hwmgr,
return result;
 }
 
-static int smu8_send_msg_to_smc_with_parameter(struct pp_hwmgr *hwmgr,
- uint16_t msg, uint32_t parameter)
-{
-   if (hwmgr == NULL || hwmgr->device == NULL)
-   return -EINVAL;
-
-   cgs_write_register(hwmgr->device, mmSMU_MP1_SRBM2P_ARG_0, parameter);
-
-   return smu8_send_msg_to_smc(hwmgr, msg);
-}
-
 static int smu8_check_fw_load_finish(struct pp_hwmgr *hwmgr,
   uint32_t firmware)
 {
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: log smu version

2018-11-11 Thread S, Shirish
This patch prints the version of SMU firmware.

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c
index 09b844e..1439835 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c
@@ -737,6 +737,10 @@ static int smu8_start_smu(struct pp_hwmgr *hwmgr)
 
cgs_write_register(hwmgr->device, mmMP0PUB_IND_INDEX, index);
hwmgr->smu_version = cgs_read_register(hwmgr->device, 
mmMP0PUB_IND_DATA);
+   pr_info("smu version %02d.%02d.%02d\n",
+   ((hwmgr->smu_version >> 16) & 0xFF),
+   ((hwmgr->smu_version >> 8) & 0xFF),
+   (hwmgr->smu_version & 0xFF));
adev->pm.fw_version = hwmgr->smu_version >> 8;
 
return smu8_request_smu_load_fw(hwmgr);
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: fix reporting of failed msg sent to SMU

2018-10-25 Thread S, Shirish
Currently send_msg_to_smc_async() only report which message
failed, but the actual failing message is the previous one,
which SMU is unable to service.

This patch reads the contents of register where the SMU is stuck
and report appropriately.

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c 
b/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c
index f836d30..b1007b8 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu8_smumgr.c
@@ -64,6 +64,7 @@ static uint32_t smu8_get_argument(struct pp_hwmgr *hwmgr)
 static int smu8_send_msg_to_smc_async(struct pp_hwmgr *hwmgr, uint16_t msg)
 {
int result = 0;
+   uint32_t = val;
 
if (hwmgr == NULL || hwmgr->device == NULL)
return -EINVAL;
@@ -71,7 +72,11 @@ static int smu8_send_msg_to_smc_async(struct pp_hwmgr 
*hwmgr, uint16_t msg)
result = PHM_WAIT_FIELD_UNEQUAL(hwmgr,
SMU_MP1_SRBM2P_RESP_0, CONTENT, 0);
if (result != 0) {
+   /* Read the last message to SMU, to report actual cause */
+   val = cgs_read_register(hwmgr->device,
+   mmSMU_MP1_SRBM2P_MSG_0);
pr_err("smu8_send_msg_to_smc_async (0x%04x) failed\n", msg);
+   pr_err("SMU still servicing msg (0x%04x)\n", val);
return result;
}
 
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] Revert "drm/amd/powerplay: Enable/Disable NBPSTATE on On/OFF of UVD"

2018-10-25 Thread S, Shirish
This reverts commit dbd8299c32f6f413f6cfe322fe0308f3cfc577e8.

Reason for revert:
This patch sends  msg PPSMC_MSG_DisableLowMemoryPstate(0x002e)
in wrong of sequence to SMU which is before PPSMC_MSG_UVDPowerON (0x0008).
This leads to SMU failing to service the request as it is
dependent on UVD to be powered ON, since it accesses UVD
registers.

This msg should ideally be sent only when the UVD is about to decode
a 4k video.

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/powerplay/hwmgr/smu8_hwmgr.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu8_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu8_hwmgr.c
index fef111d..53cf787 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu8_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu8_hwmgr.c
@@ -1228,17 +1228,14 @@ static int smu8_dpm_force_dpm_level(struct pp_hwmgr 
*hwmgr,
 
 static int smu8_dpm_powerdown_uvd(struct pp_hwmgr *hwmgr)
 {
-   if (PP_CAP(PHM_PlatformCaps_UVDPowerGating)) {
-   smu8_nbdpm_pstate_enable_disable(hwmgr, true, true);
+   if (PP_CAP(PHM_PlatformCaps_UVDPowerGating))
return smum_send_msg_to_smc(hwmgr, PPSMC_MSG_UVDPowerOFF);
-   }
return 0;
 }
 
 static int smu8_dpm_powerup_uvd(struct pp_hwmgr *hwmgr)
 {
if (PP_CAP(PHM_PlatformCaps_UVDPowerGating)) {
-   smu8_nbdpm_pstate_enable_disable(hwmgr, false, true);
return smum_send_msg_to_smc_with_parameter(
hwmgr,
PPSMC_MSG_UVDPowerON,
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were disabled by user

2018-10-16 Thread S, Shirish
Yes.
Is there any dependant patches?

Regards,
Shirish S

From: Zhu, Rex
Sent: Tuesday, October 16, 2018 12:29 PM
To: S, Shirish ; Deucher, Alexander 
; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were 
disabled by user

The code base is different.
And do you set ip_block_mask = 0xeff to diable vce?

Best Regards
Rex

From: S, Shirish
Sent: Wednesday, October 17, 2018 1:25 AM
To: Zhu, Rex mailto:rex@amd.com>>; Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: RE: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were 
disabled by user

Nope, I have just cherry-picked this patch.
Is there any dependency?

Regards,
Shirish S

From: Zhu, Rex
Sent: Tuesday, October 16, 2018 12:15 PM
To: S, Shirish mailto:shiris...@amd.com>>; Deucher, 
Alexander mailto:alexander.deuc...@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: RE: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were 
disabled by user

Do you test based on drm-next branch?

Best Regards
Rex

From: S, Shirish
Sent: Wednesday, October 17, 2018 1:12 AM
To: Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>; Zhu, Rex 
mailto:rex@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: RE: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were 
disabled by user

This patch fails on the very first resume as below:
[   53.632732] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   54.653212] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   55.673692] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   56.694203] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   57.714683] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   58.735164] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   59.755643] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   60.776124] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   61.796608] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   62.817092] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   62.837108] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
giving up!!!
[   62.837112] Power gating vce_v3_0 failed
[   62.837118] [drm:amdgpu_device_ip_suspend_phase1] *ERROR* 
set_powergating_state(gate) of IP block  failed -110

I believe there is some more work left to be done with it.

Regards,
Shirish S

From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 On Behalf Of Deucher, Alexander
Sent: Tuesday, October 16, 2018 9:44 AM
To: Zhu, Rex mailto:rex@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were 
disabled by user


Reviewed-by: Alex Deucher 
mailto:alexander.deuc...@amd.com>>


From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of Rex Zhu mailto:rex@amd.com>>
Sent: Tuesday, October 16, 2018 1:33:46 AM
To: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; 
Deucher, Alexander
Cc: Zhu, Rex
Subject: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were 
disabled by user

If user disable uvd/vce/vcn/acp blocks via module
parameter ip_block_mask,
driver power off thoser blocks to save power.

Signed-off-by: Rex Zhu mailto:rex@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 1e4dd09..3ffee08 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1774,6 +1774,24 @@ static int amdgpu_device_set_pg_state(struct 
amdgpu_device *adev, enum amd_power

 for (j = 0; j < adev->num_ip_blocks; j++) {
 i = state == AMD_PG_STATE_GATE ? j : adev->num_ip_blocks - j - 
1;
+
+   /* try to power off VCE/UVD/VCN/ACP if they were disabled by 
user */
+   if ((adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_UVD 
||
+   adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_VCE ||
+   adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_VCN ||
+   adev->i

RE: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were disabled by user

2018-10-16 Thread S, Shirish
Nope, I have just cherry-picked this patch.
Is there any dependency?

Regards,
Shirish S

From: Zhu, Rex
Sent: Tuesday, October 16, 2018 12:15 PM
To: S, Shirish ; Deucher, Alexander 
; amd-gfx@lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were 
disabled by user

Do you test based on drm-next branch?

Best Regards
Rex

From: S, Shirish
Sent: Wednesday, October 17, 2018 1:12 AM
To: Deucher, Alexander 
mailto:alexander.deuc...@amd.com>>; Zhu, Rex 
mailto:rex@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: RE: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were 
disabled by user

This patch fails on the very first resume as below:
[   53.632732] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   54.653212] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   55.673692] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   56.694203] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   57.714683] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   58.735164] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   59.755643] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   60.776124] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   61.796608] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   62.817092] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   62.837108] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
giving up!!!
[   62.837112] Power gating vce_v3_0 failed
[   62.837118] [drm:amdgpu_device_ip_suspend_phase1] *ERROR* 
set_powergating_state(gate) of IP block  failed -110

I believe there is some more work left to be done with it.

Regards,
Shirish S

From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 On Behalf Of Deucher, Alexander
Sent: Tuesday, October 16, 2018 9:44 AM
To: Zhu, Rex mailto:rex@amd.com>>; 
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were 
disabled by user


Reviewed-by: Alex Deucher 
mailto:alexander.deuc...@amd.com>>


From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of Rex Zhu mailto:rex@amd.com>>
Sent: Tuesday, October 16, 2018 1:33:46 AM
To: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; 
Deucher, Alexander
Cc: Zhu, Rex
Subject: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were 
disabled by user

If user disable uvd/vce/vcn/acp blocks via module
parameter ip_block_mask,
driver power off thoser blocks to save power.

Signed-off-by: Rex Zhu mailto:rex@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 1e4dd09..3ffee08 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1774,6 +1774,24 @@ static int amdgpu_device_set_pg_state(struct 
amdgpu_device *adev, enum amd_power

 for (j = 0; j < adev->num_ip_blocks; j++) {
 i = state == AMD_PG_STATE_GATE ? j : adev->num_ip_blocks - j - 
1;
+
+   /* try to power off VCE/UVD/VCN/ACP if they were disabled by 
user */
+   if ((adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_UVD 
||
+   adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_VCE ||
+   adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_VCN ||
+   adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_ACP) 
&&
+   adev->ip_blocks[i].version->funcs->set_powergating_state) {
+   if (!adev->ip_blocks[i].status.valid) {
+   r = 
adev->ip_blocks[i].version->funcs->set_powergating_state((void *)adev,
+   
state);
+   if (r) {
+   DRM_ERROR("set_powergating_state(gate) 
of IP block <%s> failed %d\n",
+ 
adev->ip_blocks[i].version->funcs->name, r);
+   return r;
+   }
+   }
+   }
+
 

RE: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were disabled by user

2018-10-16 Thread S, Shirish
This patch fails on the very first resume as below:
[   53.632732] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   54.653212] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   55.673692] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   56.694203] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   57.714683] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   58.735164] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   59.755643] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   60.776124] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   61.796608] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   62.817092] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
trying to reset the ECPU!!!
[   62.837108] [drm:vce_v3_0_set_powergating_state] *ERROR* VCE not responding, 
giving up!!!
[   62.837112] Power gating vce_v3_0 failed
[   62.837118] [drm:amdgpu_device_ip_suspend_phase1] *ERROR* 
set_powergating_state(gate) of IP block  failed -110

I believe there is some more work left to be done with it.

Regards,
Shirish S

From: amd-gfx  On Behalf Of Deucher, 
Alexander
Sent: Tuesday, October 16, 2018 9:44 AM
To: Zhu, Rex ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were 
disabled by user


Reviewed-by: Alex Deucher 
mailto:alexander.deuc...@amd.com>>


From: amd-gfx 
mailto:amd-gfx-boun...@lists.freedesktop.org>>
 on behalf of Rex Zhu mailto:rex@amd.com>>
Sent: Tuesday, October 16, 2018 1:33:46 AM
To: amd-gfx@lists.freedesktop.org; 
Deucher, Alexander
Cc: Zhu, Rex
Subject: [PATCH] drm/amdgpu: Poweroff uvd/vce/vcn/acp block if they were 
disabled by user

If user disable uvd/vce/vcn/acp blocks via module
parameter ip_block_mask,
driver power off thoser blocks to save power.

Signed-off-by: Rex Zhu mailto:rex@amd.com>>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 1e4dd09..3ffee08 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1774,6 +1774,24 @@ static int amdgpu_device_set_pg_state(struct 
amdgpu_device *adev, enum amd_power

 for (j = 0; j < adev->num_ip_blocks; j++) {
 i = state == AMD_PG_STATE_GATE ? j : adev->num_ip_blocks - j - 
1;
+
+   /* try to power off VCE/UVD/VCN/ACP if they were disabled by 
user */
+   if ((adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_UVD 
||
+   adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_VCE ||
+   adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_VCN ||
+   adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_ACP) 
&&
+   adev->ip_blocks[i].version->funcs->set_powergating_state) {
+   if (!adev->ip_blocks[i].status.valid) {
+   r = 
adev->ip_blocks[i].version->funcs->set_powergating_state((void *)adev,
+   
state);
+   if (r) {
+   DRM_ERROR("set_powergating_state(gate) 
of IP block <%s> failed %d\n",
+ 
adev->ip_blocks[i].version->funcs->name, r);
+   return r;
+   }
+   }
+   }
+
 if (!adev->ip_blocks[i].status.late_initialized)
 continue;
 /* skip CG for VCE/UVD, it's handled specially */
@@ -1791,6 +1809,7 @@ static int amdgpu_device_set_pg_state(struct 
amdgpu_device *adev, enum amd_power
 }
 }
 }
+
 return 0;
 }

--
1.9.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: skip IB tests for KIQ in general

2018-10-04 Thread S, Shirish



On 10/4/2018 12:41 PM, Christian König wrote:

Am 03.10.2018 um 17:15 schrieb Shirish S:

From: Pratik Vishwakarma 

[Why]
1. We never submit IBs to KIQ.
2. Ring test pass without KIQ's ring also.
3. By skipping we see an improvement of around 500ms
    in the amdgpu's resume time.

[How]
skip IB tests for KIQ ring type.

Signed-off-by: Shirish S 
Signed-off-by: Pratik Vishwakarma 


Well I'm not sure if that is a good idea or not.

On the one hand it is true that we never submit IBs to the KIQ, so 
testing that doesn't make much sense actually.


But on the other hand the 500ms delay during resume points out a 
problem with the KIQ, e.g. interrupts are not working correctly!


Question is now if we should ignore that problem because we never use 
interrupts on the KIQ?


Yes Christian,  that's the approach as no point fixing something we 
never use.
If the answer is to keep it as it is we should remove the intterupt 
handling for the KIQ as well.


I have sent a patch that shall remove interrupt handling for KIQ, please 
review.


Regards,
Shirish S
Otherwise I would say we should fix interrupts on the KIQ and then we 
also don't need this change any more.


Regards,
Christian.


---

This patch is a follow-up to the suggestion given by Alex,
while reviewing the patch: 
https://patchwork.freedesktop.org/patch/250912/


-Shirish S

  drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 8 
  1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c

index 47817e0..b8963b7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -354,6 +354,14 @@ int amdgpu_ib_ring_tests(struct amdgpu_device 
*adev)

  if (!ring || !ring->ready)
  continue;
  +    /* skip IB tests for KIQ in general for the below reasons:
+ * 1. We never submit IBs to the KIQ
+ * 2. KIQ doesn't use the EOP interrupts,
+ *    we use some other CP interrupt.
+ */
+    if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ)
+    continue;
+
  /* MM engine need more time */
  if (ring->funcs->type == AMDGPU_RING_TYPE_UVD ||
  ring->funcs->type == AMDGPU_RING_TYPE_VCE ||




--
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/display: Work around race beetween hw_done and wait_for_flip

2018-10-01 Thread S, Shirish
This workaround is not fixing the issue.

Regards,
Shirish S

-Original Message-
From: amd-gfx  On Behalf Of Harry 
Wentland
Sent: Friday, September 28, 2018 6:46 PM
To: amd-gfx@lists.freedesktop.org; S, Shirish ; Li, Sun peng 
(Leo) 
Cc: Wentland, Harry 
Subject: [PATCH] drm/amd/display: Work around race beetween hw_done and 
wait_for_flip

[Why]
There is a race that between drm_crtc_commit_hw_done and 
drm_atomic_helper_wait_for_flip where the possibilty exist for the
crtc->commit to be cleared after the null check in wait_for_flip_done
but before the call to wait_for_completion_timeout on commit->flip_done.

[How]
Take a reference to all commits in the state before drm_crtc_commit_hw_done is 
called and release those after drm_atomic_helper_wait_for_flip has finished.

Signed-off-by: Harry Wentland 
---

Would something like this work? I get the strong sense that this happens 
because Intel and IMX use the helpers in the other order, hence the dependency 
between hw_done and wait_for_flip was missed.

I'd rather make it obvious that there's (a) no reason to reorder these two 
calls on AMD HW (other than this unexpected dependency) and (b) this is 
something we'll probably want to fix in DRM.

Sorry it took me a while to understand what was happening here. Been busy at 
XDC until Jordan and Leo reminded me to take another look.

Harry

 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 0f10d920a785..ed9a7d680b63 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4626,12 +4626,27 @@ static void amdgpu_dm_atomic_commit_tail(struct 
drm_atomic_state *state)
}
spin_unlock_irqrestore(>ddev->event_lock, flags);
 
+   /*
+* WORKAROUND: Take a ref for all crtc_state commits to avoid
+* a race where the commit gets freed before commit_hw_done had
+* a chance to look for commit->flip_done
+*/
+   for_each_new_crtc_in_state(state, crtc, new_crtc_state, i)
+   drm_crtc_commit_get(new_crtc_state->commit);
+
/* Signal HW programming completion */
drm_atomic_helper_commit_hw_done(state);
 
if (wait_for_vblank)
drm_atomic_helper_wait_for_flip_done(dev, state);
 
+   /*
+* WORKAROUND: put the commit refs from above (see comment on
+* the drm_crtc_commit_get call above)
+*/
+   for_each_new_crtc_in_state(state, crtc, new_crtc_state, i)
+   drm_crtc_commit_put(new_crtc_state->commit);
+
drm_atomic_helper_cleanup_planes(dev, state);
 
/*
--
2.17.1

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdpu/vce_v3: skip suspend and resume if powergated

2018-08-10 Thread S, Shirish



On 8/10/2018 12:02 PM, Zhu, Rex wrote:


I am Ok with the check when call vce_v3_0_hw_fini.


But we may still need to call amdpug_vce_suspend/resume.


Done in V2. Have moved the check such that both are executed.
Regards,
Shirish S



and not sure whether need to do ring test when resume back.


Best Regards

Rex


*From:* S, Shirish
*Sent:* Friday, August 10, 2018 2:15 PM
*To:* Deucher, Alexander; Zhu, Rex; Liu, Leo
*Cc:* amd-gfx@lists.freedesktop.org; S, Shirish
*Subject:* [PATCH] drm/amdpu/vce_v3: skip suspend and resume if 
powergated

This patch adds a mechanism by which the VCE 3.0 block
shall check if it was enabled or in use before suspending,
if it was powergated while entering suspend then there
is no need to repeat it in vce_3_0_suspend().
Similarly, if the block was powergated while entering suspend
itself then there is no need to resume it.

By this we not only make the suspend and resume sequence
more efficient, but also optimize the overall amdgpu suspend
and resume time by reducing the ring intialize and tests
for unused IP blocks.

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  2 ++
 drivers/gpu/drm/amd/amdgpu/vce_v3_0.c | 21 +
 2 files changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h

index 07924d4..aa85063 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1035,6 +1035,8 @@ struct amdgpu_device {

 /* vce */
 struct amdgpu_vce   vce;
+   bool    is_vce_pg;
+   bool is_vce_disabled;

 /* vcn */
 struct amdgpu_vcn   vcn;
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c 
b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c

index cc6ce6c..822cfd6 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v3_0.c
@@ -326,6 +326,7 @@ static int vce_v3_0_start(struct amdgpu_device *adev)
 WREG32(mmGRBM_GFX_INDEX, mmGRBM_GFX_INDEX_DEFAULT);
 mutex_unlock(>grbm_idx_mutex);

+   adev->is_vce_pg = false;
 return 0;
 }

@@ -355,6 +356,7 @@ static int vce_v3_0_stop(struct amdgpu_device *adev)
 WREG32(mmGRBM_GFX_INDEX, mmGRBM_GFX_INDEX_DEFAULT);
 mutex_unlock(>grbm_idx_mutex);

+   adev->is_vce_pg = true;
 return 0;
 }

@@ -506,6 +508,17 @@ static int vce_v3_0_suspend(void *handle)
 int r;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   /* Proceed with suspend sequence only if VCE is started
+    * Mark the block as being disabled if its stopped.
+    */
+   if (adev->is_vce_pg) {
+   DRM_DEBUG("VCE is already powergated, not suspending\n");
+   adev->is_vce_disabled = true;
+   return 0;
+   }
+
+   adev->is_vce_disabled = false;
+
 r = vce_v3_0_hw_fini(adev);
 if (r)
 return r;
@@ -518,6 +531,14 @@ static int vce_v3_0_resume(void *handle)
 int r;
 struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+   /* Proceed with resume sequence if VCE was enabled
+    * while suspending.
+    */
+   if (adev->is_vce_disabled) {
+   DRM_DEBUG("VCE is powergated, not resuming the block\n");
+   return 0;
+   }
+
 r = amdgpu_vce_resume(adev);
 if (r)
 return r;
--
2.7.4



--
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: reciprocate amdgpu.._resume sequence to match amdgpu.._suspend

2018-07-18 Thread S, Shirish



On 7/18/2018 3:30 PM, Michel Dänzer wrote:

On 2018-07-18 11:40 AM, Shirish S wrote:

[Why]
1. To ensure that resume path reciprocates the sequence followed during
suspend.
2. While the console_lock is held, console output will be buffered, till
its unlocked it wont be emitted, hence its ideal to unlock sooner to enable
debugging/detecting/fixing of any issue in the remaining sequence of events
in resume path.

[How]
This patch restructures the console_lock, console_unlock around
amdgpu_fbdev_set_suspend() and moves this new block to the very beginning
of the resume sequence.

Signed-off-by: Shirish S 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 
  1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 709e4a3..fc4c517 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2720,8 +2720,11 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
resume, bool fbcon)
if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
return 0;
  
-	if (fbcon)

+   if (fbcon) {
console_lock();
+   amdgpu_fbdev_set_suspend(adev, 0);
+   console_unlock();
+   }
  
  	if (resume) {

pci_set_power_state(dev->pdev, PCI_D0);

I don't think the amdgpu_fbdev_set_suspend call can be moved before the
pci_set_power_state call, because fbcon may presumably try writing to
the device's VRAM at any point after amdgpu_fbdev_set_suspend.

There might be other things that need to happen before
amdgpu_fbdev_set_suspend.

Ok, in the next patch-set i have not moved amdgpu_fbdev_set_suspend()

Regards,
Shirish S

--
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: replace mutex with spin_lock (V2)

2018-06-21 Thread S, Shirish



On 6/22/2018 9:19 AM, S, Shirish wrote:



On 6/22/2018 9:00 AM, Dave Airlie wrote:

On 31 May 2018 at 20:02, Shirish S  wrote:

mutex's lead to sleeps which should be avoided in
atomic context.
Hence this patch replaces it with the spin_locks.

Below is the stack trace:

I'm not sure I really like this series of patches, going around
replacing ATOMIC and mutex with spinlocks
isn't something that should be done lightly,

In all the patches I haven't seen what spin lock or what causes us to
be in an atomic state in the first
place and why it is necessary that we are in an atomic state for such
long sequences of code.
We have root caused the issue and reverted all the patches in this 
patch series that tend to replace

mutex's with spinlock's.

I somehow noticed that this patch did not get reverted.
Please don't merge this patch.
Regards,
Shirish S

Regards,
Shirish S

Thanks,
Dave.

BUG: sleeping function called from invalid context at 
kernel/locking/mutex.c:**

in_atomic(): 1, irqs_disabled(): 1, pid: 89, name: kworker/u4:3
CPU: 1 PID: 89 Comm: kworker/u4:3 Tainted: G    W 4.14.43 #8
Workqueue: events_unbound commit_work
Call Trace:
  dump_stack+0x4d/0x63
  ___might_sleep+0x11f/0x12e
  mutex_lock+0x20/0x42
  amdgpu_atom_execute_table+0x26/0x72
  enable_disp_power_gating_v2_1+0x85/0xae
  dce110_enable_display_power_gating+0x83/0x1b1
  dce110_power_down_fe+0x4a/0x6d
  dc_post_update_surfaces_to_stream+0x59/0x87
  amdgpu_dm_do_flip+0x239/0x298
  amdgpu_dm_commit_planes.isra.23+0x379/0x54b
  ? drm_calc_timestamping_constants+0x14b/0x15c
  amdgpu_dm_atomic_commit_tail+0x4fc/0x5d2
  ? wait_for_common+0x5b/0x69
  commit_tail+0x42/0x64
  process_one_work+0x1b0/0x314
  worker_thread+0x1cb/0x2c1
  ? create_worker+0x1da/0x1da
  kthread+0x156/0x15e
  ? kthread_flush_work+0xea/0xea
  ret_from_fork+0x22/0x40

V2: Added stack trace in commit message.

Signed-off-by: Shirish S 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 2 +-
  drivers/gpu/drm/amd/amdgpu/atom.c    | 4 ++--
  drivers/gpu/drm/amd/amdgpu/atom.h    | 3 ++-
  3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c

index bf872f6..ba3d4b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
@@ -2033,7 +2033,7 @@ int amdgpu_atombios_init(struct amdgpu_device 
*adev)

 return -ENOMEM;
 }

- mutex_init(>mode_info.atom_context->mutex);
+ spin_lock_init(>mode_info.atom_context->lock);
 if (adev->is_atom_fw) {
 amdgpu_atomfirmware_scratch_regs_init(adev);
amdgpu_atomfirmware_allocate_fb_scratch(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/atom.c 
b/drivers/gpu/drm/amd/amdgpu/atom.c

index 69500a8..bfd98f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/atom.c
+++ b/drivers/gpu/drm/amd/amdgpu/atom.c
@@ -1261,7 +1261,7 @@ int amdgpu_atom_execute_table(struct 
atom_context *ctx, int index, uint32_t * pa

  {
 int r;

-   mutex_lock(>mutex);
+   spin_lock(>lock);
 /* reset data block */
 ctx->data_block = 0;
 /* reset reg block */
@@ -1274,7 +1274,7 @@ int amdgpu_atom_execute_table(struct 
atom_context *ctx, int index, uint32_t * pa

 ctx->divmul[0] = 0;
 ctx->divmul[1] = 0;
 r = amdgpu_atom_execute_table_locked(ctx, index, params);
-   mutex_unlock(>mutex);
+   spin_unlock(>lock);
 return r;
  }

diff --git a/drivers/gpu/drm/amd/amdgpu/atom.h 
b/drivers/gpu/drm/amd/amdgpu/atom.h

index a391709..54063e2 100644
--- a/drivers/gpu/drm/amd/amdgpu/atom.h
+++ b/drivers/gpu/drm/amd/amdgpu/atom.h
@@ -26,6 +26,7 @@
  #define ATOM_H

  #include 
+#include 
  #include 

  #define ATOM_BIOS_MAGIC    0xAA55
@@ -125,7 +126,7 @@ struct card_info {

  struct atom_context {
 struct card_info *card;
-   struct mutex mutex;
+   spinlock_t lock;
 void *bios;
 uint32_t cmd_table, data_table;
 uint16_t *iio;
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




--
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: replace mutex with spin_lock (V2)

2018-06-21 Thread S, Shirish



On 6/22/2018 9:00 AM, Dave Airlie wrote:

On 31 May 2018 at 20:02, Shirish S  wrote:

mutex's lead to sleeps which should be avoided in
atomic context.
Hence this patch replaces it with the spin_locks.

Below is the stack trace:

I'm not sure I really like this series of patches, going around
replacing ATOMIC and mutex with spinlocks
isn't something that should be done lightly,

In all the patches I haven't seen what spin lock or what causes us to
be in an atomic state in the first
place and why it is necessary that we are in an atomic state for such
long sequences of code.
We have root caused the issue and reverted all the patches in this patch 
series that tend to replace

mutex's with spinlock's.
Regards,
Shirish S

Thanks,
Dave.


BUG: sleeping function called from invalid context at kernel/locking/mutex.c:**
in_atomic(): 1, irqs_disabled(): 1, pid: 89, name: kworker/u4:3
CPU: 1 PID: 89 Comm: kworker/u4:3 Tainted: GW   4.14.43 #8
Workqueue: events_unbound commit_work
Call Trace:
  dump_stack+0x4d/0x63
  ___might_sleep+0x11f/0x12e
  mutex_lock+0x20/0x42
  amdgpu_atom_execute_table+0x26/0x72
  enable_disp_power_gating_v2_1+0x85/0xae
  dce110_enable_display_power_gating+0x83/0x1b1
  dce110_power_down_fe+0x4a/0x6d
  dc_post_update_surfaces_to_stream+0x59/0x87
  amdgpu_dm_do_flip+0x239/0x298
  amdgpu_dm_commit_planes.isra.23+0x379/0x54b
  ? drm_calc_timestamping_constants+0x14b/0x15c
  amdgpu_dm_atomic_commit_tail+0x4fc/0x5d2
  ? wait_for_common+0x5b/0x69
  commit_tail+0x42/0x64
  process_one_work+0x1b0/0x314
  worker_thread+0x1cb/0x2c1
  ? create_worker+0x1da/0x1da
  kthread+0x156/0x15e
  ? kthread_flush_work+0xea/0xea
  ret_from_fork+0x22/0x40

V2: Added stack trace in commit message.

Signed-off-by: Shirish S 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 2 +-
  drivers/gpu/drm/amd/amdgpu/atom.c| 4 ++--
  drivers/gpu/drm/amd/amdgpu/atom.h| 3 ++-
  3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
index bf872f6..ba3d4b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
@@ -2033,7 +2033,7 @@ int amdgpu_atombios_init(struct amdgpu_device *adev)
 return -ENOMEM;
 }

-   mutex_init(>mode_info.atom_context->mutex);
+   spin_lock_init(>mode_info.atom_context->lock);
 if (adev->is_atom_fw) {
 amdgpu_atomfirmware_scratch_regs_init(adev);
 amdgpu_atomfirmware_allocate_fb_scratch(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/atom.c 
b/drivers/gpu/drm/amd/amdgpu/atom.c
index 69500a8..bfd98f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/atom.c
+++ b/drivers/gpu/drm/amd/amdgpu/atom.c
@@ -1261,7 +1261,7 @@ int amdgpu_atom_execute_table(struct atom_context *ctx, 
int index, uint32_t * pa
  {
 int r;

-   mutex_lock(>mutex);
+   spin_lock(>lock);
 /* reset data block */
 ctx->data_block = 0;
 /* reset reg block */
@@ -1274,7 +1274,7 @@ int amdgpu_atom_execute_table(struct atom_context *ctx, 
int index, uint32_t * pa
 ctx->divmul[0] = 0;
 ctx->divmul[1] = 0;
 r = amdgpu_atom_execute_table_locked(ctx, index, params);
-   mutex_unlock(>mutex);
+   spin_unlock(>lock);
 return r;
  }

diff --git a/drivers/gpu/drm/amd/amdgpu/atom.h 
b/drivers/gpu/drm/amd/amdgpu/atom.h
index a391709..54063e2 100644
--- a/drivers/gpu/drm/amd/amdgpu/atom.h
+++ b/drivers/gpu/drm/amd/amdgpu/atom.h
@@ -26,6 +26,7 @@
  #define ATOM_H

  #include 
+#include 
  #include 

  #define ATOM_BIOS_MAGIC0xAA55
@@ -125,7 +126,7 @@ struct card_info {

  struct atom_context {
 struct card_info *card;
-   struct mutex mutex;
+   spinlock_t lock;
 void *bios;
 uint32_t cmd_table, data_table;
 uint16_t *iio;
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


--
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: change gfx8 ib test to use WB

2018-06-08 Thread S, Shirish



On 6/8/2018 1:08 PM, Christian König wrote:

Am 08.06.2018 um 07:23 schrieb zhoucm1:



On 2018年06月08日 12:54, Shirish S wrote:

This patch is extends the usage of WB in
gfx8's ib test which was originally
implemented in the below upstream patch:
"ed9324a drm/amdgpu: change gfx9 ib test to use WB"


You could copy the commit message from ed9324a to better explain why 
we do it, but that is only nice to have.




Signed-off-by: Shirish S 

Reviewed-by: Chunming Zhou 


Reviewed-by: Christian König 

Thanks Christian & david, shall re-send with the updated commit message 
and RB's.


Regards,
Shirish S



---
  drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 35 
+--

  1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c

index 818874b..61452c7 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -866,26 +866,32 @@ static int gfx_v8_0_ring_test_ib(struct 
amdgpu_ring *ring, long timeout)

  struct amdgpu_device *adev = ring->adev;
  struct amdgpu_ib ib;
  struct dma_fence *f = NULL;
-    uint32_t scratch;
-    uint32_t tmp = 0;
+
+    unsigned int index;
+    uint64_t gpu_addr;
+    uint32_t tmp;
  long r;
  -    r = amdgpu_gfx_scratch_get(adev, );
+    r = amdgpu_device_wb_get(adev, );
  if (r) {
-    DRM_ERROR("amdgpu: failed to get scratch reg (%ld).\n", r);
+    dev_err(adev->dev, "(%ld) failed to allocate wb slot\n", r);
  return r;
  }
-    WREG32(scratch, 0xCAFEDEAD);
+
+    gpu_addr = adev->wb.gpu_addr + (index * 4);
+    adev->wb.wb[index] = cpu_to_le32(0xCAFEDEAD);
  memset(, 0, sizeof(ib));
-    r = amdgpu_ib_get(adev, NULL, 256, );
+    r = amdgpu_ib_get(adev, NULL, 16, );
  if (r) {
  DRM_ERROR("amdgpu: failed to get ib (%ld).\n", r);
  goto err1;
  }
-    ib.ptr[0] = PACKET3(PACKET3_SET_UCONFIG_REG, 1);
-    ib.ptr[1] = ((scratch - PACKET3_SET_UCONFIG_REG_START));
-    ib.ptr[2] = 0xDEADBEEF;
-    ib.length_dw = 3;
+    ib.ptr[0] = PACKET3(PACKET3_WRITE_DATA, 3);
+    ib.ptr[1] = WRITE_DATA_DST_SEL(5) | WR_CONFIRM;
+    ib.ptr[2] = lower_32_bits(gpu_addr);
+    ib.ptr[3] = upper_32_bits(gpu_addr);
+    ib.ptr[4] = 0xDEADBEEF;
+    ib.length_dw = 5;
    r = amdgpu_ib_schedule(ring, 1, , NULL, );
  if (r)
@@ -900,20 +906,21 @@ static int gfx_v8_0_ring_test_ib(struct 
amdgpu_ring *ring, long timeout)

  DRM_ERROR("amdgpu: fence wait failed (%ld).\n", r);
  goto err2;
  }
-    tmp = RREG32(scratch);
+
+    tmp = adev->wb.wb[index];
  if (tmp == 0xDEADBEEF) {
  DRM_DEBUG("ib test on ring %d succeeded\n", ring->idx);
  r = 0;
  } else {
-    DRM_ERROR("amdgpu: ib test failed (scratch(0x%04X)=0x%08X)\n",
-  scratch, tmp);
+    DRM_ERROR("ib test on ring %d failed\n", ring->idx);
  r = -EINVAL;
  }
+
  err2:
  amdgpu_ib_free(adev, , NULL);
  dma_fence_put(f);
  err1:
-    amdgpu_gfx_scratch_free(adev, scratch);
+    amdgpu_device_wb_free(adev, index);
  return r;
  }


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




--
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: avoid sleeping in atomic context while creating new context or state

2018-06-01 Thread S, Shirish


The V2 of this patch is already reviewed by Harry.
The change i have made in dc_create() is no more applicable.

Regards,
Shirish S
On 5/31/2018 11:35 PM, Christian König wrote:

Am 30.05.2018 um 18:03 schrieb Harry Wentland:

On 2018-05-30 06:17 AM, Shirish S wrote:

This patch fixes the warning messages that are caused due to calling
sleep in atomic context as below:

BUG: sleeping function called from invalid context at mm/slab.h:419
in_atomic(): 1, irqs_disabled(): 1, pid: 5, name: kworker/u4:0
CPU: 1 PID: 5 Comm: kworker/u4:0 Tainted: G    W 4.14.35 #941
Workqueue: events_unbound commit_work
Call Trace:
  dump_stack+0x4d/0x63
  ___might_sleep+0x11f/0x12e
  kmem_cache_alloc_trace+0x41/0xea
  dc_create_state+0x1f/0x30
  dc_commit_updates_for_stream+0x73/0x4cf
  ? amdgpu_get_crtc_scanoutpos+0x82/0x16b
  amdgpu_dm_do_flip+0x239/0x298
  amdgpu_dm_commit_planes.isra.23+0x379/0x54b
  ? dc_commit_state+0x3da/0x404
  amdgpu_dm_atomic_commit_tail+0x4fc/0x5d2
  ? wait_for_common+0x5b/0x69
  commit_tail+0x42/0x64
  process_one_work+0x1b0/0x314
  worker_thread+0x1cb/0x2c1
  ? create_worker+0x1da/0x1da
  kthread+0x156/0x15e
  ? kthread_flush_work+0xea/0xea
  ret_from_fork+0x22/0x40

Signed-off-by: Shirish S 
---
  drivers/gpu/drm/amd/display/dc/core/dc.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c

index 33149ed..d62206f 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -588,7 +588,7 @@ static void disable_dangling_plane(struct dc 
*dc, struct dc_state *context)

    struct dc *dc_create(const struct dc_init_data *init_params)
   {
-    struct dc *dc = kzalloc(sizeof(*dc), GFP_KERNEL);
+    struct dc *dc = kzalloc(sizeof(*dc), GFP_ATOMIC);

Are you sure this one can be called in atomic_context?

If so then everything in consstruct() would also need GFP_ATOMIC.


Well the backtrace is quite obvious, but I agree that change still 
looks fishy to me as well.


Using GFP_ATOMIC should only be a last resort when nothing else helps, 
but here it looks more like we misuse a spinlock where a mutex or 
semaphore would be more appropriate.


Where exactly becomes the context atomic in the call trace?

Christian.



Harry


  unsigned int full_pipe_count;
    if (NULL == dc)
@@ -937,7 +937,7 @@ bool dc_post_update_surfaces_to_stream(struct dc 
*dc)

  struct dc_state *dc_create_state(void)
  {
  struct dc_state *context = kzalloc(sizeof(struct dc_state),
-   GFP_KERNEL);
+   GFP_ATOMIC);
    if (!context)
  return NULL;


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




--
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: avoid sleeping in atomic context while creating new context or state

2018-05-31 Thread S, Shirish



On 5/30/2018 9:33 PM, Harry Wentland wrote:

On 2018-05-30 06:17 AM, Shirish S wrote:

This patch fixes the warning messages that are caused due to calling
sleep in atomic context as below:

BUG: sleeping function called from invalid context at mm/slab.h:419
in_atomic(): 1, irqs_disabled(): 1, pid: 5, name: kworker/u4:0
CPU: 1 PID: 5 Comm: kworker/u4:0 Tainted: GW   4.14.35 #941
Workqueue: events_unbound commit_work
Call Trace:
  dump_stack+0x4d/0x63
  ___might_sleep+0x11f/0x12e
  kmem_cache_alloc_trace+0x41/0xea
  dc_create_state+0x1f/0x30
  dc_commit_updates_for_stream+0x73/0x4cf
  ? amdgpu_get_crtc_scanoutpos+0x82/0x16b
  amdgpu_dm_do_flip+0x239/0x298
  amdgpu_dm_commit_planes.isra.23+0x379/0x54b
  ? dc_commit_state+0x3da/0x404
  amdgpu_dm_atomic_commit_tail+0x4fc/0x5d2
  ? wait_for_common+0x5b/0x69
  commit_tail+0x42/0x64
  process_one_work+0x1b0/0x314
  worker_thread+0x1cb/0x2c1
  ? create_worker+0x1da/0x1da
  kthread+0x156/0x15e
  ? kthread_flush_work+0xea/0xea
  ret_from_fork+0x22/0x40

Signed-off-by: Shirish S 
---
  drivers/gpu/drm/amd/display/dc/core/dc.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 33149ed..d62206f 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -588,7 +588,7 @@ static void disable_dangling_plane(struct dc *dc, struct 
dc_state *context)
  
  struct dc *dc_create(const struct dc_init_data *init_params)

   {
-   struct dc *dc = kzalloc(sizeof(*dc), GFP_KERNEL);
+   struct dc *dc = kzalloc(sizeof(*dc), GFP_ATOMIC);

Are you sure this one can be called in atomic_context?

My bad, you are right, this is not required.
I have re-spun the patch as V2 with the GFP_ATOMIC applied only to 
dc_create_state.

Thanks & Regards,
Shirish S

If so then everything in consstruct() would also need GFP_ATOMIC.

Harry


unsigned int full_pipe_count;
  
  	if (NULL == dc)

@@ -937,7 +937,7 @@ bool dc_post_update_surfaces_to_stream(struct dc *dc)
  struct dc_state *dc_create_state(void)
  {
struct dc_state *context = kzalloc(sizeof(struct dc_state),
-  GFP_KERNEL);
+  GFP_ATOMIC);
  
  	if (!context)

return NULL;



--
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: replace mutex with spin_lock

2018-05-31 Thread S, Shirish



On 5/30/2018 9:10 PM, Christian König wrote:
Keep in mind that under SRIOV you can read registers while in atomic 
context, e.g. while holding a spinlock.


Please double check if that won't bite us.

Apart from that the change looks good to me,

Thanks Christian, i verified boot, s3, s5 on Stoney with this patch.
Have re-spun V2 which has the exact trace in which scenario this BUG is hit.
Regards,
Shirish S

Christian.

Am 30.05.2018 um 12:19 schrieb Shirish S:

mutex's lead to sleeps which should be avoided in
atomic context.
Hence this patch replaces it with the spin_locks.

Signed-off-by: Shirish S 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 2 +-
  drivers/gpu/drm/amd/amdgpu/atom.c    | 4 ++--
  drivers/gpu/drm/amd/amdgpu/atom.h    | 3 ++-
  3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c

index bf872f6..ba3d4b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
@@ -2033,7 +2033,7 @@ int amdgpu_atombios_init(struct amdgpu_device 
*adev)

  return -ENOMEM;
  }
  - mutex_init(>mode_info.atom_context->mutex);
+ spin_lock_init(>mode_info.atom_context->lock);
  if (adev->is_atom_fw) {
  amdgpu_atomfirmware_scratch_regs_init(adev);
  amdgpu_atomfirmware_allocate_fb_scratch(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/atom.c 
b/drivers/gpu/drm/amd/amdgpu/atom.c

index 69500a8..bfd98f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/atom.c
+++ b/drivers/gpu/drm/amd/amdgpu/atom.c
@@ -1261,7 +1261,7 @@ int amdgpu_atom_execute_table(struct 
atom_context *ctx, int index, uint32_t * pa

  {
  int r;
  -    mutex_lock(>mutex);
+    spin_lock(>lock);
  /* reset data block */
  ctx->data_block = 0;
  /* reset reg block */
@@ -1274,7 +1274,7 @@ int amdgpu_atom_execute_table(struct 
atom_context *ctx, int index, uint32_t * pa

  ctx->divmul[0] = 0;
  ctx->divmul[1] = 0;
  r = amdgpu_atom_execute_table_locked(ctx, index, params);
-    mutex_unlock(>mutex);
+    spin_unlock(>lock);
  return r;
  }
  diff --git a/drivers/gpu/drm/amd/amdgpu/atom.h 
b/drivers/gpu/drm/amd/amdgpu/atom.h

index a391709..cdfb0d0 100644
--- a/drivers/gpu/drm/amd/amdgpu/atom.h
+++ b/drivers/gpu/drm/amd/amdgpu/atom.h
@@ -26,6 +26,7 @@
  #define ATOM_H
    #include 
+#include 
  #include 
    #define ATOM_BIOS_MAGIC    0xAA55
@@ -125,7 +126,7 @@ struct card_info {
    struct atom_context {
  struct card_info *card;
-    struct mutex mutex;
+    spinlock_t lock;
  void *bios;
  uint32_t cmd_table, data_table;
  uint16_t *iio;




--
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: replace mutex with spin_lock

2018-05-31 Thread S, Shirish



On 5/30/2018 8:51 PM, Deucher, Alexander wrote:

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
Of Shirish S
Sent: Wednesday, May 30, 2018 6:20 AM
To: amd-gfx@lists.freedesktop.org; Wentland, Harry
; Zhu, Rex 
Cc: S, Shirish 
Subject: [PATCH] drm/amdgpu: replace mutex with spin_lock

mutex's lead to sleeps which should be avoided in atomic context.
Hence this patch replaces it with the spin_locks.

Signed-off-by: Shirish S 

Does this actually fix a bug or is it just to be safe?  Do actually call atom 
command tables in an atomic context?
I have re-spun V2 patch, with the stack trace so that you can see the 
path taken.


For reference below is the trace:

BUG: sleeping function called from invalid context at 
kernel/locking/mutex.c:238

in_atomic(): 1, irqs_disabled(): 1, pid: 89, name: kworker/u4:3
CPU: 1 PID: 89 Comm: kworker/u4:3 Tainted: G    W 4.14.43 #8
Workqueue: events_unbound commit_work
Call Trace:
 dump_stack+0x4d/0x63
 ___might_sleep+0x11f/0x12e
 mutex_lock+0x20/0x42
 amdgpu_atom_execute_table+0x26/0x72
 enable_disp_power_gating_v2_1+0x85/0xae
 dce110_enable_display_power_gating+0x83/0x1b1
 dce110_power_down_fe+0x4a/0x6d
 dc_post_update_surfaces_to_stream+0x59/0x87
 amdgpu_dm_do_flip+0x239/0x298
 amdgpu_dm_commit_planes.isra.23+0x379/0x54b
 ? drm_calc_timestamping_constants+0x14b/0x15c
 amdgpu_dm_atomic_commit_tail+0x4fc/0x5d2
 ? wait_for_common+0x5b/0x69
 commit_tail+0x42/0x64
 process_one_work+0x1b0/0x314
 worker_thread+0x1cb/0x2c1
 ? create_worker+0x1da/0x1da
 kthread+0x156/0x15e
 ? kthread_flush_work+0xea/0xea
 ret_from_fork+0x22/0x40

Its caused when a BUG hit in the kernel.

Thanks & Regards,
Shirish S

Alex


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 2 +-
  drivers/gpu/drm/amd/amdgpu/atom.c| 4 ++--
  drivers/gpu/drm/amd/amdgpu/atom.h| 3 ++-
  3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
index bf872f6..ba3d4b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
@@ -2033,7 +2033,7 @@ int amdgpu_atombios_init(struct amdgpu_device
*adev)
return -ENOMEM;
}

-   mutex_init(>mode_info.atom_context->mutex);
+   spin_lock_init(>mode_info.atom_context->lock);
if (adev->is_atom_fw) {
amdgpu_atomfirmware_scratch_regs_init(adev);
amdgpu_atomfirmware_allocate_fb_scratch(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/atom.c
b/drivers/gpu/drm/amd/amdgpu/atom.c
index 69500a8..bfd98f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/atom.c
+++ b/drivers/gpu/drm/amd/amdgpu/atom.c
@@ -1261,7 +1261,7 @@ int amdgpu_atom_execute_table(struct
atom_context *ctx, int index, uint32_t * pa  {
int r;

-   mutex_lock(>mutex);
+   spin_lock(>lock);
/* reset data block */
ctx->data_block = 0;
/* reset reg block */
@@ -1274,7 +1274,7 @@ int amdgpu_atom_execute_table(struct
atom_context *ctx, int index, uint32_t * pa
ctx->divmul[0] = 0;
ctx->divmul[1] = 0;
r = amdgpu_atom_execute_table_locked(ctx, index, params);
-   mutex_unlock(>mutex);
+   spin_unlock(>lock);
return r;
  }

diff --git a/drivers/gpu/drm/amd/amdgpu/atom.h
b/drivers/gpu/drm/amd/amdgpu/atom.h
index a391709..cdfb0d0 100644
--- a/drivers/gpu/drm/amd/amdgpu/atom.h
+++ b/drivers/gpu/drm/amd/amdgpu/atom.h
@@ -26,6 +26,7 @@
  #define ATOM_H

  #include 
+#include 
  #include 

  #define ATOM_BIOS_MAGIC   0xAA55
@@ -125,7 +126,7 @@ struct card_info {

  struct atom_context {
struct card_info *card;
-   struct mutex mutex;
+   spinlock_t lock;
void *bios;
uint32_t cmd_table, data_table;
uint16_t *iio;
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


--
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: avoid sleep while executing atombios table

2018-05-31 Thread S, Shirish



On 5/30/2018 8:51 PM, Deucher, Alexander wrote:

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
Of Shirish S
Sent: Wednesday, May 30, 2018 6:19 AM
To: amd-gfx@lists.freedesktop.org; Wentland, Harry
; Zhu, Rex 
Cc: S, Shirish 
Subject: [PATCH] drm/amdgpu: avoid sleep while executing atombios table

This patch replaces kzalloc's flag from GFP_KERNEL to GFP_ATOMIC to avoid
sleeping in atomic context.

Signed-off-by: Shirish S 

Does this actually fix a bug or is it just to be safe.  Do actually call atom 
command tables in an atomic context?
I have re-spun V2 patch, with the stack trace so that you can see the 
path taken.


For reference below is the trace:

BUG: sleeping function called from invalid context at mm/slab.h:419
in_atomic(): 1, irqs_disabled(): 0, pid: 1137, name: DrmThread
CPU: 1 PID: 1137 Comm: DrmThread Tainted: G    W   4.14.43 #10
Call Trace:
 dump_stack+0x4d/0x63
 ___might_sleep+0x11f/0x12e
 __kmalloc+0x76/0x126
 amdgpu_atom_execute_table_locked+0xfc/0x285
 amdgpu_atom_execute_table+0x5d/0x72
 transmitter_control_v1_5+0xef/0x11a
 hwss_edp_backlight_control+0x132/0x151
 dce110_disable_stream+0x133/0x16e
 core_link_disable_stream+0x1c5/0x23b
 dce110_reset_hw_ctx_wrap+0xb4/0x1aa
 dce110_apply_ctx_to_hw+0x4e/0x6da
 ? generic_reg_get+0x1f/0x33
 dc_commit_state+0x33f/0x3d2
 amdgpu_dm_atomic_commit_tail+0x2cf/0x5d2
 ? wait_for_common+0x5b/0x69
 commit_tail+0x42/0x64
 drm_atomic_helper_commit+0xdc/0xf9
 drm_atomic_helper_set_config+0x5c/0x76
 __drm_mode_set_config_internal+0x64/0x105

I see it on Stoney when the system's display turns off while entering 
idle-suspend.


Its caused when a BUG hit in the kernel.

Thanks & Regards,
Shirish S

Alex


---
  drivers/gpu/drm/amd/amdgpu/atom.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/atom.c
b/drivers/gpu/drm/amd/amdgpu/atom.c
index bfd98f0..da4558c 100644
--- a/drivers/gpu/drm/amd/amdgpu/atom.c
+++ b/drivers/gpu/drm/amd/amdgpu/atom.c
@@ -1221,7 +1221,7 @@ static int
amdgpu_atom_execute_table_locked(struct atom_context *ctx, int index,
ectx.abort = false;
ectx.last_jump = 0;
if (ws)
-   ectx.ws = kzalloc(4 * ws, GFP_KERNEL);
+   ectx.ws = kzalloc(4 * ws, GFP_ATOMIC);
else
ectx.ws = NULL;

--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


--
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: remove need of modeset flag for overlay planes

2018-05-07 Thread S, Shirish



On 5/4/2018 6:27 PM, Andrey Grodzovsky wrote:



On 05/03/2018 02:11 PM, Harry Wentland wrote:

On 2018-05-03 04:00 AM, S, Shirish wrote:


On 5/2/2018 7:21 PM, Harry Wentland wrote:

On 2018-04-27 06:27 AM, Shirish S wrote:

This patch is in continuation to the
"843e3c7 drm/amd/display: defer modeset check in 
dm_update_planes_state"

where we started to eliminate the dependency on
DRM_MODE_ATOMIC_ALLOW_MODESET to be set by the user space,
which as such is not mandatory.

After deferring, this patch eliminates the dependency on the flag
for overlay planes.

Apologies for the late response. I had to think about this patch 
for a long time since I'm not quite comfortable with it.


This has to be done in stages as its a pretty complex and requires 
thorough
testing before we free primary planes as well from dependency on 
modeset

flag.

Signed-off-by: Shirish S <shiris...@amd.com>
---
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 +---
   1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index 1a63c04..87b661d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4174,7 +4174,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,

   }
spin_unlock_irqrestore(>dev->event_lock, flags);
   -    if (!pflip_needed) {
+    if (!pflip_needed || plane->type == 
DRM_PLANE_TYPE_OVERLAY) {
Does this mean that whenever we have an overlay plane we won't do 
amdgpu_dm_do_flip but commit_planes_to_stream instead? Is this 
really the behavior we want?


commit_planes_to_stream was intended to program a new surface on a 
modeset whereas amdgpu_dm_do_flip was intended for pageflips.
Need of "modeset" flag to program new surface is what we want to fix 
in this patch for underlay plane and in next stages, fix 
manifestations caused by this approach as and when seen.
Since the user space doesn't send modeset flag for new surface, 
hence to program it, this patch checks the plane type to construct 
planes_count before calling commit_planes_to_stream().


Looking at the allow_modeset flag was never quite right and we 
anticipated having to rework this when having to deal with things 
like multiple planes. What really has to happen is that we determine 
the surface_update_type in atomic_check and then use that in 
atomic_commit to either program the surface only (UPDATE_TYPE_FAST) 
without having to lock all pipes or to lock all pipes (see 
lock_and_validation_needed in amdgpu_dm_atomic_check) if we need to 
reprogram mode (UPDATE_TYPE_FULL). I don't remember exactly what 
UPDATE_TYPE_MED is used for.


True, any suggestions on what needs to be check to decide if the surface 
update has  to be FAST?

Like plane_count, format or dc_state etc?
I don't feel comfortable taking a shortcut for DRM_PLANE_TYPE_OVERLAY 
without first having a plan and patches for how to deal with the 
above-mentioned.


Bhawan and Andrey had a look at this before but it was never quite 
ready. The work was non-trivial and potentially impacts lots of 
configurations and scenarios if we don't get it right. If you're 
curious you can look at this change (apologies to everyone else for 
posting AMD-internal link): http://git.amd.com:8080/#/c/103931/11


  If we use commit_planes_to_stream we end up losing things like the 
immediate_flip flag, as well as the wait for the right moment to 
program the flip that amdgpu_dm_do_flip does.


 From the code, amdgpu_dm_do_flip does what you mentioned only for 
primary plane and hence either way its not set for underlay.

The code wasn't designed with underlay in mind, so it will need work.

Harry


I support Harry's comments, we definitely need to strive to remove 
dependency on page_fleep needed flag, AFAIK we are the only ATOMIC KMS 
driver which makes a distinction between page fleep and other
surface updates, but it's better to sit and create a general plan of 
how to address it for all type of planes instead of patching for 
overlay only.


Andrey


Is anybody working on removing the dependency on page_flip_needed flag ?
Since am working on stoney for chrome os, i have limited visibility or 
knowledge of its implications on other asics,
hence to avoid regressions to other users i patched only the overlay 
path of it.


Some one who knows its inter-dependency on dc would be right to do it.
If you think its a long shot please re-consider my proposal of doing it 
in stages, i.e., removing overlay planes from dependency and then the 
primary since cursor updates is a but tricky and tied to primary plane.

Regards,
Shirish S



Regards,
Shirish S
   Even more importantly we won't wait for fences 
(reservation_object_wait_timeout_rcu).


Harry


WARN_ON(!dm_new_plane_state->dc_state);
     plane_states_constructed[planes_count] = 
dm

Re: [PATCH] drm/amd/display: remove need of modeset flag for overlay planes

2018-05-03 Thread S, Shirish



On 5/2/2018 7:21 PM, Harry Wentland wrote:

On 2018-04-27 06:27 AM, Shirish S wrote:

This patch is in continuation to the
"843e3c7 drm/amd/display: defer modeset check in dm_update_planes_state"
where we started to eliminate the dependency on
DRM_MODE_ATOMIC_ALLOW_MODESET to be set by the user space,
which as such is not mandatory.

After deferring, this patch eliminates the dependency on the flag
for overlay planes.


Apologies for the late response. I had to think about this patch for a long 
time since I'm not quite comfortable with it.


This has to be done in stages as its a pretty complex and requires thorough
testing before we free primary planes as well from dependency on modeset
flag.

Signed-off-by: Shirish S 
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 1a63c04..87b661d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4174,7 +4174,7 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
}
spin_unlock_irqrestore(>dev->event_lock, flags);
  
-		if (!pflip_needed) {

+   if (!pflip_needed || plane->type == DRM_PLANE_TYPE_OVERLAY) {

Does this mean that whenever we have an overlay plane we won't do 
amdgpu_dm_do_flip but commit_planes_to_stream instead? Is this really the 
behavior we want?

commit_planes_to_stream was intended to program a new surface on a modeset 
whereas amdgpu_dm_do_flip was intended for pageflips.
Need of "modeset" flag to program new surface is what we want to fix in 
this patch for underlay plane and in next stages, fix manifestations 
caused by this approach as and when seen.
Since the user space doesn't send modeset flag for new surface, hence to 
program it, this patch checks the plane type to construct planes_count 
before calling commit_planes_to_stream().


 If we use commit_planes_to_stream we end up losing things like the 
immediate_flip flag, as well as the wait for the right moment to program the 
flip that amdgpu_dm_do_flip does.

From the code, amdgpu_dm_do_flip does what you mentioned only for 
primary plane and hence either way its not set for underlay.


Regards,
Shirish S

  Even more importantly we won't wait for fences 
(reservation_object_wait_timeout_rcu).

Harry


WARN_ON(!dm_new_plane_state->dc_state);
  
  			plane_states_constructed[planes_count] = dm_new_plane_state->dc_state;

@@ -4884,7 +4884,8 @@ static int dm_update_planes_state(struct dc *dc,
  
  		/* Remove any changed/removed planes */

if (!enable) {
-   if (pflip_needed)
+   if (pflip_needed &&
+   plane && plane->type != DRM_PLANE_TYPE_OVERLAY)
continue;
  
  			if (!old_plane_crtc)

@@ -4931,7 +4932,8 @@ static int dm_update_planes_state(struct dc *dc,
if (!dm_new_crtc_state->stream)
continue;
  
-			if (pflip_needed)

+   if (pflip_needed &&
+   plane && plane->type != DRM_PLANE_TYPE_OVERLAY)
continue;
  
  			WARN_ON(dm_new_plane_state->dc_state);




--
Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: remove need of modeset flag for overlay planes

2018-05-01 Thread S, Shirish



On 5/2/2018 12:53 AM, Stéphane Marchesin wrote:

On Fri, Apr 27, 2018 at 3:27 AM Shirish S  wrote:


This patch is in continuation to the
"843e3c7 drm/amd/display: defer modeset check in dm_update_planes_state"
where we started to eliminate the dependency on
DRM_MODE_ATOMIC_ALLOW_MODESET to be set by the user space,
which as such is not mandatory.
After deferring, this patch eliminates the dependency on the flag
for overlay planes.
This has to be done in stages as its a pretty complex and requires

thorough

testing before we free primary planes as well from dependency on modeset
flag.
Signed-off-by: Shirish S 
---
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 +---
   1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index 1a63c04..87b661d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4174,7 +4174,7 @@ static void amdgpu_dm_commit_planes(struct

drm_atomic_state *state,

  }
  spin_unlock_irqrestore(>dev->event_lock, flags);
-   if (!pflip_needed) {
+   if (!pflip_needed || plane->type ==

DRM_PLANE_TYPE_OVERLAY) {

  WARN_ON(!dm_new_plane_state->dc_state);
  plane_states_constructed[planes_count] =

dm_new_plane_state->dc_state;

@@ -4884,7 +4884,8 @@ static int dm_update_planes_state(struct dc *dc,
  /* Remove any changed/removed planes */
  if (!enable) {
-   if (pflip_needed)
+   if (pflip_needed &&
+   plane && plane->type !=

DRM_PLANE_TYPE_OVERLAY)

nit: I don't think we need to check that plane is non-NULL

Agree, was a bit over cautious.
Have removed it in V2.
Thanks.
Regards,
Shirish S

Stéphane


  continue;
  if (!old_plane_crtc)
@@ -4931,7 +4932,8 @@ static int dm_update_planes_state(struct dc *dc,
  if (!dm_new_crtc_state->stream)
  continue;
-   if (pflip_needed)
+   if (pflip_needed &&
+   plane && plane->type !=

DRM_PLANE_TYPE_OVERLAY)

  continue;
  WARN_ON(dm_new_plane_state->dc_state);
--
2.7.4
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/display: Don't return ddc result and read_bytes in same return value

2018-04-27 Thread S, Shirish
Thanks Harry, it works.

Patch is Reviewed-by: Shirish S <shiris...@amd.com>

Regards,
Shirish S

-Original Message-
From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf Of Harry 
Wentland
Sent: Tuesday, April 24, 2018 8:57 PM
To: amd-gfx@lists.freedesktop.org; S, Shirish <shiris...@amd.com>; Deucher, 
Alexander <alexander.deuc...@amd.com>; S, Shirish <shiris...@amd.com>
Cc: Wentland, Harry <harry.wentl...@amd.com>
Subject: [PATCH] drm/amd/display: Don't return ddc result and read_bytes in 
same return value

The two ranges overlap.

Signed-off-by: Harry Wentland <harry.wentl...@amd.com>
---

Thinking of something like this if this works for you.

Harry

 .../display/amdgpu_dm/amdgpu_dm_mst_types.c   | 20 +++
 .../gpu/drm/amd/display/dc/core/dc_link_ddc.c | 10 +++---
 .../gpu/drm/amd/display/dc/inc/dc_link_ddc.h  |  5 +++--
 3 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index c3f3028253c3..b8dd7496b7bc 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -83,21 +83,22 @@ static ssize_t dm_dp_aux_transfer(struct drm_dp_aux *aux,
enum i2c_mot_mode mot = (msg->request & DP_AUX_I2C_MOT) ?
I2C_MOT_TRUE : I2C_MOT_FALSE;
enum ddc_result res;
-   ssize_t read_bytes;
+   uint32_t read_bytes = msg->size;
 
if (WARN_ON(msg->size > 16))
return -E2BIG;
 
switch (msg->request & ~DP_AUX_I2C_MOT) {
case DP_AUX_NATIVE_READ:
-   read_bytes = dal_ddc_service_read_dpcd_data(
+   res = dal_ddc_service_read_dpcd_data(
TO_DM_AUX(aux)->ddc_service,
false,
I2C_MOT_UNDEF,
msg->address,
msg->buffer,
-   msg->size);
-   return read_bytes;
+   msg->size,
+   _bytes);
+   break;
case DP_AUX_NATIVE_WRITE:
res = dal_ddc_service_write_dpcd_data(
TO_DM_AUX(aux)->ddc_service,
@@ -108,14 +109,15 @@ static ssize_t dm_dp_aux_transfer(struct drm_dp_aux *aux,
msg->size);
break;
case DP_AUX_I2C_READ:
-   read_bytes = dal_ddc_service_read_dpcd_data(
+   res = dal_ddc_service_read_dpcd_data(
TO_DM_AUX(aux)->ddc_service,
true,
mot,
msg->address,
msg->buffer,
-   msg->size);
-   return read_bytes;
+   msg->size,
+   _bytes);
+   break;
case DP_AUX_I2C_WRITE:
res = dal_ddc_service_write_dpcd_data(
TO_DM_AUX(aux)->ddc_service,
@@ -137,7 +139,9 @@ static ssize_t dm_dp_aux_transfer(struct drm_dp_aux *aux,
 r == DDC_RESULT_SUCESSFULL);
 #endif
 
-   return msg->size;
+   if (res != DDC_RESULT_SUCESSFULL)
+   return -EIO;
+   return read_bytes;
 }
 
 static enum drm_connector_status
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
index 49c2face1e7a..ae48d603ebd6 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
@@ -629,13 +629,14 @@ bool dal_ddc_service_query_ddc_data(
return ret;
 }
 
-ssize_t dal_ddc_service_read_dpcd_data(
+enum ddc_result dal_ddc_service_read_dpcd_data(
struct ddc_service *ddc,
bool i2c,
enum i2c_mot_mode mot,
uint32_t address,
uint8_t *data,
-   uint32_t len)
+   uint32_t len,
+   uint32_t *read)
 {
struct aux_payload read_payload = {
.i2c_over_aux = i2c,
@@ -652,6 +653,8 @@ ssize_t dal_ddc_service_read_dpcd_data(
.mot = mot
};
 
+   *read = 0;
+
if (len > DEFAULT_AUX_MAX_DATA_SIZE) {
BREAK_TO_DEBUGGER();
return DDC_RESULT_FAILED_INVALID_OPERATION;
@@ -661,7 +664,8 @@ ssize_t dal_ddc_service_read_dpcd_data(
ddc->ctx->i2caux,
ddc->ddc_pin,
)) {
-   return (ssize_t)command.payloads->length;
+   *read = command.payloads->length;
+   return DDC_RESULT_SUCESSFULL;
}
 
return DDC_RESULT_FAILED_OPERATION;

Re: [PATCH] drm/amd/display: fix return value of dm_dp_aux_transfer() (V2)

2018-04-24 Thread S, Shirish



On 4/23/2018 9:53 AM, S, Shirish wrote:



On 4/20/2018 11:52 PM, Harry Wentland wrote:

On 2018-04-17 10:56 PM, Shirish S wrote:

Currently the dm_dp_aux_transfer() does not parse
the return value of dal_ddc_service_read_dpcd_data(), which also
has a failure case.
This patch captures the same and ensures the i2c operation status is
sent appropriately to the drm framework.

V2: Updated commit message.

Signed-off-by: Shirish S <shiris...@amd.com>
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 6 
+-

  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git 
a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c

index 782491e..7ac124d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -115,7 +115,11 @@ static ssize_t dm_dp_aux_transfer(struct 
drm_dp_aux *aux,

  msg->address,
  msg->buffer,
  msg->size);
-    return read_bytes;
+    if (read_bytes != msg->size &&
+    read_bytes >= DDC_RESULT_FAILED_OPERATION)
This doesn't look right. We shouldn't be returning the size or error 
code from the same function. This will not work if we submit a read 7 
or 8 bytes and get an error.
Agree,  but hope you understood the issue, to re-iterate, in case of 
failure we reuturn a +ve number(7 or 8) back from dm_dp_aux_transfer() 
leading to edid read failures later.

I have 2 suggestions:
1. change the "enum ddc_result {" to #defines of -ve error codes? or
2. make enum start with 129 (> than the max read_bytes possible)?
let me know which one would be better?

Any suggestions?

Thanks,
Regards,
Shirish S


Harry


+    return -EIO;
+    else
+    return read_bytes;
  case DP_AUX_I2C_WRITE:
  res = dal_ddc_service_write_dpcd_data(
  TO_DM_AUX(aux)->ddc_service,





___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: Disallow enabling CRTC without primary plane with FB

2018-04-24 Thread S, Shirish



On 4/19/2018 8:08 PM, Harry Wentland wrote:

On 2018-04-19 03:43 AM, Michel Dänzer wrote:

[ Dropping stable@ (fixes with Cc: stable are picked up for stable
branches once they land in Linus' tree, there's no point sending them to
stable@ during review), adding dri-devel ]

On 2018-04-18 10:26 PM, Harry Wentland wrote:

The below commit

 "drm/atomic: Try to preserve the crtc enabled state in drm_atomic_remove_fb, 
v2"

introduces a slight behavioral change to rmfb. Instead of disabling a crtc
when the primary plane is disabled, it now preserves it.

Since DC is currently not equipped to handle this we need to fail such
a commit, otherwise we might see a corrupted screen.

How does the caller react to failing such a commit?

The caller (drm_atomic_remove_fb in this case) will retry with the old behavior 
and disable the CRTC.

Harry

That's the fall back logic suggested in the patch that caused this issue.

This patch is  Reviewed-by: Shirish S >


Regards,

Shirish S




This is based on Shirish's previous approach but avoids adding all
planes to the new atomic state which leads to a full update in DC for
any commit, and is not what we intend.

Theoretically DM should be able to deal with states with fully populated planes,
even for simple updates, such as cursor updates. This should still be
addressed in the future.

Signed-off-by: Harry Wentland 
Cc: sta...@vger.kernel.org
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 6f92a19bebd6..0bdc6b484bad 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4683,6 +4683,7 @@ static int dm_update_crtcs_state(struct 
amdgpu_display_manager *dm,
struct amdgpu_dm_connector *aconnector = NULL;
struct drm_connector_state *new_con_state = NULL;
struct dm_connector_state *dm_conn_state = NULL;
+   struct drm_plane_state *new_plane_state = NULL;
  
  		new_stream = NULL;
  
@@ -4690,6 +4691,13 @@ static int dm_update_crtcs_state(struct amdgpu_display_manager *dm,

dm_new_crtc_state = to_dm_crtc_state(new_crtc_state);
acrtc = to_amdgpu_crtc(crtc);
  
+		new_plane_state = drm_atomic_get_new_plane_state(state, new_crtc_state->crtc->primary);

+
+   if (new_crtc_state->enable && new_plane_state && 
!new_plane_state->fb) {
+   ret = -EINVAL;
+   goto fail;
+   }
+
aconnector = 
amdgpu_dm_find_first_crtc_matching_connector(state, crtc);
  
  		/* TODO This hack should go away */

@@ -4894,7 +4902,7 @@ static int dm_update_planes_state(struct dc *dc,
if (!dm_old_crtc_state->stream)
continue;
  
-			DRM_DEBUG_DRIVER("Disabling DRM plane: %d on DRM crtc %d\n",

+   DRM_DEBUG_ATOMIC("Disabling DRM plane: %d on DRM crtc 
%d\n",
plane->base.id, 
old_plane_crtc->base.id);
  
  			if (!dc_remove_plane_from_context(






___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v2] drm/amdgpu: fix the ib test hang when gfx is in "idle" state

2018-04-23 Thread S, Shirish



On 4/24/2018 8:19 AM, Huang Rui wrote:

"aaabaf4   drm/amdgpu: defer test IBs on the rings at boot (V3)"
Above patch defers the execution of gfx/compute ib tests. However, at that time,
the gfx may already go into idle state. If "idle" gfx receives command
submission, then will get hang in the system. And it still has issue to
dynamically enable/disable gfxoff at runtime. So we have to use a workaround to
skip the gfx/compute ib tests when gfx is already in "idle" state.

Signed-off-by: Huang Rui 
Cc: Shirish S 
---

Changes from V1 -> V2:
- Remove unused definitions of smu10_hwmgr.
- Add WA descriptions into commit log and comments.

---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  2 ++
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 23 ++-

Is it not required for older asics of cz/st?
If not then please update the commit message accordingly.
Regards,
Shirish S

  drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c | 22 ++
  3 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 59df4b7..a0263b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -905,6 +905,7 @@ struct amdgpu_gfx_funcs {
void (*read_wave_vgprs)(struct amdgpu_device *adev, uint32_t simd, 
uint32_t wave, uint32_t thread, uint32_t start, uint32_t size, uint32_t *dst);
void (*read_wave_sgprs)(struct amdgpu_device *adev, uint32_t simd, 
uint32_t wave, uint32_t start, uint32_t size, uint32_t *dst);
void (*select_me_pipe_q)(struct amdgpu_device *adev, u32 me, u32 pipe, 
u32 queue);
+   bool (*is_gfx_on)(struct amdgpu_device *adev);
  };
  
  struct amdgpu_ngg_buf {

@@ -1855,6 +1856,7 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
  #define amdgpu_gds_switch(adev, r, v, d, w, a) 
(adev)->gds.funcs->patch_gds_switch((r), (v), (d), (w), (a))
  #define amdgpu_psp_check_fw_loading_status(adev, i) 
(adev)->firmware.funcs->check_fw_loading_status((adev), (i))
  #define amdgpu_gfx_select_me_pipe_q(adev, me, pipe, q) 
(adev)->gfx.funcs->select_me_pipe_q((adev), (me), (pipe), (q))
+#define amdgpu_gfx_is_gfx_on(adev) (adev)->gfx.funcs->is_gfx_on((adev))
  
  /* Common functions */

  int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 2c5e2a4..b8bd194 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -342,6 +342,18 @@ static int gfx_v9_0_ring_test_ring(struct amdgpu_ring 
*ring)
return r;
  }
  
+static bool gfx_v9_0_is_gfx_on(struct amdgpu_device *adev)

+{
+   uint32_t reg;
+
+   reg = RREG32_SOC15(PWR, 0, mmPWR_MISC_CNTL_STATUS);
+   if ((reg & PWR_MISC_CNTL_STATUS__PWR_GFXOFF_STATUS_MASK) ==
+   (0x2 << PWR_MISC_CNTL_STATUS__PWR_GFXOFF_STATUS__SHIFT))
+   return true;
+
+   return false;
+}
+
  static int gfx_v9_0_ring_test_ib(struct amdgpu_ring *ring, long timeout)
  {
struct amdgpu_device *adev = ring->adev;
@@ -353,6 +365,14 @@ static int gfx_v9_0_ring_test_ib(struct amdgpu_ring *ring, 
long timeout)
uint32_t tmp;
long r;
  
+	/*

+* FIXME: It still has issue to dynamically enable/disable gfxoff at
+* runtime. So it has to skip the gfx/compute ib test when gfx is
+* already in "idle" state.
+*/
+   if (!amdgpu_gfx_is_gfx_on(adev))
+   return 0;
+
r = amdgpu_device_wb_get(adev, );
if (r) {
dev_err(adev->dev, "(%ld) failed to allocate wb slot\n", r);
@@ -1085,7 +1105,8 @@ static const struct amdgpu_gfx_funcs gfx_v9_0_gfx_funcs = 
{
.read_wave_data = _v9_0_read_wave_data,
.read_wave_sgprs = _v9_0_read_wave_sgprs,
.read_wave_vgprs = _v9_0_read_wave_vgprs,
-   .select_me_pipe_q = _v9_0_select_me_pipe_q
+   .select_me_pipe_q = _v9_0_select_me_pipe_q,
+   .is_gfx_on = _v9_0_is_gfx_on
  };
  
  static void gfx_v9_0_gpu_early_init(struct amdgpu_device *adev)

diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c 
b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
index 7712eb6..48c17fb 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
@@ -42,12 +42,6 @@
  #define SMU10_DISPCLK_BYPASS_THRESHOLD 1 /* 100Mhz */
  #define SMC_RAM_END 0x4
  
-#define mmPWR_MISC_CNTL_STATUS	0x0183

-#define mmPWR_MISC_CNTL_STATUS_BASE_IDX0
-#define PWR_MISC_CNTL_STATUS__PWR_GFX_RLC_CGPG_EN__SHIFT   0x0
-#define PWR_MISC_CNTL_STATUS__PWR_GFXOFF_STATUS__SHIFT 0x1
-#define PWR_MISC_CNTL_STATUS__PWR_GFX_RLC_CGPG_EN_MASK 0x0001L
-#define PWR_MISC_CNTL_STATUS__PWR_GFXOFF_STATUS_MASK   0x0006L
  
  static const 

Re: [PATCH] drm/amd/display: introduce quirks for i2c adaptor

2018-04-22 Thread S, Shirish



On 4/20/2018 11:54 PM, Harry Wentland wrote:

On 2018-04-17 02:57 AM, Shirish S wrote:

The dp aux channel cannot read messages of size greater
than 16 bytes, this patch adds quirks feild accordingly
at the initialization of the adaptor.

Is this in response to a bug?
Yes, its in continuation to the dm_dp_aux_transfer() return bug but also 
with an intention to clean up.
Currently we are in a more reactive mode, i.e., in dm_dp_aux_transfer() 
we have a WARN_ON for the message size,
i.e., at this time the i2c framework has already created a list of 
msg->sizes's, however by adding a quirk not only can we get rid of the 
WARN_ON
but also ensure that the i2c framework knows about it and do not let 
form message reads > 16 bytes.


I don't see any other DRM driver using quirks like this, even though they also 
wouldn't be able to transfer more than 16 bytes when using i2c-over-aux. This 
makes me wonder why we need it.

Harry


Signed-off-by: Shirish S 
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index 782491e..f7d6d9a 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -464,10 +464,15 @@ static const struct drm_dp_mst_topology_cbs dm_mst_cbs = {
.register_connector = dm_dp_mst_register_connector
  };
  
+/* I2C adapter quirks, max read len is 16 bytes. */

+static const struct i2c_adapter_quirks dm_dp_aux_quirks = {
+   .max_read_len = 128,
+};
  void amdgpu_dm_initialize_dp_connector(struct amdgpu_display_manager *dm,
   struct amdgpu_dm_connector *aconnector)
  {
aconnector->dm_dp_aux.aux.name = "dmdc";
+   aconnector->dm_dp_aux.aux.ddc.quirks = _dp_aux_quirks;
aconnector->dm_dp_aux.aux.dev = dm->adev->dev;
aconnector->dm_dp_aux.aux.transfer = dm_dp_aux_transfer;
aconnector->dm_dp_aux.ddc_service = aconnector->dc_link->ddc;



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amd/display: fix return value of dm_dp_aux_transfer() (V2)

2018-04-22 Thread S, Shirish



On 4/20/2018 11:52 PM, Harry Wentland wrote:

On 2018-04-17 10:56 PM, Shirish S wrote:

Currently the dm_dp_aux_transfer() does not parse
the return value of dal_ddc_service_read_dpcd_data(), which also
has a failure case.
This patch captures the same and ensures the i2c operation status is
sent appropriately to the drm framework.

V2: Updated commit message.

Signed-off-by: Shirish S 
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index 782491e..7ac124d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -115,7 +115,11 @@ static ssize_t dm_dp_aux_transfer(struct drm_dp_aux *aux,
msg->address,
msg->buffer,
msg->size);
-   return read_bytes;
+   if (read_bytes != msg->size &&
+   read_bytes >= DDC_RESULT_FAILED_OPERATION)

This doesn't look right. We shouldn't be returning the size or error code from 
the same function. This will not work if we submit a read 7 or 8 bytes and get 
an error.
Agree,  but hope you understood the issue, to re-iterate, in case of 
failure we reuturn a +ve number(7 or 8) back from dm_dp_aux_transfer() 
leading to edid read failures later.

I have 2 suggestions:
1. change the "enum ddc_result {" to #defines of -ve error codes? or
2. make enum start with 129 (> than the max read_bytes possible)?
let me know which one would be better?
Thanks,
Regards,
Shirish S


Harry


+   return -EIO;
+   else
+   return read_bytes;
case DP_AUX_I2C_WRITE:
res = dal_ddc_service_write_dpcd_data(
TO_DM_AUX(aux)->ddc_service,



___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH] drm/amdgpu: defer test IBs on the rings at boot (V2)

2018-04-16 Thread S, Shirish



On 4/13/2018 10:20 PM, Alex Deucher wrote:

On Fri, Apr 13, 2018 at 9:25 AM, Christian König
 wrote:

Am 13.04.2018 um 10:31 schrieb Shirish S:

amdgpu_ib_ring_tests() runs test IB's on rings at boot
contributes to ~500 ms of amdgpu driver's boot time.

This patch defers it and ensures that its executed
in amdgpu_info_ioctl() if it wasn't scheduled.

V2: Use queue_delayed_work() & flush_delayed_work().

Signed-off-by: Shirish S 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu.h|  2 ++
   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 +---
   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  4 
   3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 5734871..ae8f722 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1611,6 +1611,8 @@ struct amdgpu_device {
 /* delayed work_func for deferring clockgating during resume */
 struct delayed_work late_init_work;
+   /* delayed work_func to defer testing IB's on rings during boot */
+   struct delayed_work late_init_test_ib_work;


That still has the chance of running the late init in parallel with the IB
tests and that really doesn't looks like a good idea to me.

Yeah, at least on older chips we run into problems if we power or
clock gate some engines while they are in use.  Even on engines that
support dynamic gating, you usually have to set it up while the engine
is idle.  Make sure the IB tests run before we enable gating.

Ok Alex.
I have re-spun V3, with only one delayed work and ensuring ib tests are 
run before enabling clock gating.

Regards,
Shirish S

Alex


Is there any issue with putting the IB test into the late init work handler
as well?



 struct amdgpu_virt  virt;
 /* firmware VRAM reservation */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 1762eb4..ee84058 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -63,6 +63,7 @@ MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
   MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
 #define AMDGPU_RESUME_MS2000
+#define AMDGPU_IB_TEST_SCHED_MS2000
 static const char *amdgpu_asic_name[] = {
 "TAHITI",
@@ -2105,6 +2106,16 @@ bool amdgpu_device_asic_has_dc_support(enum
amd_asic_type asic_type)
 }
   }
   +static void amdgpu_device_late_init_test_ib_func_handler(struct
work_struct *work)
+{
+   struct amdgpu_device *adev =
+   container_of(work, struct amdgpu_device,
late_init_test_ib_work.work);
+   int r = amdgpu_ib_ring_tests(adev);
+
+   if (r)
+   DRM_ERROR("ib ring test failed (%d).\n", r);
+}
+
   /**
* amdgpu_device_has_dc_support - check if dc is supported
*
@@ -2212,6 +2223,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 INIT_LIST_HEAD(>ring_lru_list);
 spin_lock_init(>ring_lru_list_lock);
   + INIT_DELAYED_WORK(>late_init_test_ib_work,
+ amdgpu_device_late_init_test_ib_func_handler);
 INIT_DELAYED_WORK(>late_init_work,
   amdgpu_device_ip_late_init_func_handler);
   @@ -2374,9 +2387,9 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 goto failed;
 }
   - r = amdgpu_ib_ring_tests(adev);
-   if (r)
-   DRM_ERROR("ib ring test failed (%d).\n", r);
+   /* Schedule amdgpu_ib_ring_tests() */
+   queue_delayed_work(system_wq, >late_init_test_ib_work,
+   msecs_to_jiffies(AMDGPU_IB_TEST_SCHED_MS));
 if (amdgpu_sriov_vf(adev))
 amdgpu_virt_init_data_exchange(adev);
@@ -2469,6 +2482,7 @@ void amdgpu_device_fini(struct amdgpu_device *adev)
 }
 adev->accel_working = false;
 cancel_delayed_work_sync(>late_init_work);
+   cancel_delayed_work_sync(>late_init_test_ib_work);
 /* free i2c buses */
 if (!amdgpu_device_has_dc_support(adev))
 amdgpu_i2c_fini(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 487d39e..6fa326b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -279,6 +279,10 @@ static int amdgpu_info_ioctl(struct drm_device *dev,
void *data, struct drm_file
 if (!info->return_size || !info->return_pointer)
 return -EINVAL;
   + /* Ensure IB tests on ring are executed */
+   if (delayed_work_pending(>late_init_test_ib_work))
+   flush_delayed_work(>late_init_test_ib_work);
+


You just need to call flush_delayed_work() here without the if.

Regards,
Christian.


 switch (info->query) {
 case AMDGPU_INFO_ACCEL_WORKING:
 ui32 = 

Re: [PATCH] drm/amdgpu: defer test IBs on the rings at boot

2018-04-13 Thread S, Shirish



On 4/13/2018 12:38 PM, Christian König wrote:

Am 13.04.2018 um 09:01 schrieb S, Shirish:



On 4/13/2018 11:53 AM, Christian König wrote:

Am 13.04.2018 um 06:07 schrieb Shirish S:

amdgpu_ib_ring_tests() runs test IB's on rings at boot
contributes to ~500 ms of amdgpu driver's boot time.

This patch defers it and adds a check to report
in amdgpu_info_ioctl() if it was scheduled or not.


That is rather suboptimal, but see below.

Which part is sub-optimal, deferring or checking if the work is 
scheduled?


That was about the check. We should wait for the test to finish 
instead of printing an error and continuing.



Done. Have made this change in V2.


Signed-off-by: Shirish S <shiris...@amd.com>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h    |  2 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 +---
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    |  3 +++
  3 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h

index 5734871..ae8f722 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1611,6 +1611,8 @@ struct amdgpu_device {
    /* delayed work_func for deferring clockgating during 
resume */

  struct delayed_work late_init_work;
+    /* delayed work_func to defer testing IB's on rings during 
boot */

+    struct delayed_work late_init_test_ib_work;


You must put the IB test into the late_init_work as well, otherwise 
the two delayed workers can race with each other.


I thought  from the comment above the declaration, its clear why i am 
creating 2 work structures.
late_init_work is to optimize resume time and late_init_test_ib_work 
is to optimize the boot time.
There cant be race as the context in which they are called are 
totally different.


Late init enables power and clock gating. If I'm not completely 
mistaken we don't do the power/clock gating earlier because we had to 
wait for the IB test to finish.


Could be that modern ASICs have additional logic to prevent that, but 
the last time I worked on this power gating a block while you run 
something on it could even crash the whole system.



    struct amdgpu_virt    virt;
  /* firmware VRAM reservation */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index 1762eb4..e65a5e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -63,6 +63,7 @@ MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
  MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
    #define AMDGPU_RESUME_MS    2000
+#define AMDGPU_IB_TEST_SCHED_MS    2000
    static const char *amdgpu_asic_name[] = {
  "TAHITI",
@@ -2105,6 +2106,16 @@ bool amdgpu_device_asic_has_dc_support(enum 
amd_asic_type asic_type)

  }
  }
  +static void amdgpu_device_late_init_test_ib_func_handler(struct 
work_struct *work)

+{
+    struct amdgpu_device *adev =
+    container_of(work, struct amdgpu_device, 
late_init_test_ib_work.work);

+    int r = amdgpu_ib_ring_tests(adev);
+
+    if (r)
+    DRM_ERROR("ib ring test failed (%d).\n", r);
+}
+
  /**
   * amdgpu_device_has_dc_support - check if dc is supported
   *
@@ -2212,6 +2223,8 @@ int amdgpu_device_init(struct amdgpu_device 
*adev,

  INIT_LIST_HEAD(>ring_lru_list);
  spin_lock_init(>ring_lru_list_lock);
  + INIT_DELAYED_WORK(>late_init_test_ib_work,
+ amdgpu_device_late_init_test_ib_func_handler);
  INIT_DELAYED_WORK(>late_init_work,
    amdgpu_device_ip_late_init_func_handler);
  @@ -2374,9 +2387,9 @@ int amdgpu_device_init(struct amdgpu_device 
*adev,

  goto failed;
  }
  -    r = amdgpu_ib_ring_tests(adev);
-    if (r)
-    DRM_ERROR("ib ring test failed (%d).\n", r);
+    /* Schedule amdgpu_ib_ring_tests() */
+    mod_delayed_work(system_wq, >late_init_test_ib_work,
+    msecs_to_jiffies(AMDGPU_IB_TEST_SCHED_MS));


That doesn't work like you intended. mod_delayed_work() overrides 
the existing handler.


What you wanted to use is queue_delayed_work(), but as I said we 
should only have one delayed worker.
mod_delayed_work() is safer and optimal method that replaces 
cancel_delayed_work() followed by queue_delayed_work().

(https://lkml.org/lkml/2011/2/3/175)
But if you strongly insist i don't mind changing it.


Well, mod_delayed_work() does NOT replace queue_delayed_work(). Those 
two functions are for different use cases.


The link you posted actually explains it quite well:

So, cancel_delayed_work() followed by queue_delayed_work() schedules
the work to be executed at the specified time regardless of the
current pending state while queue_delayed_work() takes effect iff
currently the work item is not pending.


queue_delayed_work() takes only effect if the work item is not already 
pending/executing.


In other word

Re: [PATCH] drm/amdgpu: defer test IBs on the rings at boot

2018-04-13 Thread S, Shirish



On 4/13/2018 11:53 AM, Christian König wrote:

Am 13.04.2018 um 06:07 schrieb Shirish S:

amdgpu_ib_ring_tests() runs test IB's on rings at boot
contributes to ~500 ms of amdgpu driver's boot time.

This patch defers it and adds a check to report
in amdgpu_info_ioctl() if it was scheduled or not.


That is rather suboptimal, but see below.


Which part is sub-optimal, deferring or checking if the work is scheduled?


Signed-off-by: Shirish S 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu.h    |  2 ++
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 +---
  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c    |  3 +++
  3 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h

index 5734871..ae8f722 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1611,6 +1611,8 @@ struct amdgpu_device {
    /* delayed work_func for deferring clockgating during resume */
  struct delayed_work late_init_work;
+    /* delayed work_func to defer testing IB's on rings during boot */
+    struct delayed_work late_init_test_ib_work;


You must put the IB test into the late_init_work as well, otherwise 
the two delayed workers can race with each other.


I thought  from the comment above the declaration, its clear why i am 
creating 2 work structures.
late_init_work is to optimize resume time and late_init_test_ib_work is 
to optimize the boot time.
There cant be race as the context in which they are called are totally 
different.

    struct amdgpu_virt    virt;
  /* firmware VRAM reservation */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index 1762eb4..e65a5e6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -63,6 +63,7 @@ MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
  MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
    #define AMDGPU_RESUME_MS    2000
+#define AMDGPU_IB_TEST_SCHED_MS    2000
    static const char *amdgpu_asic_name[] = {
  "TAHITI",
@@ -2105,6 +2106,16 @@ bool amdgpu_device_asic_has_dc_support(enum 
amd_asic_type asic_type)

  }
  }
  +static void amdgpu_device_late_init_test_ib_func_handler(struct 
work_struct *work)

+{
+    struct amdgpu_device *adev =
+    container_of(work, struct amdgpu_device, 
late_init_test_ib_work.work);

+    int r = amdgpu_ib_ring_tests(adev);
+
+    if (r)
+    DRM_ERROR("ib ring test failed (%d).\n", r);
+}
+
  /**
   * amdgpu_device_has_dc_support - check if dc is supported
   *
@@ -2212,6 +2223,8 @@ int amdgpu_device_init(struct amdgpu_device *adev,
  INIT_LIST_HEAD(>ring_lru_list);
  spin_lock_init(>ring_lru_list_lock);
  +    INIT_DELAYED_WORK(>late_init_test_ib_work,
+  amdgpu_device_late_init_test_ib_func_handler);
  INIT_DELAYED_WORK(>late_init_work,
    amdgpu_device_ip_late_init_func_handler);
  @@ -2374,9 +2387,9 @@ int amdgpu_device_init(struct amdgpu_device 
*adev,

  goto failed;
  }
  -    r = amdgpu_ib_ring_tests(adev);
-    if (r)
-    DRM_ERROR("ib ring test failed (%d).\n", r);
+    /* Schedule amdgpu_ib_ring_tests() */
+    mod_delayed_work(system_wq, >late_init_test_ib_work,
+    msecs_to_jiffies(AMDGPU_IB_TEST_SCHED_MS));


That doesn't work like you intended. mod_delayed_work() overrides the 
existing handler.


What you wanted to use is queue_delayed_work(), but as I said we 
should only have one delayed worker.
mod_delayed_work() is safer and optimal method that replaces 
cancel_delayed_work() followed by queue_delayed_work().

(https://lkml.org/lkml/2011/2/3/175)
But if you strongly insist i don't mind changing it.



    if (amdgpu_sriov_vf(adev))
  amdgpu_virt_init_data_exchange(adev);
@@ -2469,6 +2482,7 @@ void amdgpu_device_fini(struct amdgpu_device 
*adev)

  }
  adev->accel_working = false;
  cancel_delayed_work_sync(>late_init_work);
+ cancel_delayed_work_sync(>late_init_test_ib_work);
  /* free i2c buses */
  if (!amdgpu_device_has_dc_support(adev))
  amdgpu_i2c_fini(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c

index 487d39e..057bd9a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -279,6 +279,9 @@ static int amdgpu_info_ioctl(struct drm_device 
*dev, void *data, struct drm_file

  if (!info->return_size || !info->return_pointer)
  return -EINVAL;
  +    if (delayed_work_pending(>late_init_test_ib_work))
+    DRM_ERROR("IB test on ring not executed\n");
+


Please use flush_delayed_work() instead of issuing and error here.


Agree, wasn't sure of what to do here :).
So i will re-spin with the flush part added. Hope this reply clarifies 
your comments.

Thanks.
Regards,
Shirish S

Regards,
Christian.


  

RE: [PATCH 2/2] Revert "drm/amd/display: disable CRTCs with NULL FB on their primary plane (V2)"

2018-04-12 Thread S, Shirish
Hi Harry, Alex,

The solution given while reviewing my patch was that "DC should support 
enabling a CRTC without a framebuffer."

Since the revert is a temporary workaround to address issue at hand and 
considering the bigger regression it will cause on ChromeOS(explained below),
I would strongly recommend that the revert should not be mainlined (to linus 
tree), until a proper fix for both the issues i.e., flickering and BUG hit on 
atomic is found.

For the sake of everyone's understanding in the list below is a brief 
background.

Mainline patch from intel folks, "846c7df drm/atomic: Try to preserve the crtc 
enabled state in drm_atomic_remove_fb, v2." 
introduces a slight behavioral change to rmfb. Instead of disabling a crtc when 
the primary plane is disabled, it now preserves it.

This change leads to BUG hit while performing atomic commit on amd driver 
leading to reboot/system instability on ChromeOS which has enabled
drm atomic way of rendering, I also remember it causing some issue on other OS 
as well.

Thanks & Regards,
Shirish S

-Original Message-
From: Michel Dänzer [mailto:mic...@daenzer.net] 
Sent: Thursday, April 12, 2018 8:39 PM
To: Wentland, Harry <harry.wentl...@amd.com>
Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander 
<alexander.deuc...@amd.com>; S, Shirish <shiris...@amd.com>
Subject: Re: [PATCH 2/2] Revert "drm/amd/display: disable CRTCs with NULL FB on 
their primary plane (V2)"

On 2018-04-12 04:51 PM, Harry Wentland wrote:
> This seems to cause flickering and lock-ups for a wide range of users.
> Revert until we've found a proper fix for the flickering and lock-ups.
> 
> This reverts commit 36cc549d59864b7161f0e23d710c1c4d1b9cf022.
> 
> Cc: Shirish S <shiris...@amd.com>
> Cc: Alex Deucher <alexander.deuc...@amd.com>
> Cc: sta...@vger.kernel.org
> Signed-off-by: Harry Wentland <harry.wentl...@amd.com>

Thanks Harry, both patches are

Reviewed-by: Michel Dänzer <michel.daen...@amd.com>


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Recall: [PATCH] drm/amdgpu: defer initing UVD & VCE IP blocks

2018-04-10 Thread S, Shirish
S, Shirish would like to recall the message, "[PATCH] drm/amdgpu: defer initing 
UVD & VCE IP blocks".
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/atomic: Add new reverse iterator over all plane state

2018-03-06 Thread S, Shirish
Hi Alex,

Have resent the V2 with R-B of Daniel.

Regards,
Shirish S


-Original Message-
From: Alex Deucher [mailto:alexdeuc...@gmail.com] 
Sent: Tuesday, March 6, 2018 11:01 PM
To: Vishwakarma, Pratik <pratik.vishwaka...@amd.com>
Cc: Daniel Vetter <dan...@ffwll.ch>; Deucher, Alexander 
<alexander.deuc...@amd.com>; amd-gfx@lists.freedesktop.org; Maling list - DRI 
developers <dri-de...@lists.freedesktop.org>; S, Shirish <shiris...@amd.com>
Subject: Re: [PATCH] drm/atomic: Add new reverse iterator over all plane state

On Tue, Mar 6, 2018 at 5:52 AM, Vishwakarma, Pratik 
<pratik.vishwaka...@amd.com> wrote:
> Hi Daniel,
>
> I have checked make htmldocs on v2 of this patch. I have attached output 
> drm-kms.html on that thread.
> No indentation issue is observed. Attached again for reference.
> Can you please provide RB on that?

How did you send the patch?  I can't get V2 to apply.  The patch is mangled.  
Please use git-send-email if you didn't before.

Alex

>
> Regards
> Pratik
>
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf 
> Of Daniel Vetter
> Sent: Tuesday, March 6, 2018 3:36 PM
> To: Alex Deucher <alexdeuc...@gmail.com>
> Cc: Deucher, Alexander <alexander.deuc...@amd.com>; 
> amd-gfx@lists.freedesktop.org; Maling list - DRI developers 
> <dri-de...@lists.freedesktop.org>; S, Shirish <shiris...@amd.com>
> Subject: Re: [PATCH] drm/atomic: Add new reverse iterator over all 
> plane state
>
> On Wed, Feb 28, 2018 at 09:26:26AM -0500, Alex Deucher wrote:
>> + dri-devel
>>
>>
>> On Wed, Feb 28, 2018 at 4:33 AM, S, Shirish <shiris...@amd.com> wrote:
>> > From: Shirish S <shiris...@amd.com>
>> >
>> > Add reverse iterator "for_each_oldnew_plane_in_state_reverse" to 
>> > complement "for_each_oldnew_plane_in_state" way of reading plane 
>> > states.
>> >
>> > The plane states are required to be read in reverse order for 
>> > amdgpu, as the z order convention followed in linux is opposite to 
>> > how the planes are supposed to be presented to DC engine, which is 
>> > in common to both windows and linux.
>> >
>> > Signed-off-by: Shirish S <shiris...@amd.com>
>> > Signed-off-by: Pratik Vishwakarma <pratik.vishwaka...@amd.com>
>
> Makes sense.
>> > ---
>> >  include/drm/drm_atomic.h | 22 ++
>> >  1 file changed, 22 insertions(+)
>> >
>> > diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h 
>> > index cf13842..b947930 100644
>> > --- a/include/drm/drm_atomic.h
>> > +++ b/include/drm/drm_atomic.h
>> > @@ -754,6 +754,28 @@ void drm_state_dump(struct drm_device *dev, struct 
>> > drm_printer *p);
>> >   (new_plane_state) = 
>> > (__state)->planes[__i].new_state, 1))
>> >
>> >  /**
>> > + * for_each_oldnew_plane_in_state_reverse - iterate over all 
>> > + planes in an atomic
>> > + * update in reverse order
>
> Are you sure this renders correctly in kernel-doc? Iirc you have to indent 
> the continuation line.
>
> Assuming this is fixed:
>
> Reviewed-by: Daniel Vetter <daniel.vet...@ffwll.ch>
>
>> > + * @__state:  drm_atomic_state pointer
>> > + * @plane:  drm_plane iteration cursor
>> > + * @old_plane_state:  drm_plane_state iteration cursor for 
>> > +the old state
>> > + * @new_plane_state:  drm_plane_state iteration cursor for 
>> > +the new state
>> > + * @__i: int iteration cursor, for macro-internal use
>> > + *
>> > + * This iterates over all planes in an atomic update in reverse 
>> > +order,
>> > + * tracking both old and  new state. This is useful in places 
>> > +where the
>> > + * state delta needs to be considered, for example in atomic check 
>> > functions.
>> > + */
>> > +#define for_each_oldnew_plane_in_state_reverse(__state, plane, 
>> > old_plane_state, new_plane_state, __i) \
>> > +   (for ((__i) = ((__state)->dev->mode_config.num_total_plane - 1);   
>> >  \
>> > +(__i) >= 0;\
>> > +(__i)--)   \
>> > +   for_each_if ((__state)->planes[__i].ptr &&  \
>> > +((plane) = (__state)->planes[__i].ptr, \
>> > + (old_plane_state) = 
>>

[PATCH] drm/atomic: Add new reverse iterator over all plane state (V2)

2018-02-28 Thread S, Shirish
From: Shirish S 

Add reverse iterator for_each_oldnew_plane_in_state_reverse to compliment the 
for_each_oldnew_plane_in_state way or reading plane states.

The plane states are required to be read in reverse order for amd drivers, 
cause the z order convention followed in linux is opposite to how the planes 
are supposed to be presented to DC engine, which is in common to both windows 
and linux.

V2: fix compile time errors due to -Werror flag.

Signed-off-by: Shirish S 
Signed-off-by: Pratik Vishwakarma 
---
 include/drm/drm_atomic.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h index 
cf13842..3fe8dde 100644
--- a/include/drm/drm_atomic.h
+++ b/include/drm/drm_atomic.h
@@ -754,6 +754,28 @@ void drm_state_dump(struct drm_device *dev, struct 
drm_printer *p);
  (new_plane_state) = 
(__state)->planes[__i].new_state, 1))
 
 /**
+ * for_each_oldnew_plane_in_state_reverse - iterate over all planes in 
+an atomic
+ * update in reverse order
+ * @__state:  drm_atomic_state pointer
+ * @plane:  drm_plane iteration cursor
+ * @old_plane_state:  drm_plane_state iteration cursor for the 
+old state
+ * @new_plane_state:  drm_plane_state iteration cursor for the 
+new state
+ * @__i: int iteration cursor, for macro-internal use
+ *
+ * This iterates over all planes in an atomic update in reverse order,
+ * tracking both old and  new state. This is useful in places where the
+ * state delta needs to be considered, for example in atomic check functions.
+ */
+#define for_each_oldnew_plane_in_state_reverse(__state, plane, 
old_plane_state, new_plane_state, __i) \
+   for ((__i) = ((__state)->dev->mode_config.num_total_plane - 1); \
+(__i) >= 0;\
+(__i)--)   \
+   for_each_if ((__state)->planes[__i].ptr &&  \
+((plane) = (__state)->planes[__i].ptr, \
+ (old_plane_state) = 
(__state)->planes[__i].old_state,\
+ (new_plane_state) = 
(__state)->planes[__i].new_state, 1))
+
+/**
  * for_each_old_plane_in_state - iterate over all planes in an atomic update
  * @__state:  drm_atomic_state pointer
  * @plane:  drm_plane iteration cursor
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: disable CRTCs with NULL FB on their primary plane (V2)

2018-02-28 Thread S, Shirish

From: Shirish S 

The below commit

"drm/atomic: Try to preserve the crtc enabled state in drm_atomic_remove_fb, v2"

introduces a slight behavioral change to rmfb. Instead of disabling a crtc when 
the primary plane is disabled, it now preserves it.

This change leads to BUG hit while performing atomic commit on amd driver.

As a fix this patch ensures that we disable the CRTC's with NULL FB by 
returning -EINVAL and hence triggering fall back to the old behavior and 
turning off the crtc in atomic_remove_fb().

V2: Added error check for plane_state and removed sanity check for crtc.

Signed-off-by: Shirish S 
Signed-off-by: Pratik Vishwakarma 
Reviewed-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 28 +++
 1 file changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 511cd58..e0c02c3 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4803,6 +4803,30 @@ static int dm_update_planes_state(struct dc *dc,
return ret;
 }
 
+static int dm_atomic_check_plane_state_fb(struct drm_atomic_state *state,
+ struct drm_crtc *crtc)
+{
+   struct drm_plane *plane;
+   struct drm_crtc_state *crtc_state;
+
+   WARN_ON(!drm_atomic_get_new_crtc_state(state, crtc));
+
+   drm_for_each_plane_mask(plane, state->dev, crtc->state->plane_mask) {
+   struct drm_plane_state *plane_state =
+   drm_atomic_get_plane_state(state, plane);
+
+   if (IS_ERR(plane_state))
+   return -EDEADLK;
+
+   crtc_state = drm_atomic_get_crtc_state(plane_state->state, 
crtc);
+   if (crtc->primary == plane && crtc_state->active) {
+   if (!plane_state->fb)
+   return -EINVAL;
+   }
+   }
+   return 0;
+}
+
 static int amdgpu_dm_atomic_check(struct drm_device *dev,
  struct drm_atomic_state *state)
 {
@@ -4826,6 +4850,10 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
goto fail;
 
for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
new_crtc_state, i) {
+   ret = dm_atomic_check_plane_state_fb(state, crtc);
+   if (ret)
+   goto fail;
+
if (!drm_atomic_crtc_needs_modeset(new_crtc_state) &&
!new_crtc_state->color_mgmt_changed)
continue;
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/atomic: Add new reverse iterator over all plane state

2018-02-28 Thread S, Shirish
From: Shirish S 

Add reverse iterator "for_each_oldnew_plane_in_state_reverse" to
complement "for_each_oldnew_plane_in_state" way of reading plane
states.

The plane states are required to be read in reverse order for
amdgpu, as the z order convention followed in linux is
opposite to how the planes are supposed to be presented to DC
engine, which is in common to both windows and linux.

Signed-off-by: Shirish S 
Signed-off-by: Pratik Vishwakarma 
---
 include/drm/drm_atomic.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
index cf13842..b947930 100644
--- a/include/drm/drm_atomic.h
+++ b/include/drm/drm_atomic.h
@@ -754,6 +754,28 @@ void drm_state_dump(struct drm_device *dev, struct 
drm_printer *p);
  (new_plane_state) = 
(__state)->planes[__i].new_state, 1))
 
 /**
+ * for_each_oldnew_plane_in_state_reverse - iterate over all planes in an 
atomic
+ * update in reverse order
+ * @__state:  drm_atomic_state pointer
+ * @plane:  drm_plane iteration cursor
+ * @old_plane_state:  drm_plane_state iteration cursor for the old state
+ * @new_plane_state:  drm_plane_state iteration cursor for the new state
+ * @__i: int iteration cursor, for macro-internal use
+ *
+ * This iterates over all planes in an atomic update in reverse order,
+ * tracking both old and  new state. This is useful in places where the
+ * state delta needs to be considered, for example in atomic check functions.
+ */
+#define for_each_oldnew_plane_in_state_reverse(__state, plane, 
old_plane_state, new_plane_state, __i) \
+   (for ((__i) = ((__state)->dev->mode_config.num_total_plane - 1);
\
+(__i) >= 0;\
+(__i)--)   \
+   for_each_if ((__state)->planes[__i].ptr &&  \
+((plane) = (__state)->planes[__i].ptr, \
+ (old_plane_state) = 
(__state)->planes[__i].old_state,\
+ (new_plane_state) = 
(__state)->planes[__i].new_state, 1)))
+
+/**
  * for_each_old_plane_in_state - iterate over all planes in an atomic update
  * @__state:  drm_atomic_state pointer
  * @plane:  drm_plane iteration cursor
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


FW: [PATCH] drm/amd/display: check for ipp before calling cursor operations

2018-02-26 Thread S, Shirish

From: Shirish S 

Currently all cursor related functions are made to all pipes that are attached 
to a particular stream.
This is not applicable to pipes that do not have cursor plane initialised like 
underlay.
Hence this patch allows cursor related operations on a pipe only if ipp in 
available on that particular pipe.

The check is added to set_cursor_position & set_cursor_attribute.

Signed-off-by: Shirish S 
Reviewed-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
index 87a193a..cd58197 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
@@ -198,7 +198,8 @@ bool dc_stream_set_cursor_attributes(
for (i = 0; i < MAX_PIPES; i++) {
struct pipe_ctx *pipe_ctx = _ctx->pipe_ctx[i];
 
-   if (pipe_ctx->stream != stream || (!pipe_ctx->plane_res.xfm && 
!pipe_ctx->plane_res.dpp))
+   if (pipe_ctx->stream != stream || (!pipe_ctx->plane_res.xfm &&
+   !pipe_ctx->plane_res.dpp) || !pipe_ctx->plane_res.ipp)
continue;
if (pipe_ctx->top_pipe && pipe_ctx->plane_state != 
pipe_ctx->top_pipe->plane_state)
continue;
@@ -237,7 +238,8 @@ bool dc_stream_set_cursor_position(
if (pipe_ctx->stream != stream ||
(!pipe_ctx->plane_res.mi  && 
!pipe_ctx->plane_res.hubp) ||
!pipe_ctx->plane_state ||
-   (!pipe_ctx->plane_res.xfm && 
!pipe_ctx->plane_res.dpp))
+   (!pipe_ctx->plane_res.xfm && 
!pipe_ctx->plane_res.dpp) ||
+   !pipe_ctx->plane_res.ipp)
continue;
 
core_dc->hwss.set_cursor_position(pipe_ctx);
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: make dm_dp_aux_transfer return payload bytes instead of size

2018-02-26 Thread S, Shirish

From: Shirish S 

The drm layer expects aux->transfer() to return the payload bytes read.
Currently dm_dp_aux_transfer() returns the payload size which does not gets 
updated during the read, hence not giving the right data for the drm layer to 
pars edid. This leads to the drm layer to conclude as the edid is BAD and hence 
some monitors/devices dont get detected properly.

This patch changes the return type of dm_dp_aux_transfer() to actual bytes read 
during DP_AUX_NATIVE_READ & DP_AUX_I2C_READ.

Signed-off-by: Shirish S 
Reviewed-by: Harry Wentland 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c   |  9 +
 drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c |  7 ---
 drivers/gpu/drm/amd/display/dc/i2caux/aux_engine.c| 15 ++-
 drivers/gpu/drm/amd/display/dc/i2caux/i2caux.c|  1 +
 drivers/gpu/drm/amd/display/dc/inc/dc_link_ddc.h  |  2 +-
 5 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index 1e8a21b..39cfe0f 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -83,17 +83,18 @@ static ssize_t dm_dp_aux_transfer(struct drm_dp_aux *aux,
enum i2c_mot_mode mot = (msg->request & DP_AUX_I2C_MOT) ?
I2C_MOT_TRUE : I2C_MOT_FALSE;
enum ddc_result res;
+   ssize_t read_bytes;
 
switch (msg->request & ~DP_AUX_I2C_MOT) {
case DP_AUX_NATIVE_READ:
-   res = dal_ddc_service_read_dpcd_data(
+   read_bytes = dal_ddc_service_read_dpcd_data(
TO_DM_AUX(aux)->ddc_service,
false,
I2C_MOT_UNDEF,
msg->address,
msg->buffer,
msg->size);
-   break;
+   return read_bytes;
case DP_AUX_NATIVE_WRITE:
res = dal_ddc_service_write_dpcd_data(
TO_DM_AUX(aux)->ddc_service,
@@ -104,14 +105,14 @@ static ssize_t dm_dp_aux_transfer(struct drm_dp_aux *aux,
msg->size);
break;
case DP_AUX_I2C_READ:
-   res = dal_ddc_service_read_dpcd_data(
+   read_bytes = dal_ddc_service_read_dpcd_data(
TO_DM_AUX(aux)->ddc_service,
true,
mot,
msg->address,
msg->buffer,
msg->size);
-   break;
+   return read_bytes;
case DP_AUX_I2C_WRITE:
res = dal_ddc_service_write_dpcd_data(
TO_DM_AUX(aux)->ddc_service,
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
index d5294798b..49c2fac 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
@@ -629,7 +629,7 @@ bool dal_ddc_service_query_ddc_data(
return ret;
 }
 
-enum ddc_result dal_ddc_service_read_dpcd_data(
+ssize_t dal_ddc_service_read_dpcd_data(
struct ddc_service *ddc,
bool i2c,
enum i2c_mot_mode mot,
@@ -660,8 +660,9 @@ enum ddc_result dal_ddc_service_read_dpcd_data(
if (dal_i2caux_submit_aux_command(
ddc->ctx->i2caux,
ddc->ddc_pin,
-   ))
-   return DDC_RESULT_SUCESSFULL;
+   )) {
+   return (ssize_t)command.payloads->length;
+   }
 
return DDC_RESULT_FAILED_OPERATION;
 }
diff --git a/drivers/gpu/drm/amd/display/dc/i2caux/aux_engine.c 
b/drivers/gpu/drm/amd/display/dc/i2caux/aux_engine.c
index 0b1db48..9c42fe5 100644
--- a/drivers/gpu/drm/amd/display/dc/i2caux/aux_engine.c
+++ b/drivers/gpu/drm/amd/display/dc/i2caux/aux_engine.c
@@ -126,20 +126,8 @@ static void process_read_reply(
ctx->status =
I2CAUX_TRANSACTION_STATUS_FAILED_PROTOCOL_ERROR;
ctx->operation_succeeded = false;
-   } else if (ctx->returned_byte < ctx->current_read_length) {
-   ctx->current_read_length -= ctx->returned_byte;
-
-   ctx->offset += ctx->returned_byte;
-
-   ++ctx->invalid_reply_retry_aux_on_ack;
-
-   if (ctx->invalid_reply_retry_aux_on_ack >
-   AUX_INVALID_REPLY_RETRY_COUNTER) {
-   ctx->status =
-   I2CAUX_TRANSACTION_STATUS_FAILED_PROTOCOL_ERROR;
-   

Re: [PATCH] drm/amdgpu: disable coarse grain clockgating for ST

2018-01-29 Thread S, Shirish
CC:arindam.n...@amd.com
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amdgpu: disable coarse grain clockgating for ST

2018-01-24 Thread S, Shirish

From: Shirish S 

The CGCG feature on Stoney is causing GFX related issues such as freezes
and blank outs.

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/amdgpu/vi.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c
b/drivers/gpu/drm/amd/amdgpu/vi.c index 3b66e1a..ebfac67 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -1045,7 +1045,6 @@ static int vi_common_early_init(void *handle)
AMD_CG_SUPPORT_GFX_CP_LS |
AMD_CG_SUPPORT_GFX_CGTS |
AMD_CG_SUPPORT_GFX_CGTS_LS |
-   AMD_CG_SUPPORT_GFX_CGCG |
AMD_CG_SUPPORT_GFX_CGLS |
AMD_CG_SUPPORT_BIF_LS |
AMD_CG_SUPPORT_HDP_MGCG |
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: remove usage of legacy_cursor_update

2017-11-22 Thread S, Shirish

From: Shirish S 

Currently the atomic check code uses legacy_cursor_update to differnetiate if 
the cursor plane is being requested by the user, which is not required as we 
shall be updating plane only if modeset is requested/required.

Have tested cursor plane and underlay get updated seamlessly, without any lag 
or frame drops.

Signed-off-by: Shirish S 
Reviewed-by: Harry Wentland 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 41 +++
 1 file changed, 12 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 8638f1c..2df2e32 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4752,8 +4752,6 @@ static int dm_update_planes_state(struct dc *dc,  static 
int amdgpu_dm_atomic_check(struct drm_device *dev,
  struct drm_atomic_state *state)
 {
-   int i;
-   int ret;
struct amdgpu_device *adev = dev->dev_private;
struct dc *dc = adev->dm.dc;
struct dm_atomic_state *dm_state = to_dm_atomic_state(state); @@ 
-4761,6 +4759,7 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
struct drm_connector_state *old_con_state, *new_con_state;
struct drm_crtc *crtc;
struct drm_crtc_state *old_crtc_state, *new_crtc_state;
+   int ret, i;
 
/*
 * This bool will be set for true for any modeset/reset @@ -4772,36 
+4771,20 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
if (ret)
goto fail;
 
-   /*
-* legacy_cursor_update should be made false for SoC's having
-* a dedicated hardware plane for cursor in amdgpu_dm_atomic_commit(),
-* otherwise for software cursor plane,
-* we should not add it to list of affected planes.
-*/
-   if (state->legacy_cursor_update) {
-   for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) {
-   if (new_crtc_state->color_mgmt_changed) {
-   ret = drm_atomic_add_affected_planes(state, 
crtc);
-   if (ret)
-   goto fail;
-   }
-   }
-   } else {
-   for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
new_crtc_state, i) {
-   if (!drm_atomic_crtc_needs_modeset(new_crtc_state))
-   continue;
+   for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
new_crtc_state, i) {
+   if (!drm_atomic_crtc_needs_modeset(new_crtc_state))
+   continue;
 
-   if (!new_crtc_state->enable)
-   continue;
+   if (!new_crtc_state->enable)
+   continue;
 
-   ret = drm_atomic_add_affected_connectors(state, crtc);
-   if (ret)
-   return ret;
+   ret = drm_atomic_add_affected_connectors(state, crtc);
+   if (ret)
+   return ret;
 
-   ret = drm_atomic_add_affected_planes(state, crtc);
-   if (ret)
-   goto fail;
-   }
+   ret = drm_atomic_add_affected_planes(state, crtc);
+   if (ret)
+   goto fail;
}
 
dm_state->context = dc_create_state();
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: check plane state before validating fbc

2017-11-21 Thread S, Shirish
From: Shirish S 

While validation fbc, array_mode of the pipe is accessed without checking 
plane_state exists for it.
Causing to null pointer dereferencing followed by reboot when a crtc associated 
with external display(not
connected) is page flipped.

This patch adds a check for plane_state before using it to validate fbc.

Signed-off-by: Shirish S 
Reviewed-by: Roman Li 
---
 drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
index ee3b944..a6cd63a 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_hw_sequencer.c
@@ -1724,6 +1724,10 @@ static enum dc_status validate_fbc(struct dc *dc,
if (pipe_ctx->stream->sink->link->psr_enabled)
return DC_ERROR_UNEXPECTED;
 
+   /* Nothing to compress */
+   if (!pipe_ctx->plane_state)
+   return DC_ERROR_UNEXPECTED;
+
/* Only for non-linear tiling */
if (pipe_ctx->plane_state->tiling_info.gfx8.array_mode == 
DC_ARRAY_LINEAR_GENERAL)
return DC_ERROR_UNEXPECTED;
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: no distinct handling of cursor required

2017-11-16 Thread S, Shirish
From: Shirish S 

Currently the atomic check code uses legacy_cursor_update to differnetiate if 
the cursor plane is being requested by the user, which is not required as we 
shall be updating plane only if modeset is requested/required.

Have tested cursor plane and underlay get updated seamlessly, without any lag 
or frame drops.

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 41 +++
 1 file changed, 12 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 8638f1c..2df2e32 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4752,8 +4752,6 @@ static int dm_update_planes_state(struct dc *dc,  static 
int amdgpu_dm_atomic_check(struct drm_device *dev,
  struct drm_atomic_state *state)
 {
-   int i;
-   int ret;
struct amdgpu_device *adev = dev->dev_private;
struct dc *dc = adev->dm.dc;
struct dm_atomic_state *dm_state = to_dm_atomic_state(state); @@ 
-4761,6 +4759,7 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
struct drm_connector_state *old_con_state, *new_con_state;
struct drm_crtc *crtc;
struct drm_crtc_state *old_crtc_state, *new_crtc_state;
+   int ret, i;
 
/*
 * This bool will be set for true for any modeset/reset @@ -4772,36 
+4771,20 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
if (ret)
goto fail;
 
-   /*
-* legacy_cursor_update should be made false for SoC's having
-* a dedicated hardware plane for cursor in amdgpu_dm_atomic_commit(),
-* otherwise for software cursor plane,
-* we should not add it to list of affected planes.
-*/
-   if (state->legacy_cursor_update) {
-   for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) {
-   if (new_crtc_state->color_mgmt_changed) {
-   ret = drm_atomic_add_affected_planes(state, 
crtc);
-   if (ret)
-   goto fail;
-   }
-   }
-   } else {
-   for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
new_crtc_state, i) {
-   if (!drm_atomic_crtc_needs_modeset(new_crtc_state))
-   continue;
+   for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
new_crtc_state, i) {
+   if (!drm_atomic_crtc_needs_modeset(new_crtc_state))
+   continue;
 
-   if (!new_crtc_state->enable)
-   continue;
+   if (!new_crtc_state->enable)
+   continue;
 
-   ret = drm_atomic_add_affected_connectors(state, crtc);
-   if (ret)
-   return ret;
+   ret = drm_atomic_add_affected_connectors(state, crtc);
+   if (ret)
+   return ret;
 
-   ret = drm_atomic_add_affected_planes(state, crtc);
-   if (ret)
-   goto fail;
-   }
+   ret = drm_atomic_add_affected_planes(state, crtc);
+   if (ret)
+   goto fail;
}
 
dm_state->context = dc_create_state();
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/display: fix static checker warning

2017-11-16 Thread S, Shirish
Done, applied to amd-staging-drm-next.

Thanks.

Regards,
Shirish S


-Original Message-
From: Wentland, Harry 
Sent: Tuesday, November 14, 2017 8:56 PM
To: S, Shirish <shiris...@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Dan Carpenter <dan.carpen...@oracle.com>
Subject: Re: [PATCH] drm/amd/display: fix static checker warning

On 2017-11-10 04:14 AM, S, Shirish wrote:
> From: Shirish S <shiris...@amd.com>
> 
> This patch fixes static checker warning of
> "warn: cast after binop" introduced by
> 56087b31 drm/amd/display: fix high part address in 
> dm_plane_helper_prepare_fb()
> 
> Signed-off-by: Shirish S <shiris...@amd.com>

Reviewed-by: Harry Wentland <harry.wentl...@amd.com>

Feel free to push to amd-staging-drm-next at your leisure.

Harry

> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index ed8b7524..0537523e 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -2955,7 +2955,7 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
> *plane,
>   = 
> lower_32_bits(afb->address);
>   
> plane_state->address.video_progressive.luma_addr.high_part
>   = 
> upper_32_bits(afb->address);
> - chroma_addr = afb->address + (u64)(awidth * 
> new_state->fb->height);
> + chroma_addr = afb->address + (u64)awidth * 
> new_state->fb->height;
>   
> plane_state->address.video_progressive.chroma_addr.low_part
>   = 
> lower_32_bits(chroma_addr);
>   
> plane_state->address.video_progressive.chroma_addr.high_part
> 
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


WARN_ON() on every commit

2017-11-13 Thread S, Shirish
Hi All,

I find the below WARN_ON() message while rendering on eDP in ST platform with 
https://cgit.freedesktop.org/~airlied/linux/log/?h=drm-next-amd-dc-staging 
branch:

WARNING kernel: [  344.159552] WARNING: CPU: 1 PID: 899 at 
/mnt/host/source/src/third_party/kernel/v4.12/drivers/gpu/drm/ttm/ttm_bo_
vm.c:287 ttm_bo_vm_open+0x2b/0x38
[  344.159552] Modules linked in: ccm cmac rfcomm uinput rtsx_pci_sdmmc xt_nat 
lzo lzo_compress ath10k_pci ath10k_co
re i2c_piix4 mac80211 zram bridge designware_i2s stp ath rtsx_pci llc acpi_als 
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat fuse 
xt_mark cfg8
0211 ip6table_filter iio_trig_sysfs cros_ec_sensors 
industrialio_triggered_buffer kfifo_buf cros_ec_sensors_core industrialio 
ax88179_178a usbnet mii btusb btrtl btb
cm btintel bluetooth ecdh_generic joydev uvcvideo videobuf2_vmalloc 
videobuf2_memops videobuf2_v4l2 videobuf2_core
WARNING kernel: [  344.159594] CPU: 1 PID: 899 Comm: TaskSchedulerSi Tainted: G 
   W   4.14.0-00021-g4448b9a68413 #467
WARNING kernel: [  344.159595] Hardware name: Google Kahlee/Kahlee, BIOS 
Google_Kahlee.10122.0.0 11/10/2017
WARNING kernel: [  344.159597] task: 967e1f75 task.stack: 
a4aa00d78000
WARNING kernel: [  344.159600] RIP: 0010:ttm_bo_vm_open+0x2b/0x38
WARNING kernel: [  344.159602] RSP: 0018:a4aa00d7bdb0 EFLAGS: 00010202
WARNING kernel: [  344.159605] RAX: 967d6505e858 RBX: 967da2415100 RCX: 
967e2a3b2830
WARNING kernel: [  344.159606] RDX: 967da679bb40 RSI: 967e29f39510 RDI: 
967d653682e0
WARNING kernel: [  344.159608] RBP: a4aa00d7bdb0 R08: 87932194 R09: 

WARNING kernel: [  344.159610] R10: 0001e1b1 R11: 0001e168 R12: 
967da6789d00
WARNING kernel: [  344.159611] R13: 967e274e3b80 R14: 967d653682e0 R15: 

WARNING kernel: [  344.159614] FS:  7ddd0559a700() 
GS:967e2ed0() knlGS:
WARNING kernel: [  344.159616] CS:  0010 DS:  ES:  CR0: 80050033
WARNING kernel: [  344.159618] CR2: 162eb0bebb18 CR3: 00012133e000 CR4: 
001406e0
WARNING kernel: [  344.159619] Call Trace:
WARNING kernel: [  344.159623]  copy_process.part.49+0xf47/0x186e
WARNING kernel: [  344.159627]  _do_fork+0xcd/0x2d1
WARNING kernel: [  344.159631]  ? __might_fault+0x35/0x37
WARNING kernel: [  344.159634]  ? _copy_to_user+0x5e/0x6b
WARNING kernel: [  344.159637]  SyS_clone+0x19/0x1b
WARNING kernel: [  344.159639]  do_syscall_64+0x52/0x61
WARNING kernel: [  344.159642]  entry_SYSCALL64_slow_path+0x25/0x25
WARNING kernel: [  344.159644] RIP: 0033:0x7ddd1224be3a
WARNING kernel: [  344.159646] RSP: 002b:7ddd05598ce0 EFLAGS: 0246 
ORIG_RAX: 0038
WARNING kernel: [  344.159649] RAX: ffda RBX: 7ddd05598ce0 RCX: 
7ddd1224be3a
WARNING kernel: [  344.159651] RDX:  RSI:  RDI: 
01200011
WARNING kernel: [  344.159652] RBP: 7ddd05598d50 R08: 02e0 R09: 
0383
WARNING kernel: [  344.159654] R10: 7ddd0559a9d0 R11: 0246 R12: 

WARNING kernel: [  344.159656] R13: 7ddd05598d00 R14: 7ddd05598f01 R15: 
162eb3b438a0
WARNING kernel: [  344.159658] Code: 0f 1f 44 00 00 55 48 8b 87 a8 00 00 00 48 
8b 97 a0 00 00 00 48 89 e5 48 8b 48 08 48 8b b2 30 01
00 00 48 39 b1 68 08 00 00 74 02 <0f> ff 48 8d 78 30 e8 59 ff ff ff 5d c3 0f 1f 
44 00 00 55 48 89


Does anybody know the fix for it?

Regards,
Shirish S

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


FW: [PATCH] drm/amd/display: fix static checker warning

2017-11-09 Thread S, Shirish

On 11/7/2017 2:06 PM, Michel Dänzer wrote:
> On 07/11/17 04:29 AM, S, Shirish wrote:
>> From: Shirish S <shiris...@amd.com>
>>
>> This patch fixes static checker warning of
>> "warn: cast after binop" introduced by
>> 4d3e00dad80a: "drm/amd/display : add high part address calculation for 
>> underlay"
>>
>> Signed-off-by: Shirish S <shiris...@amd.com>
>> ---
>>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> index a87e5ac..e1bdf5e 100644
>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>> @@ -1827,7 +1827,7 @@ static int fill_plane_attributes_from_fb(struct 
>> amdgpu_device *adev,
>>  = lower_32_bits(fb_location);
>>  plane_state->address.video_progressive.luma_addr.high_part
>>  = upper_32_bits(fb_location);
>> -chroma_addr = fb_location + (u64)(awidth * fb->height);
>> +chroma_addr = fb_location + (u64)awidth * fb->height;
>>  plane_state->address.video_progressive.chroma_addr.low_part
>>  = lower_32_bits(chroma_addr);
>>  plane_state->address.video_progressive.chroma_addr.high_part
>> @@ -2959,7 +2959,7 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
>> *plane,
>>  = 
>> lower_32_bits(afb->address);
>>  
>> plane_state->address.video_progressive.luma_addr.high_part
>>  = 
>> upper_32_bits(afb->address);
>> -chroma_addr = afb->address + (u64)(awidth * 
>> new_state->fb->height);
>> +chroma_addr = afb->address + (u64)awidth * 
>> new_state->fb->height;
>>  
>> plane_state->address.video_progressive.chroma_addr.low_part
>>  = 
>> lower_32_bits(chroma_addr);
>>  
>> plane_state->address.video_progressive.chroma_addr.high_part
>>
> This code should really be removed, since fb_location is always 0 now 
> in this function, so the values derived from it cannot be used for 
> anything anyway.
I remember Andrey had some concerns with it, if he is ok with it i can move it 
as a separate patch, for future bisect-ability.

Regards,
Shirish S
>
>

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: fix static checker warning

2017-11-06 Thread S, Shirish
From: Shirish S 

This patch fixes static checker warning of
"warn: cast after binop" introduced by
4d3e00dad80a: "drm/amd/display : add high part address calculation for underlay"

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index a87e5ac..e1bdf5e 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1827,7 +1827,7 @@ static int fill_plane_attributes_from_fb(struct 
amdgpu_device *adev,
= lower_32_bits(fb_location);
plane_state->address.video_progressive.luma_addr.high_part
= upper_32_bits(fb_location);
-   chroma_addr = fb_location + (u64)(awidth * fb->height);
+   chroma_addr = fb_location + (u64)awidth * fb->height;
plane_state->address.video_progressive.chroma_addr.low_part
= lower_32_bits(chroma_addr);
plane_state->address.video_progressive.chroma_addr.high_part
@@ -2959,7 +2959,7 @@ static int dm_plane_helper_prepare_fb(struct drm_plane 
*plane,
= 
lower_32_bits(afb->address);

plane_state->address.video_progressive.luma_addr.high_part
= 
upper_32_bits(afb->address);
-   chroma_addr = afb->address + (u64)(awidth * 
new_state->fb->height);
+   chroma_addr = afb->address + (u64)awidth * 
new_state->fb->height;

plane_state->address.video_progressive.chroma_addr.low_part
= 
lower_32_bits(chroma_addr);

plane_state->address.video_progressive.chroma_addr.high_part
-- 
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] amdgpu/dc: Avoid dereferencing NULL pointer

2017-10-27 Thread S, Shirish
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf 
> Of Drew Davenport
> Sent: Saturday, October 28, 2017 12:05 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Drew Davenport 
> 
> Subject: [PATCH] amdgpu/dc: Avoid dereferencing NULL pointer
>
> crtc is dereferenced from within drm_atomic_get_new_crtc_state, so 
> check for NULL before initializing new_crtc_state.
>
> Signed-off-by: Drew Davenport 
Reviewed-by: Shirish S 
> ---
>   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 9 ++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index d0ee1b3b8b5c..5a440fadbe18 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -3874,8 +3874,7 @@ static void amdgpu_dm_commit_planes(struct 
> drm_atomic_state *state,
>   /* update planes when needed */
>   for_each_oldnew_plane_in_state(state, plane, old_plane_state, 
> new_plane_state, i) {
>   struct drm_crtc *crtc = new_plane_state->crtc;
> - struct drm_crtc_state *new_crtc_state =
> - drm_atomic_get_new_crtc_state(state, crtc);
> + struct drm_crtc_state *new_crtc_state;
>   struct drm_framebuffer *fb = new_plane_state->fb;
>   bool pflip_needed;
>   struct dm_plane_state *dm_new_plane_state = 
> to_dm_plane_state(new_plane_state);
> @@ -3885,7 +3884,11 @@ static void amdgpu_dm_commit_planes(struct 
> drm_atomic_state *state,
>   continue;
>   }
>   
> - if (!fb || !crtc || pcrtc != crtc || !new_crtc_state->active)
> + if (!fb || !crtc || pcrtc != crtc)
> + continue;
> +
> + new_crtc_state = drm_atomic_get_new_crtc_state(state, crtc);
> + if (!new_crtc_state->active)
>   continue;
>   
>   pflip_needed = !state->allow_modeset;

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: fix null pointer dereference

2017-10-27 Thread S, Shirish

From: Shirish S 

While setting cursor position in case of mpo, input_pixel_processor is not 
available for underlay, hence add check of the same to avoid null pointer 
access issue.

Signed-off-by: Shirish S 
Reviewed-by: Harry Wentland 
Reviewed-by: Tony Cheng 
---
 drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
index 5cf69af..572b885 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c
@@ -288,7 +288,7 @@ bool dc_stream_set_cursor_position(
pos_cpy.enable = false;
 
 
-   if (ipp->funcs->ipp_cursor_set_position != NULL)
+   if (ipp !=NULL && ipp->funcs->ipp_cursor_set_position != NULL)
ipp->funcs->ipp_cursor_set_position(ipp, _cpy, 
);
 
if (mi != NULL && mi->funcs->set_cursor_position != NULL)
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[PATCH] drm/amd/display: check if modeset is required before adding plane

2017-10-26 Thread S, Shirish

From: Shirish S 

Adding affected planes without checking if modeset is requested from the user 
space causes performance regression in video p/b scenarios when full screen p/b 
is not composited.

Hence add a check before adding a plane as affected.

bug: https://bugs.freedesktop.org/show_bug.cgi?id=103408

Signed-off-by: Shirish S 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f0b50d9..e6ec130 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4727,6 +4727,9 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev,
}
} else {
for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
new_crtc_state, i) {
+   if (!drm_atomic_crtc_needs_modeset(new_crtc_state))
+   continue;
+
if (!new_crtc_state->enable)
continue;
 
--
2.7.4

___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


RE: [PATCH] drm/amd/display: assign fb_location only if bo is pinned

2017-10-26 Thread S, Shirish

I have reverted 
[PATCH 2/2] drm/amd/display: cleanup addReq and fix fb_location
and applied 
[PATCH] drm/amd/display: fix high part address in dm_plane_helper_prepare_fb()
onto amd-drm-staging kernel.


Regards,
Shirish S


-Original Message-
From: Michel Dänzer [mailto:mic...@daenzer.net] 
Sent: Wednesday, October 25, 2017 3:54 PM
To: S, Shirish <shiris...@amd.com>; Grodzovsky, Andrey 
<andrey.grodzov...@amd.com>
Cc: Deucher, Alexander <alexander.deuc...@amd.com>; 
dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amd/display: assign fb_location only if bo is pinned

On 25/10/17 12:05 PM, S, Shirish wrote:
> Hi Alex, Michel & Andrey,
> 
>  [PATCH] drm/amd/display: assign fb_location only if bo is pinned  
> [PATCH 2/2] drm/amd/display: cleanup addReq and fix fb_location
> 
> should be dropped and instead:

Since you pushed the latter to amd-staging-drm-next, please revert it there, or 
maybe submit another patch removing all fb_location related code from 
get_fb_info and fill_plane_attributes_from_fb.


> [PATCH] drm/amd/display: fix high part address in 
> dm_plane_helper_prepare_fb()
> 
> should be reviewed .

Reviewed-by: Michel Dänzer <michel.daen...@amd.com>

But please wait for review from DC folks.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


  1   2   >