Re: [Intel-gfx] [PATCH] drm/i915: Retry reading DPCD when bogus values are read

2014-01-22 Thread Todd Previte

On 11/22/13 1:36 AM, Takashi Iwai wrote:

I got kernel WARNINGs frequently on Haswell laptops complaining about
invalid max DP link bw.  With drm.debug=0x0e, it turned out that the
obtained DPCD is utterly bogus when it happens:
   [drm:intel_dp_get_dpcd], DPCD: 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d

This seems happening intermittently without plug state changes and
even after the first few reads succeeded.  As a workaround, add a
sanity check for such bogus values and retry reading if it hits.

Signed-off-by: Takashi Iwai ti...@suse.de
---

  drivers/gpu/drm/i915/intel_dp.c | 15 ++-
  1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index eb8139da9763..f596353da557 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -2792,13 +2792,26 @@ intel_dp_get_dpcd(struct intel_dp *intel_dp)
struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
struct drm_device *dev = dig_port-base.base.dev;
struct drm_i915_private *dev_priv = dev-dev_private;
-
+   int retry = 0;
char dpcd_hex_dump[sizeof(intel_dp-dpcd) * 3];

+ again:
if (intel_dp_aux_native_read_retry(intel_dp, 0x000, intel_dp-dpcd,
   sizeof(intel_dp-dpcd)) == 0)
return false; /* aux transfer failed */

+   /* sanity check: sometimes DPCD contains a series of same bogus value
+* even though the above returned success.  Retry reading in that case.
+*/
+   if (intel_dp-dpcd[0]) {
+   int i;
+   for (i = 1; i  sizeof(intel_dp-dpcd); i++)
+   if (intel_dp-dpcd[i] != intel_dp-dpcd[0])
+   break;
+   if (i = sizeof(intel_dp-dpcd)  retry++  2)
+   goto again;
+   }
+
hex_dump_to_buffer(intel_dp-dpcd, sizeof(intel_dp-dpcd),
   32, 1, dpcd_hex_dump, sizeof(dpcd_hex_dump), false);
DRM_DEBUG_KMS(DPCD: %s\n, dpcd_hex_dump);
I'm in agreement with Jani's assessment. While it does address the 
particular problem at hand, it just looks for duplicate data and issues 
a retry if that specific condition is detected. It seems more 
appropriate that instead of checking for specific failure modes, we 
check to ensure that we have useable values for the basic, critical 
fields required for Displayport.


I would suggest that we validate the first 3 bytes of the DPCD. Those 
fields are as follows:

0x00 - DPCD revision level
Currently can only be 0x10, 0x11 or 0x12
0x01 - Link rate
Also only 3 values currently - 0x06, 0x0A and 0x14
0x02 - Max lane count
For validation purposes, we just need to look at 4:0 and verify 
it's 0x01, 0x02 or 0x04
While bits 6 and 7 are used, they're just feature bits and we 
check them later anyways.


This is kind of what I was thinking:

#define MAX_DPCD_REVISION 0x12
u8 lane_count = DPCD[2]  0x1F;

// Verify DPCD revision is in a valid range. MAX_DPCD_REVISION 
should be updated as necessasry

// to reflect the current supported DPCD version.
if (DPCD[0]  0x10 || DPCD  MAX_DPCD_REVISION)
goto retry;
// These are the only valid values for link rate. More will surely 
be added as Displayport moves forward.

if  !(DPCD[1] == 0x06 || DPCD[1] == 0x0A || DPCD[1] == 0x14)
goto retry;
// Lane count must be 1, 2 or 4. 3 shalt thou not count, and 5 is 
right out...

if !(lane_count == 0x01 || lane_count == 0x02 || lane_count == 0x04)
goto retry;

retry:
retry_count++;
if retry_count  max_retries
goto error
else
goto read_dpcd


-T

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Retry reading DPCD when bogus values are read

2013-12-04 Thread Jani Nikula
On Fri, 22 Nov 2013, Takashi Iwai ti...@suse.de wrote:
 I got kernel WARNINGs frequently on Haswell laptops complaining about
 invalid max DP link bw.  With drm.debug=0x0e, it turned out that the
 obtained DPCD is utterly bogus when it happens:
   [drm:intel_dp_get_dpcd], DPCD: 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d

 This seems happening intermittently without plug state changes and
 even after the first few reads succeeded.  As a workaround, add a
 sanity check for such bogus values and retry reading if it hits.

If it's not related to crtc disable/enable state changes, then I assume
this might happen with any native aux reads at any time, and your patch
would just paper over one instance of it.

Does this occur with several different displays? With the displays this
occurs with, does it also occur on non-Haswell?

Have you observed that you'd ever need to try again more than once?
I.e. does the first retry sometimes return garbage too?

Instinctively I'd say we need to resist the urge to apply this
workaround, and try to find the root cause.

BR,
Jani.


 Signed-off-by: Takashi Iwai ti...@suse.de
 ---

  drivers/gpu/drm/i915/intel_dp.c | 15 ++-
  1 file changed, 14 insertions(+), 1 deletion(-)

 diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
 index eb8139da9763..f596353da557 100644
 --- a/drivers/gpu/drm/i915/intel_dp.c
 +++ b/drivers/gpu/drm/i915/intel_dp.c
 @@ -2792,13 +2792,26 @@ intel_dp_get_dpcd(struct intel_dp *intel_dp)
   struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
   struct drm_device *dev = dig_port-base.base.dev;
   struct drm_i915_private *dev_priv = dev-dev_private;
 -
 + int retry = 0;
   char dpcd_hex_dump[sizeof(intel_dp-dpcd) * 3];
  
 + again:
   if (intel_dp_aux_native_read_retry(intel_dp, 0x000, intel_dp-dpcd,
  sizeof(intel_dp-dpcd)) == 0)
   return false; /* aux transfer failed */
  
 + /* sanity check: sometimes DPCD contains a series of same bogus value
 +  * even though the above returned success.  Retry reading in that case.
 +  */
 + if (intel_dp-dpcd[0]) {
 + int i;
 + for (i = 1; i  sizeof(intel_dp-dpcd); i++)
 + if (intel_dp-dpcd[i] != intel_dp-dpcd[0])
 + break;
 + if (i = sizeof(intel_dp-dpcd)  retry++  2)
 + goto again;
 + }
 +
   hex_dump_to_buffer(intel_dp-dpcd, sizeof(intel_dp-dpcd),
  32, 1, dpcd_hex_dump, sizeof(dpcd_hex_dump), false);
   DRM_DEBUG_KMS(DPCD: %s\n, dpcd_hex_dump);
 -- 
 1.8.4.3

 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Retry reading DPCD when bogus values are read

2013-12-04 Thread Takashi Iwai
At Wed, 04 Dec 2013 12:17:03 +0200,
Jani Nikula wrote:
 
 On Fri, 22 Nov 2013, Takashi Iwai ti...@suse.de wrote:
  I got kernel WARNINGs frequently on Haswell laptops complaining about
  invalid max DP link bw.  With drm.debug=0x0e, it turned out that the
  obtained DPCD is utterly bogus when it happens:
[drm:intel_dp_get_dpcd], DPCD: 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 
  4d
 
  This seems happening intermittently without plug state changes and
  even after the first few reads succeeded.  As a workaround, add a
  sanity check for such bogus values and retry reading if it hits.
 
 If it's not related to crtc disable/enable state changes, then I assume
 this might happen with any native aux reads at any time, and your patch
 would just paper over one instance of it.

Yes, exactly, that's the purpose :)

 Does this occur with several different displays? With the displays this
 occurs with, does it also occur on non-Haswell?

I haven't seen on IvyBridge machines, at least.
And I have no other DP monitor, so cannot test right now.

 Have you observed that you'd ever need to try again more than once?
 I.e. does the first retry sometimes return garbage too?

I don't remember whether more than once were needed.


Takashi

 Instinctively I'd say we need to resist the urge to apply this
 workaround, and try to find the root cause.
 
 BR,
 Jani.
 
 
  Signed-off-by: Takashi Iwai ti...@suse.de
  ---
 
   drivers/gpu/drm/i915/intel_dp.c | 15 ++-
   1 file changed, 14 insertions(+), 1 deletion(-)
 
  diff --git a/drivers/gpu/drm/i915/intel_dp.c 
  b/drivers/gpu/drm/i915/intel_dp.c
  index eb8139da9763..f596353da557 100644
  --- a/drivers/gpu/drm/i915/intel_dp.c
  +++ b/drivers/gpu/drm/i915/intel_dp.c
  @@ -2792,13 +2792,26 @@ intel_dp_get_dpcd(struct intel_dp *intel_dp)
  struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
  struct drm_device *dev = dig_port-base.base.dev;
  struct drm_i915_private *dev_priv = dev-dev_private;
  -
  +   int retry = 0;
  char dpcd_hex_dump[sizeof(intel_dp-dpcd) * 3];
   
  + again:
  if (intel_dp_aux_native_read_retry(intel_dp, 0x000, intel_dp-dpcd,
 sizeof(intel_dp-dpcd)) == 0)
  return false; /* aux transfer failed */
   
  +   /* sanity check: sometimes DPCD contains a series of same bogus value
  +* even though the above returned success.  Retry reading in that case.
  +*/
  +   if (intel_dp-dpcd[0]) {
  +   int i;
  +   for (i = 1; i  sizeof(intel_dp-dpcd); i++)
  +   if (intel_dp-dpcd[i] != intel_dp-dpcd[0])
  +   break;
  +   if (i = sizeof(intel_dp-dpcd)  retry++  2)
  +   goto again;
  +   }
  +
  hex_dump_to_buffer(intel_dp-dpcd, sizeof(intel_dp-dpcd),
 32, 1, dpcd_hex_dump, sizeof(dpcd_hex_dump), false);
  DRM_DEBUG_KMS(DPCD: %s\n, dpcd_hex_dump);
  -- 
  1.8.4.3
 
  ___
  Intel-gfx mailing list
  Intel-gfx@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/intel-gfx
 
 -- 
 Jani Nikula, Intel Open Source Technology Center
 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Retry reading DPCD when bogus values are read

2013-11-25 Thread Daniel Vetter
On Fri, Nov 22, 2013 at 09:36:09AM +0100, Takashi Iwai wrote:
 I got kernel WARNINGs frequently on Haswell laptops complaining about
 invalid max DP link bw.  With drm.debug=0x0e, it turned out that the
 obtained DPCD is utterly bogus when it happens:
   [drm:intel_dp_get_dpcd], DPCD: 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d
 
 This seems happening intermittently without plug state changes and
 even after the first few reads succeeded.  As a workaround, add a
 sanity check for such bogus values and retry reading if it hits.
 
 Signed-off-by: Takashi Iwai ti...@suse.de

*insert lament about how hw never fails to disappoint*

A few questions:
- Has anyone checked whether the same panel (I guess this is about edp)
  also fails in the same way on other chips? Maybe with a dual gpu
  solution?

- Jesse has a patch somewhere, please test the diff embedded in this bz
  comment:

https://bugs.freedesktop.org/show_bug.cgi?id=67245#c29

  Could be that we have an issue with the panel power timings.

- Shouldn't we instead clamp the values read from dpcd to a sane range?
  Worst case we'll do a few bogus link training rounds ...

Also we really should be extracting this stuff from drivers into common
code. Somewhen next century maybe ;-)

Cheers, Daniel

 ---
 
  drivers/gpu/drm/i915/intel_dp.c | 15 ++-
  1 file changed, 14 insertions(+), 1 deletion(-)
 
 diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
 index eb8139da9763..f596353da557 100644
 --- a/drivers/gpu/drm/i915/intel_dp.c
 +++ b/drivers/gpu/drm/i915/intel_dp.c
 @@ -2792,13 +2792,26 @@ intel_dp_get_dpcd(struct intel_dp *intel_dp)
   struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
   struct drm_device *dev = dig_port-base.base.dev;
   struct drm_i915_private *dev_priv = dev-dev_private;
 -
 + int retry = 0;
   char dpcd_hex_dump[sizeof(intel_dp-dpcd) * 3];
  
 + again:
   if (intel_dp_aux_native_read_retry(intel_dp, 0x000, intel_dp-dpcd,
  sizeof(intel_dp-dpcd)) == 0)
   return false; /* aux transfer failed */
  
 + /* sanity check: sometimes DPCD contains a series of same bogus value
 +  * even though the above returned success.  Retry reading in that case.
 +  */
 + if (intel_dp-dpcd[0]) {
 + int i;
 + for (i = 1; i  sizeof(intel_dp-dpcd); i++)
 + if (intel_dp-dpcd[i] != intel_dp-dpcd[0])
 + break;
 + if (i = sizeof(intel_dp-dpcd)  retry++  2)
 + goto again;
 + }
 +
   hex_dump_to_buffer(intel_dp-dpcd, sizeof(intel_dp-dpcd),
  32, 1, dpcd_hex_dump, sizeof(dpcd_hex_dump), false);
   DRM_DEBUG_KMS(DPCD: %s\n, dpcd_hex_dump);
 -- 
 1.8.4.3
 
 ___
 Intel-gfx mailing list
 Intel-gfx@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Retry reading DPCD when bogus values are read

2013-11-25 Thread Takashi Iwai
At Mon, 25 Nov 2013 16:47:25 +0100,
Daniel Vetter wrote:
 
 On Fri, Nov 22, 2013 at 09:36:09AM +0100, Takashi Iwai wrote:
  I got kernel WARNINGs frequently on Haswell laptops complaining about
  invalid max DP link bw.  With drm.debug=0x0e, it turned out that the
  obtained DPCD is utterly bogus when it happens:
[drm:intel_dp_get_dpcd], DPCD: 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 4d 
  4d
  
  This seems happening intermittently without plug state changes and
  even after the first few reads succeeded.  As a workaround, add a
  sanity check for such bogus values and retry reading if it hits.
  
  Signed-off-by: Takashi Iwai ti...@suse.de
 
 *insert lament about how hw never fails to disappoint*
 
 A few questions:
 - Has anyone checked whether the same panel (I guess this is about edp)
   also fails in the same way on other chips? Maybe with a dual gpu
   solution?

I should have mentioned more clearly that this was about the real DP
monitor, not eDP.  I haven't seen this problem with eDP.

 - Jesse has a patch somewhere, please test the diff embedded in this bz
   comment:
 
 https://bugs.freedesktop.org/show_bug.cgi?id=67245#c29
 
   Could be that we have an issue with the panel power timings.

I guess this is only about eDP?


Takashi

 - Shouldn't we instead clamp the values read from dpcd to a sane range?
   Worst case we'll do a few bogus link training rounds ...
 
 Also we really should be extracting this stuff from drivers into common
 code. Somewhen next century maybe ;-)
 
 Cheers, Daniel
 
  ---
  
   drivers/gpu/drm/i915/intel_dp.c | 15 ++-
   1 file changed, 14 insertions(+), 1 deletion(-)
  
  diff --git a/drivers/gpu/drm/i915/intel_dp.c 
  b/drivers/gpu/drm/i915/intel_dp.c
  index eb8139da9763..f596353da557 100644
  --- a/drivers/gpu/drm/i915/intel_dp.c
  +++ b/drivers/gpu/drm/i915/intel_dp.c
  @@ -2792,13 +2792,26 @@ intel_dp_get_dpcd(struct intel_dp *intel_dp)
  struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
  struct drm_device *dev = dig_port-base.base.dev;
  struct drm_i915_private *dev_priv = dev-dev_private;
  -
  +   int retry = 0;
  char dpcd_hex_dump[sizeof(intel_dp-dpcd) * 3];
   
  + again:
  if (intel_dp_aux_native_read_retry(intel_dp, 0x000, intel_dp-dpcd,
 sizeof(intel_dp-dpcd)) == 0)
  return false; /* aux transfer failed */
   
  +   /* sanity check: sometimes DPCD contains a series of same bogus value
  +* even though the above returned success.  Retry reading in that case.
  +*/
  +   if (intel_dp-dpcd[0]) {
  +   int i;
  +   for (i = 1; i  sizeof(intel_dp-dpcd); i++)
  +   if (intel_dp-dpcd[i] != intel_dp-dpcd[0])
  +   break;
  +   if (i = sizeof(intel_dp-dpcd)  retry++  2)
  +   goto again;
  +   }
  +
  hex_dump_to_buffer(intel_dp-dpcd, sizeof(intel_dp-dpcd),
 32, 1, dpcd_hex_dump, sizeof(dpcd_hex_dump), false);
  DRM_DEBUG_KMS(DPCD: %s\n, dpcd_hex_dump);
  -- 
  1.8.4.3
  
  ___
  Intel-gfx mailing list
  Intel-gfx@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/intel-gfx
 
 -- 
 Daniel Vetter
 Software Engineer, Intel Corporation
 +41 (0) 79 365 57 48 - http://blog.ffwll.ch
 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Retry reading DPCD when bogus values are read

2013-11-25 Thread Daniel Vetter
On Mon, Nov 25, 2013 at 4:52 PM, Takashi Iwai ti...@suse.de wrote:
 A few questions:
 - Has anyone checked whether the same panel (I guess this is about edp)
   also fails in the same way on other chips? Maybe with a dual gpu
   solution?

 I should have mentioned more clearly that this was about the real DP
 monitor, not eDP.  I haven't seen this problem with eDP.

Then I'd really like to know what happens on other platforms, both
intel and non-intel.

 - Jesse has a patch somewhere, please test the diff embedded in this bz
   comment:

 https://bugs.freedesktop.org/show_bug.cgi?id=67245#c29

   Could be that we have an issue with the panel power timings.

 I guess this is only about eDP?

Yeah, this is just an eDP-only issue that Jesse's patch addresses.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx