On Sun, May 20, 2012 at 11:07:46AM -0700, Daniel Kurtz wrote:
> On Sun, May 20, 2012 at 8:19 AM, Daniel Vetter <[email protected]> wrote:
> >
> > On Sat, May 19, 2012 at 10:10:12PM +0200, Daniel Vetter wrote:
> > > ... too much risk for flaky edid transfers.
> > >
> > > This regression has been introduced in
> > >
> > > commit e646d5773572bf52017983d758bdf05777dc5600
> > > Author: Daniel Kurtz <[email protected]>
> > > Date:   Fri Mar 30 19:46:38 2012 +0800
> > >
> > >     drm/i915/intel_i2c: always wait for IDLE before clearing NAK
> > >
> > > This patch keeps the improved NAK handling on the hw side, but reverts
> > > the change to return -ENXIO in case the gmbus controller reports a
> > > NAK.
> > >
> > > Cc: Daniel Kurtz <[email protected]>
> >
> > Hi Daniel,
> >
> > Can you please take a look at this one and smash your r-b onto it if you
> > agree?
> >
> > Thanks, Daniel
> >
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49518
> > > Reported-and-Tested-by: Julian Simioni <[email protected]>
> > > Signed-Off-by: Daniel Vetter <[email protected]>
> > > ---
> > >  drivers/gpu/drm/i915/intel_i2c.c |    7 ++++---
> > >  1 file changed, 4 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/intel_i2c.c 
> > > b/drivers/gpu/drm/i915/intel_i2c.c
> > > index e04255e..0588d8e 100644
> > > --- a/drivers/gpu/drm/i915/intel_i2c.c
> > > +++ b/drivers/gpu/drm/i915/intel_i2c.c
> > > @@ -418,10 +418,11 @@ clear_err:
> > >        * If no ACK is received during the address phase of a transaction,
> > >        * the adapter must report -ENXIO.
> > >        * It is not clear what to return if no ACK is received at other 
> > > times.
> > > -      * So, we always return -ENXIO in all NAK cases, to ensure we send
> > > -      * it at least during the one case that is specified.
> > > +      *
> > > +      * Unfortunately we can't afford false positives in returning 
> > > -ENXIO,
> > > +      * hence never return -ENXIO.
> > >        */
> > > -     ret = -ENXIO;
> > > +     ret = i;
> 
> 
> The bugzilla report shows that in the old case, the gmbus failure was:
> [    2.528812] vgaarb: device changed decodes:
> PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
> [    2.540753] [drm] GMBUS [i915 gmbus panel] timed out waiting for idle
> [    2.580813] [drm:intel_panel_get_backlight], get backlight PWM = 0
> 
> And now the failure is:
> [    2.523015] vgaarb: device changed decodes:
> PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem
> [    2.534770] [drm] GMBUS [i915 gmbus panel] timed out after NAK
> [    2.534797] [drm:gmbus_xfer], GMBUS [i915 gmbus panel] NAK for
> addr: 0050 w(1)
> [    2.534883] [drm:intel_panel_get_backlight], get backlight PWM = 0
> 
> From these logs, it looks like in both cases, there is an i2c
> communication error between host and panel.  The difference is that:
>   * in the old case, the return value was 0, which triggered a silent
> retry (the 40ms gap between GMBUS and PWM messages)
>   * but, now that the return value is -ENXIO, the caller does not retry.
> 
> Actually, in the 'new' case, there are two errors happening: a NAK and
> then a timeout after the NAK, while waiting for the controller to
> clear the ACTIVE bit.  I'm not actually sure what causes "timeout
> after NAK", but I think it means the controller is waiting for entire
> transaction to complete (perhaps the STOP bit after NAK?).
> 
> Since this reported issue is happening in this double error path, I'd
> rather a patch that fixes it without disabling the more generic NAK
> path.  Maybe something like this:

Hm, good suggestion. This way, if we get a NAK but no issues later on,
we'll still fail faster if gmbus detected that no device is responding.
Which should speed up boot a bit. I'll run this past the bug reporter.

Thanks, Daniel

> 
> clear_err:
>       /*
>        * Wait for bus to IDLE before clearing NAK.
>        * If we clear the NAK while bus is still active, then it will stay
>        * active and the next transaction may fail.
>        */
> +      ret = -ENXIO;
>       if (wait_for((I915_READ(GMBUS2 + reg_offset) & GMBUS_ACTIVE) == 0,
> -                  10))
> +                  10)) {
>               DRM_DEBUG_KMS("GMBUS [%s] timed out after NAK\n",
>                             adapter->name);
> +             ret = -ETIMEDOUT;  // Or 0 ?
> +     }
> 
>       /* Toggle the Software Clear Interrupt bit. This has the effect
>        * of resetting the GMBUS controller and so clearing the
>        * BUS_ERROR raised by the slave's NAK.
>        */
>       I915_WRITE(GMBUS1 + reg_offset, GMBUS_SW_CLR_INT);
>       I915_WRITE(GMBUS1 + reg_offset, 0);
>       I915_WRITE(GMBUS0 + reg_offset, 0);
> 
>       DRM_DEBUG_KMS("GMBUS [%s] NAK for addr: %04x %c(%d)\n",
>                        adapter->name, msgs[i].addr,
>                        (msgs[i].flags & I2C_M_RD) ? 'r' : 'w', msgs[i].len);
> 
>       /*
>        * If no ACK is received during the address phase of a transaction,
>        * the adapter must report -ENXIO.
>        * It is not clear what to return if no ACK is received at other times.
> -      * So, we always return -ENXIO in all NAK cases, to ensure we send
> -      * it at least during the one case that is specified.
> +      * So, return -ENXIO for NAK after any byte, unless there was a timeout
> +      * while waiting for IDLE after NAK.
>        */
> -     ret = -ENXIO;
>       goto out;
> 
> -Daniel
> 
> >
> > >       goto out;
> > >
> > >  timeout:
> > > --
> > > 1.7.10
> > >
> >
> > --
> > Daniel Vetter
> > Mail: [email protected]
> > Mobile: +41 (0)79 365 57 48

-- 
Daniel Vetter
Mail: [email protected]
Mobile: +41 (0)79 365 57 48
_______________________________________________
Intel-gfx mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to