[Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-31 Thread Chris Wilson
On Mon, 17 Oct 2011 11:12:29 -0200, Eugeni Dodonov  wrote:
> This allows to avoid talking to a non-existent bus repeatedly until we
> finally timeout. The non-existent bus is signalled by -ENXIO error,
> provided by i2c_algo_bit:bit_doAddress call.
> 
> As the advantage of such change, all the other routines which use
> drm_get_edid would benefit for this timeout.
> 
> This change should fix
> https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
> edid detection timing by 10-30% in most cases, and by a much larger margin
> in case of phantom outputs.
> 
> Signed-off-by: Eugeni Dodonov 

Looks like we have reached the conclusion that this simple patch is the
least likely to cause problems and easiest to fix if it does. :)
Reviewed-by: Chris Wilson 
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


Re: [Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-31 Thread Chris Wilson
On Mon, 17 Oct 2011 11:12:29 -0200, Eugeni Dodonov eugeni.dodo...@intel.com 
wrote:
 This allows to avoid talking to a non-existent bus repeatedly until we
 finally timeout. The non-existent bus is signalled by -ENXIO error,
 provided by i2c_algo_bit:bit_doAddress call.
 
 As the advantage of such change, all the other routines which use
 drm_get_edid would benefit for this timeout.
 
 This change should fix
 https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
 edid detection timing by 10-30% in most cases, and by a much larger margin
 in case of phantom outputs.
 
 Signed-off-by: Eugeni Dodonov eugeni.dodo...@intel.com

Looks like we have reached the conclusion that this simple patch is the
least likely to cause problems and easiest to fix if it does. :)
Reviewed-by: Chris Wilson ch...@hchris-wilson.co.uk
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-30 Thread Jean Delvare
Hi Eugeni,

On Mon, 24 Oct 2011 12:40:14 -0200, Eugeni Dodonov wrote:
> On Thu, Oct 20, 2011 at 10:33, Jean Delvare  wrote:
> 
> > Just to clarify: by "connectivity is setup", do you mean code in the
> > driver setting the GPIO pin direction etc., or a display being
> > connected to the graphics card?
> >
> > I admit I am a little surprised. I2C buses should have their lines up
> > by default, thanks to pull-up resistors, and masters and slaves should
> > merely drive the lines low as needed. The absence of slaves should have
> > no impact on the line behavior. If the Intel graphics chips (or the
> > driver) rely on the display to hold the lines high, I'd say this is a
> > seriously broken design.
> >
> > As a side note, I thought that HDMI had the capability of presence
> > detection, so there may be a better way to know if a display is
> > connected or not, and this information could used to dynamically
> > instantiate and delete the I2C buses? That way we could skip probing
> > for EDID when there is no chance of success.
> 
> Yes, I think so too.
> 
> I admit I haven't got to the root of the problem here. My test was really
> simple, borrowed from the test_bus() at i2c-algo-bit.c - without HDMI cable
> plugged in, I was getting the "SDA stuck high" messages; with the cable
> plugged in, I wasn't getting any of those.

Just the HDMI cable, or with a screen at the other end?

Either way, this smells like bad hardware design. The graphics card
itself should pull the I2C bus lines up by default, it shouldn't rely
on a cable (I'm not familiar with HDMI but I'm not sure if that makes
sense at all) or external display (more likely) to do it. But I can
also imagine that this could be a driver issue, maybe the GPIO lines
aren't setup properly by the driver? I'm not familiar enough with the
Intel graphics hardware and driver to tell, you'll know better.

> But in any case, I haven't investigated it deeper in the hardware direction
> after figuring out that drm_get_edid would retry its attempts for retreiving
> the edid for 15 times, getting the -ENXIO error for all of them.
> 
> 
> > Well, you could always do manual line testing of the I2C bus _before_
> > calling drm_do_probe_ddc_edid()? And skip the call if the test fails
> > (i.e. the bus isn't ready for use.) As said before, I am willing to
> > export bit_test if it helps any. I've attached a patch doing exactly
> > this. Let me know if you want me to commit it.
> 
> Yes, surely, I can do this. I did a similar test in the i915-specific patch,
> checking if we can talk to i2c adapter pior to call drm_do_probe_ddc_edid,
> but I thought that perhaps it would be easier for all the cards relying on
> drm_get_edid to have a faster return path in case of problems.
> 
> I am fine with any solution, if this problem is happening to appear on i915
> cards only, we could do this in our driver. It is that 15 attempts looks a
> bit overkill.

Yes, I agree, 15 retries makes no sense. And I like your original
patch, it makes a lot of sense.

> > Your proposed patch is better than I first thought. I had forgotten
> > that i2c-algo-bit tries up to 3 times to get a first ack from a slave.
> > So if i2c_transfer returns -ENXIO, this means that i2c-algo-bit has
> > already attempted 3 times to contact the slave, with no reply, so
> > there's probably no point going on. A communication problem with a
> > present slave would have returned a different error code.
> >
> > Without your patch, a missing slave would cause 3 * 5 = 15 retries,
> > which is definitely too much.
> >
> > That being said, even then the whole probe sequence shouldn't exceed
> > 10 ms, which I wouldn't expect a user to notice. The user-reported 4
> > second delay when running xrandr can't be caused by this. 4 seconds for
> > 15 attempts is 250 ms per attempt, this looks like a timeout due to
> > non-functional bus, not a nack. Not that 250 ms is 1 jiffy for HZ=250,
> > which I guess was what the reporting user was running. So even with
> > your patch, there's still 750 ms to wait, I don't think this is
> > acceptable. You really have to fix that I2C bus or skip probing it.
> 
> Yep, true. I've followed the easiest route so far - find out where the
> initial problem happens, and attempt at solving it. This change in
> drm_get_edid solves the delay at its origin, but we certainly could have
> additional delays propagated somewhere else. I couldn't reproduce the
> original reporter's huge delay, so I looked at what was within my reach.

Your fix is not really "at the origin" of the problem. Fixing it at the
origin would be not creating the I2C bus if it won't work (or marking
it as unavailable in some way, if the drm framework allows this.) If
that is not possible, testing the lines before accessing the bus would
be closer to the origin, however it might be argued that it adds delays
in the working case, and also somehow violates the I2C specification
(the line testing is not part of the specification and could 

Re: [Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-30 Thread Jean Delvare
Hi Eugeni,

On Mon, 24 Oct 2011 12:40:14 -0200, Eugeni Dodonov wrote:
 On Thu, Oct 20, 2011 at 10:33, Jean Delvare kh...@linux-fr.org wrote:
 
  Just to clarify: by connectivity is setup, do you mean code in the
  driver setting the GPIO pin direction etc., or a display being
  connected to the graphics card?
 
  I admit I am a little surprised. I2C buses should have their lines up
  by default, thanks to pull-up resistors, and masters and slaves should
  merely drive the lines low as needed. The absence of slaves should have
  no impact on the line behavior. If the Intel graphics chips (or the
  driver) rely on the display to hold the lines high, I'd say this is a
  seriously broken design.
 
  As a side note, I thought that HDMI had the capability of presence
  detection, so there may be a better way to know if a display is
  connected or not, and this information could used to dynamically
  instantiate and delete the I2C buses? That way we could skip probing
  for EDID when there is no chance of success.
 
 Yes, I think so too.
 
 I admit I haven't got to the root of the problem here. My test was really
 simple, borrowed from the test_bus() at i2c-algo-bit.c - without HDMI cable
 plugged in, I was getting the SDA stuck high messages; with the cable
 plugged in, I wasn't getting any of those.

Just the HDMI cable, or with a screen at the other end?

Either way, this smells like bad hardware design. The graphics card
itself should pull the I2C bus lines up by default, it shouldn't rely
on a cable (I'm not familiar with HDMI but I'm not sure if that makes
sense at all) or external display (more likely) to do it. But I can
also imagine that this could be a driver issue, maybe the GPIO lines
aren't setup properly by the driver? I'm not familiar enough with the
Intel graphics hardware and driver to tell, you'll know better.

 But in any case, I haven't investigated it deeper in the hardware direction
 after figuring out that drm_get_edid would retry its attempts for retreiving
 the edid for 15 times, getting the -ENXIO error for all of them.
 
 
  Well, you could always do manual line testing of the I2C bus _before_
  calling drm_do_probe_ddc_edid()? And skip the call if the test fails
  (i.e. the bus isn't ready for use.) As said before, I am willing to
  export bit_test if it helps any. I've attached a patch doing exactly
  this. Let me know if you want me to commit it.
 
 Yes, surely, I can do this. I did a similar test in the i915-specific patch,
 checking if we can talk to i2c adapter pior to call drm_do_probe_ddc_edid,
 but I thought that perhaps it would be easier for all the cards relying on
 drm_get_edid to have a faster return path in case of problems.
 
 I am fine with any solution, if this problem is happening to appear on i915
 cards only, we could do this in our driver. It is that 15 attempts looks a
 bit overkill.

Yes, I agree, 15 retries makes no sense. And I like your original
patch, it makes a lot of sense.

  Your proposed patch is better than I first thought. I had forgotten
  that i2c-algo-bit tries up to 3 times to get a first ack from a slave.
  So if i2c_transfer returns -ENXIO, this means that i2c-algo-bit has
  already attempted 3 times to contact the slave, with no reply, so
  there's probably no point going on. A communication problem with a
  present slave would have returned a different error code.
 
  Without your patch, a missing slave would cause 3 * 5 = 15 retries,
  which is definitely too much.
 
  That being said, even then the whole probe sequence shouldn't exceed
  10 ms, which I wouldn't expect a user to notice. The user-reported 4
  second delay when running xrandr can't be caused by this. 4 seconds for
  15 attempts is 250 ms per attempt, this looks like a timeout due to
  non-functional bus, not a nack. Not that 250 ms is 1 jiffy for HZ=250,
  which I guess was what the reporting user was running. So even with
  your patch, there's still 750 ms to wait, I don't think this is
  acceptable. You really have to fix that I2C bus or skip probing it.
 
 Yep, true. I've followed the easiest route so far - find out where the
 initial problem happens, and attempt at solving it. This change in
 drm_get_edid solves the delay at its origin, but we certainly could have
 additional delays propagated somewhere else. I couldn't reproduce the
 original reporter's huge delay, so I looked at what was within my reach.

Your fix is not really at the origin of the problem. Fixing it at the
origin would be not creating the I2C bus if it won't work (or marking
it as unavailable in some way, if the drm framework allows this.) If
that is not possible, testing the lines before accessing the bus would
be closer to the origin, however it might be argued that it adds delays
in the working case, and also somehow violates the I2C specification
(the line testing is not part of the specification and could confuse
some chips, in theory at least.)

 In any case - looking at a faster way to leave the 

[Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-24 Thread Eugeni Dodonov
On Thu, Oct 20, 2011 at 10:33, Jean Delvare  wrote:

> Just to clarify: by "connectivity is setup", do you mean code in the
> driver setting the GPIO pin direction etc., or a display being
> connected to the graphics card?
>
> I admit I am a little surprised. I2C buses should have their lines up
> by default, thanks to pull-up resistors, and masters and slaves should
> merely drive the lines low as needed. The absence of slaves should have
> no impact on the line behavior. If the Intel graphics chips (or the
> driver) rely on the display to hold the lines high, I'd say this is a
> seriously broken design.
>
> As a side note, I thought that HDMI had the capability of presence
> detection, so there may be a better way to know if a display is
> connected or not, and this information could used to dynamically
> instantiate and delete the I2C buses? That way we could skip probing
> for EDID when there is no chance of success.
>

Yes, I think so too.

I admit I haven't got to the root of the problem here. My test was really
simple, borrowed from the test_bus() at i2c-algo-bit.c - without HDMI cable
plugged in, I was getting the "SDA stuck high" messages; with the cable
plugged in, I wasn't getting any of those.

But in any case, I haven't investigated it deeper in the hardware direction
after figuring out that drm_get_edid would retry its attempts for retreiving
the edid for 15 times, getting the -ENXIO error for all of them.


> Well, you could always do manual line testing of the I2C bus _before_
> calling drm_do_probe_ddc_edid()? And skip the call if the test fails
> (i.e. the bus isn't ready for use.) As said before, I am willing to
> export bit_test if it helps any. I've attached a patch doing exactly
> this. Let me know if you want me to commit it.
>

Yes, surely, I can do this. I did a similar test in the i915-specific patch,
checking if we can talk to i2c adapter pior to call drm_do_probe_ddc_edid,
but I thought that perhaps it would be easier for all the cards relying on
drm_get_edid to have a faster return path in case of problems.

I am fine with any solution, if this problem is happening to appear on i915
cards only, we could do this in our driver. It is that 15 attempts looks a
bit overkill.


> Your proposed patch is better than I first thought. I had forgotten
> that i2c-algo-bit tries up to 3 times to get a first ack from a slave.
> So if i2c_transfer returns -ENXIO, this means that i2c-algo-bit has
> already attempted 3 times to contact the slave, with no reply, so
> there's probably no point going on. A communication problem with a
> present slave would have returned a different error code.
>
> Without your patch, a missing slave would cause 3 * 5 = 15 retries,
> which is definitely too much.
>
> That being said, even then the whole probe sequence shouldn't exceed
> 10 ms, which I wouldn't expect a user to notice. The user-reported 4
> second delay when running xrandr can't be caused by this. 4 seconds for
> 15 attempts is 250 ms per attempt, this looks like a timeout due to
> non-functional bus, not a nack. Not that 250 ms is 1 jiffy for HZ=250,
> which I guess was what the reporting user was running. So even with
> your patch, there's still 750 ms to wait, I don't think this is
> acceptable. You really have to fix that I2C bus or skip probing it.
>

Yep, true. I've followed the easiest route so far - find out where the
initial problem happens, and attempt at solving it. This change in
drm_get_edid solves the delay at its origin, but we certainly could have
additional delays propagated somewhere else. I couldn't reproduce the
original reporter's huge delay, so I looked at what was within my reach.

In any case - looking at a faster way to leave the drm_do_probe_ddc_edid,
while also allowing a way for having a more detailed probing - would it be
an acceptable solution to add a:

MODULE_PARM(skip_unresponsive_edid, "Ignore outputs which do not provide
edid on first attempt");

and update the patch to use this option?

-- 
Eugeni Dodonov

-- next part --
An HTML attachment was scrubbed...
URL: 



Re: [Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-24 Thread Eugeni Dodonov
On Thu, Oct 20, 2011 at 10:33, Jean Delvare kh...@linux-fr.org wrote:

 Just to clarify: by connectivity is setup, do you mean code in the
 driver setting the GPIO pin direction etc., or a display being
 connected to the graphics card?

 I admit I am a little surprised. I2C buses should have their lines up
 by default, thanks to pull-up resistors, and masters and slaves should
 merely drive the lines low as needed. The absence of slaves should have
 no impact on the line behavior. If the Intel graphics chips (or the
 driver) rely on the display to hold the lines high, I'd say this is a
 seriously broken design.

 As a side note, I thought that HDMI had the capability of presence
 detection, so there may be a better way to know if a display is
 connected or not, and this information could used to dynamically
 instantiate and delete the I2C buses? That way we could skip probing
 for EDID when there is no chance of success.


Yes, I think so too.

I admit I haven't got to the root of the problem here. My test was really
simple, borrowed from the test_bus() at i2c-algo-bit.c - without HDMI cable
plugged in, I was getting the SDA stuck high messages; with the cable
plugged in, I wasn't getting any of those.

But in any case, I haven't investigated it deeper in the hardware direction
after figuring out that drm_get_edid would retry its attempts for retreiving
the edid for 15 times, getting the -ENXIO error for all of them.


 Well, you could always do manual line testing of the I2C bus _before_
 calling drm_do_probe_ddc_edid()? And skip the call if the test fails
 (i.e. the bus isn't ready for use.) As said before, I am willing to
 export bit_test if it helps any. I've attached a patch doing exactly
 this. Let me know if you want me to commit it.


Yes, surely, I can do this. I did a similar test in the i915-specific patch,
checking if we can talk to i2c adapter pior to call drm_do_probe_ddc_edid,
but I thought that perhaps it would be easier for all the cards relying on
drm_get_edid to have a faster return path in case of problems.

I am fine with any solution, if this problem is happening to appear on i915
cards only, we could do this in our driver. It is that 15 attempts looks a
bit overkill.


 Your proposed patch is better than I first thought. I had forgotten
 that i2c-algo-bit tries up to 3 times to get a first ack from a slave.
 So if i2c_transfer returns -ENXIO, this means that i2c-algo-bit has
 already attempted 3 times to contact the slave, with no reply, so
 there's probably no point going on. A communication problem with a
 present slave would have returned a different error code.

 Without your patch, a missing slave would cause 3 * 5 = 15 retries,
 which is definitely too much.

 That being said, even then the whole probe sequence shouldn't exceed
 10 ms, which I wouldn't expect a user to notice. The user-reported 4
 second delay when running xrandr can't be caused by this. 4 seconds for
 15 attempts is 250 ms per attempt, this looks like a timeout due to
 non-functional bus, not a nack. Not that 250 ms is 1 jiffy for HZ=250,
 which I guess was what the reporting user was running. So even with
 your patch, there's still 750 ms to wait, I don't think this is
 acceptable. You really have to fix that I2C bus or skip probing it.


Yep, true. I've followed the easiest route so far - find out where the
initial problem happens, and attempt at solving it. This change in
drm_get_edid solves the delay at its origin, but we certainly could have
additional delays propagated somewhere else. I couldn't reproduce the
original reporter's huge delay, so I looked at what was within my reach.

In any case - looking at a faster way to leave the drm_do_probe_ddc_edid,
while also allowing a way for having a more detailed probing - would it be
an acceptable solution to add a:

MODULE_PARM(skip_unresponsive_edid, Ignore outputs which do not provide
edid on first attempt);

and update the patch to use this option?

-- 
Eugeni Dodonov
http://eugeni.dodonov.net/
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-20 Thread Michael Büsch
On Thu, 20 Oct 2011 14:33:39 +0200
Jean Delvare  wrote:

> retry mechanism: Chris Wilson and Michael Buesch (both Cc'd.) Chris,
> Michael, do you know of ways to reproduce the issue?

The error could easily reproduced by loading the machine heavily.
So my guess is that it is caused by electrical noise injected into
the i2c bus. Probably due to bad hardware design. But that doesn't matter.
I have to live with that. The error did not trigger again with the workaround
in place, though.

> Can you please
> also confirm that the error code you were receiving was not -ENXIO?

I really don't remember what error code it was.

> Note that the problem is more likely to happen with slow masters,
> because (1) every transaction takes longer and thus has a higher
> probability to be hit by interrupts and (2) long delays mean smaller
> margins to the spec limits, so interrupts are more likely to cause
> trouble. I see that both radeon_i2c and intel_i2c set udelay to 20 ?s,
> which means a 25 kbps I2C bus. In general it is recommended to not
> drive the I2C bus below 10 kbps (that's even a hard limit for SMBus
> compliance.) nouveau_i2c is even worse with udelay = 40 ?s which is
> equivalent to a 12.5 kbps I2C bus, very close to the low limit.

Hm. Not sure. I don't think the machine had heavy IRQ load.
Just high CPU and memory load (compile the kernel or something like that).

-- 
Greetings, Michael.


[Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-20 Thread Jean Delvare
On Thu, 20 Oct 2011 14:33:39 +0200, Jean Delvare wrote:
> That being said, even then the whole probe sequence shouldn't exceed
> 10 ms, which I wouldn't expect a user to notice. The user-reported 4
> second delay when running xrandr can't be caused by this. 4 seconds for
> 15 attempts is 250 ms per attempt, this looks like a timeout due to
> non-functional bus, not a nack. Not that 250 ms is 1 jiffy for HZ=250,

D'oh. Today is not my day, I can't believe I wrote this :/ 1 jiffy is
obviously 4 ms only at HZ=250. So I can't explain why it is taking so
much time on the reporter's machine.

> which I guess was what the reporting user was running. So even with
> your patch, there's still 750 ms to wait, I don't think this is
> acceptable. You really have to fix that I2C bus or skip probing it.

This conclusion probably still holds.

-- 
Jean Delvare


[Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-20 Thread Jean Delvare
Forgot to attach the patch, sorry. Here it is.
-- 
Jean Delvare
-- next part --
A non-text attachment was scrubbed...
Name: i2c-algo-bit-export-test.patch
Type: text/x-patch
Size: 1702 bytes
Desc: not available
URL: 



[Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-20 Thread Jean Delvare
On Tue, 18 Oct 2011 11:37:38 -0200, Eugeni Dodonov wrote:
> On 10/18/2011 11:14, Jean Delvare wrote:
> > Hi Dave, Eugeni, Alex,
> >
> > On Tue, 18 Oct 2011 11:02:00 +0100, Dave Airlie wrote:
> >>> This allows to avoid talking to a non-existent bus repeatedly until we
> >>> finally timeout. The non-existent bus is signalled by -ENXIO error,
> >>> provided by i2c_algo_bit:bit_doAddress call.
> >>>
> >>> As the advantage of such change, all the other routines which use
> >>> drm_get_edid would benefit for this timeout.
> >>>
> >>> This change should fix
> >>> https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
> >>> edid detection timing by 10-30% in most cases, and by a much larger margin
> >>> in case of phantom outputs.
> >>
> >> Jean, Alex,
> >>
> >> I'm thinking of thowing this into -next, any objections?
> >
> > This seems to be the wrong place for the test. The code below is really
> > testing for non-responsive (possibly not present) EDID, NOT
> > "non-existent adapter". Non-existent adapter should be checked in the
> > firmware if possible, or failing that, by testing the bus lines at bus
> > creation time, as i2c_algo_bit.bit_test=1 is doing. While this setting
> > has been known to cause trouble recently (not per se but because it was
> > triggering a bug somewhere else in the radeon driver), it might still
> > have value, and could be changed to a per-adapter setting by exporting
> > the test function (I have a patch ready doing exactly this) and letting
> > video drivers test their I2C buses and discard the non-working ones if
> > they want.
> >
> > I am worried that the patch below will cause regressions on other
> > machines. There's a comment right before the loop which reads:
> >
> > /* The core i2c driver will automatically retry the transfer if the
> >  * adapter reports EAGAIN. However, we find that bit-banging transfers
> >  * are susceptible to errors under a heavily loaded machine and
> >  * generate spurious NAKs and timeouts. Retrying the transfer
> >  * of the individual block a few times seems to overcome this.
> >  */
> >
> > So the retries are there for a reason, and -ENXIO is exactly what you
> > get on spurious NAKs.
> 
> You are right about the bit_test=1 testing, my initial attempt at the
> fix did exactly that
> (https://bugs.freedesktop.org/show_bug.cgi?id=41059, comments 14 and 15).
> 
> The problem is that for some buses, namely HDMI ones in my testing,
> bit_test-like tests always consider them as non-existent when the
> connectivity is not setup; and as working when it is.

Just to clarify: by "connectivity is setup", do you mean code in the
driver setting the GPIO pin direction etc., or a display being
connected to the graphics card?

I admit I am a little surprised. I2C buses should have their lines up
by default, thanks to pull-up resistors, and masters and slaves should
merely drive the lines low as needed. The absence of slaves should have
no impact on the line behavior. If the Intel graphics chips (or the
driver) rely on the display to hold the lines high, I'd say this is a
seriously broken design.

As a side note, I thought that HDMI had the capability of presence
detection, so there may be a better way to know if a display is
connected or not, and this information could used to dynamically
instantiate and delete the I2C buses? That way we could skip probing
for EDID when there is no chance of success.

In the specific case of the user report that started this discussion,
the card has no HDMI port to start with, so it seems weird to even
attempt to create a DDC bus for HDMI. If there no way the i915 driver
can detect this case and skip the whole HDMI setup? As I understand it
the radeon driver is able to do that. If the user report is an isolated
case of faulty BIOS/firmware, you could consider handling it
specifically (as the radeon driver does, too.)

> So in any case, we
> could not just blacklist them - when they do, they are gone for good,
> and won't work anymore, and we need to keep re-trying them at each attempt.
> 
> And in case of continuous pre-testing for the stuck bits and like when
> driver attempts to get the edid (for example, when xrandr is run), we
> still hit the same issue - the drm_do_probe_ddc_edid will continue to
> retry the attempts until it reaches the maximum number of retries. E.g., there
> is no way to tell drm_do_probe_ddc_edid to treat any return code as a
> permanent failure and make it give up faster.

Well, you could always do manual line testing of the I2C bus _before_
calling drm_do_probe_ddc_edid()? And skip the call if the test fails
(i.e. the bus isn't ready for use.) As said before, I am willing to
export bit_test if it helps any. I've attached a patch doing exactly
this. Let me know if you want me to commit it.

Your proposed patch is better than I first thought. I had forgotten
that i2c-algo-bit tries up to 3 times to get a first ack from a slave.
So if i2c_transfer returns 

Re: [Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-20 Thread Jean Delvare
On Tue, 18 Oct 2011 11:37:38 -0200, Eugeni Dodonov wrote:
 On 10/18/2011 11:14, Jean Delvare wrote:
  Hi Dave, Eugeni, Alex,
 
  On Tue, 18 Oct 2011 11:02:00 +0100, Dave Airlie wrote:
  This allows to avoid talking to a non-existent bus repeatedly until we
  finally timeout. The non-existent bus is signalled by -ENXIO error,
  provided by i2c_algo_bit:bit_doAddress call.
 
  As the advantage of such change, all the other routines which use
  drm_get_edid would benefit for this timeout.
 
  This change should fix
  https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
  edid detection timing by 10-30% in most cases, and by a much larger margin
  in case of phantom outputs.
 
  Jean, Alex,
 
  I'm thinking of thowing this into -next, any objections?
 
  This seems to be the wrong place for the test. The code below is really
  testing for non-responsive (possibly not present) EDID, NOT
  non-existent adapter. Non-existent adapter should be checked in the
  firmware if possible, or failing that, by testing the bus lines at bus
  creation time, as i2c_algo_bit.bit_test=1 is doing. While this setting
  has been known to cause trouble recently (not per se but because it was
  triggering a bug somewhere else in the radeon driver), it might still
  have value, and could be changed to a per-adapter setting by exporting
  the test function (I have a patch ready doing exactly this) and letting
  video drivers test their I2C buses and discard the non-working ones if
  they want.
 
  I am worried that the patch below will cause regressions on other
  machines. There's a comment right before the loop which reads:
 
  /* The core i2c driver will automatically retry the transfer if the
   * adapter reports EAGAIN. However, we find that bit-banging transfers
   * are susceptible to errors under a heavily loaded machine and
   * generate spurious NAKs and timeouts. Retrying the transfer
   * of the individual block a few times seems to overcome this.
   */
 
  So the retries are there for a reason, and -ENXIO is exactly what you
  get on spurious NAKs.
 
 You are right about the bit_test=1 testing, my initial attempt at the
 fix did exactly that
 (https://bugs.freedesktop.org/show_bug.cgi?id=41059, comments 14 and 15).
 
 The problem is that for some buses, namely HDMI ones in my testing,
 bit_test-like tests always consider them as non-existent when the
 connectivity is not setup; and as working when it is.

Just to clarify: by connectivity is setup, do you mean code in the
driver setting the GPIO pin direction etc., or a display being
connected to the graphics card?

I admit I am a little surprised. I2C buses should have their lines up
by default, thanks to pull-up resistors, and masters and slaves should
merely drive the lines low as needed. The absence of slaves should have
no impact on the line behavior. If the Intel graphics chips (or the
driver) rely on the display to hold the lines high, I'd say this is a
seriously broken design.

As a side note, I thought that HDMI had the capability of presence
detection, so there may be a better way to know if a display is
connected or not, and this information could used to dynamically
instantiate and delete the I2C buses? That way we could skip probing
for EDID when there is no chance of success.

In the specific case of the user report that started this discussion,
the card has no HDMI port to start with, so it seems weird to even
attempt to create a DDC bus for HDMI. If there no way the i915 driver
can detect this case and skip the whole HDMI setup? As I understand it
the radeon driver is able to do that. If the user report is an isolated
case of faulty BIOS/firmware, you could consider handling it
specifically (as the radeon driver does, too.)

 So in any case, we
 could not just blacklist them - when they do, they are gone for good,
 and won't work anymore, and we need to keep re-trying them at each attempt.
 
 And in case of continuous pre-testing for the stuck bits and like when
 driver attempts to get the edid (for example, when xrandr is run), we
 still hit the same issue - the drm_do_probe_ddc_edid will continue to
 retry the attempts until it reaches the maximum number of retries. E.g., there
 is no way to tell drm_do_probe_ddc_edid to treat any return code as a
 permanent failure and make it give up faster.

Well, you could always do manual line testing of the I2C bus _before_
calling drm_do_probe_ddc_edid()? And skip the call if the test fails
(i.e. the bus isn't ready for use.) As said before, I am willing to
export bit_test if it helps any. I've attached a patch doing exactly
this. Let me know if you want me to commit it.

Your proposed patch is better than I first thought. I had forgotten
that i2c-algo-bit tries up to 3 times to get a first ack from a slave.
So if i2c_transfer returns -ENXIO, this means that i2c-algo-bit has
already attempted 3 times to contact the slave, with no reply, so
there's probably no point going 

Re: [Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-20 Thread Jean Delvare
Forgot to attach the patch, sorry. Here it is.
-- 
Jean Delvare
---
 drivers/i2c/algos/i2c-algo-bit.c |8 ++--
 include/linux/i2c-algo-bit.h |3 +++
 2 files changed, 9 insertions(+), 2 deletions(-)

--- linux-3.1-rc10.orig/drivers/i2c/algos/i2c-algo-bit.c	2011-10-20 12:03:05.0 +0200
+++ linux-3.1-rc10/drivers/i2c/algos/i2c-algo-bit.c	2011-10-20 12:57:20.0 +0200
@@ -231,8 +231,11 @@ static int i2c_inb(struct i2c_adapter *i
 /*
  * Sanity check for the adapter hardware - check the reaction of
  * the bus lines only if it seems to be idle.
+ * Must be called with i2c_adap-bus_lock held if adapter is registered.
+ * This is done by surrounding the call to i2c_bit_test_bus() with
+ * i2c_lock_adapter(i2c_adap) and i2c_unlock_adapter(i2c_adap).
  */
-static int test_bus(struct i2c_adapter *i2c_adap)
+int i2c_bit_test_bus(struct i2c_adapter *i2c_adap)
 {
 	struct i2c_algo_bit_data *adap = i2c_adap-algo_data;
 	const char *name = i2c_adap-name;
@@ -320,6 +323,7 @@ bailout:
 
 	return -ENODEV;
 }
+EXPORT_SYMBOL_GPL(i2c_bit_test_bus);
 
 /* - Utility functions
  */
@@ -623,7 +627,7 @@ static int __i2c_bit_add_bus(struct i2c_
 	int ret;
 
 	if (bit_test) {
-		ret = test_bus(adap);
+		ret = i2c_bit_test_bus(adap);
 		if (ret  0)
 			return -ENODEV;
 	}
--- linux-3.1-rc10.orig/include/linux/i2c-algo-bit.h	2011-07-22 04:17:23.0 +0200
+++ linux-3.1-rc10/include/linux/i2c-algo-bit.h	2011-10-20 12:54:33.0 +0200
@@ -50,4 +50,7 @@ struct i2c_algo_bit_data {
 int i2c_bit_add_bus(struct i2c_adapter *);
 int i2c_bit_add_numbered_bus(struct i2c_adapter *);
 
+/* Must be called with bus_lock held if adapter is registered */
+int i2c_bit_test_bus(struct i2c_adapter *);
+
 #endif /* _LINUX_I2C_ALGO_BIT_H */
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-20 Thread Jean Delvare
On Thu, 20 Oct 2011 14:33:39 +0200, Jean Delvare wrote:
 That being said, even then the whole probe sequence shouldn't exceed
 10 ms, which I wouldn't expect a user to notice. The user-reported 4
 second delay when running xrandr can't be caused by this. 4 seconds for
 15 attempts is 250 ms per attempt, this looks like a timeout due to
 non-functional bus, not a nack. Not that 250 ms is 1 jiffy for HZ=250,

D'oh. Today is not my day, I can't believe I wrote this :/ 1 jiffy is
obviously 4 ms only at HZ=250. So I can't explain why it is taking so
much time on the reporter's machine.

 which I guess was what the reporting user was running. So even with
 your patch, there's still 750 ms to wait, I don't think this is
 acceptable. You really have to fix that I2C bus or skip probing it.

This conclusion probably still holds.

-- 
Jean Delvare
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-20 Thread Michael Büsch
On Thu, 20 Oct 2011 14:33:39 +0200
Jean Delvare kh...@linux-fr.org wrote:

 retry mechanism: Chris Wilson and Michael Buesch (both Cc'd.) Chris,
 Michael, do you know of ways to reproduce the issue?

The error could easily reproduced by loading the machine heavily.
So my guess is that it is caused by electrical noise injected into
the i2c bus. Probably due to bad hardware design. But that doesn't matter.
I have to live with that. The error did not trigger again with the workaround
in place, though.

 Can you please
 also confirm that the error code you were receiving was not -ENXIO?

I really don't remember what error code it was.

 Note that the problem is more likely to happen with slow masters,
 because (1) every transaction takes longer and thus has a higher
 probability to be hit by interrupts and (2) long delays mean smaller
 margins to the spec limits, so interrupts are more likely to cause
 trouble. I see that both radeon_i2c and intel_i2c set udelay to 20 µs,
 which means a 25 kbps I2C bus. In general it is recommended to not
 drive the I2C bus below 10 kbps (that's even a hard limit for SMBus
 compliance.) nouveau_i2c is even worse with udelay = 40 µs which is
 equivalent to a 12.5 kbps I2C bus, very close to the low limit.

Hm. Not sure. I don't think the machine had heavy IRQ load.
Just high CPU and memory load (compile the kernel or something like that).

-- 
Greetings, Michael.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-18 Thread Jean Delvare
Hi Dave, Eugeni, Alex,

On Tue, 18 Oct 2011 11:02:00 +0100, Dave Airlie wrote:
> > This allows to avoid talking to a non-existent bus repeatedly until we
> > finally timeout. The non-existent bus is signalled by -ENXIO error,
> > provided by i2c_algo_bit:bit_doAddress call.
> >
> > As the advantage of such change, all the other routines which use
> > drm_get_edid would benefit for this timeout.
> >
> > This change should fix
> > https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
> > edid detection timing by 10-30% in most cases, and by a much larger margin
> > in case of phantom outputs.
> 
> Jean, Alex,
> 
> I'm thinking of thowing this into -next, any objections?

This seems to be the wrong place for the test. The code below is really
testing for non-responsive (possibly not present) EDID, NOT
"non-existent adapter". Non-existent adapter should be checked in the
firmware if possible, or failing that, by testing the bus lines at bus
creation time, as i2c_algo_bit.bit_test=1 is doing. While this setting
has been known to cause trouble recently (not per se but because it was
triggering a bug somewhere else in the radeon driver), it might still
have value, and could be changed to a per-adapter setting by exporting
the test function (I have a patch ready doing exactly this) and letting
video drivers test their I2C buses and discard the non-working ones if
they want.

I am worried that the patch below will cause regressions on other
machines. There's a comment right before the loop which reads:

/* The core i2c driver will automatically retry the transfer if the
 * adapter reports EAGAIN. However, we find that bit-banging transfers
 * are susceptible to errors under a heavily loaded machine and
 * generate spurious NAKs and timeouts. Retrying the transfer
 * of the individual block a few times seems to overcome this.
 */

So the retries are there for a reason, and -ENXIO is exactly what you
get on spurious NAKs.

One thing which may make sense would be to make the retry count a
module parameter, instead of a hard-coded value, so that users can
lower the value if they care more about boot time than reliability. But
again, ideally problematic buses would not have been created in the
first place.

> >
> > Signed-off-by: Eugeni Dodonov 
> > ---
> > ?drivers/gpu/drm/drm_edid.c | ? ?5 +
> > ?1 files changed, 5 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> > index 7425e5c..1bca6d7 100644
> > --- a/drivers/gpu/drm/drm_edid.c
> > +++ b/drivers/gpu/drm/drm_edid.c
> > @@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, 
> > unsigned char *buf,
> > ? ? ? ? ? ? ? ? ? ? ? ?}
> > ? ? ? ? ? ? ? ?};
> > ? ? ? ? ? ? ? ?ret = i2c_transfer(adapter, msgs, 2);
> > + ? ? ? ? ? ? ? if (ret == -ENXIO) {
> > + ? ? ? ? ? ? ? ? ? ? ? DRM_DEBUG_KMS("drm: skipping non-existent adapter 
> > %s\n",
> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? adapter->name);
> > + ? ? ? ? ? ? ? ? ? ? ? break;
> > + ? ? ? ? ? ? ? }
> > ? ? ? ?} while (ret != 2 && --retries);
> >
> > ? ? ? ?return ret == 2 ? 0 : -1;

-- 
Jean Delvare


[Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-18 Thread Eugeni Dodonov
On 10/18/2011 11:14, Jean Delvare wrote:
> Hi Dave, Eugeni, Alex,
>
> On Tue, 18 Oct 2011 11:02:00 +0100, Dave Airlie wrote:
>>> This allows to avoid talking to a non-existent bus repeatedly until we
>>> finally timeout. The non-existent bus is signalled by -ENXIO error,
>>> provided by i2c_algo_bit:bit_doAddress call.
>>>
>>> As the advantage of such change, all the other routines which use
>>> drm_get_edid would benefit for this timeout.
>>>
>>> This change should fix
>>> https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
>>> edid detection timing by 10-30% in most cases, and by a much larger margin
>>> in case of phantom outputs.
>>
>> Jean, Alex,
>>
>> I'm thinking of thowing this into -next, any objections?
>
> This seems to be the wrong place for the test. The code below is really
> testing for non-responsive (possibly not present) EDID, NOT
> "non-existent adapter". Non-existent adapter should be checked in the
> firmware if possible, or failing that, by testing the bus lines at bus
> creation time, as i2c_algo_bit.bit_test=1 is doing. While this setting
> has been known to cause trouble recently (not per se but because it was
> triggering a bug somewhere else in the radeon driver), it might still
> have value, and could be changed to a per-adapter setting by exporting
> the test function (I have a patch ready doing exactly this) and letting
> video drivers test their I2C buses and discard the non-working ones if
> they want.
>
> I am worried that the patch below will cause regressions on other
> machines. There's a comment right before the loop which reads:
>
>   /* The core i2c driver will automatically retry the transfer if the
>* adapter reports EAGAIN. However, we find that bit-banging transfers
>* are susceptible to errors under a heavily loaded machine and
>* generate spurious NAKs and timeouts. Retrying the transfer
>* of the individual block a few times seems to overcome this.
>*/
>
> So the retries are there for a reason, and -ENXIO is exactly what you
> get on spurious NAKs.
>

Hi Jean,

You are right about the bit_test=1 testing, my initial attempt at the
fix did exactly that
(https://bugs.freedesktop.org/show_bug.cgi?id=41059, comments 14 and 15).

The problem is that for some buses, namely HDMI ones in my testing,
bit_test-like tests always consider them as non-existent when the
connectivity is not setup; and as working when it is. So in any case, we
could not just blacklist them - when they do, they are gone for good,
and won't work anymore, and we need to keep re-trying them at each attempt.

And in case of continuous pre-testing for the stuck bits and like when
driver attempts to get the edid (for example, when xrandr is run), we
still hit the same issue - the drm_do_probe_ddc_edid will continue to
retry the attempts until it reaches the maximum number of retries. E.g., there
is no way to tell drm_do_probe_ddc_edid to treat any return code as a
permanent failure and make it give up faster.

It could be fixed either in per-driver fashion, like I did with the other patch
(which also tests for -ENXIO); or in a generic way, in DRM. I couldn't reproduce
any false positives coming from -ENXIO on i915 driver, but perhaps it is
easier with radeon? Do you have any specific workload which trick the
driver into generating spurious NAKs by a chance?

> One thing which may make sense would be to make the retry count a
> module parameter, instead of a hard-coded value, so that users can
> lower the value if they care more about boot time than reliability. But
> again, ideally problematic buses would not have been created in the
> first place.

Or perhaps it would be possible to have any error code coming from
i2c_transfer to signal a permanent error, which should not be retried..
what do you think?

--
Eugeni Dodonov


[Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-18 Thread Dave Airlie
> This allows to avoid talking to a non-existent bus repeatedly until we
> finally timeout. The non-existent bus is signalled by -ENXIO error,
> provided by i2c_algo_bit:bit_doAddress call.
>
> As the advantage of such change, all the other routines which use
> drm_get_edid would benefit for this timeout.
>
> This change should fix
> https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
> edid detection timing by 10-30% in most cases, and by a much larger margin
> in case of phantom outputs.

Jean, Alex,

I'm thinking of thowing this into -next, any objections?

Dave.

>
> Signed-off-by: Eugeni Dodonov 
> ---
> ?drivers/gpu/drm/drm_edid.c | ? ?5 +
> ?1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> index 7425e5c..1bca6d7 100644
> --- a/drivers/gpu/drm/drm_edid.c
> +++ b/drivers/gpu/drm/drm_edid.c
> @@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, 
> unsigned char *buf,
> ? ? ? ? ? ? ? ? ? ? ? ?}
> ? ? ? ? ? ? ? ?};
> ? ? ? ? ? ? ? ?ret = i2c_transfer(adapter, msgs, 2);
> + ? ? ? ? ? ? ? if (ret == -ENXIO) {
> + ? ? ? ? ? ? ? ? ? ? ? DRM_DEBUG_KMS("drm: skipping non-existent adapter 
> %s\n",
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? adapter->name);
> + ? ? ? ? ? ? ? ? ? ? ? break;
> + ? ? ? ? ? ? ? }
> ? ? ? ?} while (ret != 2 && --retries);
>
> ? ? ? ?return ret == 2 ? 0 : -1;
> --
> 1.7.7
>
> ___
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>


[Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-18 Thread Alex Deucher
On Tue, Oct 18, 2011 at 6:02 AM, Dave Airlie  wrote:
>> This allows to avoid talking to a non-existent bus repeatedly until we
>> finally timeout. The non-existent bus is signalled by -ENXIO error,
>> provided by i2c_algo_bit:bit_doAddress call.
>>
>> As the advantage of such change, all the other routines which use
>> drm_get_edid would benefit for this timeout.
>>
>> This change should fix
>> https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
>> edid detection timing by 10-30% in most cases, and by a much larger margin
>> in case of phantom outputs.
>
> Jean, Alex,
>
> I'm thinking of thowing this into -next, any objections?
>

I haven't really hit the issue this patch attempts to fix on any
cards, but on radeon at least, we know which i2c buses are used for
ddc and which are not, so barring the occasional oem vbios bug, we
won't be trying ddc on arbitrary i2c buses.

Alex

> Dave.
>
>>
>> Signed-off-by: Eugeni Dodonov 
>> ---
>> ?drivers/gpu/drm/drm_edid.c | ? ?5 +
>> ?1 files changed, 5 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
>> index 7425e5c..1bca6d7 100644
>> --- a/drivers/gpu/drm/drm_edid.c
>> +++ b/drivers/gpu/drm/drm_edid.c
>> @@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, 
>> unsigned char *buf,
>> ? ? ? ? ? ? ? ? ? ? ? ?}
>> ? ? ? ? ? ? ? ?};
>> ? ? ? ? ? ? ? ?ret = i2c_transfer(adapter, msgs, 2);
>> + ? ? ? ? ? ? ? if (ret == -ENXIO) {
>> + ? ? ? ? ? ? ? ? ? ? ? DRM_DEBUG_KMS("drm: skipping non-existent adapter 
>> %s\n",
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? adapter->name);
>> + ? ? ? ? ? ? ? ? ? ? ? break;
>> + ? ? ? ? ? ? ? }
>> ? ? ? ?} while (ret != 2 && --retries);
>>
>> ? ? ? ?return ret == 2 ? 0 : -1;
>> --
>> 1.7.7
>>
>> ___
>> Intel-gfx mailing list
>> Intel-gfx at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>
>


Re: [Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-18 Thread Dave Airlie
 This allows to avoid talking to a non-existent bus repeatedly until we
 finally timeout. The non-existent bus is signalled by -ENXIO error,
 provided by i2c_algo_bit:bit_doAddress call.

 As the advantage of such change, all the other routines which use
 drm_get_edid would benefit for this timeout.

 This change should fix
 https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
 edid detection timing by 10-30% in most cases, and by a much larger margin
 in case of phantom outputs.

Jean, Alex,

I'm thinking of thowing this into -next, any objections?

Dave.


 Signed-off-by: Eugeni Dodonov eugeni.dodo...@intel.com
 ---
  drivers/gpu/drm/drm_edid.c |    5 +
  1 files changed, 5 insertions(+), 0 deletions(-)

 diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
 index 7425e5c..1bca6d7 100644
 --- a/drivers/gpu/drm/drm_edid.c
 +++ b/drivers/gpu/drm/drm_edid.c
 @@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, 
 unsigned char *buf,
                        }
                };
                ret = i2c_transfer(adapter, msgs, 2);
 +               if (ret == -ENXIO) {
 +                       DRM_DEBUG_KMS(drm: skipping non-existent adapter 
 %s\n,
 +                                       adapter-name);
 +                       break;
 +               }
        } while (ret != 2  --retries);

        return ret == 2 ? 0 : -1;
 --
 1.7.7

 ___
 Intel-gfx mailing list
 intel-...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-18 Thread Jean Delvare
Hi Dave, Eugeni, Alex,

On Tue, 18 Oct 2011 11:02:00 +0100, Dave Airlie wrote:
  This allows to avoid talking to a non-existent bus repeatedly until we
  finally timeout. The non-existent bus is signalled by -ENXIO error,
  provided by i2c_algo_bit:bit_doAddress call.
 
  As the advantage of such change, all the other routines which use
  drm_get_edid would benefit for this timeout.
 
  This change should fix
  https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
  edid detection timing by 10-30% in most cases, and by a much larger margin
  in case of phantom outputs.
 
 Jean, Alex,
 
 I'm thinking of thowing this into -next, any objections?

This seems to be the wrong place for the test. The code below is really
testing for non-responsive (possibly not present) EDID, NOT
non-existent adapter. Non-existent adapter should be checked in the
firmware if possible, or failing that, by testing the bus lines at bus
creation time, as i2c_algo_bit.bit_test=1 is doing. While this setting
has been known to cause trouble recently (not per se but because it was
triggering a bug somewhere else in the radeon driver), it might still
have value, and could be changed to a per-adapter setting by exporting
the test function (I have a patch ready doing exactly this) and letting
video drivers test their I2C buses and discard the non-working ones if
they want.

I am worried that the patch below will cause regressions on other
machines. There's a comment right before the loop which reads:

/* The core i2c driver will automatically retry the transfer if the
 * adapter reports EAGAIN. However, we find that bit-banging transfers
 * are susceptible to errors under a heavily loaded machine and
 * generate spurious NAKs and timeouts. Retrying the transfer
 * of the individual block a few times seems to overcome this.
 */

So the retries are there for a reason, and -ENXIO is exactly what you
get on spurious NAKs.

One thing which may make sense would be to make the retry count a
module parameter, instead of a hard-coded value, so that users can
lower the value if they care more about boot time than reliability. But
again, ideally problematic buses would not have been created in the
first place.

 
  Signed-off-by: Eugeni Dodonov eugeni.dodo...@intel.com
  ---
   drivers/gpu/drm/drm_edid.c |    5 +
   1 files changed, 5 insertions(+), 0 deletions(-)
 
  diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
  index 7425e5c..1bca6d7 100644
  --- a/drivers/gpu/drm/drm_edid.c
  +++ b/drivers/gpu/drm/drm_edid.c
  @@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, 
  unsigned char *buf,
                         }
                 };
                 ret = i2c_transfer(adapter, msgs, 2);
  +               if (ret == -ENXIO) {
  +                       DRM_DEBUG_KMS(drm: skipping non-existent adapter 
  %s\n,
  +                                       adapter-name);
  +                       break;
  +               }
         } while (ret != 2  --retries);
 
         return ret == 2 ? 0 : -1;

-- 
Jean Delvare
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-18 Thread Eugeni Dodonov
On 10/18/2011 11:14, Jean Delvare wrote:
 Hi Dave, Eugeni, Alex,

 On Tue, 18 Oct 2011 11:02:00 +0100, Dave Airlie wrote:
 This allows to avoid talking to a non-existent bus repeatedly until we
 finally timeout. The non-existent bus is signalled by -ENXIO error,
 provided by i2c_algo_bit:bit_doAddress call.

 As the advantage of such change, all the other routines which use
 drm_get_edid would benefit for this timeout.

 This change should fix
 https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
 edid detection timing by 10-30% in most cases, and by a much larger margin
 in case of phantom outputs.

 Jean, Alex,

 I'm thinking of thowing this into -next, any objections?

 This seems to be the wrong place for the test. The code below is really
 testing for non-responsive (possibly not present) EDID, NOT
 non-existent adapter. Non-existent adapter should be checked in the
 firmware if possible, or failing that, by testing the bus lines at bus
 creation time, as i2c_algo_bit.bit_test=1 is doing. While this setting
 has been known to cause trouble recently (not per se but because it was
 triggering a bug somewhere else in the radeon driver), it might still
 have value, and could be changed to a per-adapter setting by exporting
 the test function (I have a patch ready doing exactly this) and letting
 video drivers test their I2C buses and discard the non-working ones if
 they want.

 I am worried that the patch below will cause regressions on other
 machines. There's a comment right before the loop which reads:

   /* The core i2c driver will automatically retry the transfer if the
* adapter reports EAGAIN. However, we find that bit-banging transfers
* are susceptible to errors under a heavily loaded machine and
* generate spurious NAKs and timeouts. Retrying the transfer
* of the individual block a few times seems to overcome this.
*/

 So the retries are there for a reason, and -ENXIO is exactly what you
 get on spurious NAKs.


Hi Jean,

You are right about the bit_test=1 testing, my initial attempt at the
fix did exactly that
(https://bugs.freedesktop.org/show_bug.cgi?id=41059, comments 14 and 15).

The problem is that for some buses, namely HDMI ones in my testing,
bit_test-like tests always consider them as non-existent when the
connectivity is not setup; and as working when it is. So in any case, we
could not just blacklist them - when they do, they are gone for good,
and won't work anymore, and we need to keep re-trying them at each attempt.

And in case of continuous pre-testing for the stuck bits and like when
driver attempts to get the edid (for example, when xrandr is run), we
still hit the same issue - the drm_do_probe_ddc_edid will continue to
retry the attempts until it reaches the maximum number of retries. E.g., there
is no way to tell drm_do_probe_ddc_edid to treat any return code as a
permanent failure and make it give up faster.

It could be fixed either in per-driver fashion, like I did with the other patch
(which also tests for -ENXIO); or in a generic way, in DRM. I couldn't reproduce
any false positives coming from -ENXIO on i915 driver, but perhaps it is
easier with radeon? Do you have any specific workload which trick the
driver into generating spurious NAKs by a chance?

 One thing which may make sense would be to make the retry count a
 module parameter, instead of a hard-coded value, so that users can
 lower the value if they care more about boot time than reliability. But
 again, ideally problematic buses would not have been created in the
 first place.

Or perhaps it would be possible to have any error code coming from
i2c_transfer to signal a permanent error, which should not be retried..
what do you think?

--
Eugeni Dodonov
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-18 Thread Alex Deucher
On Tue, Oct 18, 2011 at 6:02 AM, Dave Airlie airl...@gmail.com wrote:
 This allows to avoid talking to a non-existent bus repeatedly until we
 finally timeout. The non-existent bus is signalled by -ENXIO error,
 provided by i2c_algo_bit:bit_doAddress call.

 As the advantage of such change, all the other routines which use
 drm_get_edid would benefit for this timeout.

 This change should fix
 https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
 edid detection timing by 10-30% in most cases, and by a much larger margin
 in case of phantom outputs.

 Jean, Alex,

 I'm thinking of thowing this into -next, any objections?


I haven't really hit the issue this patch attempts to fix on any
cards, but on radeon at least, we know which i2c buses are used for
ddc and which are not, so barring the occasional oem vbios bug, we
won't be trying ddc on arbitrary i2c buses.

Alex

 Dave.


 Signed-off-by: Eugeni Dodonov eugeni.dodo...@intel.com
 ---
  drivers/gpu/drm/drm_edid.c |    5 +
  1 files changed, 5 insertions(+), 0 deletions(-)

 diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
 index 7425e5c..1bca6d7 100644
 --- a/drivers/gpu/drm/drm_edid.c
 +++ b/drivers/gpu/drm/drm_edid.c
 @@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, 
 unsigned char *buf,
                        }
                };
                ret = i2c_transfer(adapter, msgs, 2);
 +               if (ret == -ENXIO) {
 +                       DRM_DEBUG_KMS(drm: skipping non-existent adapter 
 %s\n,
 +                                       adapter-name);
 +                       break;
 +               }
        } while (ret != 2  --retries);

        return ret == 2 ? 0 : -1;
 --
 1.7.7

 ___
 Intel-gfx mailing list
 intel-...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-17 Thread Eugeni Dodonov
On Mon, Oct 17, 2011 at 20:41, Keith Packard  wrote:

> On Mon, 17 Oct 2011 19:07:51 -0200, Eugeni Dodonov 
> wrote:
>
> > From what I've checked, the other return error value in this context
> could
> > be -EREMOTEIO, which could be caused by transmission error so it should
> be
> > retried.
>
> Oh, there's -ENOMEM, -EINVAL and probably a few others down in the
> bowels of the kernel i2c bits. Starting with the obvious (ENXIO) seems
> safest to me.
>

Yes, of course, but I was referring to the values which could be returned
through the i2c-algo-bit call used in this edid detection call.

-- 
Eugeni Dodonov
 
-- next part --
An HTML attachment was scrubbed...
URL: 



[PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-17 Thread Eugeni Dodonov
On Mon, Oct 17, 2011 at 18:41, Keith Packard  wrote:

> On Mon, 17 Oct 2011 11:12:29 -0200, Eugeni Dodonov <
> eugeni.dodonov at intel.com> wrote:
>
> > + if (ret == -ENXIO) {
> > + DRM_DEBUG_KMS("drm: skipping non-existent adapter
> %s\n",
> > + adapter->name);
> > + break;
> > + }
>
> This seems good to me; are there additional error values which should
> also be considered fatal and not subject to retry?
>



[PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-17 Thread Keith Packard
On Mon, 17 Oct 2011 19:07:51 -0200, Eugeni Dodonov  
wrote:

> From what I've checked, the other return error value in this context could
> be -EREMOTEIO, which could be caused by transmission error so it should be
> retried.

Oh, there's -ENOMEM, -EINVAL and probably a few others down in the
bowels of the kernel i2c bits. Starting with the obvious (ENXIO) seems
safest to me.

-- 
keith.packard at intel.com
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: not available
URL: 



[PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-17 Thread Keith Packard
On Mon, 17 Oct 2011 11:12:29 -0200, Eugeni Dodonov  wrote:

> + if (ret == -ENXIO) {
> + DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n",
> + adapter->name);
> + break;
> + }

This seems good to me; are there additional error values which should
also be considered fatal and not subject to retry?

Reviewed-by: Keith Packard 

-keith
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 827 bytes
Desc: not available
URL: 



[PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-17 Thread Eugeni Dodonov
This allows to avoid talking to a non-existent bus repeatedly until we
finally timeout. The non-existent bus is signalled by -ENXIO error,
provided by i2c_algo_bit:bit_doAddress call.

As the advantage of such change, all the other routines which use
drm_get_edid would benefit for this timeout.

This change should fix
https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
edid detection timing by 10-30% in most cases, and by a much larger margin
in case of phantom outputs.

Signed-off-by: Eugeni Dodonov 
---
 drivers/gpu/drm/drm_edid.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 7425e5c..1bca6d7 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, 
unsigned char *buf,
}
};
ret = i2c_transfer(adapter, msgs, 2);
+   if (ret == -ENXIO) {
+   DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n",
+   adapter->name);
+   break;
+   }
} while (ret != 2 && --retries);

return ret == 2 ? 0 : -1;
-- 
1.7.7



[PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-17 Thread Eugeni Dodonov
This allows to avoid talking to a non-existent bus repeatedly until we
finally timeout. The non-existent bus is signalled by -ENXIO error,
provided by i2c_algo_bit:bit_doAddress call.

As the advantage of such change, all the other routines which use
drm_get_edid would benefit for this timeout.

This change should fix
https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
edid detection timing by 10-30% in most cases, and by a much larger margin
in case of phantom outputs.

Signed-off-by: Eugeni Dodonov eugeni.dodo...@intel.com
---
 drivers/gpu/drm/drm_edid.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 7425e5c..1bca6d7 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, 
unsigned char *buf,
}
};
ret = i2c_transfer(adapter, msgs, 2);
+   if (ret == -ENXIO) {
+   DRM_DEBUG_KMS(drm: skipping non-existent adapter %s\n,
+   adapter-name);
+   break;
+   }
} while (ret != 2  --retries);
 
return ret == 2 ? 0 : -1;
-- 
1.7.7

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-17 Thread Keith Packard
On Mon, 17 Oct 2011 11:12:29 -0200, Eugeni Dodonov eugeni.dodo...@intel.com 
wrote:

 + if (ret == -ENXIO) {
 + DRM_DEBUG_KMS(drm: skipping non-existent adapter %s\n,
 + adapter-name);
 + break;
 + }

This seems good to me; are there additional error values which should
also be considered fatal and not subject to retry?

Reviewed-by: Keith Packard kei...@keithp.com

-keith


pgpFsUdLzQUmW.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-17 Thread Eugeni Dodonov
On Mon, Oct 17, 2011 at 18:41, Keith Packard kei...@keithp.com wrote:

 On Mon, 17 Oct 2011 11:12:29 -0200, Eugeni Dodonov 
 eugeni.dodo...@intel.com wrote:

  + if (ret == -ENXIO) {
  + DRM_DEBUG_KMS(drm: skipping non-existent adapter
 %s\n,
  + adapter-name);
  + break;
  + }

 This seems good to me; are there additional error values which should
 also be considered fatal and not subject to retry?


From what I've checked, the other return error value in this context could
be -EREMOTEIO, which could be caused by transmission error so it should be
retried.

-- 
Eugeni Dodonov
http://eugeni.dodonov.net/
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-17 Thread Keith Packard
On Mon, 17 Oct 2011 19:07:51 -0200, Eugeni Dodonov eug...@dodonov.net wrote:

 From what I've checked, the other return error value in this context could
 be -EREMOTEIO, which could be caused by transmission error so it should be
 retried.

Oh, there's -ENOMEM, -EINVAL and probably a few others down in the
bowels of the kernel i2c bits. Starting with the obvious (ENXIO) seems
safest to me.

-- 
keith.pack...@intel.com


pgpBgH8gSeXmr.pgp
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-17 Thread Eugeni Dodonov
On Mon, Oct 17, 2011 at 20:41, Keith Packard kei...@keithp.com wrote:

 On Mon, 17 Oct 2011 19:07:51 -0200, Eugeni Dodonov eug...@dodonov.net
 wrote:

  From what I've checked, the other return error value in this context
 could
  be -EREMOTEIO, which could be caused by transmission error so it should
 be
  retried.

 Oh, there's -ENOMEM, -EINVAL and probably a few others down in the
 bowels of the kernel i2c bits. Starting with the obvious (ENXIO) seems
 safest to me.


Yes, of course, but I was referring to the values which could be returned
through the i2c-algo-bit call used in this edid detection call.

-- 
Eugeni Dodonov
 http://eugeni.dodonov.net/
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-07 Thread Eugeni Dodonov
This allows to avoid talking to a non-existent bus repeatedly until we
finally timeout. The non-existent bus is signalled by -ENXIO error,
provided by i2c_algo_bit:bit_doAddress call.

As the advantage of such change, all the other routines which use
drm_get_edid would benefit for this timeout.

As the disadvantage comes the fact that I only tested it on Intel cards,
so I am not sure whether it would work on nouveau and radeon.

This change should potentially fix
https://bugs.freedesktop.org/show_bug.cgi?id=41059

v2: change printk level to KERN_DEBUG to avoid filling up dmesg

Signed-off-by: Eugeni Dodonov 
---
 drivers/gpu/drm/drm_edid.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 7425e5c..5ed34f2 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, 
unsigned char *buf,
}
};
ret = i2c_transfer(adapter, msgs, 2);
+   if (ret == -ENXIO) {
+   printk(KERN_DEBUG "drm: skipping non-existent adapter 
%s\n",
+   adapter->name);
+   break;
+   }
} while (ret != 2 && --retries);

return ret == 2 ? 0 : -1;
-- 
1.7.6.4



[PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

2011-10-07 Thread Eugeni Dodonov
This allows to avoid talking to a non-existent bus repeatedly until we
finally timeout. The non-existent bus is signalled by -ENXIO error,
provided by i2c_algo_bit:bit_doAddress call.

As the advantage of such change, all the other routines which use
drm_get_edid would benefit for this timeout.

As the disadvantage comes the fact that I only tested it on Intel cards,
so I am not sure whether it would work on nouveau and radeon.

This change should potentially fix
https://bugs.freedesktop.org/show_bug.cgi?id=41059

v2: change printk level to KERN_DEBUG to avoid filling up dmesg

Signed-off-by: Eugeni Dodonov eugeni.dodo...@intel.com
---
 drivers/gpu/drm/drm_edid.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 7425e5c..5ed34f2 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, 
unsigned char *buf,
}
};
ret = i2c_transfer(adapter, msgs, 2);
+   if (ret == -ENXIO) {
+   printk(KERN_DEBUG drm: skipping non-existent adapter 
%s\n,
+   adapter-name);
+   break;
+   }
} while (ret != 2  --retries);
 
return ret == 2 ? 0 : -1;
-- 
1.7.6.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel