On Fri, 02 May 2008 16:25:36 -0500, David Milburn wrote:
> 
> Hi Jean,
> 
> Jean Delvare wrote:
> > Hi David,
> > 
> > On Tue, 29 Apr 2008 12:00:53 -0500, David Milburn wrote:
> > 
> >>Per the PIIX4 errata, there maybe a delay between setting the
> >>start bit in the Smbus Host Controller Register and the transaction
> >>actually starting. If the driver doesn't delay long enough, it
> >>may appear that the transaction is complete when actually it
> >>hasn't started, this may lead to bus collisions.
> > 
> > 
> > The driver was already waiting at least 1 ms before checking the busy
> > bit. I don't see any value mentioned in the PIIX4 specification update,
> > so what makes you think that 2 ms is required? And what makes you think
> > it's sufficient?
> 
> Yes, you are correct it doesn't specify a value, I was trying to
> make a minimal increase, but as you state below it probably would
> be better to wait intially instead of impacting the polling loop.
> 
> > 
> > I've never seen any problem with the i2c-piix4 driver on my PIIX4,
> > while I tested it hard at HZ=1000. On which chip did you hit the
> > problem? On what machine? Which transaction? How frequently? And how
> > did you notice? Details please.
> > 
> 
> It is always reproducible when running "sensors" on a Tyan
> Trinity GC-SL (s2707) with a Broadcom ServerWorks Grand Champion
> SL chipset and a Winbond W83782D.

OK, so not an original Intel PIIX4. Which south bridge is on the system
exactly, ServerWorks CSB5? Maybe the PIIX4 itself is happy with 1 ms
and that's why I don't see the problem.

Do you have a datasheet for this chip?

I think I have a old SuperMicro board with a ServerWorks OSB4, I'll do
some tests with it. I think it has only seen 2.4 kernels so far, so
HZ=100, where the initial delay was at least 10 ms. I'll test it at
HZ=1000 and see if I can reproduce the problem. If I can, then I guess
we are better increasing the delay for all the old ServerWorks chips
(OSB4, CSB5 and CSB6.) If I can't then maybe just increase the delay
for the CSB5?

> 
> With debugging turned on, here is the dmesg leading up to the
> timeout (more details at bugzilla.redhat.com BZ 182687).

Thanks. I asked the reporter for an lspci.

> 
> i2c_adapter i2c-0: Transaction (post): CNT=0c, CMD=aa, ADD=91, DAT0=4b, 
> DAT1=00
> i2c_adapter i2c-0: Transaction (pre): CNT=08, CMD=02, ADD=91, DAT0=4b, DAT1=00
> i2c_adapter i2c-0: temp 02, timeout 0 MAX_TIMEOUT 500
> i2c_adapter i2c-0: Transaction (post): CNT=08, CMD=02, ADD=91, DAT0=4b, 
> DAT1=00
> i2c_adapter i2c-0: Transaction (pre): CNT=08, CMD=16, ADD=91, DAT0=4b, DAT1=00
> i2c_adapter i2c-0: temp 09, timeout 501 MAX_TIMEOUT 500
> i2c_adapter i2c-0: SMBus Timeout!
> i2c_adapter i2c-0: Bus collision! SMBus may be locked until next hard reset. 
> (sorry!)
> i2c_adapter i2c-0: Failed reset at end of transaction (01)
> i2c_adapter i2c-0: Transaction (post): CNT=08, CMD=16, ADD=91, DAT0=4b, 
> DAT1=00
> i2c_adapter i2c-0: Transaction (pre): CNT=0c, CMD=00, ADD=91, DAT0=4b, DAT1=00
> i2c_adapter i2c-0: SMBus busy (01). Resetting...
> i2c_adapter i2c-0: Failed! (01)

> 
> >>Signed-off-by: David Milburn <[EMAIL PROTECTED]>
> >>---
> >> drivers/ata/libata-core.c      |    0 
> >> drivers/i2c/busses/i2c-piix4.c |    2 +-
> >> 2 files changed, 1 insertion(+), 1 deletion(-)
> >>
> >>diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> >>diff --git a/drivers/i2c/busses/i2c-piix4.c b/drivers/i2c/busses/i2c-piix4.c
> >>index 9bbe96c..84a70ac 100644
> >>--- a/drivers/i2c/busses/i2c-piix4.c
> >>+++ b/drivers/i2c/busses/i2c-piix4.c
> >>@@ -232,7 +232,7 @@ static int piix4_transaction(void)
> >> 
> >>    /* We will always wait for a fraction of a second! (See PIIX4 docs 
> >> errata) */
> >>    do {
> >>-           msleep(1);
> >>+           msleep(2);
> >>            temp = inb_p(SMBHSTSTS);
> >>    } while ((temp & 0x01) && (timeout++ < MAX_TIMEOUT));
> >> 
> > 
> > 
> > This is not only increasing the delay before checking the busy bit
> > right after starting a transaction. This also slows down the polling
> > loop. And this also has the side effect of doubling the timeout - not
> > really your fault, count-based timeouts are broken by design.
> > 
> > I am additionally worried that you are changing this for all devices,
> > while presumably only the PIIX4 (and maybe the 82443MX? not sure) are
> > affected. If seems unfair to slow down the more recent devices.
> > 
> > I'd like you to update your patch to only change the initial wait time
> > and not the polling loop interval. It should be fairly easy. I also
> > would like you to make this change device-dependent, so that newer
> > devices aren't slowed down.
> 
> Ok, I will re-submit an updated patch after testing.
> 
> Thanks for reviewing,
> David

You're welcome. I'll help you with testing on PIIX4 and OSB4 (if I can
get my hands on it again.)

-- 
Jean Delvare

_______________________________________________
i2c mailing list
[email protected]
http://lists.lm-sensors.org/mailman/listinfo/i2c

Reply via email to