Anders Grafström wrote:
Auke Kok wrote:
Allthough the spec itself didn't talk about phy reset times, I've ran this
patch with
some debugging output on a few boxes and did some speed/duplex settings,
and the PHY
reset returned succesfull after the very first mdio_read, which is before
any msleep(10)
is executed. That is also expected behaviour.

I think you might be confusing this with a MAC reset, which has a
documented 10usec
timeout (see 8255x developers manual). The driver already adheres to this
by doing a
20usec delay after software/selective resets.

which gets us back to the original problem: how did your driver end up in
loopback mode?
(and, how did you figure out that it did??).


This is what the 2.4.33.3 driver does:

void
e100_phy_reset(struct e100_private *bdp)
{
        u16 ctrl_reg;
        ctrl_reg = BMCR_RESET;
        e100_mdi_write(bdp, MII_BMCR, bdp->phy_addr, ctrl_reg);
        /* ieee 802.3 : The reset process shall be completed       */
        /* within 0.5 seconds from the settting of PHY reset bit.  */
        set_current_state(TASK_UNINTERRUPTIBLE);
        schedule_timeout(HZ / 2);
}

And here
http://www.cs.helsinki.fi/linux/linux-kernel/2003-23/1245.html
I found this entry:

<[EMAIL PROTECTED]> (03/06/08 1.1218)
[e100] misc
<...>
* Add 1/2 second delay after PHY reset to allow link partner to
see and respond to reset, per IEEE 802.3.


I ran mii-diag when the LEDs went out and the register dump
said it was in loopback. It is somewhat difficult reproduce.
It seems to be timing dependent, something else has to occur
at the same time.
I must confess I have only seen it with the 2.6.13 kernel.
I have not been able to reproduce it with 2.6.18.
But I have found no change in the driver that would fix it so
I suspect the problem is still there.

I have tried adding debug output to see if I can read back the
RESET bit in set state, but then the problem refuses to show
so I don't think I can rule out an unfinished PHY reset.

theoretically, yes, the ieee spec PHY reset timeout is kind of silly: in no way do we assume that we have re-negotiated link after 1/2 a second! Other code in the driver should take care of that, and since it works I'll assume it does ;)

the mdio_read probably acts as a flush to the hardware too - masquerading problems, more goodness. Perhaps we should do a single read in all cases and forget about the timeout (is there an mdio_write_flush?)

Basically the timeout is wrong: a LINK reset is not a PHY reset. The PHY is back online and ready to respond in (probably) a single clock cycle. The link can take up to 3 seconds in normal cases. Waiting for 1/2 a second does not fix anything there. Here's where the 8255x (PHY part) spec abandons us: I don't read anything about PHY reset timeouts in it.

Can you try to debug if your while () timeout loop is actually waiting for a significant amount? something like adding a printk(KERN_ERR "counted down to %d0 msec\n", counter); after the entire while{} loop should show you if there is variation in the PHY reset time needed for the PHY to be back online.

running mii-diag before the link comes back up might be causing the issue in the first place, and certainly suggests a small race.

Have you tried to run the e100-sbit branch from jgarzik's netdev-2.6 tree? We're still looking into merging this and I guess I should push it to -mm to have it receive some testing....

Cheers,

Auke
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to