On Sat, Dec 08, 2018 at 05:32:41PM +0000, Miod Vallat wrote: > I have noticed, for a while, that my O2 systems were horribly slow > during installs or upgrades, when fetching sets from the network (28 > *minutes* to fetch base64.tgz). > > At first, I thought this was a bsd.rd specific bug, but couldn't find > anything obvious. After gathering enough data, I found out that the > problem only occurs on a cold boot. After a reboot, the network > performance is as good as it can be. That would explain why I would only > notice it during upgrades. > > I also noticed that, on a warm boot, the dmesg would show: > > mec0 at macebus0 base 0x00280000 irq 3: MAC-110 rev 1, address > 08:00:69:0e:bf:a1 > nsphy0 at mec0 phy 8: DP83840 10/100 PHY, rev. 1 > > but on cold boots, it would show: > > mec0 at macebus0 base 0x00280000 irq 3: MAC-110 rev 1, address > 08:00:69:0e:bf:a1 > nsphy0 at mec0 phy 10: DP83840 10/100 PHY, rev. 1 > > Note that, in these cases, the phy seems to attach to a different > address. In these cases, after booting, "ifconfig mec0" would show: > > mec0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500 > lladdr 08:00:69:0e:bf:a1 > llprio 3 > media: Ethernet autoselect > status: active > inet 10.0.1.193 netmask 0xff000000 broadcast 10.255.255.255 > > while one would expect the "media" line to be similar to: > > media: Ethernet autoselect (100baseTX full-duplex) > > Investigating further, it seems that, after a cold boot, the MII bus > takes some time to initialize; the phy does not answer to address 8 but > to a larger address (10 or 11), then, after being reset, to its correct > address of 8. > > So the kernel would discover the phy at a wrong address, attach it, and > after it gets reset, reading from the phy at the wrong address would > return either all bits clear or all bits set, confusing the link speed > logic without any way to recover. > > What I tried but did not work: > - invoking mii_attach() twice in the mec driver. This would attach nsphy > twice, once at the wrong address, then once at the correct address, > but the first (wrong) attachment would be preferred. > - adding a one second delay between the Ethernet interface reset and > mii_attach(). This would work most of the time, but not always. > > What I tried and works: > - the first time the interface is reset, the mii bus is walked and all > phys found on it are reset. Thus, by the time mii_attach() runs and > walks the bus again, the phy will answer at the right address. > > The diff below implements this (last chunk of if_mec.c), and also cleans > the mii read/write routines a bit (all the other chunks). > > Tested on three different R5K family O2 systems, which have all been > exposing that problem on cold boot.
Committed. Thank you!