Mike Belopuhov wrote on 2013-05-23 21:55:
On Thu, May 23, 2013 at 11:04 +0200, David Imhoff wrote:
The problem seems to be in the Auto polling of the mac link state.
ifconfig shows the correct link state, but the BGE_STS_LINK bit in
the driver is never set. Debugging the interrupt routine I found the
driver never receives a link state changed attention, instead the
last interrupt indicates a auto-poll error and bit 0 of register
BGE_AUTOPOLL_STS is set.
Looking at the FreeBSD and Linux drivers I found that they don't use
auto-polling. Also the programmers guide of the BCM5718 family(page
204) doesn't describe the auto-polling method anymore, as was done
in the PG of other BCM57xx.
I agree that this might be a better solution than what I had:
setting the BGE_STS_LINK_EVT flag back in bge_link_upd to force
the bge_intr into running it every time untill it gets the link;
and it seems to work.
I also tried fixing auto-polling mode, but it kept giving me a lot
of bogus interrupts(few hundred) after bringing the link up. Using
the LNKRDY signal just gives 2 interrupts.
So I disabled Auto-polling for the BCM5718 family. This made all
ports work on my Dell R320(BCM5720) and BCM5719. My BCM5719 still
doesn't fully work because now i run into bge_watchdog reset's when
sending big packet, the same issue reported by Robert Young on 5
April.
Right, it seems like the TX completion interrupts never arrive
for large packets. Forcing txeof doesn't help. In fact I can
see some funny things going on here: for small ping packets 1
out of 1 tx buffer descriptors contain an mbuf to scratch, for
170 byte packets 1 out of 2 tx bd's contain an mbuf, for 250
byte packets 1 out of 3 tx bd's contain an mbuf, for 500 byte
packets TX completion interrupt never arrives.
In my setup the problem seems RX related. Sending a 1450
byte ping to a non-existing host using a forced arp entry works.
But when i do the same on the remote host and run tcpdump on
the BCM5719 interface, the controller crashes and the
watchdog will trigger when trying to send a (small) packet.
My steps to reproduce:
- connect bge host to a remote host using a hub or direct link
- on bge host:
# ifconfig bge0 192.168.1.1
# tcpdump -n -i bge0
- on remote host:
# ifconfig ??? 192.168.1.2
# arp -s 192.168.1.4 02:11:11:11:11:14
# ping -c 1 192.168.1.4
# ping -s 1450 -c 5 192.168.1.4
- on bge host, tcpdump shows only 1 packet received
- on bge host: ping -c 1 192.168.1.2
- wait a bit, and notice the watchdog reset
I had a suspicion that this is because ASF mode is turned on
and we fail to share a port with IPMI firmware, but right now
I'm more inclined to believe that this is just another TX bug.
Both of my BCM5719 cards are PCI/E expansion cards, so no port
sharing on them.
I also noticed if_bge.c line 2293:
BGE_SETBIT(sc, BGE_MI_MODE, BGE_MIMODE_AUTOPOLL|10<<16);
Does any one have a clue why the (10<<16) is there? it changes the
MI clock. But why? and why OR 0xA with the default 0xC, instead of
0x2? FreeBSD and Linux use the default of 0xC for the MI Clock.
I think it's a remnant of the original code and it should be
BGE_MIMODE_BASE (as in "the other frequency").
But since BGE_MIMODE_BASE is the default value after reset and
OR'ing the value wouldn't be correct, I suggest to just remove
the '|(10<<16)'.
Index: sys/dev/pci/if_bge.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/if_bge.c,v
retrieving revision 1.320
diff -u -r1.320 if_bge.c
--- sys/dev/pci/if_bge.c 4 Mar 2013 01:33:18 -0000 1.320
+++ sys/dev/pci/if_bge.c 21 May 2013 13:18:36 -0000
@@ -2288,7 +2288,9 @@
/* Enable PHY auto polling (for MII/GMII only) */
if (sc->bge_flags & BGE_PHY_FIBER_TBI) {
CSR_WRITE_4(sc, BGE_MI_STS, BGE_MISTS_LINK);
- } else {
+ } else if (!(BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5717 ||
+ BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5719 ||
+ BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5720)) {
BGE_STS_SETBIT(sc, BGE_STS_AUTOPOLL);
BGE_SETBIT(sc, BGE_MI_MODE, BGE_MIMODE_AUTOPOLL|10<<16);
if (BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5700)
@@ -4386,18 +4388,13 @@
if_link_state_change(ifp);
ifp->if_baudrate = 0;
}
- /*
- * Discard link events for MII/GMII cards if MI auto-polling
disabled.
- * This should not happen since mii callouts are locked now, but
- * we keep this check for debug.
- */
- } else if (BGE_STS_BIT(sc, BGE_STS_AUTOPOLL)) {
+ } else {
/*
I think you've missed some important bits here.
I tried to keep the functional changes to a minimum. As far as I
could see the new code didn't do anything less then the original.
* Some broken BCM chips have BGE_STATFLAG_LINKSTATE_CHANGED bit
* in status word always set. Workaround this bug by reading
* PHY link status directly.
*/
- link = (CSR_READ_4(sc, BGE_MI_STS) & BGE_MISTS_LINK)?
+ link = (CSR_READ_4(sc, BGE_TX_STS) & BGE_TXSTAT_LINK_UP)?
BGE_STS_LINK : 0;
if (BGE_STS_BIT(sc, BGE_STS_LINK) != link) {
Why did you change BGE_MI_STS to BGE_TX_STS?
BGE_MI_STS didn't update the link flag when I disabled auto-polling.
The programmer guide 57xx-PG105-R.pdf page 242 says that BGE_TX_STS
must be used when Not using auto-polling, and can be used when using
auto-polling.
So in 'theory' this should work for all BGE chips in both auto-
polling and LNKRDY mode.
Please inspect my diff below; I believe it's more complete.
diff --git sys/dev/pci/if_bge.c sys/dev/pci/if_bge.c
index 1d37192..d357443 100644
--- sys/dev/pci/if_bge.c
+++ sys/dev/pci/if_bge.c
@@ -1055,10 +1055,22 @@ bge_miibus_statchg(struct device *dev)
(mii->mii_media_active & IFM_ETH_FMASK) != sc->bge_flowflags) {
sc->bge_flowflags = mii->mii_media_active & IFM_ETH_FMASK;
mii->mii_media_active &= ~IFM_ETH_FMASK;
}
+ if (!BGE_STS_BIT(sc, BGE_STS_LINK) &&
+ mii->mii_media_status & IFM_ACTIVE &&
+ IFM_SUBTYPE(mii->mii_media_active) != IFM_NONE)
+ BGE_STS_SETBIT(sc, BGE_STS_LINK);
+ else if (BGE_STS_BIT(sc, BGE_STS_LINK) &&
+ (!(mii->mii_media_status & IFM_ACTIVE) ||
+ IFM_SUBTYPE(mii->mii_media_active) == IFM_NONE))
+ BGE_STS_CLRBIT(sc, BGE_STS_LINK);
+
+ if (!BGE_STS_BIT(sc, BGE_STS_LINK))
+ return;
+
/* Set the port mode (MII/GMII) to match the link speed. */
mac_mode = CSR_READ_4(sc, BGE_MAC_MODE) &
~(BGE_MACMODE_PORTMODE | BGE_MACMODE_HALF_DUPLEX);
tx_mode = CSR_READ_4(sc, BGE_TX_MODE);
rx_mode = CSR_READ_4(sc, BGE_RX_MODE);
@@ -1773,11 +1785,11 @@ int
bge_blockinit(struct bge_softc *sc)
{
volatile struct bge_rcb *rcb;
vaddr_t rcb_addr;
bge_hostaddr taddr;
- u_int32_t dmactl, val;
+ u_int32_t dmactl, mimode, val;
int i, limit;
/*
* Initialize the memory window pointer register so that
* we can access the first 32K of internal NIC RAM. This will
@@ -2369,12 +2381,20 @@ bge_blockinit(struct bge_softc *sc)
/* Enable PHY auto polling (for MII/GMII only) */
if (sc->bge_flags & BGE_PHY_FIBER_TBI) {
CSR_WRITE_4(sc, BGE_MI_STS, BGE_MISTS_LINK);
} else {
- BGE_STS_SETBIT(sc, BGE_STS_AUTOPOLL);
- BGE_SETBIT(sc, BGE_MI_MODE, BGE_MIMODE_AUTOPOLL|10<<16);
+ if ((sc->bge_flags & BGE_CPMU_PRESENT) != 0)
+ mimode = BGE_MIMODE_500KHZ_CONST;
+ else
+ mimode = BGE_MIMODE_BASE;
Line 2643 already sets the 500KHZ Const clock.
This also changes the "Use Short Preamble" flag and "PHY Address"
which are not 0 by default.
+ if (BGE_IS_5700_FAMILY(sc) ||
+ BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5705) {
+ mimode |= BGE_MIMODE_AUTOPOLL;
+ BGE_STS_SETBIT(sc, BGE_STS_AUTOPOLL);
+ }
I didn't dare to change the behaviour for non BCM5718 family
cards since this might break compatibility with cards that are
currently working. I don't know what the official policy of
OpenBSD is for these kind of changes.
+ CSR_WRITE_4(sc, BGE_MI_MODE, mimode);
if (BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5700)
FreeBSD excludes the BGE_CHIPID_BCM5700_B2 chip here, so
maybe we should also do so.
CSR_WRITE_4(sc, BGE_MAC_EVT_ENB,
BGE_EVTENB_MI_INTERRUPT);
}
@@ -4515,10 +4535,17 @@ bge_link_upd(struct bge_softc *sc)
else if (BGE_STS_BIT(sc, BGE_STS_LINK) &&
(!(mii->mii_media_status & IFM_ACTIVE) ||
IFM_SUBTYPE(mii->mii_media_active) == IFM_NONE))
BGE_STS_CLRBIT(sc, BGE_STS_LINK);
}
+ } else {
+ /*
+ * For controllers that call mii_tick, we have to poll
+ * link status.
+ */
I wonder if it's possible that a interrupt is triggered during a
bge_miibus_writereg/bge_miibus_readreg on a non-APE firmware? In that
case this code path would also be taken on cards that use auto-polling.
Also do we still have to check for broken chips that have the
BGE_STATFLAG_LINKSTATE_CHANGED always set?
+ mii_pollstat(mii);
+ bge_miibus_statchg(&sc->bge_dev);
}
/* Clear the attention */
CSR_WRITE_4(sc, BGE_MAC_STS, BGE_MACSTAT_SYNC_CHANGED|
BGE_MACSTAT_CFG_CHANGED|BGE_MACSTAT_MI_COMPLETE|
Attached below is a new version of your patch with my comments
applied.
Kind regards,
David
Index: sys/dev/pci/if_bge.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/if_bge.c,v
retrieving revision 1.328
diff -u -p -r1.328 if_bge.c
--- sys/dev/pci/if_bge.c 22 May 2013 16:02:31 -0000 1.328
+++ sys/dev/pci/if_bge.c 24 May 2013 13:40:33 -0000
@@ -1057,6 +1057,18 @@ bge_miibus_statchg(struct device *dev)
mii->mii_media_active &= ~IFM_ETH_FMASK;
}
+ if (!BGE_STS_BIT(sc, BGE_STS_LINK) &&
+ mii->mii_media_status & IFM_ACTIVE &&
+ IFM_SUBTYPE(mii->mii_media_active) != IFM_NONE)
+ BGE_STS_SETBIT(sc, BGE_STS_LINK);
+ else if (BGE_STS_BIT(sc, BGE_STS_LINK) &&
+ (!(mii->mii_media_status & IFM_ACTIVE) ||
+ IFM_SUBTYPE(mii->mii_media_active) == IFM_NONE))
+ BGE_STS_CLRBIT(sc, BGE_STS_LINK);
+
+ if (!BGE_STS_BIT(sc, BGE_STS_LINK))
+ return;
+
/* Set the port mode (MII/GMII) to match the link speed. */
mac_mode = CSR_READ_4(sc, BGE_MAC_MODE) &
~(BGE_MACMODE_PORTMODE | BGE_MACMODE_HALF_DUPLEX);
@@ -2371,9 +2383,13 @@ bge_blockinit(struct bge_softc *sc)
if (sc->bge_flags & BGE_PHY_FIBER_TBI) {
CSR_WRITE_4(sc, BGE_MI_STS, BGE_MISTS_LINK);
} else {
- BGE_STS_SETBIT(sc, BGE_STS_AUTOPOLL);
- BGE_SETBIT(sc, BGE_MI_MODE, BGE_MIMODE_AUTOPOLL|10<<16);
- if (BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5700)
+ if (BGE_IS_5700_FAMILY(sc) ||
+ BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5705) {
+ BGE_STS_SETBIT(sc, BGE_STS_AUTOPOLL);
+ BGE_SETBIT(sc, BGE_MI_MODE, BGE_MIMODE_AUTOPOLL);
+ }
+ if (BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5700 &&
+ BGE_CHIPREV(sc->bge_chipid) != BGE_CHIPID_BCM5700_B2)
CSR_WRITE_4(sc, BGE_MAC_EVT_ENB,
BGE_EVTENB_MI_INTERRUPT);
}
@@ -4491,11 +4507,6 @@ bge_link_upd(struct bge_softc *sc)
if_link_state_change(ifp);
ifp->if_baudrate = 0;
}
- /*
- * Discard link events for MII/GMII cards if MI auto-polling
disabled.
- * This should not happen since mii callouts are locked now, but
- * we keep this check for debug.
- */
} else if (BGE_STS_BIT(sc, BGE_STS_AUTOPOLL)) {
/*
* Some broken BCM chips have BGE_STATFLAG_LINKSTATE_CHANGED bit
@@ -4517,6 +4528,13 @@ bge_link_upd(struct bge_softc *sc)
IFM_SUBTYPE(mii->mii_media_active) == IFM_NONE))
BGE_STS_CLRBIT(sc, BGE_STS_LINK);
}
+ } else {
+ /*
+ * For controllers that call mii_tick, we have to poll
+ * link status.
+ */
+ mii_pollstat(mii);
+ bge_miibus_statchg(&sc->bge_dev);
}
/* Clear the attention */