Mike Belopuhov wrote on 2013-05-23 21:55:
On Thu, May 23, 2013 at 11:04 +0200, David Imhoff wrote:
The problem seems to be in the Auto polling of the mac link state.
ifconfig shows the correct link state, but the BGE_STS_LINK bit in
the driver is never set. Debugging the interrupt routine I found the
driver never receives a link state changed attention, instead the
last interrupt indicates a auto-poll error and bit 0 of register
BGE_AUTOPOLL_STS is set.
Looking at the FreeBSD and Linux drivers I found that they don't use
auto-polling. Also the programmers guide of the BCM5718 family(page
204) doesn't describe the auto-polling method anymore, as was done
in the PG of other BCM57xx.
I agree that this might be a better solution than what I had:
setting the BGE_STS_LINK_EVT flag back in bge_link_upd to force
the bge_intr into running it every time untill it gets the link;
and it seems to work.

I also tried fixing auto-polling mode, but it kept giving me a lot
of bogus interrupts(few hundred) after bringing the link up. Using
the LNKRDY signal just gives 2 interrupts.

So I disabled Auto-polling for the BCM5718 family. This made all
ports work on my Dell R320(BCM5720) and BCM5719. My BCM5719 still
doesn't fully work because now i run into bge_watchdog reset's when
sending big packet, the same issue reported by Robert Young on 5
April.

Right, it seems like the TX completion interrupts never arrive
for large packets.  Forcing txeof doesn't help.  In fact I can
see some funny things going on here: for small ping packets 1
out of 1 tx buffer descriptors contain an mbuf to scratch, for
170 byte packets 1 out of 2 tx bd's contain an mbuf, for 250
byte packets 1 out of 3 tx bd's contain an mbuf, for 500 byte
packets TX completion interrupt never arrives.

In my setup the problem seems RX related. Sending a 1450
byte ping to a non-existing host using a forced arp entry works.
But when i do the same on the remote host and run tcpdump on
the BCM5719 interface, the controller crashes and the
watchdog will trigger when trying to send a (small) packet.

My steps to reproduce:
 - connect bge host to a remote host using a hub or direct link
 - on bge host:
   # ifconfig bge0 192.168.1.1
   # tcpdump -n -i bge0
 - on remote host:
   # ifconfig ??? 192.168.1.2
   # arp -s 192.168.1.4 02:11:11:11:11:14
   # ping -c 1 192.168.1.4
   # ping -s 1450 -c 5 192.168.1.4
 - on bge host, tcpdump shows only 1 packet received
 - on bge host: ping -c 1 192.168.1.2
 - wait a bit, and notice the watchdog reset

I had a suspicion that this is because ASF mode is turned on
and we fail to share a port with IPMI firmware, but right now
I'm more inclined to believe that this is just another TX bug.

Both of my BCM5719 cards are PCI/E expansion cards, so no port
sharing on them.

I also noticed if_bge.c line 2293:
BGE_SETBIT(sc, BGE_MI_MODE, BGE_MIMODE_AUTOPOLL|10<<16);
Does any one have a clue why the (10<<16) is there? it changes the
MI clock. But why? and why OR 0xA with the default 0xC, instead of
0x2? FreeBSD and Linux use the default of 0xC for the MI Clock.

I think it's a remnant of the original code and it should be
BGE_MIMODE_BASE (as in "the other frequency").

But since BGE_MIMODE_BASE is the default value after reset and
OR'ing the value wouldn't be correct, I suggest to just remove
the '|(10<<16)'.

Index: sys/dev/pci/if_bge.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/if_bge.c,v
retrieving revision 1.320
diff -u -r1.320 if_bge.c
--- sys/dev/pci/if_bge.c        4 Mar 2013 01:33:18 -0000       1.320
+++ sys/dev/pci/if_bge.c        21 May 2013 13:18:36 -0000
@@ -2288,7 +2288,9 @@
        /* Enable PHY auto polling (for MII/GMII only) */
        if (sc->bge_flags & BGE_PHY_FIBER_TBI) {
                CSR_WRITE_4(sc, BGE_MI_STS, BGE_MISTS_LINK);
-       } else {
+       } else if (!(BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5717 ||
+           BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5719 ||
+           BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5720)) {
                BGE_STS_SETBIT(sc, BGE_STS_AUTOPOLL);
                BGE_SETBIT(sc, BGE_MI_MODE, BGE_MIMODE_AUTOPOLL|10<<16);
                if (BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5700)
@@ -4386,18 +4388,13 @@
                        if_link_state_change(ifp);
                        ifp->if_baudrate = 0;
                }
-       /*
-        * Discard link events for MII/GMII cards if MI auto-polling
disabled.
-        * This should not happen since mii callouts are locked now, but
-        * we keep this check for debug.
-        */
-       } else if (BGE_STS_BIT(sc, BGE_STS_AUTOPOLL)) {
+       } else {
                /*
I think you've missed some important bits here.

I tried to keep the functional changes to a minimum. As far as I
could see the new code didn't do anything less then the original.


                 * Some broken BCM chips have BGE_STATFLAG_LINKSTATE_CHANGED bit
                 * in status word always set. Workaround this bug by reading
                 * PHY link status directly.
                 */
-               link = (CSR_READ_4(sc, BGE_MI_STS) & BGE_MISTS_LINK)?
+               link = (CSR_READ_4(sc, BGE_TX_STS) & BGE_TXSTAT_LINK_UP)?
                    BGE_STS_LINK : 0;

                if (BGE_STS_BIT(sc, BGE_STS_LINK) != link) {


Why did you change BGE_MI_STS to BGE_TX_STS?

BGE_MI_STS didn't update the link flag when I disabled auto-polling.
The programmer guide 57xx-PG105-R.pdf page 242 says that BGE_TX_STS
must be used when Not using auto-polling, and can be used when using
auto-polling.

So in 'theory' this should work for all BGE chips in both auto-
polling and LNKRDY mode.

Please inspect my diff below; I believe it's more complete.
diff --git sys/dev/pci/if_bge.c sys/dev/pci/if_bge.c
index 1d37192..d357443 100644
--- sys/dev/pci/if_bge.c
+++ sys/dev/pci/if_bge.c
@@ -1055,10 +1055,22 @@ bge_miibus_statchg(struct device *dev)
            (mii->mii_media_active & IFM_ETH_FMASK) != sc->bge_flowflags) {
                sc->bge_flowflags = mii->mii_media_active & IFM_ETH_FMASK;
                mii->mii_media_active &= ~IFM_ETH_FMASK;
        }
+       if (!BGE_STS_BIT(sc, BGE_STS_LINK) &&
+           mii->mii_media_status & IFM_ACTIVE &&
+           IFM_SUBTYPE(mii->mii_media_active) != IFM_NONE)
+               BGE_STS_SETBIT(sc, BGE_STS_LINK);
+       else if (BGE_STS_BIT(sc, BGE_STS_LINK) &&
+           (!(mii->mii_media_status & IFM_ACTIVE) ||
+           IFM_SUBTYPE(mii->mii_media_active) == IFM_NONE))
+               BGE_STS_CLRBIT(sc, BGE_STS_LINK);
+
+       if (!BGE_STS_BIT(sc, BGE_STS_LINK))
+               return;
+
        /* Set the port mode (MII/GMII) to match the link speed. */
        mac_mode = CSR_READ_4(sc, BGE_MAC_MODE) &
            ~(BGE_MACMODE_PORTMODE | BGE_MACMODE_HALF_DUPLEX);
        tx_mode = CSR_READ_4(sc, BGE_TX_MODE);
        rx_mode = CSR_READ_4(sc, BGE_RX_MODE);
@@ -1773,11 +1785,11 @@ int
 bge_blockinit(struct bge_softc *sc)
 {
        volatile struct bge_rcb         *rcb;
        vaddr_t                 rcb_addr;
        bge_hostaddr            taddr;
-       u_int32_t               dmactl, val;
+       u_int32_t               dmactl, mimode, val;
        int                     i, limit;

        /*
         * Initialize the memory window pointer register so that
         * we can access the first 32K of internal NIC RAM. This will
@@ -2369,12 +2381,20 @@ bge_blockinit(struct bge_softc *sc)

        /* Enable PHY auto polling (for MII/GMII only) */
        if (sc->bge_flags & BGE_PHY_FIBER_TBI) {
                CSR_WRITE_4(sc, BGE_MI_STS, BGE_MISTS_LINK);
        } else {
-               BGE_STS_SETBIT(sc, BGE_STS_AUTOPOLL);
-               BGE_SETBIT(sc, BGE_MI_MODE, BGE_MIMODE_AUTOPOLL|10<<16);
+               if ((sc->bge_flags & BGE_CPMU_PRESENT) != 0)
+                       mimode = BGE_MIMODE_500KHZ_CONST;
+               else
+                       mimode = BGE_MIMODE_BASE;

Line 2643 already sets the 500KHZ Const clock.
This also changes the "Use Short Preamble" flag and "PHY Address"
which are not 0 by default.

+               if (BGE_IS_5700_FAMILY(sc) ||
+                   BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5705) {
+                       mimode |= BGE_MIMODE_AUTOPOLL;
+                       BGE_STS_SETBIT(sc, BGE_STS_AUTOPOLL);
+               }

I didn't dare to change the behaviour for non BCM5718 family
cards since this might break compatibility with cards that are
currently working. I don't know what the official policy of
OpenBSD is for these kind of changes.

+               CSR_WRITE_4(sc, BGE_MI_MODE, mimode);
                if (BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5700)

FreeBSD excludes the BGE_CHIPID_BCM5700_B2 chip here, so
maybe we should also do so.

                        CSR_WRITE_4(sc, BGE_MAC_EVT_ENB,
                            BGE_EVTENB_MI_INTERRUPT);
        }
@@ -4515,10 +4535,17 @@ bge_link_upd(struct bge_softc *sc)
                        else if (BGE_STS_BIT(sc, BGE_STS_LINK) &&
                            (!(mii->mii_media_status & IFM_ACTIVE) ||
                            IFM_SUBTYPE(mii->mii_media_active) == IFM_NONE))
                                BGE_STS_CLRBIT(sc, BGE_STS_LINK);
                }
+       } else {
+               /*
+                * For controllers that call mii_tick, we have to poll
+                * link status.
+                */

I wonder if it's possible that a interrupt is triggered during a
bge_miibus_writereg/bge_miibus_readreg on a non-APE firmware? In that
case this code path would also be taken on cards that use auto-polling.

Also do we still have to check for broken chips that have the
BGE_STATFLAG_LINKSTATE_CHANGED always set?

+               mii_pollstat(mii);
+               bge_miibus_statchg(&sc->bge_dev);
        }

        /* Clear the attention */
        CSR_WRITE_4(sc, BGE_MAC_STS, BGE_MACSTAT_SYNC_CHANGED|
            BGE_MACSTAT_CFG_CHANGED|BGE_MACSTAT_MI_COMPLETE|

Attached below is a new version of your patch with my comments
applied.

Kind regards,

David

Index: sys/dev/pci/if_bge.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/if_bge.c,v
retrieving revision 1.328
diff -u -p -r1.328 if_bge.c
--- sys/dev/pci/if_bge.c        22 May 2013 16:02:31 -0000      1.328
+++ sys/dev/pci/if_bge.c        24 May 2013 13:40:33 -0000
@@ -1057,6 +1057,18 @@ bge_miibus_statchg(struct device *dev)
                mii->mii_media_active &= ~IFM_ETH_FMASK;
        }

+       if (!BGE_STS_BIT(sc, BGE_STS_LINK) &&
+           mii->mii_media_status & IFM_ACTIVE &&
+           IFM_SUBTYPE(mii->mii_media_active) != IFM_NONE)
+               BGE_STS_SETBIT(sc, BGE_STS_LINK);
+       else if (BGE_STS_BIT(sc, BGE_STS_LINK) &&
+           (!(mii->mii_media_status & IFM_ACTIVE) ||
+           IFM_SUBTYPE(mii->mii_media_active) == IFM_NONE))
+               BGE_STS_CLRBIT(sc, BGE_STS_LINK);
+
+       if (!BGE_STS_BIT(sc, BGE_STS_LINK))
+               return;
+
        /* Set the port mode (MII/GMII) to match the link speed. */
        mac_mode = CSR_READ_4(sc, BGE_MAC_MODE) &
            ~(BGE_MACMODE_PORTMODE | BGE_MACMODE_HALF_DUPLEX);
@@ -2371,9 +2383,13 @@ bge_blockinit(struct bge_softc *sc)
        if (sc->bge_flags & BGE_PHY_FIBER_TBI) {
                CSR_WRITE_4(sc, BGE_MI_STS, BGE_MISTS_LINK);
        } else {
-               BGE_STS_SETBIT(sc, BGE_STS_AUTOPOLL);
-               BGE_SETBIT(sc, BGE_MI_MODE, BGE_MIMODE_AUTOPOLL|10<<16);
-               if (BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5700)
+               if (BGE_IS_5700_FAMILY(sc) ||
+                   BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5705) {
+                       BGE_STS_SETBIT(sc, BGE_STS_AUTOPOLL);
+                       BGE_SETBIT(sc, BGE_MI_MODE, BGE_MIMODE_AUTOPOLL);
+               }
+               if (BGE_ASICREV(sc->bge_chipid) == BGE_ASICREV_BCM5700 &&
+                   BGE_CHIPREV(sc->bge_chipid) != BGE_CHIPID_BCM5700_B2)
                        CSR_WRITE_4(sc, BGE_MAC_EVT_ENB,
                            BGE_EVTENB_MI_INTERRUPT);
        }
@@ -4491,11 +4507,6 @@ bge_link_upd(struct bge_softc *sc)
                        if_link_state_change(ifp);
                        ifp->if_baudrate = 0;
                }
-       /*
- * Discard link events for MII/GMII cards if MI auto-polling disabled.
-        * This should not happen since mii callouts are locked now, but
-        * we keep this check for debug.
-        */
        } else if (BGE_STS_BIT(sc, BGE_STS_AUTOPOLL)) {
                /*
                 * Some broken BCM chips have BGE_STATFLAG_LINKSTATE_CHANGED bit
@@ -4517,6 +4528,13 @@ bge_link_upd(struct bge_softc *sc)
                            IFM_SUBTYPE(mii->mii_media_active) == IFM_NONE))
                                BGE_STS_CLRBIT(sc, BGE_STS_LINK);
                }
+       } else {
+               /*
+                * For controllers that call mii_tick, we have to poll
+                * link status.
+                */
+               mii_pollstat(mii);
+               bge_miibus_statchg(&sc->bge_dev);
        }

        /* Clear the attention */

Reply via email to