Hi,

Refloating this thread to inform you that the patches did not solve the issue. 
I still get network disconnects, although this time it took 10 days instead of 
the regular 7 days.

I currently use a workaround to bring the networking up as soon as possible. It 
doesn’t fix the bug but since it’s hellish to debug — only happens once a week— 
at least others who also are affected can mitigate the consequences:

Run this every minute from the root crontab:

#!/usr/local/bin/bash
# If during three consecutive runs there is no network, restart it
LOG=/tmp/restart-gem0.log
ROUTER=192.168.1.1
ping -c 1 -q -w 5 $ROUTER > /dev/null

if [[ $? -ne 0 ]]; then
        echo "nonet" >> $LOG
        if [[ $(wc -l $LOG | awk '{print $1}') -ge 3 ]]; then
                ifconfig gem0 down 
                sleep 2
                ifconfig gem0 up 
                rm $LOG
        fi

else 
        rm -f $LOG
fi

> On 08 Sep 2015, at 18:04, Carlos Fenollosa <[email protected]> wrote:
> 
> Thanks Mark and Stuart. I’ll try Mark’s patch first since it’s shorter, then 
> Stuart’s if it’s not enough.
> 
> I’m not keeping my hopes high since Landry mentions it doesn’t affect his 
> box, but we’ll see.
> 
> I will report back in a few days with the results.
> 
> Thanks,
> Carlos
> 
> 
>> On 08 Sep 2015, at 17:44, Stuart Henderson <[email protected]> wrote:
>> 
>> On 2015/09/08 17:28, Carlos Fenollosa wrote:
>>> 
>>>> On 07 Sep 2015, at 20:40, Stuart Henderson <[email protected]> wrote:
>>>> 
>>>> On 2015/09/07 20:26, Landry Breuil wrote:
>>>>> I cant help you on the issue itself, but i can confirm you that i've
>>>>> been seeing the exact same issue with gem0 on my g4 mac mini here, and
>>>>> since some releases. randomly, gem0 just doesnt receive/send pkts
>>>>> anymore and needs to be downed/upped.
>>>> 
>>>> Interesting - I don't see that on mine.
>>>> 
>>>> Out of interest does your switch have flow control enabled? (you will
>>>> see rxpause and/or txpause in the ifconfig output). If it does, is there
>>>> any change if you disable it on the switch (if you can do so)?
>>>> 
>>>> gem0: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu 
>>>> 1500
>>>>      lladdr 00:0d:93:63:da:5a
>>>>      priority: 0
>>>>      groups: egress
>>>>      media: Ethernet autoselect (100baseTX full-duplex)
>>>>      status: active
>>> 
>>> Yes, it seems to be the case:
>>> 
>>> gem0: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>>>       lladdr 00:11:24:87:a7:64
>>>       priority: 0
>>>       groups: egress
>>>       media: Ethernet autoselect (100baseTX full-duplex,rxpause,txpause)
>>>       status: active
>>>       inet 192.168.1.199 netmask 0xffffff00 broadcast 192.168.1.255
>>> 
>>> 
>>> I have a crappy telco router, I’m actually not sure if I can disable it 
>>> there. There is a section on QoS, but the option is disabled.
>>> Could the driver be forced to disable flow control? At least I could try 
>>> running it for a couple weeks to see if the bug is triggered again.
>>> 
>>> 
>>> Thanks a lot,
>>> Carlos
>>> 
>> 
>> Flow control was a complete guess btw and might be unconnected.
>> This diff ought to disable it but my mac is 1500km away at the moment
>> so untested!
>> 
>> Landry, does yours show rxpause/txpause on this line?
>> 
>> 
>> Index: gem.c
>> ===================================================================
>> RCS file: /cvs/src/sys/dev/ic/gem.c,v
>> retrieving revision 1.112
>> diff -u -p -r1.112 gem.c
>> --- gem.c    24 Jun 2015 09:40:54 -0000      1.112
>> +++ gem.c    8 Sep 2015 15:43:51 -0000
>> @@ -240,7 +240,7 @@ gem_config(struct gem_softc *sc)
>> 
>>      gem_mifinit(sc);
>> 
>> -    mii_flags = MIIF_DOPAUSE;
>> +    mii_flags = 0;
>> 
>>      /* 
>>       * Look for an external PHY.
>> @@ -905,7 +905,7 @@ gem_init_regs(struct gem_softc *sc)
>>      bus_space_write_4(t, h, GEM_MAC_RX_CODE_VIOL, 0);
>> 
>>      /* Set XOFF PAUSE time */
>> -    bus_space_write_4(t, h, GEM_MAC_SEND_PAUSE_CMD, 0x1bf0);
>> +    bus_space_write_4(t, h, GEM_MAC_SEND_PAUSE_CMD, 0);
>> 
>>      /*
>>       * Set the internal arbitration to "infinite" bursts of the
>> @@ -1357,17 +1357,6 @@ gem_mii_statchg(struct device *dev)
>>              v &= ~GEM_MAC_XIF_GMII_MODE;
>>      }
>>      bus_space_write_4(t, mac, GEM_MAC_XIF_CONFIG, v);
>> -
>> -    /*
>> -     * 802.3x flow control
>> -     */
>> -    v = bus_space_read_4(t, mac, GEM_MAC_CONTROL_CONFIG);
>> -    v &= ~(GEM_MAC_CC_RX_PAUSE | GEM_MAC_CC_TX_PAUSE);
>> -    if ((IFM_OPTIONS(sc->sc_mii.mii_media_active) & IFM_ETH_RXPAUSE) != 0)
>> -            v |= GEM_MAC_CC_RX_PAUSE;
>> -    if ((IFM_OPTIONS(sc->sc_mii.mii_media_active) & IFM_ETH_TXPAUSE) != 0)
>> -            v |= GEM_MAC_CC_TX_PAUSE;
>> -    bus_space_write_4(t, mac, GEM_MAC_CONTROL_CONFIG, v);
>> }
>> 
>> int
>> 
>> 
> 


Reply via email to