>Synopsis: Weekly network disconnect with G4 Mac Mini (gem0)
>Category: powerpc
>Environment:
System : OpenBSD 5.7
Details : OpenBSD 5.7-stable (GENERIC) #2: Wed Aug 12 23:45:47 CEST
2015
root@mini:/usr/src/sys/arch/macppc/compile/GENERIC
Architecture: OpenBSD.macppc
Machine : macppc
>Description:
Hello,
I'm experiencing a very strange bug with a headless G4 Mac Mini with
the gem0 network driver. The network disconnects by itself and the machine
loses all internet connectivity. It doesn't respond to pings/ssh even inside
the local network. The rest of the machines in my network seem unaffected so
it's not an issue regarding my router.
>How-To-Repeat:
I've narrowed it down to the following conditions:
- It usually happens about a week of regular usage. My G4 has a fairly
consistent usage pattern so it makes sense that the bug also appears with a
pattern.
Here are some sample dates where the bug was triggered:
- Restart on 12/Aug 04:15, happens again on 19/Aug 15:15
- Restart on 22/Aug 23:10, happens again on 31/Aug 12:46
- Restart on 31/Aug 15:10, happens again on 5/Sep 16:11
- It once happened after just a couple hours heavily downloading data
(BitTorrent, so it can either be a number of connections issue or an absolute
tx/rx amount issue)
- It can be fixed with with "ifconfig gem0 down && ifconfig gem0 up", but not
unplugging and replugging the cable. A system restart also solves the issue.
There are no error logs. The closest I can get to an error log is the fact that
afpd times out, and I used this timestamp to establish the exact time of the
issue.
I also run an internet-dependent cron job which starts to fail consistently
with the afpd error message, so I'm confident that the bug trigger time is
correct.
Here is what I can see on /var/log/messages for the time when the bug is
triggered:
Aug 22 23:09:57 mini afpd[8461]: afp_alarm: child timed out, entering
disconnected state
Aug 22 23:09:57 mini afpd[8461]: dsi_disconnect: entering disconnected state
Aug 22 23:09:57 mini afpd[8461]: dsi_disconnect: entering disconnected state
Another one:
Aug 31 12:46:19 mini afpd[24528]: afp_alarm: child timed out, entering
disconnected state
Aug 31 12:46:19 mini afpd[24528]: dsi_disconnect: entering disconnected state
Aug 31 12:46:19 mini afpd[24528]: dsi_wrtreply: Bad file descriptor
Aug 31 12:46:19 mini afpd[24528]: dsi_disconnect: entering disconnected state
This one is from yesterday:
Sep 5 16:10:50 mini ntpd[6258]: 2 out of 4 peers valid
Sep 5 16:10:50 mini ntpd[6258]: bad peer from pool pool.ntp.org (46.17.142.10)
Sep 5 16:10:50 mini ntpd[6258]: bad peer from pool pool.ntp.org
(194.140.131.21)
I then try to grep on /var/log for timestamps which are close to that date, but
there are no other error messages.
The machine is running headless so I can't see if there are any error messages
on screen.
>Fix:
ifconfig gem0 down && ifconfig gem0 up
As to a permanent fix, here are some hyphotheses:
- It is clearly a network issue, since it's solved by an ifconfig down+up
- It is probably something driver-related, since I googled and looked at the
mailing lists, and there is nobody experiencing the same issue. I guess there
are few people using OpenBSD on a G4 with the gem0 driver, so this may be an
untested corner case of the driver. If it were a system-wide issue, somebody
else would probably have noticed it.
- This may be a data overflow. It can be either in a counter of absolute tx/rx
data, or number of connections. The weird weekly periodicity has probably
something to do with it. Or maybe connections aren't properly cleaned up and
eventually they fill up some buffer? This is my best guess
- It does not seem to affect the kernel/other processes since there are no
dmesg messages and the system doesn't require a restart.
Can anybody give me more pointers to further narrow down the issue?
Thanks a lot,
Carlos
dmesg:
OpenBSD 5.7-stable (GENERIC) #2: Wed Aug 12 23:45:47 CEST 2015
root@mini:/usr/src/sys/arch/macppc/compile/GENERIC
real mem = 1073741824 (1024MB)
avail mem = 1030991872 (983MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root: model PowerMac10,1
cpu0 at mainbus0: 7447A (Revision 0x102): 1416 MHz: 512KB L2 cache
mem0 at mainbus0
spdmem0 at mem0: 1GB DDR SDRAM non-parity PC3200CL3.0
memc0 at mainbus0: uni-n rev 0xd2
"hw-clock" at memc0 not configured
kiic0 at memc0 offset 0xf8001000
iic0 at kiic0
mpcpcibr0 at mainbus0 pci: uni-north
pci0 at mpcpcibr0 bus 0
pchb0 at pci0 dev 11 function 0 "Apple UniNorth AGP" rev 0x00
radeondrm0 at pci0 dev 16 function 0 "ATI Radeon 9200" rev 0x01
drm0 at radeondrm0
radeondrm0: irq 48
mpcpcibr1 at mainbus0 pci: uni-north
pci1 at mpcpcibr1 bus 0
macobio0 at pci1 dev 23 function 0 "Apple Intrepid" rev 0x00
openpic0 at macobio0 offset 0x40000: version 0x4614 feature 3f0302 LE
macgpio0 at macobio0 offset 0x50
"modem-reset" at macgpio0 offset 0x1d not configured
"modem-power" at macgpio0 offset 0x1c not configured
macgpio1 at macgpio0 offset 0x9: irq 47
"programmer-switch" at macgpio0 offset 0x11 not configured
"gpio5" at macgpio0 offset 0x6f not configured
"gpio6" at macgpio0 offset 0x70 not configured
"extint-gpio15" at macgpio0 offset 0x67 not configured
"escc-legacy" at macobio0 offset 0x12000 not configured
zsc0 at macobio0 offset 0x13000: irq 22,23
zstty0 at zsc0 channel 0
zstty1 at zsc0 channel 1
aoa0 at macobio0 offset 0x10000: irq 30,1,2
audio0 at aoa0
"timer" at macobio0 offset 0x15000 not configured
adb0 at macobio0 offset 0x16000
apm0 at adb0: battery flags 0x0, 0% charged
piic0 at adb0
iic1 at piic0
maxtmp0 at iic1 addr 0xc8: max6642
kiic1 at macobio0 offset 0x18000
iic2 at kiic1
wdc0 at macobio0 offset 0x20000 irq 24: DMA
ohci0 at pci1 dev 26 function 0 "Apple Intrepid USB" rev 0x00: irq 29, version
1.0, legacy support
ohci1 at pci1 dev 27 function 0 "NEC USB" rev 0x43: irq 63, version 1.0
ohci2 at pci1 dev 27 function 1 "NEC USB" rev 0x43: irq 63, version 1.0
ehci0 at pci1 dev 27 function 2 "NEC USB" rev 0x04: irq 63
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 "NEC EHCI root hub" rev 2.00/1.00 addr 1
usb1 at ohci0: USB revision 1.0
uhub1 at usb1 "Apple OHCI root hub" rev 1.00/1.00 addr 1
usb2 at ohci1: USB revision 1.0
uhub2 at usb2 "NEC OHCI root hub" rev 1.00/1.00 addr 1
usb3 at ohci2: USB revision 1.0
uhub3 at usb3 "NEC OHCI root hub" rev 1.00/1.00 addr 1
mpcpcibr2 at mainbus0 pci: uni-north
pci2 at mpcpcibr2 bus 0
kauaiata0 at pci2 dev 13 function 0 "Apple Intrepid ATA" rev 0x00
wdc1 at kauaiata0 irq 39: DMA
wd0 at wdc1 channel 0 drive 0: <ST9808210A>
wd0: 16-sector PIO, LBA48, 76319MB, 156301488 sectors
atapiscsi0 at wdc1 channel 0 drive 1
scsibus1 at atapiscsi0: 2 targets
cd0 at scsibus1 targ 0 lun 0: <MATSHITA, CD-RW CW-8124, DACD> ATAPI 5/cdrom
removable
wd0(wdc1:0:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 2
cd0(wdc1:0:1): using PIO mode 4, DMA mode 2, Ultra-DMA mode 2
"Apple UniNorth Firewire" rev 0x81 at pci2 dev 14 function 0 not configured
gem0 at pci2 dev 15 function 0 "Apple Uni-N2 GMAC" rev 0x80: irq 41, address
00:11:24:87:a7:64
bmtphy0 at gem0 phy 0: BCM5221 100baseTX PHY, rev. 4
umass0 at uhub0 port 2 configuration 1 interface 0 "Seagate Expansion Desk" rev
2.10/1.00 addr 2
umass0: using SCSI over Bulk-Only
scsibus2 at umass0: 2 targets, initiator 0
sd0 at scsibus2 targ 1 lun 0: <Seagate, Expansion Desk, 0604> SCSI4 0/direct
fixed
sd0: 1907729MB, 4096 bytes/sector, 488378645 sectors
vscsi0 at root
scsibus3 at vscsi0: 256 targets
softraid0 at root
scsibus4 at softraid0: 256 targets
bootpath: /pci@f4000000/ata-6@d/disk@0:/bsd
root on wd0a (54b6720d276cdb19.a) swap on wd0b dump on wd0b
error: [drm:pid0:radeon_get_bios] *ERROR* Unable to locate a BIOS ROM
trying to bind memory to uninitialized GART !
error: [drm:pid0:radeon_ttm_backend_bind] *ERROR* failed to bind 1 pages at
0x00000000
drm:pid0:radeon_wb_init *WARNING* (-22) create WB bo failed
drm:pid0:r100_init *ERROR* Disabling GPU acceleration
ttm_pool_mm_shrink_fini stub
error: [drm:pid0:radeon_get_bios] *ERROR* Unable to locate a BIOS ROM
error: [drm:pid0:r100_cp_init_microcode] *ERROR* radeon_cp: Failed to load
firmware "radeon-r200_cp"
error: [drm:pid0:r100_cp_init] *ERROR* Failed to load firmware!
drm:pid0:r100_startup *ERROR* failed initializing CP (-2).
drm:pid0:r100_init *ERROR* Disabling GPU acceleration
radeondrm0: 1024x768
wsdisplay0 at radeondrm0 mux 1: console (std, vt100 emulation)
wsdisplay0: screen 1-5 added (std, vt100 emulation)
usbdevs:
Controller /dev/usb0:
addr 1: high speed, self powered, config 1, EHCI root hub(0x0000), NEC(0x1033),
rev 1.00
port 1 powered
port 2 addr 2: high speed, self powered, config 1, Expansion Desk(0x3321),
Seagate(0x0bc2), rev 1.00, iSerialNumber NA4NBLPJ
port 3 powered
port 4 powered
port 5 powered
Controller /dev/usb1:
addr 1: full speed, self powered, config 1, OHCI root hub(0x0000),
Apple(0x106b), rev 1.00
port 1 powered
port 2 powered
Controller /dev/usb2:
addr 1: full speed, self powered, config 1, OHCI root hub(0x0000), NEC(0x1033),
rev 1.00
port 1 powered
port 2 powered
port 3 powered
Controller /dev/usb3:
addr 1: full speed, self powered, config 1, OHCI root hub(0x0000), NEC(0x1033),
rev 1.00
port 1 powered
port 2 powered