So, I've spent all weekend looking into this, and I'm still no closer to solving it.
I've tried replacing the NICs, swapping the switches, removing the switches, isolating the machines, replacing the wiring, and logging the iptables traffic. That last one was quite interesting actually. I added log rules for ICMP traffic to the nat table's prerouting and postrouting chains, and the filter table's input, forward, and output chains. When an outage is occurring, pinging the internal NIC from my workstation shows up packets. Pinging the external NIC from my workstation doesn't show a thing. The packets don't even seem to be reaching the prerouting chain. Once the outage finishes, they start appearing as normal. The server is able to ping both it's interfaces at all times, as you'd expect. A laptop on the external network is able to ping the external NIC, (but obviously not the internal one). I get no reported dropped packets anywhere. I did notice some rx_crc_errors on the internal NIC (using ethtool), (which is why I tried replacing the wiring), but these don't seem to go up when the problem occurs (i.e. they didn't increment at all during outages) - I'm going to hazard that they're another issue entirely. ~1000 errors out of ~100000000 good receives doesn't suggest anything major :) NIC-wise, I tried swapping to both being r8169, and both being e1000. Identical results regardless of hardware involved. My next port of call I guess has to be trying older kernels and seeing if I get the same symptoms. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org