Not sure if it will help, but it might be useful to show 'systat mb' and 'sysctl kern.netlivelocks'. You mention updating packages, I've definitely had systems which have been pretty much flattened with netlivelocks/mitigation while doing this, perhaps some em(4) don't react very well to this...?
On 3 May 2015 01:13:19 BST, Bryan Linton <[email protected]> wrote: >On 2015-05-02 00:16:43, Christian Schulte <[email protected]> wrote: >> >Synopsis: After some time (minutes or seconds) the em0 interface >stops working >> >Category: system >> >Environment: >> System : OpenBSD 5.7 >> Details : OpenBSD 5.7-stable (GENERIC.MP) #0: Fri May 1 >23:59:46 CEST 2015 >> >> [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP >> >> Architecture: OpenBSD.amd64 >> Machine : amd64 >> >Description: >> Following is the contents of the /etc/hostname.em0 file: >> >> inet 192.168.10.50 255.255.255.0 192.168.10.255 >> >> The em0 interface works as expected. After some time it stops >> working. Processes currently transmitting data will show a >> (Broken pipe) error. Doing ifconfig em0 down && sh >/etc/netstart >> the interface starts working again for some time and than >hangs >> again. >> >> >How-To-Repeat: >> The issue is reproducible by simply using the em0 interface. >> >Fix: >> ifconfig em0 down && sh /etc/netstart >> >> [...] >> > >I wonder if this is the same thing that I mentioned here: > http://marc.info/?l=openbsd-misc&m=141007061612003&w=2 >Though I don't get a "(Broken pipe)" error, em0 just hangs and won't >transmit >packets. > >In a nutshell, CPU activity causes em0 to hang on both my T60 and X61t >laptops. > >Provided I don't do CPU intensive activities, it can run fine for days, > >or potentially weeks at a time. However, running bogofilter (which >I've stopped >using since it wedged the network reliably every time I'd fetch my >email), >compiling a large port (smaller ones don't tend to wedge it, only >larger ones >that take at least several minutes to compile seem to) or just trying >to do >something like fetch and upgrade packages, or just going to a CPU >intensive >webpage like google maps will invariably cause the network to wedge. > >Neither machine will send or receive packets, including responding to >ping >requests when this happens. > >If I need to update packages or compile something, I'll usually just >run a >script like the following so I don't need to babysit the computer: > while true; do sleep 25 && ifconfig em0 down up && sleep 1 && ifconfig >em0 down up && echo ping; done > >I was contacted privately by a developer and sent a few patches to the >UVM and >softraid systems which caused them to KERNEL_LOCK() and KERNEL_UNLOCK() >in a >few key places which has helped a lot, but the network will still lock >up >reliably with heavy CPU activity. > >All I can say is that I first noticed this when I upgraded from a >mid-July 2014 >snap to an early-September 2014 snap, which I know is a very large >window. I've >unfortunately been too busy to try to go back and figure out what >change caused >this, and have been getting by with the above script when necessary. > >I've seen a lot of work done on the networking code and in UVM over the >last >year or so, so I've been upgrading to newer snaps as they've been >released >hoping that they'd fix it, but it seems like the problem may lie >somewhere >else or be obscure or specific to my setup. > >I've just assumed that since no one else has reported this problem, >that there >was something unique about my system or setup that was causing this, >which >would tend to lower the severity of this bug since as far as I knew up >until >now, I was the only one affected by it. > >I realize this is a rather poor bug report, but hopefully by at least >mentioning >the fact that a few key KERNEL_LOCK()/KERNEL_UNLOCK() calls sprinkled >around >some UVM and softraid code reduce the occurance of this bug, it at >least gives >someone a somewhat smaller target to look at. > >I can provide more information if necessary. A dmesg from the T60 >(with the >above mentioned patches applied) follows: -- Sent from a phone, please excuse the formatting.
