Daniel Ouellet wrote:
Bryan S. Leaman wrote:
Hi All,
I have a production firewall on a Sun V120 running OpenBSD 4.5 sparc64,
with 2 active interfaces. Two weeks ago, the gem1 interface suddenly
hung
and I was able to revive it using "ifconfig gem1 down; ifconfig gem1
up". I found the following m...@openbsd thread from March 2009:
http://www.mail-archive.com/[email protected]/msg73257.html
Did you try the mp kernel to see if that makes a difference for you.
Out of curiosity, what effect would this have on a single CPU box?
Also, don't forget that the fix here is not in 4.5, but pass 4.5
And anything in your logs for timeout message may be?
And 4.6 is really around the corner now. Might be best to run it and see.
I know the fix for gem is in 4.6, but does the same problem affect hme?
Since I'm having the problem with both drivers, I'm not sure if the 4.6
fix is related to the problem I'm seeing. Unlike your experience, I'm
not getting any error messages in any logs or on the console. The only
clue is the ierrs/oerrs and some error counts on the switch.
I was able to kill the interface several times by pushing data through
the firewall (into hme0 and out hme1) at around 70Mbps for 5-10
minutes. Same result--hme1 stopped responding but I could ping hosts on
the hme0 side. I'm fairly sure (it was a long night...) that one time I
did the ifconfig down/up on *hme0* and that revived hme1, which seemed odd.
I ran "systat ifstat" during the failure, and it showed data flowing
inbound through the firewall into hme0 and out hme1, but nothing in the
other direction. So hme1 seems to be half working. Not sure if it
matters, but I'm using altq with hfsc.
IFACE STATE IPKTS IBYTES IERRS OPKTS OBYTES
OERRS COLLS
hme0 up:U 2 599 0 0 0
0 0
hme1 up:U 0 0 0 2 599
0 0
Totals 2 599 0 2 599
0 0
Any other suggestions?
Bryan