Apologies if the formatting on this email is a bit off - I'm stuck with our 
webmail
interface...

We were having problems with kernel crashes on our 3-4 year old pfsense
1.2.3 boxes and as part of the search for a solution we tried upgrading to
pfsense 2.0.1, which didn't seem to solve the problem and also brought
along a possibly unrelated new issue whereby our WAN interface would
appear to stop passing traffic but not in such a way that the CARP
interfaces would fail over to the secondary node.

Today we replaced the primary node with alternative, trusted to be reliable
hardware (it was doing other functions which were quite intensive). We just
experienced the problem again and I managed to disable carp on the primary
node, which switched traffic to our secondary whilst I investigated what was
going wrong.

It turns out that the WAN interface (bge1) was transmitting traffic just fine,
but unable to receive responses. I could see arp requests going out, and
see the requests and replies on another machine using tcpdump.

I ran:
# netstat -I bge1

which showed that Ierrs was increasing with Ipkts remaining near-static (in
about an hour it increased by 5).

There was nothing in dmesg apart from the carp failovers.

More interestingly, bouncing the interface with:

# ifconfig bge1 down
# ifconfig bge1 up

brought the interface back to life - things were back to normal.

Is this something which appears familiar? We didn't seem to experience
the interface misbehaving like this prior to upgrading to 2.0.1.

I've got the output of all the various netstat invokations, logs and tcpdump
info from both nodes. Let me know if any of it will be helpful. We would
have been pushing ~200Mbps out of the interface with ~50Mbps inbound.

At least we haven't had a kernel crash on the replacement box yet, that's
something to be glad of.

-- 
Russell Howe
[email protected]
_______________________________________________
List mailing list
[email protected]
http://lists.pfsense.org/mailman/listinfo/list

Reply via email to