On 28.09.2010, at 10:54, Jurgen Weber <[email protected]> wrote: > Hello List > > We have been having issues with some firewall machines of ours using pfSense. > > FreeBSD smash01.ish.com.au 7.2-RELEASE-p5 FreeBSD 7.2-RELEASE-p5 #0: Sun Dec > 6 23:20:31 EST 2009 > sullr...@freebsd_7.2_pfsense_1.2.3_snaps.pfsense.org:/usr/obj.pfSense/usr/pfSensesrc/src/sys/pfSense_SMP.7 > i386 > > MotherBoard: > http://www.supermicro.com/products/motherboard/Xeon3000/3200/X7SBi-LN4.cfm > > Originally the systems started out by showing a lot of packet loss, the > system time would fall behind, and the value of "#vmstat -i | grep timer" was > dropping below 2000. I was lead to believe by the guys at pfSense that this > is where the value should sit. I would also receive errors in messages that > looked like " kernel: calcru: runtime went backwards from 244314 usec to > 236341". > > We tried a variety of things, disabling USB, turning off the Intel Speed Step > in the BIOS, disabling ACPI, etc, etc. All having little to no effect. The > only thing that would right it is restarting the box but over time it would > degrade again. I talked to the SuperMicro and they said that this is a > FreeBSD issue and pretty much washed their hands of it. > > After a couple of months of dealing with this and just rebooting the systems > reguarly, the symptoms slowly but surely disappeared. eg. The kernel messages > went away, the system time was not falling behind and I was experiencing no > packet loss but the "#vmstat -i | grep timer" value would continue to > decrease over time. Eventually I think, when it finally got the 0 the machine > restarted (I am only guessing here). > > After this restart it worked again for a couple of hours and then it > restarted again. > > After the second time the system has not missed a beat, it has been fine and > the "#vmstat -i | grep timer" value remained near the 2000 mark... We setup > some zabbix monitoring to watch it. As mentioned it was fine for about a > month. Until today. Today the value has dropped to 0, but the system has not > restarted and over the last couple of hours the value has increased to 47. > > This machine is mission critical, we have two in a fail over scenario (using > pfSense's CARP features) and it seems unfortunate that we have an issue with > two brand new SuperMicro boxes that affect both machines. While at the moment > everything seems fine I want to ensure that I have no further issues. Does > anyone have any suggestions? > > Lastly I have double check both of the below: > http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/troubleshoot.html#CALCRU-NEGATIVE-RUNTIME > We disabled EIST. > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/troubleshoot.html#COMPUTER-CLOCK-SKEW > > # dmesg | grep Timecounter > Timecounter "i8254" frequency 1193182 Hz quality 0 > Timecounters tick every 1.000 msec > # sysctl kern.timecounter.hardware > kern.timecounter.hardware: i8254 > > Only have one timer to choose from. > > Thanks > > Jurgen > > _______________________________________________ > [email protected] mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "[email protected]"
Hello, vmsat -i calculates interrupt rate based on interrupt count/uptime, and the interrupt count is 32 bit integer. With high values of kern.hz it will overflow in few days (with kern.hz=4000 it will happen every 12 days or so). If that is the case, use systat -vmstat 1 to get accurate interrupt rate. That is just fyi, because i was confused once and it scared me abit, and i started changing counters untill i noticed this. p.s. please forgive my poor english_______________________________________________ [email protected] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[email protected]"
