On Fri, 30 Dec 2011, Maxim Sobolev wrote:

On 12/30/2011 4:46 PM, Maxim Sobolev wrote:
I see. Would you guys mind if I put that NULL pointer check into the code for the time being and turn it into some kind of big nasty warning in 8-stable branch only?

I could also open a ticket, put all debug information collected to date in there. And encourage people to report to it once they see this warning on their system. Then it would provide more information about the exposure. It is definitely looks like locking issue somewhere, not just bad luck or flaky hardware, as we see it happening consistently on top 4 most UDP-loaded systems here, and it correlates well with the load. With my small NULL catch the machines have been running happily for a month now, so there is no visible side-effects.

Please do file the PR so that all the information is in one place -- this is a network stack hacking week for me, so I should be able to take a closer look.

Could you characterise the traffic load on these boxes a bit more? Also, is there regular monitoring using netstat/bsnmp/etc going on? I'd like to try and identify ways in which this workload differs from other common high-UDP workloads being used on 8.x that aren't seeing this problem...

Robert
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[email protected]"

Reply via email to