On Fri, 30 Dec 2011, Maxim Sobolev wrote:
On 12/30/2011 4:46 PM, Maxim Sobolev wrote:
I see. Would you guys mind if I put that NULL pointer check into the code
for the time being and turn it into some kind of big nasty warning in
8-stable branch only?
I could also open a ticket, put all debug information collected to date in
there. And encourage people to report to it once they see this warning on
their system. Then it would provide more information about the exposure. It
is definitely looks like locking issue somewhere, not just bad luck or flaky
hardware, as we see it happening consistently on top 4 most UDP-loaded
systems here, and it correlates well with the load. With my small NULL catch
the machines have been running happily for a month now, so there is no
visible side-effects.
Please do file the PR so that all the information is in one place -- this is a
network stack hacking week for me, so I should be able to take a closer look.
Could you characterise the traffic load on these boxes a bit more? Also, is
there regular monitoring using netstat/bsnmp/etc going on? I'd like to try
and identify ways in which this workload differs from other common high-UDP
workloads being used on 8.x that aren't seeing this problem...
Robert
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[email protected]"