On Wed, Jun 28, 2006, Robert Watson wrote: >>>> 6.1-STABLE crashed on me. I'm providing a backtrace. Could any of you, >>>> experienced people, suggest me if it's a hardware problem or is it an >>>> error inside the OS? >>> This is a known bug in the TCP code; a large set of outstanding changes >>> is present in 7.x that will fix the problem when merged. However, I >>> recently had push-back on merging the larger batch of changes, so am >>> looking at merging a workaround that will also correct the problem >>> without the larger set of architectural changes. I hope to have a chance >>> to look at that in detail this weekend. >> I'm glad to know that it isn't either unknown or hardware-related. Thank >> you for your prompt reply! > Per my earlier e-mail, I had hoped to merge a larger set of changes from > HEAD that resolve the underlying problem here (that inpcb's can be detached > from a socket while the socket is still in use), but right now I'm > deferring merging those changes as they are somewhat risky (as they are > large). Instead, I've produced a candidate work-around patch, now attached > to kern/97095. This does not fix the underlying problem, but seeks to > narrow the window for the race to be exercised by avoiding caching a > volatile pointer across user memory copying, which under load can result in > blocking I/O. I would be quite interested in knowing if this resolves the > problem in practice -- if so, it's a definite short-term merge candidate to > reduce the symptoms of this problem until the proper fix can be merged.
> http://www.watson.org/~robert/freebsd/netperf/20060628-ip_ctloutput.diff Thank you for the patch. I'll let you know in few days if the crash occurs again. It's quite reproducible (crashed yesterday in the same code path).
pgpeerBZV3ylV.pgp
Description: PGP signature
