On Wed, 6 Sep 2006, Gleb Smirnoff wrote:

glebius     2006-09-06 13:56:35 UTC

 FreeBSD src repository

 Modified files:
   sys/netinet          in_pcb.c tcp_subr.c tcp_timer.c tcp_var.h
 Log:
 o Backout rev. 1.125 of in_pcb.c. It appeared to behave extremely
   bad under high load. For example with 40k sockets and 25k tcptw
   entries, connect() syscall can run for seconds. Debugging showed
   that it iterates the cycle millions times and purges thousands of
   tcptw entries at a time.
   Besides practical unusability this change is architecturally
   wrong. First, in_pcblookup_local() is used in connect() and bind()
   syscalls. No stale entries purging shouldn't be done here. Second,
   it is a layering violation.

So you're returning to the behavior where the system chokes and stops all outbound TCP connections because everything is in the timewait state? There has to be a way to fix the problem without removing this heuristic entirely.

How did you run your tests?

 o Return back the tcptw purging cycle to tcp_timer_2msl_tw(),
   that was removed in rev. 1.78 by rwatson. The commit log of this
   revision tells nothing about the reason cycle was removed. Now
   we need this cycle, since major cleaner of stale tcptw structures
   is removed.

Looks good, this is probably the reason for the code in in_pcb behaving so poorly. Did you test just this change alone to see if it solved the problem that you were seeing?

Mike "Silby" Silbersack
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to