Hi all, I'm a developer working on software that does a fairly large amount of concurrent outbound tcp/ip connections. About 15000 connections or so.
I'm running build 57 on a few different machines, and the problem I am having is that after running my software for ~15 minutes, I can no longer connect to anything past our router. It can still talk to anything on the local network, just nothing past the router (this includes ping or any other program). This only seems to occur when I get upwards of 15k+ connections. The CPU load is about 20%, and the actual bandwidth usage is minimal. Other machines on the local network continue to function normally. Also keep in mind that this software works fine on freebsd and linux, though I might have botched my 'event ports' implementation for solaris, but that still shouldn't make the whole machine go dark. Indeed the program seems to run nicely until the OS starts dropping any outbound packet destined for the intranets. Other symptoms: - netstat -r hangs after problem occurs (probably because it can't talk to a dns server) - The system doesn't recover unless i do 'svcadm restart network/service or network/physical' - Problem still occurs if I run this script: http://everythingsolaris.org/software/tune_tcp Does anyone have any idea whats causing it to basically stop talking to our router? I'm free to try anything and give feedback. I'll be checking back here frequently as well. Thanks This message posted from opensolaris.org _______________________________________________ networking-discuss mailing list [email protected]
