>  I'm a developer working on software that does a fairly large amount of
> concurrent outbound tcp/ip connections.  About 15000 connections or so.

I hope it's not a mass mailer... ;)

> I'm running build 57 on a few different machines, and the problem I am
> having is that after running my software for ~15 minutes, I can no longer
> connect to anything past our router.  It can still talk to anything on the
> local network, just nothing past the router (this includes ping or any
> other program).

You should run lockstat for a minute or two during this time:

        lockstat sleep <#-of-secs>

and see what kernel locks are being banged upon/held.

> This only seems to occur when I get upwards of 15k+ connections.  The CPU
> load is about 20%, and the actual bandwidth usage is minimal.  Other
> machines on the local network continue to function normally.  Also keep in
> mind that this software works fine on freebsd and linux, though I might
> have botched my 'event ports' implementation for solaris, but that still
> shouldn't make the whole machine go dark.  Indeed the program seems to run
> nicely until the OS starts dropping any outbound packet destined for the
> intranets.

I'm guessing you're flooding you "IRE cache entries".  See ip_newroute() in
the source for where this comes from.

While you're ramping up, but before you get wedged, utter:

        netstat -rna | grep UHA | wc -l

And I'll betcha this number is large.  As it gets larger, you'll probably
notice your symptoms.

> Other symptoms:
> - netstat -r hangs after problem occurs (probably because it can't talk to a 
> dns server)

Use -n to avoid that problem.

> - The system doesn't recover unless i do 'svcadm restart network/service or 
> network/physical'

This causes your routes to get flushed, including the IRE cache entries.

> Does anyone have any idea whats causing it to basically stop talking to our
> router?  I'm free to try anything and give feedback.  I'll be checking back
> here frequently as well.

Utter that netstat pipe while you're wedged, I'll betcha the number is huge.

Dan
_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to