RE: Socket Performance

Peter M. Goldstein Sun, 13 Oct 2002 14:51:01 -0700


Danny,


> In my opinion James socket problems would be greatly reduced in impact
if
> James behaviour was as follows..
> 
> connections are accepted
> -> resources are consumed
> -> limits are approached
> -> connections are refused
> -> resources are freed
> -> connections are accepted

Some of this capability is already present.  It simply requires correct
configuration.  Use of the <connections> sub element <maxconnections>
(newly introduced with the ConnectionManager change of a few weeks ago)
allows you to throttle the number of connections per server connection.

The current problem is that the mis-use of the Scheduler requires that
the maxconnections number be kept artificially low.  Basically, with
only five concurrent connections, you can easily kill the Scheduler
implementation with consistent load.

> In addition it concerns me that we can't run James under the -server
JVM
> otpion on linux because Avalon causes a failure (attached message)
> Tomcat 3 under heavy and sustained load ends up with an out of memory
> exception, -server cures it, largely because of the more agressive
garbage
> collection.

It concerns me too.  We should push the Avalon folks to figure out what
the problem is.  Possibly this would fix the Scheduler crash, possibly
not.  Seems doubtful to me, as the problem results from the fact that
the global scheduler or timer has references to events that have been
expired and thus GC won't remove these events.  As far as I can tell the
exact same problem exists with Harmeet's scheduler as does with the
previous scheduler.  The priority queue will hold on to the events,
causing out of memory errors.  This is one reason why I believe the
scheduler is the wrong approach.
 
> In my opinion it is right for us to optimise our use of resources, but
> impossible to create a server that will sustain any load applied, what
we
> need to do is ensure that the server will continue to function, even
if
> this means rejecting connections.
> This route will provide a scalable and robust solution.

I don't disagree with this point.  And a correctly configured server
(after the watchdog fix) does this properly.  Specifically, each service
requires a base number of threads (~2) to function.  Each service
requires either 1 or 2 threads per handler, depending on whether we're
using the old code or the new code.  The SpoolManager consumes num of
spool threads plus one.  The NNTP Repository consumes the number of
spooler threads plus one.  Fetchpop consumes a single thread.  So sum
that all up based on your configuration, and set that to the max of your
thread pool.  If you do, no problem.  

This is basically what I've been trying to work towards.  Obviously
James can't take arbitrarily high loads.  But the current maximum load
is well below what a real production system should be able to take.  And
the current response of the server in the case of overload is clearly
not acceptable.  Server needs to be robust.

How do we solve this problem?  Proper configuration and a source base
that doesn't tip over from OutOfMemoryErrors.  I believe the current
patch helps alleviate this situation.  I understand you're having
issues, and can only tell you that I am not.  I'm happy to work with you
to get through those issues, but I need more info on your configuration
and assembly.

--Peter

P.S.: The problem from last night's test has been identified.  Basically
the problem lay in the spool.  The spool processing fell woefully behind
the rate at which emails were coming in.  This led to a multi-GB backlog
in the spool of hundreds of thousands of files of ~1 K.  This led to O/S
level problems, as Win2k doesn't handle this very well.  It's taken me
well over an hour to attempt to delete these files, and I'm not done
yet.  But there is no indication of a problem with the handler.  Just a
problem with the underlying O/S.      



--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

RE: Socket Performance

Reply via email to