RE: [PATCH] Removing Scheduler dependency, refactoring service code

Peter M. Goldstein Wed, 09 Oct 2002 21:32:39 -0700


Harmeet,


> > Basically this code is designed to address a bunch of 
> > memory/robustness/scalability issues.  These issues were, at least 
> > in part, due to problems with the Cornerstone TimeScheduler.  
> > Dependence on the global TimeScheduler, previously used to handle 
> > idle timeouts, has been removed.  Instead, James now uses the 
> > classes of the new org.apache.james.util.Watchdog package to provide

> > this functionality.
> 
> I have been using a scheduler implementation for 1+ year without any 
> issues. If you don't trust a scheduler implementation why not replace 
> that with another one ? At least one has already been posted.
> One of the advantages of Avalon is that you can replace an impl. I
thought
> the scheduler is a good abstration and widely used.
> 
> What are the advantages of picking another abstraction to solve the 
> same problem ?

Because the scheduler is the wrong abstraction.  This is not a scheduler
problem - it's a timeout.  A timeout is far simpler than a scheduler.
And that's the core issue.

A scheduler is required to keep track of a number of possible events,
each of which may trigger at an arbitrary time.  It is centralized, and
has complex synchronization issues when large numbers of events are
added or deleted frequently.  Schedulers are generally intended for
events that are long lived and are expected to actually occur.

A timeout is the opposite problem.  The connections and the
corresponding handlers are extremely short lived - on the average they
live for milliseconds.  Under load hundreds of events will be added or
deleted per second.  In the overwhelming majority of cases the timeout
should not trigger.  It is designed to handle the exceptional case, not
the general one.

Thus a timeout should be cheap to invoke (lightweight, involving just a
couple of small objects and drawing from an efficient thread pool) and
it should be optimized for the general case (the timeout thread waits
for the timeout period, rather than consuming resources by context
switches and unnecessary loops).  There should be minimal contention
between different handlers, as their timeout events are independent.  So
rather than using a globally shared scheduler resource with frequent
invocations, you grab a thread from a shared pool and then face no
potential contention with other handlers for the life of the watchdog.

The scheduler model fails on all of the above.  

> >
> > In addition, there was substantial rearchitecting of the service 
> > structure.  James services are now subclasses of the new 
> > AbstractJamesService class.  This class serves to unify the common 
> > functionality of the services.  BaseConnectionHandler is removed, 
> > because the functionality once provided by that class has been moved

> > into AbstractJamesService.  The Configurable interface (and the
> 
> Earlier the intent was to have thin server and move towards a base 
> handler class. I don't see the benefit in rearchitecture. Did the 
> current system not meet a requirement ?
> In the past there has been conversation about removing derived Servers
> classes and making do with handlers.

There were a number of very serious problems with the handler heavy
approach.

First, there was the simple expense.  Server objects are created once
per service, handler objects are created, fed through the lifecycle, and
destroyed multiple times a second.  Thus it is advantageous to move as
much of the lifecycle expense into the server code.

Second, there's a bizarre division of configuration information.  Not
only was the configuration information divided between handler and
server in some bizarre fashion (why does helloName belong to the handler
and not to the server?), but common configuration between the servers
was duplicated (and may I add, duplicated incorrectly, leading to yet
another long standing NNTP bug).  The addition of the
AbstractJamesService class correctly handled that common server
configuration (extracting socket type, port, enabled, etc. in a uniform
fashion) while allowing custom per-server type configuration.  The
BaseConnectionHandler class attempted to encapsulate some of that common
behavior, but by the very fact it was a handler class rather than a
server class it couldn't encapsulate the port #, socket type, connection
type, and thread pool used for connection.  Those have to be in the
class that provides the service.  So why should the helloName and the
timeout be in the BaseConnectionHandler as opposed to being in the
aforementioned server class?

Third, the sheer conceptual weight.  For the reasons described above,
the BaseConnectionHandler class cannot be the end-all/be-all class.
Thus, why is it necessary?  The answer - it isn't.  It just further
confuses the architecture.  And it perpetuates the idea that you can
mate any ConnectionHandler to any service.  As that idea is false (see
below), it just serves as a point of confusion in the code base.
 
> > corresponding behavior) has been removed from all the 
> > ConnectionHandler children, in recognition that configuration is 
> > really on a per-server, not per-handler, basis.
> 
> A far simpler solution would have been to have a single params object 
> per handler or optimize configuration code. This would have delivered 
> the speed gain without a new architecture. Each philosophy would have 
> holes and I don't see big enough of a hole in the hander mechanism to 
> require a redo.

No, that wouldn�t work.  The solution described is not simple at all.  

Adding a "params" object would either require re-validation and
defaulting in each handler invocation (resulting in the original
expense), or would basically be a non-type safe (and hence inferior)
version of the changes already introduced.  The servers are different
(and the handlers are not interchangeable) for precisely this reason.
Different handlers require different param values/services.

As for optimizing the configuration code, there is nothing to optimize.
The pre-existing Configuration code was basically as optimized as
possible.  It was just getting invoked hundreds of times per second
rather than once per the lifetime of the server.  There's no way to beat
that math.

> > This leads to tighter coupling between the
> > Handlers and the corresponding server classes, but that's simply 
> > recognition of a pre-existing reality.  These classes are extremely 
> > tightly coupled.
> 
> Server is a listener. Handler contains protocol specific stuff. Why 
> would there be a (tight) coupling ? It is a listen-dispatch mechanism.

> Shouldn't it be as loosely coupled as possible ?

This is a somewhat na�ve view of a service architecture.  Servers aren't
generic.  If they were, we'd need only one server class period.  But
servers are different from one another, so we need separate server
classes.  You can see how the servers differ by examining the diffs for
the assorted server classes that were submitted with this patch.

In addition, a service is not simply the protocol.  Specifically, a
handler that implements the SMTP protocol does not an SMTP service make.
There are two additional constraints.

First, there are service specific details that are not described by the
protocol.  These generally must be accessed and invoked before any
handler is created.  The obvious example is the default port for the
service.  You need the default port when you're getting the config to
open the socket.  

Second, there are resources provided by the server that depend on the
configuration.  Yes, it's possible to re-read the configuration on a per
handler invocation basis.  But that's a bad idea for the reasons
described above.

Also, the very design of the Phoenix server makes this complicated.  Try
changing the handler so that it has a new dependency.  You'll find that
you need to alter the .xinfo for the server class as well as its
assembly descriptor.  That's part of what I mean by a tight coupling.
Phoenix knows about, loads, and determines dependencies from the server
class (which is a Phoenix block).  The handler class is unknown to the
underlying server.  It can't be a block, because it's invocation and
lifecycle is not controlled by Phoenix.  So it can't publish
dependencies to Phoenix.

> >
> > The config.xml entries were not modified to remove the 
> > <handler></handler> tags, to maximize backwards compatibility.  
> > These tags don't really make sense, even before this set of changes,

> > and should be removed in a future version.
> >
> > Some enhancements, including the ability to specify a thread group 
> > for use by a particular service and the ability to do more 
> > fine-grained connection management (i.e. to have more than one set 
> > of SSL connections, each for use by different services) have been 
> > added.  These will be further documented, but are basically 
> > encapsulated in the AbstractJamesService class.
> 
> I thought the capability to create n listeners for each service 
> already existed.

No, it doesn't.  And never did.  There is no capacity, either in the
implemented services or their abstract base class, to have multiple
handlers for a given service.  Even if the code was there, the
configuration and implementation would preclude it as there is only a
single port entry for each service (so you can't distinguish between
handlers based on port) and the connection handler is created and
invoked by the connection manager sans parameters.

As far as the ability to handle multiple socket types, that wasn't
there.  The socket type was one of two hard coded strings, and the SSL
selection was Boolean.  So if you wanted to have two different SSL
socket types, one for use by the NNTP server and another for use by the
SMTP server (a very common configuration), then you would have to run
two separate servers.  Very bad.

> >
> > There was some change to the behavior of the SMTP server - 
> > specifically, the number of bytes required to reset the idle timer 
> > is now measured against the total number of bytes read off the 
> > socket, not simply those read in the DATA command.  This 
> > substantially neatens the code and in real life scenarios doesn't 
> > make a whole lot of difference.
> >
> > There is also a partial redress of bug #13316 in the NNTP code, 
> > which will reset the watchdog every 100 headers it transmits.
> 
> 
> observer-observable model would be a good fix.

I don't think so.  This is not really a problem of observer/observable.
I mean do you want the handler to register itself as an observer of the
stream?  The stream contains the information of how many bytes it has
transmitted.  The handler knows how many headers it has written.  The
handler knows how long its idle timeout is.  The handler knows when it
has written a complete command.  The handler has the socket which is to
be closed on timeout.  So which is the observer and which is the
observable?  Both the handler and stream have data that would be
associated with the observable (i.e. how long is the timeout and when to
reset).  

The problem here is really architectural - the NNTP server should solve
this problem in the same way the POP3 and SMTP servers solve the
problem.  But I really don't want to spend any more time doing NNTP
reworking.  The partial redress is just a quick hack because there is no
Writer that handles counting bytes and timing out.

> >
> > I've done basic testing on this code, and I am now doing further 
> > load testing.  I will continue to do testing before submission.
> >
> > The code is attached as a set of diffs, and a collection of newly 
> > added files.
> >
> > As always, comments, critiques, and questions are welcome.
> 
> I don't follow the need for a new scheduler abstraction or re design 
> of server-handler interaction. Maybe a bit of pro-con or a discussion 
> on different options and what makes one better than other would help.

See the above.

--Peter



--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

RE: [PATCH] Removing Scheduler dependency, refactoring service code

Reply via email to