RE: [PATCH] Removing Scheduler dependency, refactoring service code

Peter M. Goldstein Thu, 10 Oct 2002 02:27:40 -0700


Harmeet et al,


> Because the scheduler is the wrong abstraction.  This is not a
scheduler
> problem - it's a timeout.  A timeout is far simpler than a scheduler.
> And that's the core issue.
> 
> 
> [Harmeet]
> One time timer task rather than repeated timer tasks is the same thing
as
> Watchdog timeout. It is property of implementation.
> 

See Noel's comments on this.

I also want to add that this situation is directly analogous to the
AuthService issue.  The interface itself defines a contract.  In the
case of the Scheduler, that contract requires support for multi-threaded
lookup of targets on each reset.  This requires lock contention
(assuming the implementation properly implements the contract) on each
reset.  Lock contention which is both unnecessary, since no handler is
interested in the target of any other timeout and potentially expensive
(multi-way lock contentions cost ~50x uncontested locks/volatiles in
modern JVMs).  The interface contract puts restrictions on the
implementation.

I'm not going to respond to the rest of the specific scheduler comments,
because they all labor under the same confusion of interface contract
and implementation.  Noel nailed the crucial point.

> [Harmeet]
> A Config object can take care of this. Handlers don't need to
> expose themselves to lifecycle. What you are describing is
configuration
> per
> handler and as you pointed out that is not needed. You are also saying
> configuration is spread out, that can be centralized too in a config
> object.
> 

This is not correct.

There are a couple of real problems with this proposal, some of them
highlighted in Noel's reply, some not.

Let's start with the situation.  We have a block definition in Phoenix
that represents a service.  This class representing this block
(NNTPServer, SMTPServer, etc.) is instantiated by Phoenix (or any other
Avalon container) and is passed a read-only Configuration object (see
the comments for the Configuration interface).  What happens now?

In the current code base the following behavior occurs.  Some parameters
(port, isSSL, enabled) are read out of the configuration, validated, and
set to defaults if they are not present.  The parameters that are read
and validated are logged, the rest are not.  Note that even this
functionality requires customized Server classes for each service.

A child of the Configuration is then passed to the superclass
AbstractService.  There is no clear distinction between the elements
that are in the top level Configuration object and its child class.
When the first connection (and every subsequent connection) is made, the
remainder of the Configuration parameters are read and validated.  This
expense is incurred on each connection.  Note that the server may fail
because of bad configuration upon first connection, which may be hours
or even days after the server was started.  

Changes to the handler which add additional component dependencies now
require alteration to the supposedly independent server configuration,
both in the .xinfo and in the assembly.xml.  This is totally non-obvious
from the "decoupled" listener/handler model.

Now consider the situation with the changed code.  The server starts and
all service configuration parameters are read, validated, and set to
defaults if they are not present.  Any fatal misconfiguration will be
detected here and cause server failure.  All service configuration
information is logged, so an administrator can simply look at the
startup log and see how the service was configured.  There is no extra
per-connection logging to that prints a config parameter that doesn't
change over the life of the server.  It's stated just once in the log.

Since the handler is viewed as tightly coupled to the listener, and
dependency checking is understood to be a function of the listener
block, it is obvious that changes to the handler that introduce new
dependencies will require modification of the listener configuration
files.

Personally I much prefer the second scenario.  It's cleaner from both an
administrator and a developer point of view.  See below for reasons why
the potential third scenario - the suggestion to "cache" configuration
information - is badly flawed.

> [Harmeet]
> Listener-dispactch(handler) is the pattern followed in current code.
It is
> more flexible than  server class hierarchy. The missing piece is
confusing
> and repeated configuration of the handler not a problem with the
pattern
> iteself. Most other servers try to separate the listeners from
dispatch
> workers/handlers.

The current code is a listener-dispatch pattern.  It is a
listener-dispatch pattern, and would remain so after the changes I
propose.

I don't think you quite understand what a listener-dispatch pattern
implies.  There is absolutely no requirement in that pattern (and
certainly it doesn't hold true in generic implementations) that the
listener be generic.  In general, the listener is specific to the
particular service being implemented.  Usually there is a common
interface that describes the listener-handler (in this case
ConnectionHandlerFactory / ConnectionHandler) but there is certainly no
requirement (nor is it desirable) that either part of the pattern be
generic.  This pattern is all about a division of responsibility, not
about the ability to swap in and out different partners.  The division
of responsibility is clarified in my proposal, as all the configuration
is moved to the listener and the handler can focus on its purpose -
describing the logic for a particular transaction type (NNTP, SMTP,
etc.).

As I stated before, neither side is generic in the current code base.
The server or listener side inherits from a reasonably heavyweight
AbstractService class - about the same weight as the
AbstractJamesService class, but with fewer features.

> [Harmeet]
> Params/Config need not require revalidation and if they do there will
be
> the
> overhead that you are avoiding by server class hierarchy.
> 

Again, this is not correct.  You need to validate your config
parameters.  You need to do it somewhere.  If you load and parse the
parameters in the handler, you do it in the handler.  If you load and
parse the parameters in the server class, you do it in the server class.

> [Harmeet]
> Cache the config. Precalculation for handlers, isn't that the only
gain in
> having a server class hierarchy. Maybe that idea can be abstracted and
> used
> with current model.
> 

No, this isn't the only gain in having a server class hierarchy.  See
this email and the previous one for a discussion of gains.  They
include, but aren't limited to, better error handling, clearer logging,
simpler configuration, and simpler code.

Moreover this caching idea you suggest is even worse than the current
code.  Consider the situation.  Since you object to distinguishing
between the server objects, presumably you're going to have the "first"
handler for the service initialize the configuration and all others will
access it.  That's because someone has to initialize the config.  If
you've got a generic server with no ability to interpret the content of
the configuration, it has to be the handler.  This is a classic server
design anti-pattern.  Because what you've done is introduced a mutable
(because the handler has to do the initial parsing and setting to
defaults), shared configuration object.  Access to this object will
require a lock on every call, because the handler has to check if the
config is initialized and, if not, initialize it.  This becomes a
heavily contested lock.  Very expensive, destroys performance.

Now, let's imagine that we admit that the natural place for the
configuration parsing is in the server class.  But we still want to
stick with this notion of "generic" arguments passed to the handler.
What's wrong with this?  The basic problem with it is that you don't
gain anything over the approach I suggest and you lose type safety.  The
server class still needs to know what type of service it represents in
order to parse and validate the configuration parameters.  In addition,
if we restrict ourselves to the Avalon configuration rather than some
Properties object, we lose even more. Configuration parameters that
aren't easily represented as Strings become difficult to pass to the
handler.  We have to put some String key in place, and give the handler
the Configuration.  It pulls out the String, and looks up the non-String
thing.  Ugh.

This is simple?

> 
> > > This leads to tighter coupling between the
> > > Handlers and the corresponding server classes, but that's simply
> > > recognition of a pre-existing reality.  These classes are
extremely
> > > tightly coupled.
> >
> > Server is a listener. Handler contains protocol specific stuff. Why
> would
> > there be a (tight) coupling ?
> > It is a listen-dispatch mechanism. Shouldn't it be as loosely
coupled
> as
> > possible ?
> 
> This is a somewhat na�ve view of a service architecture.  Servers
aren't
> generic.  If they were, we'd need only one server class period.  But
> servers are different from one another, so we need separate server
> classes.  You can see how the servers differ by examining the diffs
for
> the assorted server classes that were submitted with this patch.
> 
> 
> 
> [Harmeet]
> No I am saying listeners are generic. Handlers/Workers are specific. A
mix
> of two makes service and servers.
> 

Listeners are not generic.  Not even close.  They aren't now, they can't
be for the reasons I outlined in my last email, and there's been no
compelling reason presented for them to be generic.


> [Harmeet]
> But a socket listener associated with SMTP handler makes an SMTP
Service.
> That is the current state.
> An incremental and better system would be Socket Listener with Config
> Object
> and Handler makes a Service.
> 
> A Bad State would be a server class hierarchy that couples Listener,
> Config
> and Handler.
> 
> -------------------------------

I couldn't disagree more with this statement.

It's a typical expression of extensibility at any cost.  The trick to
writing extensible code is to pick your extension points.  The
Configuration object is generic, so the class itself is not coupled to
the server or the handler.  Of course the specific values stored in that
Configuration object ARE tightly coupled to the SMTP server.  They have
to be, as they're the values that configure how the SMTP server runs.

The Handler and Listener should be tightly coupled.  There is no reason
why an NNTPServer should work with a SMTPHandler.  That wouldn't make
any sense.  Even in the proposed solution they wouldn't work together -
they'd just fail miserably on each connection because the SMTPHandler
wouldn't have the parameters it needed.  The handler and listener are
tightly coupled precisely because they are implementations of a service
- a service that has configuration, server specific details, and a
protocol.

> [Harmeet]
> Port, queus size etc  are the only listener concerns.
> 

Port, queue, etc.?  What's hidden in the etc.?

Trivializing the listener concerns like this is just hiding the issue.
The listener is responsible for socket and connection management, so it
needs to be able to specify the ConnectionManager and socket type.  As
the ConnectionManager needs a thread group to invoke a server
connection, it needs a thread group.  For certain services there will be
configuration restriction logic related to the socket types (imagine
publishing an HTTPS service with TCP/IP sockets).  This all belongs in
the listener.   The handler can't deal with any of it, because by the
time the handler is invoked this needs to have been already dealt with.
So the listener has to have a very non-trivial configuration.  

> -------------------------------
> 
> Second, there are resources provided by the server that depend on the
> configuration.  Yes, it's possible to re-read the configuration on a
per
> handler invocation basis.  But that's a bad idea for the reasons
> described above.
> 
> [Harmeet]
> That is why I am suggesting params or configuration object, cached for
a
> class of handlers.

See above why a param (lose type safety) or configuration object (can
only hold Stringifiable objects) is a bad idea.

> [Harmeet]
> Are you saying Avalon is too complicated. Avalon complexity or user
> friendliness is not a reason to change a fairly standard
listener-dispatch
> model. If there are problems with Phoenix, would it be better to not
use
> it
> ? That is a different topic. Complexity in one part should ideally not
be
> reason a rearchitect another part.

This is just silly.  What's being critiqued here as over-complex is a
vast simplification of one of the most difficult aspects of server
design.

Let me be clear - this is one of the best parts of Phoenix.  It's
incredibly well designed and well thought out and it takes one of the
most annoying parts of server design (getting all your dependency
checking right) and makes it trivial. 

Avalon in general, and Phoenix as a specific container is based around
the idea of component oriented programming.  Each of the services we
define - the listener classes - comprises a component.  It's a component
because it can be uniquely defined by a name, provides one or more
interfaces, exists for the life of the container, and can be looked up
by the other components running in the container.  These characteristics
are all necessary to establish a dependency chain and ensure that each
component can both declare what it needs to function and be evaluated as
part of a dependency chain.

The handler instances cannot be components.  There are some arbitrary
number of them, they don't have unique names, they can't be accessed by
other components, and they have very limited lifetimes.  Thus they don't
meet the minimum requirements.

Avalon is great, and Phoenix is a great container.  Chucking it is
really out of the question.  This is not a fault in the Avalon
framework, but rather a fault in your proposal.  Your proposal violates
basic principles of component oriented design.

 
> -------------------------------
> 
> [Harmeet]
> You can have different services. POP3 and POP3 Secure. Doesn't that
achive
> what you are describing. If you POP3 is a service, POP3 (plain) and
POP3
> Secure are 2 instances of it. It depends on what you mean by a
service.
> Effectiely if you want different listereners/threadgroups for a
protocol,
> that exists.

No, it doesn't.  You are misunderstanding the problem.  The ability to
configure different socket types and different thread groups is not
present in the current code.  Period.  The configuration doesn't look
for it.  All it can do is "plain" sockets vs. "ssl".  That's it.  And
the thread pool is always the default thread pool.  There is no way to
configure the server to use different thread pools for different
services.

 
> -------------------------------
> 
> As far as the ability to handle multiple socket types, that wasn'tb
> there.  The socket type was one of two hard coded strings, and the SSL
> selection was Boolean.  So if you wanted to have two different SSL
> socket types, one for use by the NNTP server and another for use by
the
> SMTP server (a very common configuration), then you would have to run
> two separate servers.  Very bad.
> 
> [Harmeet]
> It is pretty standard. Take a look at tomcat. They do the same thing.
> 

No, it isn't.  Almost nobody writes production servers that handle SSL
for multiple services without providing the ability to have multiple SSL
connection types.

This is also a misunderstanding of the standard recommended production
deployment of Tomcat.

Tomcat is a reference implementation of the Servlet/JSP specs.  As such
it was an explicit decision to focus on the Servlet/JSP processing in
Tomcat.  The network interface was left uber-simple on purpose.  The
static page delivery wasn't tuned.  Why?  Because the developers wanted
to focus on the core functionality.

And that's why the recommended deployment of Tomcat is in a reverse
proxy environment.  Apache in front to handle complex connection
management and configuration (and it certainly does allow multiple SSL
connection types) and to handle static page delivery.  And then Apache
proxies Servlet and JSP requests to the Tomcat server.  That allows the
production environment to take advantage of Apache's strong connection
management.

This is a recognized weakness of the Tomcat implementation.  That's why
there is always a lot of discussion about embedding Tomcat in Avalon
containers, etc. on the Avalon mailing lists.  Because a container that
implements the Avalon framework would provide all of the complex
connection management features that the Tomcat implementation lacks and
allow you to run more complex systems in non-reverse proxy environments.

I don't see any reason to hobble ourselves with an unnecessary flaw.

> [Harmeet]
> Basically if n lines have been written out, observer resets the timer.
> rather than only after all the lines have been written out.
> 

And this won't work.  Why?  Because it ignores the distinction between
observer and observable as well as the conditions under which we want a
reset.  The purpose of the reset here is to prevent transaction hangs,
as much from server processing as from client processing.  So the
primary trigger reset is not the amount of data written, but rather the
trigger that occurs every time a command is processed.  A command being
processed indicates that the transaction is proceeding.  As this
information is not available to the stream (it has no idea when a
command has completed), the stream cannot be the observable in this
situation.  The stream data is crucial for commands that potentially
involve large transfers of data, so the handler can't be the observable.
So the pattern doesn't fit.


> [Harmeet]
> The observer-observable is a clean and standard pattern way. It allows
an
> observer to monitor and decide on timeout reset.
> 

It is a clean and standard pattern.  It's just the wrong one for this
situation.  See above.

--Peter




--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

RE: [PATCH] Removing Scheduler dependency, refactoring service code

Reply via email to