Re: [PATCH] Removing Scheduler dependency, refactoring service code

Nathan Carter Thu, 10 Oct 2002 10:24:03 -0700

Peter et al,

1) This thread is quite interesting because I've asked many of these 
same questions in beginning to look at the IMAP stuff, which in many 
cases (at least the listener and handler code) seems to be based on old 
  versions of the POP equivalents (i.e., they are missing august and 
september changes by danny and peter to the POP code).


2) In your initial post, you mention doing some "basic" and "load" 
testing to further quantify the memory/robustness/scalability issues and 
show how your patch solves them.  What about (and I'd be happy to donate 
time for this):

   a)using a profiling tool to see how much time is actually being 
wasted by configuring each Handler instead of more tightly coupling them 
to the server listener.

   b)Do you have before/after metrics for your load tests?  A 2X 
improvement will be a strong argument in itself.  JMeter doesn't seem to 
support email server load testing - is there anything better than this 
(e.g. PushToTest, 
http://www.etestinglabs.com/benchmarks/svrtools/email/t1intro.asp)?

Again, I'd be more than happy to help in testing because this 
significantly affects what I'm doing in IMAP, and I'm going to have to 
perform this kind of testing on IMAP due to its complexity and higher 
overhead.

Nathan Carter


Peter M. Goldstein wrote:
> Harmeet et al,
> 
> 
>>Because the scheduler is the wrong abstraction.  This is not a
> 
> scheduler
> 
>>problem - it's a timeout.  A timeout is far simpler than a scheduler.
>>And that's the core issue.
>>
>>
>>[Harmeet]
>>One time timer task rather than repeated timer tasks is the same thing
> 
> as
> 
>>Watchdog timeout. It is property of implementation.
>>
> 
> 
> See Noel's comments on this.
> 
> I also want to add that this situation is directly analogous to the
> AuthService issue.  The interface itself defines a contract.  In the
> case of the Scheduler, that contract requires support for multi-threaded
> lookup of targets on each reset.  This requires lock contention
> (assuming the implementation properly implements the contract) on each
> reset.  Lock contention which is both unnecessary, since no handler is
> interested in the target of any other timeout and potentially expensive
> (multi-way lock contentions cost ~50x uncontested locks/volatiles in
> modern JVMs).  The interface contract puts restrictions on the
> implementation.
> 
> I'm not going to respond to the rest of the specific scheduler comments,
> because they all labor under the same confusion of interface contract
> and implementation.  Noel nailed the crucial point.
> 
> 
>>[Harmeet]
>>A Config object can take care of this. Handlers don't need to
>>expose themselves to lifecycle. What you are describing is
> 
> configuration
> 
>>per
>>handler and as you pointed out that is not needed. You are also saying
>>configuration is spread out, that can be centralized too in a config
>>object.
>>
> 
> 
> This is not correct.
> 
> There are a couple of real problems with this proposal, some of them
> highlighted in Noel's reply, some not.
> 
> Let's start with the situation.  We have a block definition in Phoenix
> that represents a service.  This class representing this block
> (NNTPServer, SMTPServer, etc.) is instantiated by Phoenix (or any other
> Avalon container) and is passed a read-only Configuration object (see
> the comments for the Configuration interface).  What happens now?
> 
> In the current code base the following behavior occurs.  Some parameters
> (port, isSSL, enabled) are read out of the configuration, validated, and
> set to defaults if they are not present.  The parameters that are read
> and validated are logged, the rest are not.  Note that even this
> functionality requires customized Server classes for each service.
> 
> A child of the Configuration is then passed to the superclass
> AbstractService.  There is no clear distinction between the elements
> that are in the top level Configuration object and its child class.
> When the first connection (and every subsequent connection) is made, the
> remainder of the Configuration parameters are read and validated.  This
> expense is incurred on each connection.  Note that the server may fail
> because of bad configuration upon first connection, which may be hours
> or even days after the server was started.  
> 
> Changes to the handler which add additional component dependencies now
> require alteration to the supposedly independent server configuration,
> both in the .xinfo and in the assembly.xml.  This is totally non-obvious
> from the "decoupled" listener/handler model.
> 
> Now consider the situation with the changed code.  The server starts and
> all service configuration parameters are read, validated, and set to
> defaults if they are not present.  Any fatal misconfiguration will be
> detected here and cause server failure.  All service configuration
> information is logged, so an administrator can simply look at the
> startup log and see how the service was configured.  There is no extra
> per-connection logging to that prints a config parameter that doesn't
> change over the life of the server.  It's stated just once in the log.
> 
> Since the handler is viewed as tightly coupled to the listener, and
> dependency checking is understood to be a function of the listener
> block, it is obvious that changes to the handler that introduce new
> dependencies will require modification of the listener configuration
> files.
> 
> Personally I much prefer the second scenario.  It's cleaner from both an
> administrator and a developer point of view.  See below for reasons why
> the potential third scenario - the suggestion to "cache" configuration
> information - is badly flawed.
> 
> 
>>[Harmeet]
>>Listener-dispactch(handler) is the pattern followed in current code.
> 
> It is
> 
>>more flexible than  server class hierarchy. The missing piece is
> 
> confusing
> 
>>and repeated configuration of the handler not a problem with the
> 
> pattern
> 
>>iteself. Most other servers try to separate the listeners from
> 
> dispatch
> 
>>workers/handlers.
> 
> 
> The current code is a listener-dispatch pattern.  It is a
> listener-dispatch pattern, and would remain so after the changes I
> propose.
> 
> I don't think you quite understand what a listener-dispatch pattern
> implies.  There is absolutely no requirement in that pattern (and
> certainly it doesn't hold true in generic implementations) that the
> listener be generic.  In general, the listener is specific to the
> particular service being implemented.  Usually there is a common
> interface that describes the listener-handler (in this case
> ConnectionHandlerFactory / ConnectionHandler) but there is certainly no
> requirement (nor is it desirable) that either part of the pattern be
> generic.  This pattern is all about a division of responsibility, not
> about the ability to swap in and out different partners.  The division
> of responsibility is clarified in my proposal, as all the configuration
> is moved to the listener and the handler can focus on its purpose -
> describing the logic for a particular transaction type (NNTP, SMTP,
> etc.).
> 
> As I stated before, neither side is generic in the current code base.
> The server or listener side inherits from a reasonably heavyweight
> AbstractService class - about the same weight as the
> AbstractJamesService class, but with fewer features.
> 
> 
>>[Harmeet]
>>Params/Config need not require revalidation and if they do there will
> 
> be
> 
>>the
>>overhead that you are avoiding by server class hierarchy.
>>
> 
> 
> Again, this is not correct.  You need to validate your config
> parameters.  You need to do it somewhere.  If you load and parse the
> parameters in the handler, you do it in the handler.  If you load and
> parse the parameters in the server class, you do it in the server class.
> 
> 
>>[Harmeet]
>>Cache the config. Precalculation for handlers, isn't that the only
> 
> gain in
> 
>>having a server class hierarchy. Maybe that idea can be abstracted and
>>used
>>with current model.
>>
> 
> 
> No, this isn't the only gain in having a server class hierarchy.  See
> this email and the previous one for a discussion of gains.  They
> include, but aren't limited to, better error handling, clearer logging,
> simpler configuration, and simpler code.
> 
> Moreover this caching idea you suggest is even worse than the current
> code.  Consider the situation.  Since you object to distinguishing
> between the server objects, presumably you're going to have the "first"
> handler for the service initialize the configuration and all others will
> access it.  That's because someone has to initialize the config.  If
> you've got a generic server with no ability to interpret the content of
> the configuration, it has to be the handler.  This is a classic server
> design anti-pattern.  Because what you've done is introduced a mutable
> (because the handler has to do the initial parsing and setting to
> defaults), shared configuration object.  Access to this object will
> require a lock on every call, because the handler has to check if the
> config is initialized and, if not, initialize it.  This becomes a
> heavily contested lock.  Very expensive, destroys performance.
> 
> Now, let's imagine that we admit that the natural place for the
> configuration parsing is in the server class.  But we still want to
> stick with this notion of "generic" arguments passed to the handler.
> What's wrong with this?  The basic problem with it is that you don't
> gain anything over the approach I suggest and you lose type safety.  The
> server class still needs to know what type of service it represents in
> order to parse and validate the configuration parameters.  In addition,
> if we restrict ourselves to the Avalon configuration rather than some
> Properties object, we lose even more. Configuration parameters that
> aren't easily represented as Strings become difficult to pass to the
> handler.  We have to put some String key in place, and give the handler
> the Configuration.  It pulls out the String, and looks up the non-String
> thing.  Ugh.
> 
> This is simple?
> 
> 
>>>>This leads to tighter coupling between the
>>>>Handlers and the corresponding server classes, but that's simply
>>>>recognition of a pre-existing reality.  These classes are
>>>
> extremely
> 
>>>>tightly coupled.
>>>
>>>Server is a listener. Handler contains protocol specific stuff. Why
>>
>>would
>>
>>>there be a (tight) coupling ?
>>>It is a listen-dispatch mechanism. Shouldn't it be as loosely
>>
> coupled
> 
>>as
>>
>>>possible ?
>>
>>This is a somewhat na�ve view of a service architecture.  Servers
> 
> aren't
> 
>>generic.  If they were, we'd need only one server class period.  But
>>servers are different from one another, so we need separate server
>>classes.  You can see how the servers differ by examining the diffs
> 
> for
> 
>>the assorted server classes that were submitted with this patch.
>>
>>
>>
>>[Harmeet]
>>No I am saying listeners are generic. Handlers/Workers are specific. A
> 
> mix
> 
>>of two makes service and servers.
>>
> 
> 
> Listeners are not generic.  Not even close.  They aren't now, they can't
> be for the reasons I outlined in my last email, and there's been no
> compelling reason presented for them to be generic.
> 
> 
> 
>>[Harmeet]
>>But a socket listener associated with SMTP handler makes an SMTP
> 
> Service.
> 
>>That is the current state.
>>An incremental and better system would be Socket Listener with Config
>>Object
>>and Handler makes a Service.
>>
>>A Bad State would be a server class hierarchy that couples Listener,
>>Config
>>and Handler.
>>
>>-------------------------------
> 
> 
> I couldn't disagree more with this statement.
> 
> It's a typical expression of extensibility at any cost.  The trick to
> writing extensible code is to pick your extension points.  The
> Configuration object is generic, so the class itself is not coupled to
> the server or the handler.  Of course the specific values stored in that
> Configuration object ARE tightly coupled to the SMTP server.  They have
> to be, as they're the values that configure how the SMTP server runs.
> 
> The Handler and Listener should be tightly coupled.  There is no reason
> why an NNTPServer should work with a SMTPHandler.  That wouldn't make
> any sense.  Even in the proposed solution they wouldn't work together -
> they'd just fail miserably on each connection because the SMTPHandler
> wouldn't have the parameters it needed.  The handler and listener are
> tightly coupled precisely because they are implementations of a service
> - a service that has configuration, server specific details, and a
> protocol.
> 
> 
>>[Harmeet]
>>Port, queus size etc  are the only listener concerns.
>>
> 
> 
> Port, queue, etc.?  What's hidden in the etc.?
> 
> Trivializing the listener concerns like this is just hiding the issue.
> The listener is responsible for socket and connection management, so it
> needs to be able to specify the ConnectionManager and socket type.  As
> the ConnectionManager needs a thread group to invoke a server
> connection, it needs a thread group.  For certain services there will be
> configuration restriction logic related to the socket types (imagine
> publishing an HTTPS service with TCP/IP sockets).  This all belongs in
> the listener.   The handler can't deal with any of it, because by the
> time the handler is invoked this needs to have been already dealt with.
> So the listener has to have a very non-trivial configuration.  
> 
> 
>>-------------------------------
>>
>>Second, there are resources provided by the server that depend on the
>>configuration.  Yes, it's possible to re-read the configuration on a
> 
> per
> 
>>handler invocation basis.  But that's a bad idea for the reasons
>>described above.
>>
>>[Harmeet]
>>That is why I am suggesting params or configuration object, cached for
> 
> a
> 
>>class of handlers.
> 
> 
> See above why a param (lose type safety) or configuration object (can
> only hold Stringifiable objects) is a bad idea.
> 
> 
>>[Harmeet]
>>Are you saying Avalon is too complicated. Avalon complexity or user
>>friendliness is not a reason to change a fairly standard
> 
> listener-dispatch
> 
>>model. If there are problems with Phoenix, would it be better to not
> 
> use
> 
>>it
>>? That is a different topic. Complexity in one part should ideally not
> 
> be
> 
>>reason a rearchitect another part.
> 
> 
> This is just silly.  What's being critiqued here as over-complex is a
> vast simplification of one of the most difficult aspects of server
> design.
> 
> Let me be clear - this is one of the best parts of Phoenix.  It's
> incredibly well designed and well thought out and it takes one of the
> most annoying parts of server design (getting all your dependency
> checking right) and makes it trivial. 
> 
> Avalon in general, and Phoenix as a specific container is based around
> the idea of component oriented programming.  Each of the services we
> define - the listener classes - comprises a component.  It's a component
> because it can be uniquely defined by a name, provides one or more
> interfaces, exists for the life of the container, and can be looked up
> by the other components running in the container.  These characteristics
> are all necessary to establish a dependency chain and ensure that each
> component can both declare what it needs to function and be evaluated as
> part of a dependency chain.
> 
> The handler instances cannot be components.  There are some arbitrary
> number of them, they don't have unique names, they can't be accessed by
> other components, and they have very limited lifetimes.  Thus they don't
> meet the minimum requirements.
> 
> Avalon is great, and Phoenix is a great container.  Chucking it is
> really out of the question.  This is not a fault in the Avalon
> framework, but rather a fault in your proposal.  Your proposal violates
> basic principles of component oriented design.
> 
>  
> 
>>-------------------------------
>>
>>[Harmeet]
>>You can have different services. POP3 and POP3 Secure. Doesn't that
> 
> achive
> 
>>what you are describing. If you POP3 is a service, POP3 (plain) and
> 
> POP3
> 
>>Secure are 2 instances of it. It depends on what you mean by a
> 
> service.
> 
>>Effectiely if you want different listereners/threadgroups for a
> 
> protocol,
> 
>>that exists.
> 
> 
> No, it doesn't.  You are misunderstanding the problem.  The ability to
> configure different socket types and different thread groups is not
> present in the current code.  Period.  The configuration doesn't look
> for it.  All it can do is "plain" sockets vs. "ssl".  That's it.  And
> the thread pool is always the default thread pool.  There is no way to
> configure the server to use different thread pools for different
> services.
> 
>  
> 
>>-------------------------------
>>
>>As far as the ability to handle multiple socket types, that wasn'tb
>>there.  The socket type was one of two hard coded strings, and the SSL
>>selection was Boolean.  So if you wanted to have two different SSL
>>socket types, one for use by the NNTP server and another for use by
> 
> the
> 
>>SMTP server (a very common configuration), then you would have to run
>>two separate servers.  Very bad.
>>
>>[Harmeet]
>>It is pretty standard. Take a look at tomcat. They do the same thing.
>>
> 
> 
> No, it isn't.  Almost nobody writes production servers that handle SSL
> for multiple services without providing the ability to have multiple SSL
> connection types.
> 
> This is also a misunderstanding of the standard recommended production
> deployment of Tomcat.
> 
> Tomcat is a reference implementation of the Servlet/JSP specs.  As such
> it was an explicit decision to focus on the Servlet/JSP processing in
> Tomcat.  The network interface was left uber-simple on purpose.  The
> static page delivery wasn't tuned.  Why?  Because the developers wanted
> to focus on the core functionality.
> 
> And that's why the recommended deployment of Tomcat is in a reverse
> proxy environment.  Apache in front to handle complex connection
> management and configuration (and it certainly does allow multiple SSL
> connection types) and to handle static page delivery.  And then Apache
> proxies Servlet and JSP requests to the Tomcat server.  That allows the
> production environment to take advantage of Apache's strong connection
> management.
> 
> This is a recognized weakness of the Tomcat implementation.  That's why
> there is always a lot of discussion about embedding Tomcat in Avalon
> containers, etc. on the Avalon mailing lists.  Because a container that
> implements the Avalon framework would provide all of the complex
> connection management features that the Tomcat implementation lacks and
> allow you to run more complex systems in non-reverse proxy environments.
> 
> I don't see any reason to hobble ourselves with an unnecessary flaw.
> 
> 
>>[Harmeet]
>>Basically if n lines have been written out, observer resets the timer.
>>rather than only after all the lines have been written out.
>>
> 
> 
> And this won't work.  Why?  Because it ignores the distinction between
> observer and observable as well as the conditions under which we want a
> reset.  The purpose of the reset here is to prevent transaction hangs,
> as much from server processing as from client processing.  So the
> primary trigger reset is not the amount of data written, but rather the
> trigger that occurs every time a command is processed.  A command being
> processed indicates that the transaction is proceeding.  As this
> information is not available to the stream (it has no idea when a
> command has completed), the stream cannot be the observable in this
> situation.  The stream data is crucial for commands that potentially
> involve large transfers of data, so the handler can't be the observable.
> So the pattern doesn't fit.
> 
> 
> 
>>[Harmeet]
>>The observer-observable is a clean and standard pattern way. It allows
> 
> an
> 
>>observer to monitor and decide on timeout reset.
>>
> 
> 
> It is a clean and standard pattern.  It's just the wrong one for this
> situation.  See above.
> 
> --Peter
> 
> 
> 
> 
> --
> To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
> 
> 




--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Re: [PATCH] Removing Scheduler dependency, refactoring service code

Reply via email to