Re: [java] 0-10 proposal

Rupert Smith Thu, 19 Jul 2007 04:14:52 -0700

Hi,

When I say continuation, I don't necessarily mean a procedure that will be
run asynchronously, I just mean a procedure that can be passed as an
argument, or a 'first class function'. Such things can be run
asynchronously, but that is not an necessity for something to be called a
continuation.


Your example code finishes with calls to delegates. If asynchronous handling
were to be introduced at a more course grained level, that might be the
place to do it, as the events leave one stage of processing and trigger
activity in another. I do say _if_ though, just because we can have
asynchronous contintuations and thread pools etc. doesn't mean that its the
best thing to do. What I'm really trying to point out is that the concept of
contuations is being re-invented many times in the Java code base, so watch
out for it when it does, and it might be worth marking all cases where it
does happen with a common interface for it.

Also, I think the event concept is a re-usable idea, that could be extended
further out than just in this processing stage. This stage takes individual
frames as events, and composes into more granular segments, representing
method calls and messages and so on. These could still be modeled as events
that are passed into the routing/delivery stages.

Java does not support first class functions, that is, you cannot pass
methods as arguments. However, there is a way to work around this. For
example, if I have a function f that takes argument x of type X and return y
of type Y, in a language that supports first class functions I might have:

fun f x:X = some compuation resulting in a value in Y

and f has type X -> Y

To encode this in Java, or any OO language for that matter, you can do:

class F<X, Y>
{
 Y compute(X x) { ... }
}

or a variation on this that splits it up a bit, might be:

class F<X, Y>
{
 setArg(X x);
 void compute();
 Y getResult();
}

Then I can write funky stuff, like a map function, that applies another
function to every element of a collection, or a filterator that applies a
filter function to an iterator to produce another iterator and so on. All
the sorts of thing you could happily do in Python or Ruby or Erlang and so
on, to write more reusable, and concise code.

So whenever I see an interface in Java that has just one method in it, or
looks suspiciously like a variation of that, I think aha! that might be a
continuation!

I can spot a couple straight away in your Stub code, Handler and Delegator,
and transitively all the classes that extend or implement them, which is
pretty much the entire thing. If I look at something like Switch, I can see
that it is a continuation, that takes a set of continuations, and an event,
and chains a call down onto one of the set of continuations, depending on
the type of the event, in what is called a 'continuation passing style'.

Whenever I see the same code pattern being used over and again, I think, I
could put an interface around that, and then write other utility code that
works with that concept in terms of the interface. There are the
continuations that I mentioned already in the code base, but we have not
named their 'compute' methods consistently or put a common interface around
all of them, and there would be some advantage to be gained in doing so. The
names we have used so far are process, processAll, execute, or
processMethod, and these can be found in FailoverHandler, FailoverSupport,
Event, Job, PoolingFilter, BlockingMethodFrameListener, and so on. So you
are introducing two more method names for the same basic concept: handle and
delegate. As an example, I can't remember the exact details, but I think
FailoverProtectedOperation is a continuation that is wrapped in a utility
executor (itself a continuation), FailoverRetrySupport, that repeatedly runs
it until it does not fail to execute. To my thinking, this retry support is
a piece of general purpose utility code, that can run any continuation until
it succeeds, and does not necessarily have to be presented in terms of
failover, but can be presented as a more abstract concept that might find
re-use in other parts of the code.

The reason I chose the name 'run' and the interfaces Runnable, Callable and
Future to base my Continuation on, is simply because the existing code in
the Java library makes use of these concepts already, and I would be able to
make use of Executors and so on as a result, rather than because I thought
that that was the best way to represent continuations in Java.

Another reason that a Continuation interface might be a good idea, is that
it will mean that we explicitly mark all parts of the code where
continuations are used. Which leads me on to the next thing; a warning!
Continuations are badly supported in Java compared with more
dynamic/functional language. As I alredy mentioned Java does not natively
support continuations, you can find ways of doing it, but there are some
serious drawbacks that you need to be aware of.

If I chain a load of functions together in a continuation passing style, in
a functional language the compiler will be smart enough to know that when it
creates a continuation, it can eliminate from its environment (the call
stack), all variables that are not referenced in the continuations body.
This means that stack frames no longer referenced can be popped off the
stack and reclaimed as the continuation chain progresses. And I bet it
really made someones head hurt to figure out how to do that! If it can, it
will use tail recursion, to avoid creating new stack frames altogether. The
Java compiler and JVM do not do this, so continuation passing always creates
full stack frames, and as the chain progresses the stack, and consequently
everything it references on the heap, continues to build up, even though
most of it is never used again. In Java, continuation passing eats RAM, both
stack and heap.

I don't know if you know, but when Python was implemented in Java as Jython,
there were considereable difficulties in dealing with continuations. The JVM
stack could be used, but tail recursion and environment trimming is not
supported, so continuation passing is severeley limited. Alternatively the
stack could be explicitly modelled on the heap, using for example, a
java.util.Stack, but this would be slow. So Jython is slow, contrasted with
ports to the .Net machine for comparison (some of my old university
professors designed the .Net machine and they were all functional
programmers, hence its support for doing this).

As an example, I ran your Stub code and put a stack dump in the
SessionDelegate, just before the System.out.println:

Thread.dumpStack();
System.out.println("got a queue declare");

then looked at the call stack when this point is reached. Here it is:

   at org.apache.qpid.commlayer.SessionDelegate.queue_declare(Stub.java
:524)
   at org.apache.qpid.commlayer.SessionDelegate.queue_declare(Stub.java
:518)
   at org.apache.qpid.commlayer.QueueDeclare_v0_10.delegate(Stub.java:671)
   at org.apache.qpid.commlayer.MethodDispatcher.handle(Stub.java:476)
   at org.apache.qpid.commlayer.MethodDispatcher.handle(Stub.java:462)
   at org.apache.qpid.commlayer.SegmentAssembler.handle(Stub.java:395)
   at org.apache.qpid.commlayer.SegmentAssembler.handle(Stub.java:372)
   at org.apache.qpid.commlayer.Switch.handle(Stub.java:126)
   at org.apache.qpid.commlayer.SessionResolver.handle(Stub.java:419)
   at org.apache.qpid.commlayer.SessionResolver.handle(Stub.java:407)
   at org.apache.qpid.commlayer.Switch.handle(Stub.java:126)
   at org.apache.qpid.commlayer.Channel.handle(Stub.java:174)
   at org.apache.qpid.commlayer.Connection.handle(Stub.java:149)
   at org.apache.qpid.commlayer.Stub.frame(Stub.java:50)
   at org.apache.qpid.commlayer.Stub.frame(Stub.java:34)
   at org.apache.qpid.commlayer.Stub.main(Stub.java:59)

Here is a very approximate outline for a piece of code that does this
processing (I missed out a lot, but I think you get the idea), but in a more
conventional flat 'C' coding style:

class Connection
{
 public void handle(Frame frame)
 {
   Channel channel = getChannelForFrame(frame);

   Session session = resolveSessionForFrame(frame);

   Segment segment = null;

   if (frame.isFirst && frame.isLast)
   {
     segment = new Segment(frame);
   }
   else
   {
     // Get null, or the completed segment if this is that last frame of
it.
     segment = addFrameToPendingSegments(frame);
   }

   if (segment == null)
   {
      // Processing complete for now, need more frames to complete segment.
      return;
   }

   if (segment.isMethod)
   {
     short methodCode = segment.getMethodCode();

     switch(methodCode)
     {
       case OPEN:
       ...
       case DECLARE_QUEUE:
       ...
   }

   ...
 }

All the intermediate calls to process the frame will have created stack
frames, with local variables, carried out their work, and then cleaned up
the stack. So the stack and heap will be cleaned up before I call into the
next stage of processing. If the broker has 100,000 messages sitting inside
it, pending routing and delivery, this could make a huge difference.

There is one clever thing that could be achieved by using continuations with
intermediate processing state pending on the stack. Supposing I have a
message to process, and I create an 'event' for every queue that I deliver
that message to, and the processing of each of these events is a
continuation of the routing process. I could write it in such a way that
when each of the delivery continuations complete, the call returns to just
beyond the routing call that created them, at this point the message has
been delivered and can be safely cleaned up. This takes advantage of the
symmetry of stack based processing, for every push there is a pop, to do
away with the reference counting that we use at present.

Now, I'm not saying the your code is 'wrong', quite the contrary, I really
enjoyed reading it, and seeing the clever things that you can do with
continuations; it certainly is a neat way to do things. It may well be that
it is fast enough for our purposes, and the RAM overheads are acceptable.
But it is worth remembering the price you pay for doing clever stuff in
Java, and that Java runs fastest when it looks like straight line C code. I
think I would put an interface around this 'layer', so that as always with
optimizations, an optimized version can be written at a later date on an
as-needed basis.

Whilst I'm serious about re-using code accross common concepts, I'm not
really too serious about suggesting that you use my ideas in your code
example. I just wanted to point a few things out, and keep the flow of ideas
alive and possibly trigger off any good ideas that anyone else may have.

Rupert


On 18/07/07, Rafael Schloming <[EMAIL PROTECTED]> wrote:


Rupert Smith wrote:
> At a guess, I'd say you are a python programmer? and missing its more
> dynamic capabilities...

I guess I'm outed. ;)

> I do wonder if running every event through around 2 dynamic switches,
> dispatching to handlers looked up in a hashtable, might be a little
slow?
> Although, I admire the cleverness and neatness of the solution. It is
> perfectly possible that it will be fast enough. I might put a little
timing
> test around one of those switches and find out just how fast it will run
> compared with a 'switch' statement.
>
> Of course, some of your switches dispatch based on a short constant, so
you
> could replace the hash table lookups with real 'switch' statements if
need
> be.

The Switch class is really just there to keep the stubs concise. As you
say if necessary both uses could easily be replaced with a manual switch
statement.

That said, I don't believe this would be necessary as currently every
incoming frame gets routed through AMQSateManager which itself does two
hashtable lookups to find the eventual handler of the frame. The
proposed design uses a delegation pattern that accomplishes the same
thing as AMQStateManager based purely on method dispatch. This makes the
number of hashtable lookups equal in both designs, with the potential to
optimize it down to zero in the proposed design.

> One idea that springs to mind, looking at this code: Could you make
events
> self handling?
>
> For example, instead of doing:
>
> handler.handle(event);
>
> what about:
>
> event.setHandler(handler);
> event.run();
>
> or:
>
> event.setHandler(handler);
> executor.execute(event);
>
> The only reason I suggest this, is so that events become continuations.
For
> example:
>
> public abstract class Continuation<V> implements Runnable, Callable<V>,
> Future<V>
> {
>    /**
>     * Applies the delayed procedure.
>     */
>    public abstract void run();
>
>    /**
>     * Applies the delayed procedure, or throws an exception if unable to
do
> so.
>     *
>     * @return The computed result.
>     *
>     * @throws Exception If unable to compute a result.
>     */
>    public V call() throws Exception
>    {
>        execute();
>
>        return get();
>    }
>
>    ...
>
> public class Event extends Continuation
> ...
>
> As events are Runnable, they can make use of
> java.util.concurrent.Executorsto run them. A simple executor to do
> this immediately is:
>
>    /**
>     * A simple executor. Runs the task at hand straight away.
>     */
>    class ImmediateExecutor implements Executor
>    {
>        /**
>         * Runs the task straight away.
>         *
>         * @param r The task to run.
>         */
>        public void execute(Runnable r)
>        {
>            r.run();
>        }
>    }
>
> This opens up the possibility of writing some utility code based around
> continuations. Some example:
>
> Many events could be batched together into a single containing event
that
> executes all of its contained events one after the other. Advantage:
less
> context switching when running a lot of asynchronous events. See Job and
> Event in the existing code.
>
> Writing cancellable/interuptable tasks. For example, when a synchronous
> request needs to be cancelled and re-sent in the event of failover.
>
> Events, or batches of events can be handled by thread pools. We can
start
> with one single thread pool, to handle all asynchronous events, then
> consider whether splitting into staged pools might confer any
advantages.
>
> Asynchronous Executors that take account of priority could be written.
>
> The concept of continuations has been reinvented several times in the
> existing code base. It would make sense to refactor and share common
code.
> Some examples are: FailoverHandler, FailoverSupport, Event, Job,
> PoolingFilter, BlockingMethodFrameListener, and I'm sure there are more.

It would definitely make sense to consolidate such things into a single
pattern, however I'm not sure it makes sense to introduce continuations
at such a low level. My conception of the responsibility of this layer
is to accept incoming I/O events and aggregate, decode, and translate
into higher level events that are meaningful to the upper domain layers
(either client or broker) that use this code.

I would therefore expect the domain layers that use this code to
determine the threading model and introduce continuations at that point
if it is appropriate for the given event.

That said I'm not sure I fully understand what you're describing, so
there may be ways this layer could make it easier for the domain layers
that use this code to introduce continuations should they wish to.

--Rafael

>
> Rupert
>
>
> On 18/07/07, Rafael Schloming <[EMAIL PROTECTED] > wrote:
>>
>> Here are some stubs I've been working on that describe most of the
>> communication layer. For those who like dealing directly with code,
>> please dig in. I will be following up with some UML and a higher level
>> description tomorrow.
>>
>> --Rafael
>>
>> Arnaud Simon wrote:
>> > Hi,
>> >
>> > I have attached a document describing my view on the new 0-10
>> > implementation. I would suggest that we first implement a 0.10 client
>> > that we will test against the 0.10 C++ broker. We will then have a
>> > chance to discuss all together the Java broker design during our Java
>> > face to face (Rob should organize it in Glasgow later this year).
>> >
>> > Basically we have identified three main components:
>> > - the communication layer that is common to broker and client
>> > - the Qpid API that is client specific and plugged on the
communication
>> > layer
>> > - The JMS API that comes on top of the Qpid API
>> >
>> > The plan is to provide support for 0.8 and 0.10 by first
distinguishing
>> > the name spaces. Once the 0.10 client is stable we will then be able
to
>> > provide a 0.8 implementation of the Qpid API (based on the existing
>> code
>>
>> > obviously). This will have the advantage to only support a single JMS
>> > implementation.
>> >
>> > I will send in another thread the QPI API as Rajith and I see it
right
>> > now. Rafael should send more info about the communication layer.
>> >
>> > Regards
>> >
>> > Arnaud
>> >
>>
>>
>

Re: [java] 0-10 proposal

Reply via email to