Re: [java] 0-10 proposal

Rupert Smith Thu, 19 Jul 2007 06:38:50 -0700

I have been a little approximate with my terminology; its been a while since
I last looked at stuff like this! Specifically, I think I mean 'closure'
instead of 'continuation'. So in Java:


A first class function might be:

interface Function<X, Y>
{
 Y compute(X x);
}

A closure might be:

interface Closure<X, Y>
{
 void setArg(X x);
 Y compute();
}

The Switch class is a continuation though, in that it chains its call onto
another function or closure.

On 19/07/07, Rupert Smith <[EMAIL PROTECTED]> wrote:

Hi,

When I say continuation, I don't necessarily mean a procedure that will be
run asynchronously, I just mean a procedure that can be passed as an
argument, or a 'first class function'. Such things can be run
asynchronously, but that is not an necessity for something to be called a
continuation.

Your example code finishes with calls to delegates. If asynchronous
handling were to be introduced at a more course grained level, that might be
the place to do it, as the events leave one stage of processing and trigger
activity in another. I do say _if_ though, just because we can have
asynchronous contintuations and thread pools etc. doesn't mean that its the
best thing to do. What I'm really trying to point out is that the concept of
contuations is being re-invented many times in the Java code base, so watch
out for it when it does, and it might be worth marking all cases where it
does happen with a common interface for it.

Also, I think the event concept is a re-usable idea, that could be
extended further out than just in this processing stage. This stage takes
individual frames as events, and composes into more granular segments,
representing method calls and messages and so on. These could still be
modeled as events that are passed into the routing/delivery stages.

Java does not support first class functions, that is, you cannot pass
methods as arguments. However, there is a way to work around this. For
example, if I have a function f that takes argument x of type X and return y
of type Y, in a language that supports first class functions I might have:

fun f x:X = some compuation resulting in a value in Y

and f has type X -> Y

To encode this in Java, or any OO language for that matter, you can do:

class F<X, Y>
{
  Y compute(X x) { ... }
}

or a variation on this that splits it up a bit, might be:

class F<X, Y>
{
  setArg(X x);
  void compute();
  Y getResult();
}

Then I can write funky stuff, like a map function, that applies another
function to every element of a collection, or a filterator that applies a
filter function to an iterator to produce another iterator and so on. All
the sorts of thing you could happily do in Python or Ruby or Erlang and so
on, to write more reusable, and concise code.

So whenever I see an interface in Java that has just one method in it, or
looks suspiciously like a variation of that, I think aha! that might be a
continuation!

I can spot a couple straight away in your Stub code, Handler and
Delegator, and transitively all the classes that extend or implement them,
which is pretty much the entire thing. If I look at something like Switch, I
can see that it is a continuation, that takes a set of continuations, and an
event, and chains a call down onto one of the set of continuations,
depending on the type of the event, in what is called a 'continuation
passing style'.

Whenever I see the same code pattern being used over and again, I think, I
could put an interface around that, and then write other utility code that
works with that concept in terms of the interface. There are the
continuations that I mentioned already in the code base, but we have not
named their 'compute' methods consistently or put a common interface around
all of them, and there would be some advantage to be gained in doing so. The
names we have used so far are process, processAll, execute, or
processMethod, and these can be found in FailoverHandler, FailoverSupport,
Event, Job, PoolingFilter, BlockingMethodFrameListener, and so on. So you
are introducing two more method names for the same basic concept: handle and
delegate. As an example, I can't remember the exact details, but I think
FailoverProtectedOperation is a continuation that is wrapped in a utility
executor (itself a continuation), FailoverRetrySupport, that repeatedly runs
it until it does not fail to execute. To my thinking, this retry support is
a piece of general purpose utility code, that can run any continuation until
it succeeds, and does not necessarily have to be presented in terms of
failover, but can be presented as a more abstract concept that might find
re-use in other parts of the code.

The reason I chose the name 'run' and the interfaces Runnable, Callable
and Future to base my Continuation on, is simply because the existing code
in the Java library makes use of these concepts already, and I would be able
to make use of Executors and so on as a result, rather than because I
thought that that was the best way to represent continuations in Java.

Another reason that a Continuation interface might be a good idea, is that
it will mean that we explicitly mark all parts of the code where
continuations are used. Which leads me on to the next thing; a warning!
Continuations are badly supported in Java compared with more
dynamic/functional language. As I alredy mentioned Java does not natively
support continuations, you can find ways of doing it, but there are some
serious drawbacks that you need to be aware of.

If I chain a load of functions together in a continuation passing style,
in a functional language the compiler will be smart enough to know that when
it creates a continuation, it can eliminate from its environment (the call
stack), all variables that are not referenced in the continuations body.
This means that stack frames no longer referenced can be popped off the
stack and reclaimed as the continuation chain progresses. And I bet it
really made someones head hurt to figure out how to do that! If it can, it
will use tail recursion, to avoid creating new stack frames altogether. The
Java compiler and JVM do not do this, so continuation passing always creates
full stack frames, and as the chain progresses the stack, and consequently
everything it references on the heap, continues to build up, even though
most of it is never used again. In Java, continuation passing eats RAM, both
stack and heap.

I don't know if you know, but when Python was implemented in Java as
Jython, there were considereable difficulties in dealing with continuations.
The JVM stack could be used, but tail recursion and environment trimming is
not supported, so continuation passing is severeley limited. Alternatively
the stack could be explicitly modelled on the heap, using for example, a
java.util.Stack, but this would be slow. So Jython is slow, contrasted
with ports to the .Net machine for comparison (some of my old university
professors designed the .Net machine and they were all functional
programmers, hence its support for doing this).

As an example, I ran your Stub code and put a stack dump in the
SessionDelegate, just before the System.out.println:

Thread.dumpStack();
System.out.println("got a queue declare");

then looked at the call stack when this point is reached. Here it is:

    at org.apache.qpid.commlayer.SessionDelegate.queue_declare(Stub.java
:524)
    at org.apache.qpid.commlayer.SessionDelegate.queue_declare(Stub.java
:518)
    at org.apache.qpid.commlayer.QueueDeclare_v0_10.delegate (Stub.java
:671)
    at org.apache.qpid.commlayer.MethodDispatcher.handle(Stub.java:476)
    at org.apache.qpid.commlayer.MethodDispatcher.handle(Stub.java:462)
    at org.apache.qpid.commlayer.SegmentAssembler.handle (Stub.java:395)
    at org.apache.qpid.commlayer.SegmentAssembler.handle(Stub.java:372)
    at org.apache.qpid.commlayer.Switch.handle(Stub.java:126)
    at org.apache.qpid.commlayer.SessionResolver.handle(Stub.java :419)
    at org.apache.qpid.commlayer.SessionResolver.handle(Stub.java:407)
    at org.apache.qpid.commlayer.Switch.handle(Stub.java:126)
    at org.apache.qpid.commlayer.Channel.handle(Stub.java:174)
    at org.apache.qpid.commlayer.Connection.handle(Stub.java:149)
    at org.apache.qpid.commlayer.Stub.frame(Stub.java:50)
    at org.apache.qpid.commlayer.Stub.frame(Stub.java:34)
    at org.apache.qpid.commlayer.Stub.main (Stub.java:59)

Here is a very approximate outline for a piece of code that does this
processing (I missed out a lot, but I think you get the idea), but in a more
conventional flat 'C' coding style:

class Connection
{
  public void handle(Frame frame)
  {
    Channel channel = getChannelForFrame(frame);

    Session session = resolveSessionForFrame(frame);

    Segment segment = null;

    if (frame.isFirst && frame.isLast)
    {
      segment = new Segment(frame);
    }
    else
    {
      // Get null, or the completed segment if this is that last frame of
it.
      segment = addFrameToPendingSegments(frame);
    }

    if (segment == null)
    {
       // Processing complete for now, need more frames to complete
segment.
       return;
    }

    if (segment.isMethod)
    {
      short methodCode = segment.getMethodCode();

      switch(methodCode)
      {
        case OPEN:
        ...
        case DECLARE_QUEUE:
        ...
    }

    ...
  }

All the intermediate calls to process the frame will have created stack
frames, with local variables, carried out their work, and then cleaned up
the stack. So the stack and heap will be cleaned up before I call into the
next stage of processing. If the broker has 100,000 messages sitting inside
it, pending routing and delivery, this could make a huge difference.

There is one clever thing that could be achieved by using continuations
with intermediate processing state pending on the stack. Supposing I have a
message to process, and I create an 'event' for every queue that I deliver
that message to, and the processing of each of these events is a
continuation of the routing process. I could write it in such a way that
when each of the delivery continuations complete, the call returns to just
beyond the routing call that created them, at this point the message has
been delivered and can be safely cleaned up. This takes advantage of the
symmetry of stack based processing, for every push there is a pop, to do
away with the reference counting that we use at present.

Now, I'm not saying the your code is 'wrong', quite the contrary, I really
enjoyed reading it, and seeing the clever things that you can do with
continuations; it certainly is a neat way to do things. It may well be that
it is fast enough for our purposes, and the RAM overheads are acceptable.
But it is worth remembering the price you pay for doing clever stuff in
Java, and that Java runs fastest when it looks like straight line C code. I
think I would put an interface around this 'layer', so that as always with
optimizations, an optimized version can be written at a later date on an
as-needed basis.

Whilst I'm serious about re-using code accross common concepts, I'm not
really too serious about suggesting that you use my ideas in your code
example. I just wanted to point a few things out, and keep the flow of ideas
alive and possibly trigger off any good ideas that anyone else may have.

Rupert

On 18/07/07, Rafael Schloming <[EMAIL PROTECTED]> wrote:
>
> Rupert Smith wrote:
> > At a guess, I'd say you are a python programmer? and missing its more
> > dynamic capabilities...
>
> I guess I'm outed. ;)
>
> > I do wonder if running every event through around 2 dynamic switches,
> > dispatching to handlers looked up in a hashtable, might be a little
> slow?
> > Although, I admire the cleverness and neatness of the solution. It is
> > perfectly possible that it will be fast enough. I might put a little
> timing
> > test around one of those switches and find out just how fast it will
> run
> > compared with a 'switch' statement.
> >
> > Of course, some of your switches dispatch based on a short constant,
> so you
> > could replace the hash table lookups with real 'switch' statements if
> need
> > be.
>
> The Switch class is really just there to keep the stubs concise. As you
> say if necessary both uses could easily be replaced with a manual switch
>
> statement.
>
> That said, I don't believe this would be necessary as currently every
> incoming frame gets routed through AMQSateManager which itself does two
> hashtable lookups to find the eventual handler of the frame. The
> proposed design uses a delegation pattern that accomplishes the same
> thing as AMQStateManager based purely on method dispatch. This makes the
> number of hashtable lookups equal in both designs, with the potential to
>
> optimize it down to zero in the proposed design.
>
> > One idea that springs to mind, looking at this code: Could you make
> events
> > self handling?
> >
> > For example, instead of doing:
> >
> > handler.handle(event);
> >
> > what about:
> >
> > event.setHandler(handler);
> > event.run();
> >
> > or:
> >
> > event.setHandler(handler);
> > executor.execute(event);
> >
> > The only reason I suggest this, is so that events become
> continuations. For
> > example:
> >
> > public abstract class Continuation<V> implements Runnable,
> Callable<V>,
> > Future<V>
> > {
> >    /**
> >     * Applies the delayed procedure.
> >     */
> >    public abstract void run();
> >
> >    /**
> >     * Applies the delayed procedure, or throws an exception if unable
> to do
> > so.
> >     *
> >     * @return The computed result.
> >     *
> >     * @throws Exception If unable to compute a result.
> >     */
> >    public V call() throws Exception
> >    {
> >        execute();
> >
> >        return get();
> >    }
> >
> >    ...
> >
> > public class Event extends Continuation
> > ...
> >
> > As events are Runnable, they can make use of
> > java.util.concurrent.Executorsto run them. A simple executor to do
> > this immediately is:
> >
> >    /**
> >     * A simple executor. Runs the task at hand straight away.
> >     */
> >    class ImmediateExecutor implements Executor
> >    {
> >        /**
> >         * Runs the task straight away.
> >         *
> >         * @param r The task to run.
> >         */
> >        public void execute(Runnable r)
> >        {
> >            r.run();
> >        }
> >    }
> >
> > This opens up the possibility of writing some utility code based
> around
> > continuations. Some example:
> >
> > Many events could be batched together into a single containing event
> that
> > executes all of its contained events one after the other. Advantage:
> less
> > context switching when running a lot of asynchronous events. See Job
> and
> > Event in the existing code.
> >
> > Writing cancellable/interuptable tasks. For example, when a
> synchronous
> > request needs to be cancelled and re-sent in the event of failover.
> >
> > Events, or batches of events can be handled by thread pools. We can
> start
> > with one single thread pool, to handle all asynchronous events, then
> > consider whether splitting into staged pools might confer any
> advantages.
> >
> > Asynchronous Executors that take account of priority could be written.
> >
> > The concept of continuations has been reinvented several times in the
> > existing code base. It would make sense to refactor and share common
> code.
> > Some examples are: FailoverHandler, FailoverSupport, Event, Job,
> > PoolingFilter, BlockingMethodFrameListener, and I'm sure there are
> more.
>
> It would definitely make sense to consolidate such things into a single
> pattern, however I'm not sure it makes sense to introduce continuations
> at such a low level. My conception of the responsibility of this layer
> is to accept incoming I/O events and aggregate, decode, and translate
> into higher level events that are meaningful to the upper domain layers
> (either client or broker) that use this code.
>
> I would therefore expect the domain layers that use this code to
> determine the threading model and introduce continuations at that point
> if it is appropriate for the given event.
>
> That said I'm not sure I fully understand what you're describing, so
> there may be ways this layer could make it easier for the domain layers
> that use this code to introduce continuations should they wish to.
>
> --Rafael
>
> >
> > Rupert
> >
> >
> > On 18/07/07, Rafael Schloming <[EMAIL PROTECTED] > wrote:
> >>
> >> Here are some stubs I've been working on that describe most of the
> >> communication layer. For those who like dealing directly with code,
> >> please dig in. I will be following up with some UML and a higher
> level
> >> description tomorrow.
> >>
> >> --Rafael
> >>
> >> Arnaud Simon wrote:
> >> > Hi,
> >> >
> >> > I have attached a document describing my view on the new 0-10
> >> > implementation. I would suggest that we first implement a 0.10client
> >> > that we will test against the 0.10 C++ broker. We will then have a
> >> > chance to discuss all together the Java broker design during our
> Java
> >> > face to face (Rob should organize it in Glasgow later this year).
> >> >
> >> > Basically we have identified three main components:
> >> > - the communication layer that is common to broker and client
> >> > - the Qpid API that is client specific and plugged on the
> communication
> >> > layer
> >> > - The JMS API that comes on top of the Qpid API
> >> >
> >> > The plan is to provide support for 0.8 and 0.10 by first
> distinguishing
> >> > the name spaces. Once the 0.10 client is stable we will then be
> able to
> >> > provide a 0.8 implementation of the Qpid API (based on the existing
> >> code
> >>
> >> > obviously). This will have the advantage to only support a single
> JMS
> >> > implementation.
> >> >
> >> > I will send in another thread the QPI API as Rajith and I see it
> right
> >> > now. Rafael should send more info about the communication layer.
> >> >
> >> > Regards
> >> >
> >> > Arnaud
> >> >
> >>
> >>
> >
>

Re: [java] 0-10 proposal

Reply via email to