Hi,

On Monday 08 June 2009 20:11:02 Nathan Davis wrote:
> I recently came across the PyCon 2009 video of the Kamaelia presentation.
> I'm not too familiar with either Twisted or Kamaelia.  Kamaelia seems to be
> an "improvement" over twisted, but I just want to see if my analysis is
> correct.

As some background to Kamaelia, about 7 years ago, I used to work with a 
company producing large scale internet software (Inktomi), and had worked in 
that field for about 5 years or so. That software was proprietary and built 
using a reactor based model essentially. Also, that software was written in 
C++, which adds an extra layer of complexity on top, but for obvious reasons.

Whilst I was there, due to large scale real world deployments being the best 
test of any network system, I saw many real scenarios which made it clear 
that whilst you can get a really good developer to come along and work with 
that model, dealing with subtleties can be much much harder than people would 
anticipate.

Also, you had this issue, which I suspect occurs in the vast majority of 
companies:
   * Company decides it wants to work on Project X
   * Company assigns "best" developers[1] to work on Project X
       [1] Best is a social metric in most companies, driven as much by
            politics as anything else after all.
   * Project X is delivered, Company decides to work on Project Y, "best"
      developers reassigned.
   * Developers picking up the pieces of Project X are left with an incomplete
      understanding of the code, because they didn't write it. (Code being an
      expression of thought of a problem solution that lacks the higher
      overview)
   * Code remains mainly in this maintenance phase. The simple fact that a
      reactor model can be complex to work with can mean accretion of
      misunderstandings and pain. This happens in any model, unless you focus
      on trying to make maintenance easier.

      However, the reality is this issue affects any concurrent system, unless
      you try to make it hard to make certain classes of mistakes. ("Oh, I
      have to lock that data structure before I use it? I have to set that
      flag? Even for reading I need a lock? I didn't realise that?")

The key aspect is that for many reactor based systems effectively boil down to 
being:
    * Do this, and if this happens do that later or do that later.

Which is of course the model you have with state machines. Some reactor models 
deal with better than others, but fundamentally, state machines represent a 
model of thinking, but only capture the low level details of the thinking, 
not the high level.

For the reactor model of programming, I view twisted as best of breed, having 
seen a fair few (proprietary and open). The fact it's in python also helps 
because it eliminates certain classes of pain. (Aside from anything else, by 
trying to make some aspects of the higher level clearer/more obvious)

However, having seen real world pain caused by this model, and the fact
you do have to have a certain level of ability to use it, I hypothesised that
you could take a different approach to make it simpler.

I've also had a long term view that programming in general will be able to 
attack harder problems if we have better tools. 

> It seems to me that the major "contribution" of twisted is that it lets you
> do I/O asynchronously.  More specifically, it provides a structured way to
> be notified of asynchronous I/O event via callbacks.  

This is true. I think it's easy to underplay what it gives you, but this is 
true. The reason behind this is because with network systems where you wish 
to deal with large amounts of users, avoiding context switching through the 
use of threads has been considered a good idea for a long while. (It's why 
stackless's tasklets are popular, and why concurrency rich languages tend to 
implement lightweight green threads as well as using OS threads & processes).

> Kamaelia builds on the ideas of twisted, but:
> 1.  It uses (a somewhat limited form of) coroutines instead of events.  In
> this way, it's similar to Stackless or Greenlets except it uses standard
> Python.  The net result is an event-driven program that looks and feels
> like it's threaded (at least a little bit).

Correct. The recognition is that a limited coroutine in the form of a 
generator *is* a statemachine in the same way as a collection of functions 
that call each other as deferreds is a state machine. This link can be found 
in the (naive) C++ "mini-axon" version of Kamaelia's core. (That demo code 
creates  a macros which allow you to create C++ classes which look like 
python generators, but when passed through cpp become state machines).

The choice was taken to specifically use standard python, because the research 
question was "can we make concurrency easier for the vast bulk of us to work 
with?" (which I think we've answered to our satisfaction now is "yes" having 
built multiple systems).


> 2.  It has a strong component model. Anyway, Kamaelia makes it relatively
> easy to create components and connect them together.

Yes, this is a core aim in Kamaelia. Kamaelia's component model is based on 3 
different approaches to componentisation:
    * Unix pipelines
    * Electronic systems & hardware description languages
    * Occam (something I played with many moons ago)

Specifically all 3 of these make it simple (or simpler) to manage concurrency, 
and all tend to end up with a focus on "reusable chunks of code" that happen 
to communicate over "named things", and don't necessarily know who the 
recipient of a message is. 

This is of course what enables ls , cat, sed, awk, etc to be as composable in 
interesting ways 30 years after they were made with little change, as they 
are today - since this approach allows testing in complete isolation, as well 
as unit & integration testing.

The hypothesis here of course is this:
   * Many reactor based systems tend to end up have "a chunk of code
      controlling file reading", "a chunk of code handing select", "a chunk of
      code handling new connections", "a chunk of code watching the GUI", and
      communicate through buffers which are not always explicit, and can know
      about each other.
   * Recognising that generators can be viewed as equivalent to a collection
      of deferreds, and choosing to make those buffers explicit...
   * ... the hypothesis is that you could make something that could
      *potentially* be as efficient, but be a model that is more amenable
      to maintenance and development by a wider set of people.
      (I believe in making my life easier :-)

Having tested this model now for a while and built a number of different 
systems with it, and specifically the involvement of (relatively) novice 
developers as well as more experienced developers in programmes like Google 
Summer of Code, I believe that this hypothesis to be relatively proven, so 
the current project focus is really on:
   * Consolidation of core systems, components and applications
   * Optimisation
   * Clean up

And just use, rather than research. (It really wasn't clear if it would be 
a "more" accessible model when we started)

> This is something lacking in twisted?

I don't believe so. Twisted has evolved over time and does have a component 
model. However, if you look in the twisted book, you won't find an explicit 
reference to the model - the API docs for it are here:

 http://twistedmatrix.com/documents/8.2.0/api/twisted.python.components.html

Specifically they build on Zope3's component model, which is a different sort 
of component model. (Once you know this you'll find uses of the component 
model in the book with classes named I<Something> )

>From a software/non-concurrent one, a more traditional one - Kamaelia's is 
closer to the traditional Unix model.

(The two models aren't mutually exclusive by the way)

> So, is this analysis generally correct?  Kamaelia is similar to twisted in
> that it is an asynchronous, event-driven framework at its core; but it
> hides the details a little better (and provides a more formal component
> model)?

Yes, I believe your analysis is correct.

Regarding "better" or "worse" I view that as a largely subjective term and 
prefer "this works better for me" over any other judgement. I do notice we 
tend to get more people saying this than not though :-)

As Gloria mentioned in her talk though, the question "it can't be that
simple" comes up, which is why the mini-axon tutorial exists, and having
done some recent tests, it should be possible to have some rather
substantial optimisations to Kamaelia's core (Axon), without changes
to Kamaelia applications.

My personal way of viewing things as a result, is that twisted's design is 
very clearly designed with performance from the outset, in a way that matches 
the original developer's thinking model, and then adding tools to make it 
easier to work with. 

Kamaelia's model is more based on enabling maintenance of code, to make it 
easier to pick up something you've never seen before and to have a high level 
roadmap to the code, and then follow through it to understand it, so that you 
understand the impact of changes you make.

For example, whilst it's now more complex, the core of this code:

http://code.google.com/p/kamaelia/source/browse/trunk/Code/Python/Kamaelia/Kamaelia/Experimental/PythonInterpreter.py#132

Should be still recognisable as being equivalent to this code:

    import sys, traceback
    
    def run_user_code(envdir):
        source = raw_input(">>> ")
        try:
            exec source in envdir
        except:
            print "Exception in user code:"
            print '-'*60
            traceback.print_exc(file=sys.stdout)
            print '-'*60
    
    envdir = {}
    while 1:
        run_user_code(envdir)

And these examples:
http://code.google.com/p/kamaelia/source/browse/trunk/Code/Python/Kamaelia/Examples/PythonInterpreter

such as this:

    Pipeline(
        Textbox(size = (800, 300), position = (100,380)),
        InterpreterTransformer(),
        TextDisplayer(size = (800, 300), position = (100,40)),
    ).run()

Give a clear idea of how the code is expected to be used - something doable 
through the explicit nature of the decoupling and linkage rather than 
implicit.

Having this formal component model therefore makes composition of interesting 
systems simpler, and more explicit. Using python generators as our core unit 
of concurrency and core component type encourages generally simpler/smaller 
components which in turn lend themselves to reuse.

In retrospect, full blown co-routines is something I view would actually have 
hampered the project in the first place because they would have encouraged a 
use of larger components, which would perhaps have led to lower amounts of 
reuse.

One other difference between Twisted and Kamaelia though. Twisted's core focus 
has generally been network systems. Kamaelia's original use case was network 
systems, but it's core focus is on general systems to be implemented 
concurrently. 

This means tools relating to transcoding TV or user uploaded images & videos 
are as valid/appropriate tasks for Kamaelia as a greylisting mail server, an 
IRC bot, a collaborative whiteboard, or a tool based on gesture recognition & 
speech synthesis designed for teaching a child to read and write, along with 
database modelling tools. (all things built with Kamaelia)

At the end of the day though, they're both just programming models. They 
aren't mutually exclusive, and picking the right model for the right job is 
more important than any other consideration :-)

Regards,


Michael.
-- 
http://yeoldeclue.com/blog
http://twitter.com/kamaelian
http://www.kamaelia.org/Home

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"kamaelia" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/kamaelia?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to