Re: [Evolution-hackers] Camel Manifesto

Michael Meeks Thu, 26 Nov 2009 06:21:48 -0800

Hi Matthew,

On Fri, 2009-11-20 at 10:46 -0500, Matthew Barnes wrote:
> There may be isolated cases internally to Camel where it can exploit
> parallelism in CPU-intensive tasks with threading or where threads are
> necessary for interacting with synchronous-only libraries, but it should
> be used sparingly and hidden behind a fully asynchronous API.


        So - I'm well up for hiding complexity behind an asynchronous API in
general; that's a great goal. I guess there is also the mail-to-e-d-s
red herring to consider in the mix - that (potentially) adds a layer of
asynchronicity to the equation in the form of remote dbus calls; perhaps
worth considering that in parallel - though it would be cut at a
different place (potentially).

>   It should not be central to the design of the entire mail application,
> as it is currently.  Basically I want the mail front-end in Evolution 3
> to be single threaded, or as close to that as possible.

        Sounds reasonable.

> The first is what I think is a very insightful paper on the inherent
> problems with threads:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf

        Sure - I read it carefully many moons hence; and it's good - though
IMHO it over-states the case somewhat, or at least - it seems to me that
sometimes the alternative is worse.

        The problem IMHO comes when there is a multi-step process; eg.

        DN lookup
        socket connect
        N way ssh handshake 

        none of which the application cares about - really; the 'async' way is
to whack this all as atomised pieces of code, into some state machine:

        switch (state) {
        case DNS_CONNECTING:
                if (not failed) {
                        state = DNS_CONNECTED;
                        pollstate = IO_OUT|ERR;
                }
        case DNS_CONNECTED:
                send (lookup_msg);
                pollstate = IO_IN|ERR;
                state = DNS_WAIT_REPLY;
        case DNS_WAIT_REPLY:
                read () ...
                if (!short_read)
                        state = SOCKET_NEW
                else
                        continue in this state
        case SOCKET_NEW:
                ...
        case SOCKET_CONNECTED:
                pollstate = IO_OUT|ERR;
        case SOCKET_CAN_WRITE:
                ...
                pollstate = IO_IN|ERR;
        case SOCKET_WAIT_RESULT:
                ...
        }

        etc. etc. etc. This is basically what ORBit2 / linc does - although, of
course we get lazy when eg. Windows demands more round-trips, and we
probably do DNS synchronously (that stuff never worked well anyway), and
so on. It is not particularly awful - though, some tricks such as
checking for 'IO_IN' before HUP etc. to avoid loosing the end of a
message are worth not forgetting ;-)

        Of course - as the number of the steps in the handshake grows the scope
for error and confusion grow - nevermind the debugging problem: when it
locks up, what went wrong ? :-) how do you even see the state of the
umpteen state machines that are ticking away behind the scenes ?

        Of course - some large 'state' structure is required - replicating an
equivalent thread's stack (but on the heap), and that has to be
lifecycle managed and so on in a similar way to threads I guess - with
some extra function overhead.

        The threaded version with async callback API I guess has the same
initial closure creation overhead; but then the code is fairly easy to
read:

        host_addr = do_blocking_lookup (name);
        if (cancelled || !host_addr)
                goto emit_error;
        fd = socket();
        connect (host_addr);
        if (cancelled || !connect_error)
                goto emit_error;
        write_request (fd);
        read_reply (fd);
        emit_success_callback () etc.

        it is also rather easy to debug - as soon as anything fails - with
'bug-buddy' or other conventional debugging tools it is easy to see who
was causing the blocking / dead-locking, or what synchronous calls were
not responsive.

        It is also far easier and clearer to re-use code via blocking calls (I
suspect) - than to cobble other sets of states into your state machine.

        And of course, none of this is news to anyone I'm sure. Clearly though
- the simpler the locking, and the closer to clean & simple message
passing - the easier and safer the threading becomes.

        Anyhow - to me at least ( thankfully shielded from the pain that is
suffered by users of camel ) it seems like a large influx of re-writing
everything as async is unlikely to give substantial reliability wins
(beyond those intrinsic to having a great hacker re-read, and test the
code).

        But - since I'm not doing it, I can only write long & silly mails to
try to persuade :-)

        ATB,

                Michael.

-- 
 [email protected]  <><, Pseudo Engineer, itinerant idiot

_______________________________________________
Evolution-hackers mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/evolution-hackers

Re: [Evolution-hackers] Camel Manifesto

Reply via email to