On Thu, Aug 14, 2008 at 12:14:49AM +0100, Dirk-Willem van Gulik wrote:
<snip>
> The reason I was hoping to have the cake and eat it is that I am  
> inside a REST call - i.e. a nice evhttp handler (which needs to wait  
> for a spread.org sp_receive() to confirm that its decoupled sp_sent()  
> has stuck before conforming a 200 OK or 500 Fail).
> 
> And hence if I tame this with a state-engine - or try to pospone  
> sending the reply by using a queue - as I am within the evhttp handler  
> - I fear having then re-write a fair chunk of what evhttp does so  
> nicely.
> 

Oh. I've never used evhttp. But, after the first time I wrote an SMTP server
using libevent (including MIME parser with callbacks as each logical
component--header, boundary, entity, decoded text, etc--was parsed), I
learned the hard way that _only_ providing a callback event interface was a
bad idea. If evhttp suffers from this, it would be something to address in
2.0.

Ideally, the underlying event processing should be accessible procedurally
(i.e. addData, processData, getEvents, no recursion), and then a callback
interface built on top of those. 80% of the time an application just want to
register a callback for a logical message. But it's the 20% that kills you,
especially in circumstances like these where you have no alternatives, or
horrendously ugly alternatives, because of the tradeoffs already made.

Last month I wrote, in two weeks, a full-blown RTSP/HTTP parser (including
processing interleaved RTP packets, but no HTTP chunking, plus-or-minus
bugs) using Ragel, a regular exprestate machine compiler. The two weeks
include stumbling upon and learning how to use Ragel. The library supports
the add+process+get interface, and the application code marries the library
w/ libevent I/O handlers. Not a bad route to try, if you're inclined. Ragel
is incredibly, incredibly awesome, and a perfect fit for applications using
libevent. It's mindbogglingly trivial to write unbuffered, restartable
protocol parsers.

It's not open source (yet), but here's the relevant Ragel bits for perusal.
This is literally the entire parser (well, one iteration lying around on my
e-mail server), sans the utility routines used for building up the event
message structure. It's restartable. I can feed this thing one byte at a
time (i.e. only a single byte read after every libevent callback).

I actually parse individual header bodies separately, because parsing
MIME-structured headers introduces other difficulties easier solved
separately (like expanding the input code space so a Ragel parser sees
tagged commented and quoted characters; regular expressions can't parse
recursive structures). But, as you see below, it's entirely possible to
parse header bodies inline as well, as is done for Content-Length, for
instance.

To read, start at the bottom--the "main" expression--and work your way up.
Protocol data is parsed by successive calls to rtsp_p_parse, shown at the
end. The only state Ragel needs is a single integer, stored in the parser
structure.

%%{
        machine rtsp_parser;

        alphtype unsigned char;

        access P->fsm.;

        variable p P->fsm.p;
        variable pe P->fsm.pe;
        variable eof P->fsm.eof;

        action string_start {
                if ((error = rtsp_p_newstr(P)))
                        { P->fsm.error = error; goto error; }
        }

        action string_write {
                if ((error = rtsp_p_putchar(P, fc)))
                        { P->fsm.error = error; goto error; }
        }

        action string_end {
                if ((error = rtsp_p_endstr(P)))
                        { P->fsm.error = error; goto error; }
        }

        action number_start {
                rtsp_p_newnum(P);
        }

        action integer_write {
                if ((error = rtsp_p_addint(P, fc - '0')))
                        { P->fsm.error = error; goto error; }
        }

        action float_write {
                if ((error = rtsp_p_addflt(P, fc - '0')))
                        { P->fsm.error = error; goto error; }
        }

        action method_end { P->msg.request.method = P->acc.txt; }
        action resource_end { P->msg.request.resource = P->acc.txt; }
        action protocol_end { *rtsp_p_protocol(P) = P->acc.txt; }
        action version_end { *rtsp_p_version(P) = P->acc.num.i + P->acc.num.f; }
        action status_end { P->msg.response.status = P->acc.num.i; }
        action reason_end { P->msg.response.reason = P->acc.txt; }

        action request_end { P->msg.type = RTSP_P_REQUEST; fpause; }
        action response_end { P->msg.type = RTSP_P_RESPONSE; fpause; }

        action oops { P->fsm.error = RTSP_E_SYNTAX; goto error; }

        crlf            = '\r'? '\n';
        blank           = [ \t];
        text            = any+ - (cntrl | 0x7f); # del

        GET             = /GET/i %{ rtsp_p_tagstr(P, RTSP_M_GET); };
        POST            = /POST/i %{ rtsp_p_tagstr(P, RTSP_M_POST); };
        OPTIONS         = /OPTIONS/i %{ rtsp_p_tagstr(P, RTSP_M_OPTIONS); };
        DESCRIBE        = /DESCRIBE/i %{ rtsp_p_tagstr(P, RTSP_M_DESCRIBE); };
        SETUP           = /SETUP/i %{ rtsp_p_tagstr(P, RTSP_M_SETUP); };
        PLAY            = /PLAY/i %{ rtsp_p_tagstr(P, RTSP_M_PLAY); };
        PAUSE           = /PAUSE/i %{ rtsp_p_tagstr(P, RTSP_M_PAUSE); };
        TEARDOWN        = /TEARDOWN/i %{ rtsp_p_tagstr(P, RTSP_M_TEARDOWN); };

#       methods         = alpha+;
        methods         = GET | POST | OPTIONS | DESCRIBE | SETUP | PLAY
                        | PAUSE | TEARDOWN | alpha+;

        HTTP            = /HTTP/i %{ rtsp_p_tagstr(P, RTSP_P_HTTP); };
        RTSP            = /RTSP/i %{ rtsp_p_tagstr(P, RTSP_P_RTSP); };

#       protocols       = alpha+;
        protocols       = HTTP | RTSP | alpha+;

        method          = methods >string_start $string_write %string_end 
%method_end;
        resource        = (any - space)+ >string_start @string_write 
%string_end %resource_end;
        protocol        = protocols >string_start $string_write %string_end 
%protocol_end;
        version         = digit+ >number_start $integer_write '.' digit+ 
$float_write %version_end;
        status          = digit{3} >number_start $integer_write %status_end;
        reason          = (text - [\r\n]) >string_start $string_write 
%string_end %reason_end;

        request         = (method blank+ resource blank+ protocol '/' version) 
%request_end;
        response        = (protocol '/' version blank+ status blank+ <: reason) 
%response_end;

        start_line      = (request when { P->mode == RTSP_O_SERVER } | response 
when { P->mode == RTSP_O_CLIENT }) blank* crlf;

        content_length  = digit+ >number_start $integer_write %{ P->clen = 
P->acc.num.i; P->cpos = 0; };

        action header_name_end { P->msg.header.name = P->acc.txt; }
        action header_body_end { P->msg.header.body = P->acc.txt; }
        action header_end { P->msg.type = RTSP_P_HEADER; fpause; }

        Content_Type    = /Content\-Type/i %{ rtsp_p_tagstr(P, 
RTSP_H_CONTENT_TYPE); };
        Content_Length  = /Content\-Length/i %{ rtsp_p_tagstr(P, 
RTSP_H_CONTENT_LENGTH); };
        Accept          = /Accept/i %{ rtsp_p_tagstr(P, RTSP_H_ACCEPT); };
        CSeq            = /CSeq/i %{ rtsp_p_tagstr(P, RTSP_H_CSEQ); };
        Transport       = /Transport/i %{ rtsp_p_tagstr(P, RTSP_H_TRANSPORT); };
        Session         = /Session/i %{ rtsp_p_tagstr(P, RTSP_H_SESSION); };
        X_SessionCookie = /X\-SessionCookie/i %{ rtsp_p_tagstr(P, 
RTSP_H_X_SESSIONCOOKIE); };
        Host            = /Host/i %{ rtsp_p_tagstr(P, RTSP_H_HOST); };

        headers         = Content_Type | Content_Length | Accept | CSeq
                        | Transport | Session | X_SessionCookie | Host
                        | (alpha (alnum | '-' | '_')*);

        header_text     = content_length when { P->msg.header.name.tag == 
RTSP_H_CONTENT_LENGTH }
                        | (any* - [\r\n]);
        header_cont     = blank header_text :> crlf;
        header_body     = (header_text :> crlf header_cont*) >string_start 
$string_write %string_end %header_body_end;
        header_name     = headers >string_start $string_write %string_end 
%header_name_end;

        header          = (header_name blank* ':' blank* <: header_body) 
%header_end;

        #
        # FIXME: Support simply passing back a series of vectors into input
        # buffers, rather than copying all that data. Needed for chunking
        # support.
        #
        action body_end {
                if (P->cpos >= P->clen) {
                        if ((error = rtsp_p_endstr(P)))
                                { P->fsm.error = error; goto error; }

                        P->msg.type             = RTSP_P_BODY;
                        P->msg.body.data        = (void *)P->acc.txt.str;
                        P->msg.body.length      = P->acc.txt.len;

                        P->cpos         = 0;
                        P->clen         = 0;

                        fpause;
                }
        } # body_end

        body    = (any when { P->cpos++ < P->clen })** >string_start 
$string_write @body_end;

        action crlf_end {
                if (P->clen == 0) {
                        P->msg.type             = RTSP_P_BODY;
                        P->msg.body.data        = 0;
                        P->msg.body.length      = 0;
                }
        }

        message = start_line header* crlf @crlf_end body;

        #
        # FIXME: We need to parse the Transport: header to get the correct
        # channel identifiers. Strictly speak, they needn't "MUST" adhere to
        # the even-odd rule.
        #
        action payload_end {
                if (P->ppos >= P->plen) {
                        if ((error = rtsp_p_endstr(P)))
                                { P->fsm.error = error; goto error; }

                        P->msg.type             = RTSP_P_PACKET;
                        P->msg.packet.channel   = P->channel;
                        P->msg.packet.data      = (void *)P->acc.txt.str;
                        P->msg.packet.length    = P->acc.txt.len;

                        fpause;
                }
        }

        payload         = (any when { P->ppos++ < P->plen })** >string_start 
$string_write @payload_end;

        payload_length  = any{2} >{ P->plen = P->ppos = 0; } ${ P->plen <<= 8; 
P->plen |= fc; };

        channel         = any ${ P->channel = fc; };

        packet          = "$" channel payload_length payload <: "";

        main            := (message | packet)** $!oops;

}%%

%% write data;

static enum rtsp_errno rtsp_p_exec(struct rtsp_parser *P) {
        enum rtsp_errno error;

        %% write exec;

        return 0;
error:
        P->fsm.cs       = rtsp_parser_error;

        return P->fsm.error;
} /* rtsp_p_exec() */


size_t rtsp_p_parse(struct rtsp_parser *P, struct rtsp_p_event **msg, const 
void *src, size_t len, enum rtsp_errno *error) {
        rtsp_p_setp(P, src, len);

        P->msg.type     = 0;
        *error          = rtsp_p_exec(P);
        *msg            = (P->msg.type)? &P->msg : 0;

        return (P->fsm.p - (unsigned char *)src);
} /* rtsp_p_parse() */


struct rtsp_parser *rtsp_p_open(enum rtsp_mode mode, enum rtsp_errno *error) {
        struct rtsp_parser *P   = 0;
        struct obstack *obs     = 0;

        if (!(obs = obs_open(RTSP_OBSOPTS)))
                goto syserr;

        if (!(P = obs_malloc(obs, sizeof *P)))
                goto syserr;

        *P              = rtsp_p_initializer;
        P->mode         = mode;
        P->obstack      = obs;

        obs_mark(P->obstack, &P->obsmark);

        %% write init;

        return P;
syserr:
        *error  = errno;

        obs_close(obs);

        return 0;
} /* rtsp_p_open() */


void rtsp_p_reset(struct rtsp_parser *P) {
        obs_reset(P->obstack, &P->obsmark);

        return (void)0;
} /* rtsp_p_reset() */


void rtsp_p_close(struct rtsp_parser *P) {
        if (P == 0)
                return (void)0;

        obs_close(P->obstack);

        return (void)0;
} /* rtsp_p_close() */

_______________________________________________
Libevent-users mailing list
Libevent-users@monkey.org
http://monkeymail.org/mailman/listinfo/libevent-users

Reply via email to