Thanks Adrian, I think that technique will make it possible to solve my problem.
On Thu, Feb 3, 2011 at 9:32 AM, Adrian Thurston <[email protected]> wrote: > Apparently I don't know how to use my own tool! Let's try this again, this > time not so rushed on my part :) > > action le {} > foo = 'hello' $^le; > main := ( > any* | > foo > ); > > Local error actions are local to the named machine they are in, not the > enclosing (), which is the rushed mistake I made. > > Thanks, > Adrian > > On 11-02-02 02:17 PM, Murray Henderson wrote: >> >> Hi Adrian, >> >> Thanks for taking an interest :-). >> >> >> As far as I can tell, >> >> main = ( >> ('HELLO ' $^parse_error) 'WORLD' | >> any* >> ); >> >> and >> >> main = ( >> ('HELLO ' $!parse_error) 'WORLD' | >> any* >> ); >> >> are equivalent to >> >> >> main = any*; >> >> >> >> >> Anyway, the real machine I am trying to build currently looks like this: >> >> >> doctype_single_quoted_value = ( >> "'" ([^>]*) >> >start_token_value >> %end_token >> :>> "'" >> ); >> >> doctype_double_quoted_value = ( >> '"' ([^>]*) >> >start_token_value >> %end_token >> :>> '"' >> ); >> >> doctype_quoted_value = (doctype_single_quoted_value | >> doctype_double_quoted_value); >> >> doctype_name = ( >> space+ (any - ('>' | space))+ >> >start_token_doctype_name >> %end_token >> ); >> >> doctype_public = space+ 'PUBLIC' %token_doctype_public space+ >> doctype_quoted_value; >> >> doctype_system = space+ 'SYSTEM' %token_doctype_system space+ >> doctype_quoted_value; >> >> doctype = ( >> '<!DOCTYPE' %token_doctype space* (doctype_name doctype_public? >> doctype_system?)? space* '>' >> ); >> >> >> >> This machine looks about right (in the FSM diagram) except that it >> doesn't handle malformed doctypes. >> >> With the $^^ operator I described, I imagine the machine would look >> like this (given a parse error action, pe): >> >> >> >> doctype = ( >> '<!DOCTYPE' %token_doctype space* ((doctype_name doctype_public? >> doctype_system?) $^^pe)? space*<: ([^>]+>pe)? '>' >> ); >> >> >> Additionally, I think I might be able to use that imaginary operator >> to make whitespace optional (though with a parse error if the >> whitespace is omitted): >> >> eg: >> >> omittable_space = space+>^^pe; >> doctype_public = omittable_space 'PUBLIC' %token_doctype_public >> omittable_space doctype_quoted_value; >> >> >> >> >> I will be using this machine inside multiple scanners, so goto based >> error recovery would be a pain. Default actions that transition to the >> final state seem like a handy feature for any permissive parser >> (although I realize I am doing something extreme). >> >> I still thinking about attempting to patch ragel. Much more >> complicated than I thought it would be, but can't hurt for me to give >> it a crack. >> >> >> Still absolutely nowhere near finished, but my work is progressing slowly >> ;-). >> https://github.com/murrayh/html5rl/blob/master/html5_grammar.rl >> >> >> Cheers, >> Murray >> >> >> On Tue, Feb 1, 2011 at 5:16 PM, Adrian Thurston<[email protected]> >> wrote: >>> >>> Hi, does this do what you want? >>> >>> main = ( >>> ('HELLO ' $^parse_error) 'WORLD' | >>> any* >>> ); >>> >>> I'm not sure how that fits into your overall plan. Try it out and we'll >>> discuss further. >>> >>> Regards, >>> Adrian >>> >>> On 11-01-31 03:50 PM, Murray Henderson wrote: >>>> >>>> Hello, >>>> >>>> Both local and global error actions transition to the error state. I >>>> am using Ragel 6.5. I can try with 6.6 when I get home. >>>> >>>> I made a quick example (based off S. Geist's example): >>>> >>>> http://pastebin.com/06ihRxQg >>>> >>>> Example output: >>>> >>>> HELLO WORLD >>>> read: HELLO WORLD >>>> len: 12, state: 12 >>>> HELWORLD >>>> parse error >>>> read: HEL >>>> len: 3, state: 0 >>>> >>>> >>>> Cheers, >>>> Murray >>>> >>>> >>>> On Tue, Feb 1, 2011 at 10:02 AM, Adrian Thurston<[email protected]> >>>> wrote: >>>>> >>>>> Local error actions don't. Sorry I should have suggested just those. >>>>> >>>>> On 11-01-31 02:58 PM, Murray Henderson wrote: >>>>>> >>>>>> Hello, >>>>>> >>>>>> Local and global error actions transition to the error state. >>>>>> >>>>>> I want DEF to transition to the next machine (ie. behave like a final >>>>>> state), not the error state. >>>>>> >>>>>> The parser I am writing is permissive, all input must be accepted (I >>>>>> never want to goto the error state). >>>>>> >>>>>> I do not wish to use manual goto recovery, because the parser is large >>>>>> and complex, such manual tracking is a lot of work and error prone. >>>>>> >>>>>> Cheers, >>>>>> Murray >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Feb 1, 2011 at 4:58 AM, Adrian Thurston >>>>>> <[email protected]> wrote: >>>>>>> >>>>>>> Hi, have you looked at ragel's local and global error actions yet? >>>>>>> These >>>>>>> may >>>>>>> do what you want. >>>>>>> >>>>>>> -Adrian >>>>>>> >>>>>>> On 11-01-26 08:08 PM, Murray Henderson wrote: >>>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> I want to embed a default action into a machine that leaves the >>>>>>>> machine (without using manual a jump inside the action). >>>>>>>> >>>>>>>> For simplicities sake, I will call this operator $^^ (since it is >>>>>>>> similar to the Local Error operator). >>>>>>>> >>>>>>>> >>>>>>>> Example: >>>>>>>> >>>>>>>> action parse_error {} >>>>>>>> helloworld = ('HELLO ' %^^parse_error) 'WORLD'; >>>>>>>> >>>>>>>> Non-error inputs include: >>>>>>>> HELLO WORLD >>>>>>>> HELLOWORLD (parse_error action occurs on 'O' -> 'W' >>>>>>>> transition) >>>>>>>> HELLWORLD (parse_error action occurs on 'L' -> 'W' >>>>>>>> transition) >>>>>>>> HELWORLD (parse_error action occurs on 'L' -> 'W' transition) >>>>>>>> HEWORLD (parse_error action occurs on 'E' -> 'W' transition) >>>>>>>> HWORLD (parse_error action occurs on 'H' -> 'W' transition) >>>>>>>> WORLD (parse_error action occurs on -> 'W' transition) >>>>>>>> >>>>>>>> >>>>>>>> I can simulate the above behavior with the '?' operator, but that is >>>>>>>> laborious, and there are other ways of using $^^ that I suspect >>>>>>>> cannot >>>>>>>> be simulated. >>>>>>>> >>>>>>>> >>>>>>>> I want this operator because I am trying to make a liberal parser >>>>>>>> that >>>>>>>> accepts all possible input. (Every state must have a default action) >>>>>>>> .I am creating a html5 parser that uses regular machines for >>>>>>>> tokenizing, and scanners built from the regular machines for >>>>>>>> parsing. >>>>>>>> Yes, I am mad. >>>>>>>> >>>>>>>> I cannot use manual jumps, because I don't want to jump out of the >>>>>>>> scanners mid-token. >>>>>>>> >>>>>>>> >>>>>>>> I am willing to try and add this operator into Ragel myself. I have >>>>>>>> grabbed the source code and tracked my way to fsmap.cpp, where the >>>>>>>> new >>>>>>>> operator would be added. >>>>>>>> >>>>>>>> Before I continue... >>>>>>>> Is there already a way to achieve my desired behavior that I am not >>>>>>>> aware >>>>>>>> of? >>>>>>>> Would such an operator be worthwhile? Is it even possible? >>>>>>>> Is there any knowledge that could be imparted that would help me >>>>>>>> make >>>>>>>> a >>>>>>>> patch? >>>>>>>> >>>>>>>> If I do end up making a patch, for symmetry purposes I will make >>>>>>>> global/local and start/any/final etc versions of the operator. >>>>>>>> >>>>>>>> After a brief look through the source, it looks like I would need to >>>>>>>> mod the FsmAp::fillGaps() function, passing in a (separate object >>>>>>>> for >>>>>>>> each?) final state into the FsmAp::attachNewTrans() instead of NULL. >>>>>>>> >>>>>>>> Ragel is a wonderful program by the way, thank you for creating it. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Murray >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> ragel-users mailing list >>>>>>>> [email protected] >>>>>>>> http://www.complang.org/mailman/listinfo/ragel-users >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> ragel-users mailing list >>>>>>> [email protected] >>>>>>> http://www.complang.org/mailman/listinfo/ragel-users >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> ragel-users mailing list >>>>>> [email protected] >>>>>> http://www.complang.org/mailman/listinfo/ragel-users >>>>> >>>>> _______________________________________________ >>>>> ragel-users mailing list >>>>> [email protected] >>>>> http://www.complang.org/mailman/listinfo/ragel-users >>>>> >>>> >>>> _______________________________________________ >>>> ragel-users mailing list >>>> [email protected] >>>> http://www.complang.org/mailman/listinfo/ragel-users >>> >>> _______________________________________________ >>> ragel-users mailing list >>> [email protected] >>> http://www.complang.org/mailman/listinfo/ragel-users >>> >> >> _______________________________________________ >> ragel-users mailing list >> [email protected] >> http://www.complang.org/mailman/listinfo/ragel-users > > _______________________________________________ > ragel-users mailing list > [email protected] > http://www.complang.org/mailman/listinfo/ragel-users > _______________________________________________ ragel-users mailing list [email protected] http://www.complang.org/mailman/listinfo/ragel-users
