Apparently I don't know how to use my own tool! Let's try this again,
this time not so rushed on my part :)
action le {}
foo = 'hello' $^le;
main := (
any* |
foo
);
Local error actions are local to the named machine they are in, not the
enclosing (), which is the rushed mistake I made.
Thanks,
Adrian
On 11-02-02 02:17 PM, Murray Henderson wrote:
Hi Adrian,
Thanks for taking an interest :-).
As far as I can tell,
main = (
('HELLO ' $^parse_error) 'WORLD' |
any*
);
and
main = (
('HELLO ' $!parse_error) 'WORLD' |
any*
);
are equivalent to
main = any*;
Anyway, the real machine I am trying to build currently looks like this:
doctype_single_quoted_value = (
"'" ([^>]*)
>start_token_value
%end_token
:>> "'"
);
doctype_double_quoted_value = (
'"' ([^>]*)
>start_token_value
%end_token
:>> '"'
);
doctype_quoted_value = (doctype_single_quoted_value |
doctype_double_quoted_value);
doctype_name = (
space+ (any - ('>' | space))+
>start_token_doctype_name
%end_token
);
doctype_public = space+ 'PUBLIC' %token_doctype_public space+
doctype_quoted_value;
doctype_system = space+ 'SYSTEM' %token_doctype_system space+
doctype_quoted_value;
doctype = (
'<!DOCTYPE' %token_doctype space* (doctype_name doctype_public?
doctype_system?)? space* '>'
);
This machine looks about right (in the FSM diagram) except that it
doesn't handle malformed doctypes.
With the $^^ operator I described, I imagine the machine would look
like this (given a parse error action, pe):
doctype = (
'<!DOCTYPE' %token_doctype space* ((doctype_name doctype_public?
doctype_system?) $^^pe)? space*<: ([^>]+>pe)? '>'
);
Additionally, I think I might be able to use that imaginary operator
to make whitespace optional (though with a parse error if the
whitespace is omitted):
eg:
omittable_space = space+>^^pe;
doctype_public = omittable_space 'PUBLIC' %token_doctype_public
omittable_space doctype_quoted_value;
I will be using this machine inside multiple scanners, so goto based
error recovery would be a pain. Default actions that transition to the
final state seem like a handy feature for any permissive parser
(although I realize I am doing something extreme).
I still thinking about attempting to patch ragel. Much more
complicated than I thought it would be, but can't hurt for me to give
it a crack.
Still absolutely nowhere near finished, but my work is progressing slowly ;-).
https://github.com/murrayh/html5rl/blob/master/html5_grammar.rl
Cheers,
Murray
On Tue, Feb 1, 2011 at 5:16 PM, Adrian Thurston<[email protected]> wrote:
Hi, does this do what you want?
main = (
('HELLO ' $^parse_error) 'WORLD' |
any*
);
I'm not sure how that fits into your overall plan. Try it out and we'll
discuss further.
Regards,
Adrian
On 11-01-31 03:50 PM, Murray Henderson wrote:
Hello,
Both local and global error actions transition to the error state. I
am using Ragel 6.5. I can try with 6.6 when I get home.
I made a quick example (based off S. Geist's example):
http://pastebin.com/06ihRxQg
Example output:
HELLO WORLD
read: HELLO WORLD
len: 12, state: 12
HELWORLD
parse error
read: HEL
len: 3, state: 0
Cheers,
Murray
On Tue, Feb 1, 2011 at 10:02 AM, Adrian Thurston<[email protected]>
wrote:
Local error actions don't. Sorry I should have suggested just those.
On 11-01-31 02:58 PM, Murray Henderson wrote:
Hello,
Local and global error actions transition to the error state.
I want DEF to transition to the next machine (ie. behave like a final
state), not the error state.
The parser I am writing is permissive, all input must be accepted (I
never want to goto the error state).
I do not wish to use manual goto recovery, because the parser is large
and complex, such manual tracking is a lot of work and error prone.
Cheers,
Murray
On Tue, Feb 1, 2011 at 4:58 AM, Adrian Thurston
<[email protected]> wrote:
Hi, have you looked at ragel's local and global error actions yet?
These
may
do what you want.
-Adrian
On 11-01-26 08:08 PM, Murray Henderson wrote:
Hello,
I want to embed a default action into a machine that leaves the
machine (without using manual a jump inside the action).
For simplicities sake, I will call this operator $^^ (since it is
similar to the Local Error operator).
Example:
action parse_error {}
helloworld = ('HELLO ' %^^parse_error) 'WORLD';
Non-error inputs include:
HELLO WORLD
HELLOWORLD (parse_error action occurs on 'O' -> 'W' transition)
HELLWORLD (parse_error action occurs on 'L' -> 'W' transition)
HELWORLD (parse_error action occurs on 'L' -> 'W' transition)
HEWORLD (parse_error action occurs on 'E' -> 'W' transition)
HWORLD (parse_error action occurs on 'H' -> 'W' transition)
WORLD (parse_error action occurs on -> 'W' transition)
I can simulate the above behavior with the '?' operator, but that is
laborious, and there are other ways of using $^^ that I suspect cannot
be simulated.
I want this operator because I am trying to make a liberal parser that
accepts all possible input. (Every state must have a default action)
.I am creating a html5 parser that uses regular machines for
tokenizing, and scanners built from the regular machines for parsing.
Yes, I am mad.
I cannot use manual jumps, because I don't want to jump out of the
scanners mid-token.
I am willing to try and add this operator into Ragel myself. I have
grabbed the source code and tracked my way to fsmap.cpp, where the new
operator would be added.
Before I continue...
Is there already a way to achieve my desired behavior that I am not
aware
of?
Would such an operator be worthwhile? Is it even possible?
Is there any knowledge that could be imparted that would help me make
a
patch?
If I do end up making a patch, for symmetry purposes I will make
global/local and start/any/final etc versions of the operator.
After a brief look through the source, it looks like I would need to
mod the FsmAp::fillGaps() function, passing in a (separate object for
each?) final state into the FsmAp::attachNewTrans() instead of NULL.
Ragel is a wonderful program by the way, thank you for creating it.
Cheers,
Murray
_______________________________________________
ragel-users mailing list
[email protected]
http://www.complang.org/mailman/listinfo/ragel-users
_______________________________________________
ragel-users mailing list
[email protected]
http://www.complang.org/mailman/listinfo/ragel-users
_______________________________________________
ragel-users mailing list
[email protected]
http://www.complang.org/mailman/listinfo/ragel-users
_______________________________________________
ragel-users mailing list
[email protected]
http://www.complang.org/mailman/listinfo/ragel-users
_______________________________________________
ragel-users mailing list
[email protected]
http://www.complang.org/mailman/listinfo/ragel-users
_______________________________________________
ragel-users mailing list
[email protected]
http://www.complang.org/mailman/listinfo/ragel-users
_______________________________________________
ragel-users mailing list
[email protected]
http://www.complang.org/mailman/listinfo/ragel-users
_______________________________________________
ragel-users mailing list
[email protected]
http://www.complang.org/mailman/listinfo/ragel-users