2009/9/24 Sam Kuper <[email protected]>:
> Antlr is stumping me. Perhaps it's because I'm trying to use it to
> parse a pretty gnarly markup language, but maybe not...
>
> Between the [BEGIN] and [END] markers below is a sample of the markup
> I'm dealing with.
>
> [BEGIN]
> .Extract field for no. 16 can be added as the last line of an entry,
> in the form: *e 16
> FOR NO. 17, ADD 17: NO COMMA. E.G. *e 16 17 or *e 17 (DO NOT USE THE
> *v FIELD OR THE *d FIELD)
>
>
> !!!!!!!!!!!REPEAT, NO COMMA!!!!!!!!!!!!!!
>
>
> *a \\Albert, John\\ (b. 1912/13). Gardener. George Robert Fox's
> gardener at The Vicarage, \circa\ 1941. (Census returns 1841 (Public Record
> Office HO137/916/8).)
> *b expanded by EL
> *c
> *v 2, 3
> *e 18 19
> [END]
>
> I'll use regex notation below to help me describe what the above markup means.
>
> Everything up to the first
>
> ^\*a
>
> is a note and needs to be translated such that single newlines are
> ignored but two or more newlines are translated to a pair of newlines.

Even this seems to be incredibly difficult with ANTLR. Essentially, I
want to "stop on match" where the match is (in regex notation): ^\*a

But even leaving aside the apparent impossibility of recognising
start-of-line in ANTLR, various approaches I've tried, such as this:

grammar name_reg;
options {
        language=Java;
}
name_reg        : notes? entry* EOF;
notes           : (~'*a')*;
entry           : '*a';
ASCII           : ' '..'~';

fail to stop on encountering '*a' and instead just strip the '*' and
put the rest into 'notes'.

So please could someone help by telling me how can I make ANTLR
capture everything up to a given sequence of characters?

Many thanks,

Sam

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: 
http://www.antlr.org/mailman/options/antlr-interest/your-email-address

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"il-antlr-interest" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/il-antlr-interest?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to