I'm very new to ANTLR and am investigating how I might use the tool for an
upcoming project.
I need to be able to recognize and parse a language that is similar to X12
syntax in that it contains delimited segments each containing delimited data
elements.
The first segment specifies both the element delimiter and segment delimiters
used for the rest of the input, the delimiters must be different from one
another and from the element data, and the segment delimiter can contain
multiple characters.
Below is a very simple test grammar in which I want to convert to being able to
determine the element delimiter (ED) at runtime (always the 1st character after
'STA') and the segment delimiter (SD). I suspect I can't do this entirely in
the grammar and may need to subclass/override some core ANTLR classes or maybe
even scan the input buffer.
I'm not sure where to go from here and haven't yet found anything that appears
useful either in the Definitive ANTLR Ref book or via google. I'd appreciate
any RTFM links I missed if this has already been discussed many times before,
or any pointers on where to look in the source for extending existing ANTLR
behavior.
Thanks, Jon
// Simple.g
grammar Simple;
tokens {
STA = 'STA';
BEG = 'BEG';
END = 'END';
}
transaction : header beg_segment footer;
header : STA segment_body;
beg_segment : BEG segment_body;
footer : END segment_body;
segment_body : ED DATA ED DATA SD;
DATA : 'A'..'Z'+;
ED : '*';
SD : '\r' '\n' | 'r';
// test data
STA*HEADER*SEGMENT
BEG*TRANSACTION*HEADER
END*FOOTER*SEGMENT
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
--
You received this message because you are subscribed to the Google Groups
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/il-antlr-interest?hl=en.