Hi,
For quite some time I have working on this home project trying to parse a very
complex grammar. I had a brain wave yesterday on how to skip the difficult
bits, at least for the moment. If I were to handle the difficult bits now, I
would end up producing a parser for a complete programming language, almost.
Below is an example of the sort of thing I am trying to skip - i.e. the 'where'
statement. Now, because the grammar is such that the 'where' statement, or
statements, immediately precede the 'end_type' keyword, I thought I'd gobble to
'end_type'.
In the example below, at the moment, 'typeid' and 'underlyingtype' eventually
come down to a simple string identifier of 'a..z'('a-z'|'_')*
type dayinmonth = integer;
where
validrange : {1 <= self <= 31};
end_type;
The modified grammar for this is...
typedecl
: 'type' typeid '=' underlyingtype ';' (options {greedy=false;} : .* )
'end_type' ';'
;
I can't just skip to the next ';' because there may be several statements i.e.
Where
Label1 : stuff1;
Label2: stuff2;
What I find is that the '{' and '}' within the gobble process become
significant. In other similar cases I find a '|', or even a carriage return
'\r', is significant. Using the Eclipse add-in, testing just this sub-graph
produces different (although successful in both cases) results depending on
whether the '{' or '}' is surrounded by whitespace or not. Somehow, if it is
surrounded by whitespace, the '{' token disappears from the parse tree. But
when trying to parse the text properly in context, it throws up an error. I
also found that changing the '{' to '(' removed the error.
In the end, I managed to parse an 12000 line file with only this type of error.
This was a long introduction for just a couple of short questions. Are there
significant characters that can affect the gobble process? Do I need other
options to be able to skip everything to 'end_type'?
Thanks.
Dugald Wilson
_____________________________________________________________________
The information contained in this message, together with any attachments, may
be legally privileged or confidential and is intended only for the use of the
individual(s) or entity named above. If you are not the intended recipient, you
are notified that any dissemination, distribution or copying of this message is
strictly prohibited. If you have received this message in error, please notify
us immediately before deleting it.
This message has been checked for all known viruses through MessageLabs Virus
Control Centre, for and on behalf of the AVEVA Group. Although no viruses were
found it is the recipient's responsibility to ensure that this message is safe
for use on their system.
AVEVA Group plc is a Public Limited Company registered in England with
registered number 2937296. The registered office of AVEVA Group plc is High
Cross, Madingley Road, Cambridge, England CB3 0HB
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
--
You received this message because you are subscribed to the Google Groups
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/il-antlr-interest?hl=en.