Hello,

I'm learning to use parsers: trying pyParsing, construct and simpleparse to have a better overview. I know a bit regular expressions and rather used to BNF-like formats such as used for specification of languages. But I have never really employed them personly, so the following may be trivial. Below is used a BNF dialect that I think is clear and non-ambiguous.

format_code     := '+' | '-' | '*' | '#'
I need to specify that a single, identical, format_code code may be repeated. Not that a there may be several one on a sequence.
format          := (format_code)+
would catch '+-', which is wrong. I want only patterns such as '--', '+++',...

style_code      := '/' | '!' | '_'
Similar case, but different. I want patterns like:
styled_text     := style plain_text style
where both style instances are identical. As the number of styles may grow (and even be impredictable: the style_code line will actually be written at runtime according to a config file) I don't want, and anyway can't, specify all possible kinds of styled_text. Even if possible, it would be ugly!

I would like to specify a "side-condition" for a pattern, meaning that it should only when a specific token lies aside. For instance:
A       := A_pattern {X}
X is not part of the pattern, thus should not be extracted. If X is just "garbage", I can write an enlarged pattern, then let it down later:
A       := A_pattern
A_X     := A X
If X itself is a token, I can write a super pattern, then extract both items from the combination, and let down As that come alone:
X       := X_pattern
A       := A_pattern
A_X     := A X
But what if X is part of another production? For example:
B       := X B_end_pattern
A_X     := A X
I tried it, but I can't get X in both productions. So that I catch either B or A_X -- according to mysterious priority rules I don't fully understand (it seems to be neither the longest string, nor the first written pattern, by pyParsing).

Now, precisely, what about priority? I mean ambiguous cases, when an actual production can match several patterns. Parsers have tricks, rules, or explicit features to cope with such cases, but, as I understand it, these apply during or after the parsing process, as additional treatment. Is there a way to specify priority in the grammar itself?

Denis

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to