Am 28.11.2011 22:58, schrieb Aaron Meurer:
The main issue here, as I see it, is not so much the speed, as the
extensibility.  We want to be able to add in bunches of new syntaxes
without the parser becoming too unwieldy.

You need to define a syntactic frame for that. I.e. additional syntaxes must follow some general restrictions, else you'll end up with ambiguous syntax.

You can get by with some generalize operator-precedence syntax, for example, but that will disallow stuff like A * -B, you have to parenthesize it as A * (-B). Going more powerful becomes quite unmanageable quickly, unless you have somebody on the team who is a crack with parsing tools.

So... if you want something modular where you can add syntax on the fly, without nasty surprises, you need to define what kinds of syntaxes you want to allow. I'm not a parsing tool crack, but I know quite well what you can do if you do not want to become one, so I can advise.

> If you just have a list of
regular expression search and replaces, this could easily happen.

This depends.
If you let each regex just do a search-and-replace, the later regexes will see the outputs of the former ones and start to interact; this will produce extremely fragile parsers that will break at the drop of a hat. If you let each regex replace with a token that's guaranteed not to be matched by any regex further down the line, there will be no such interaction. However, you need to set up a dictionary so that the tokens can be replaced with their texts again.

I have seen this approach work extremely well in Patrick Michaud's PmWiki. People with little experience in syntax were creating new markup syntax in their plug-ins all the time, and while there were conflicts, they were relatively easy to diagnose and didn't happen very often.

Way 2: Use a Perl-compatible regular expression engine. PCRE engines do have
support for parsing nested parentheses, see http://www.pcre.org/pcre.txt and
search for "recursive back references".
That support is limited both in what can be parsed and what you can do with
the parse, so it may or may not help in this particular case.

Is there a way to do PCRE in Python?

I had expected that PCRE is already built-in.
But I see that Python has its own, with many ideas going from Python to Perl to PCRE (and even a feature that didn't make it into Perl/PCRE, namely (?P<name>...) - that's some awesomeness there).
Unfortunately, Python does not have the recursive pattern hacks.

As regular expressions are too error-prone anyway, I'd avoid using two engines (the regular Python one and PCRE). Programmers doing regular expressions might start confusing which of the two engines are in use, or might confuse specific features and quirks, and end up introducing even more errors into the patterns.

Regards,
Jo

--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Reply via email to