Am 28.11.2011 22:58, schrieb Aaron Meurer:
The main issue here, as I see it, is not so much the speed, as the
extensibility. We want to be able to add in bunches of new syntaxes
without the parser becoming too unwieldy.
You need to define a syntactic frame for that. I.e. additional syntaxes
must follow some general restrictions, else you'll end up with ambiguous
syntax.
You can get by with some generalize operator-precedence syntax, for
example, but that will disallow stuff like A * -B, you have to
parenthesize it as A * (-B).
Going more powerful becomes quite unmanageable quickly, unless you have
somebody on the team who is a crack with parsing tools.
So... if you want something modular where you can add syntax on the fly,
without nasty surprises, you need to define what kinds of syntaxes you
want to allow.
I'm not a parsing tool crack, but I know quite well what you can do if
you do not want to become one, so I can advise.
> If you just have a list of
regular expression search and replaces, this could easily happen.
This depends.
If you let each regex just do a search-and-replace, the later regexes
will see the outputs of the former ones and start to interact; this will
produce extremely fragile parsers that will break at the drop of a hat.
If you let each regex replace with a token that's guaranteed not to be
matched by any regex further down the line, there will be no such
interaction. However, you need to set up a dictionary so that the tokens
can be replaced with their texts again.
I have seen this approach work extremely well in Patrick Michaud's
PmWiki. People with little experience in syntax were creating new markup
syntax in their plug-ins all the time, and while there were conflicts,
they were relatively easy to diagnose and didn't happen very often.
Way 2: Use a Perl-compatible regular expression engine. PCRE engines do have
support for parsing nested parentheses, see http://www.pcre.org/pcre.txt and
search for "recursive back references".
That support is limited both in what can be parsed and what you can do with
the parse, so it may or may not help in this particular case.
Is there a way to do PCRE in Python?
I had expected that PCRE is already built-in.
But I see that Python has its own, with many ideas going from Python to
Perl to PCRE (and even a feature that didn't make it into Perl/PCRE,
namely (?P<name>...) - that's some awesomeness there).
Unfortunately, Python does not have the recursive pattern hacks.
As regular expressions are too error-prone anyway, I'd avoid using two
engines (the regular Python one and PCRE). Programmers doing regular
expressions might start confusing which of the two engines are in use,
or might confuse specific features and quirks, and end up introducing
even more errors into the patterns.
Regards,
Jo
--
You received this message because you are subscribed to the Google Groups
"sympy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sympy?hl=en.