Re: [sympy] Parsing issue with mathematica.py

Joachim Durchholz Tue, 29 Nov 2011 01:13:47 -0800

Am 28.11.2011 22:58, schrieb Aaron Meurer:

The main issue here, as I see it, is not so much the speed, as the
extensibility.  We want to be able to add in bunches of new syntaxes
without the parser becoming too unwieldy.

You need to define a syntactic frame for that. I.e. additional syntaxesmust follow some general restrictions, else you'll end up with ambiguoussyntax.

You can get by with some generalize operator-precedence syntax, forexample, but that will disallow stuff like A * -B, you have toparenthesize it as A * (-B).Going more powerful becomes quite unmanageable quickly, unless you havesomebody on the team who is a crack with parsing tools.

So... if you want something modular where you can add syntax on the fly,without nasty surprises, you need to define what kinds of syntaxes youwant to allow.I'm not a parsing tool crack, but I know quite well what you can do ifyou do not want to become one, so I can advise.


> If you just have a list of

regular expression search and replaces, this could easily happen.


This depends.

If you let each regex just do a search-and-replace, the later regexeswill see the outputs of the former ones and start to interact; this willproduce extremely fragile parsers that will break at the drop of a hat.If you let each regex replace with a token that's guaranteed not to bematched by any regex further down the line, there will be no suchinteraction. However, you need to set up a dictionary so that the tokenscan be replaced with their texts again.

I have seen this approach work extremely well in Patrick Michaud'sPmWiki. People with little experience in syntax were creating new markupsyntax in their plug-ins all the time, and while there were conflicts,they were relatively easy to diagnose and didn't happen very often.

Way 2: Use a Perl-compatible regular expression engine. PCRE engines do have
support for parsing nested parentheses, see http://www.pcre.org/pcre.txt and
search for "recursive back references".
That support is limited both in what can be parsed and what you can do with
the parse, so it may or may not help in this particular case.


Is there a way to do PCRE in Python?


I had expected that PCRE is already built-in.

But I see that Python has its own, with many ideas going from Python toPerl to PCRE (and even a feature that didn't make it into Perl/PCRE,namely (?P<name>...) - that's some awesomeness there).

Unfortunately, Python does not have the recursive pattern hacks.

As regular expressions are too error-prone anyway, I'd avoid using twoengines (the regular Python one and PCRE). Programmers doing regularexpressions might start confusing which of the two engines are in use,or might confuse specific features and quirks, and end up introducingeven more errors into the patterns.


Regards,
Jo

--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Re: [sympy] Parsing issue with mathematica.py

Reply via email to