> I've never had any serious experience with operator precedence
> parsing, but I have the intuition that this technique is going to be
> quite unwieldy if we would like to go beyond simple expressions like
> the ones you have shown.
>
> I'd advocate a more general approach to this problem, and namely
> something like LL(1), SLR or LALR(1). When writing from scratch, this
> is obviously going to require more effort than operator precedence but
> instead we will have a much wider class of languages covered.
> However, I'd recommend using a parser generator instead of writing a
> parser from scratch yourself. It will introduce some slowdown, but
> will instead make the whole thing a lot easier to write, maintain and,
> more importantly, to extend. I'm not sure about the state of parser
> generators for Python, but this page
> http://wiki.python.org/moin/LanguageParsing may provide some
> information.
>
Yeah I think I should use some parser generators. But doesn't it introduce
a
dependency on the parser generator. Is it ok to have this dependency?
> I can see a different problem here though: choosing a parsing method
> and producing the grammar of the language (or what information the
> parser would require) may not be enough to get the desired
> functionality. What we are trying to get at should incorporate
> something similar to natural language processing. Therefore, before
> actually starting the parsing process, I think the input should be
> brought to some kind of canonical form. I am strongly inclined to
> believe that Wolfram|Alpha takes exactly this way of doing things (but
> obviously with a lot more details).
>
One simple thing I can think of is detecting synonyms. For example,
> "roots x**2 == 1" and "solutions x**2 == 1" and "solve x**2 == 1" are
> one and the same thing. Therefore, it may be useful to start with
> substituting each of these synonyms with some canonical word ("solve",
> for example). It is possible to argue that this can be solved by
> introducing additional rules into the grammar. However, I am inclined
> to believe that this will make the grammar unnecessarily large and,
> even worse, slow down the parser
>
Even I was thinking of the same. We have to substitute the synonyms before
we
start parsing.
> .
>
> Another simple thing is detecting spelling errors. Suppose I type
> "intgrate" or any of the other (numerous) wrong possibilities. I
> think it would be very nice to have such mistakes detected and
> corrected automatically. This http://packages.python.org/pyenchant/
> seems on topic.
>
> I'd also recommend showing the substitutions and the resulting
> canonical form. In this way the user will be able to see how their
> input was interpreted and, maybe, change their formulation to get
> nearer to what they wanted.
>
This will be implemented just like Wolfram Alpha. We will show them what
the input was interpreted as.
> The list of actions the preprocessor should can be extended
> arbitrarily, I guess. For example, it could try to fix wrongly
> balanced parentheses. It might also try to cope with a different
> order of keywords in the string, like in "sin(x) integral". It would
> be nice to parse single-argument functions written without parentheses
> ("sin x" instead of "sin(x)"). The preprocessor could also drop
> incomprehensible (and thus supposedly meaningless) words, like in
> "find the integral of x^2".
>
> Apparently, the elements in this list should also be given priorities,
> because some of them are essential (synonyms, for example, as I see
> it), others are less critical.
>
Thanks for your help. I think I have to look into all these ideas and
prioritize them and
come up with a plan to implement them.
> Sergiu
>
>
--
You received this message because you are subscribed to the Google Groups
"sympy" group.
To view this discussion on the web visit
https://groups.google.com/d/msg/sympy/-/aNSWgICyxgoJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sympy?hl=en.