Re: [sympy] feedback for GSOC 2012 idea

Bharath M R Mon, 12 Mar 2012 21:16:55 -0700


> I've never had any serious experience with operator precedence
> parsing, but I have the intuition that this technique is going to be
> quite unwieldy if we would like to go beyond simple expressions like
> the ones you have shown.
>
> I'd advocate a more general approach to this problem, and namely
> something like LL(1), SLR or LALR(1).  When writing from scratch, this
> is obviously going to require more effort than operator precedence but
> instead we will have a much wider class of languages covered.
> However, I'd recommend using a parser generator instead of writing a
> parser from scratch yourself.  It will introduce some slowdown, but
> will instead make the whole thing a lot easier to write, maintain and,
> more importantly, to extend.  I'm not sure about the state of parser
> generators for Python, but this page
> http://wiki.python.org/moin/LanguageParsing may provide some
> information.
>
Yeah I think I should use some parser generators. But doesn't it introduce 
a 
dependency on the parser generator. Is it ok to have this dependency?


> I can see a different problem here though: choosing a parsing method
> and producing the grammar of the language (or what information the
> parser would require) may not be enough to get the desired
> functionality.  What we are trying to get at should incorporate
> something similar to natural language processing.  Therefore, before
> actually starting the parsing process, I think the input should be
> brought to some kind of canonical form.  I am strongly inclined to
> believe that Wolfram|Alpha takes exactly this way of doing things (but
> obviously with a lot more details). 
>
One simple thing I can think of is detecting synonyms.  For example,
> "roots x**2 == 1" and "solutions x**2 == 1" and "solve x**2 == 1" are
> one and the same thing.  Therefore, it may be useful to start with
> substituting each of these synonyms with some canonical word ("solve",
> for example).  It is possible to argue that this can be solved by
> introducing additional rules into the grammar.  However, I am inclined
> to believe that this will make the grammar unnecessarily large and,
> even worse, slow down the parser 
>
Even I was thinking of the same. We have to substitute the synonyms before 
we 
start parsing.  

> .
>
> Another simple thing is detecting spelling errors.  Suppose I type
> "intgrate" or any of the other (numerous) wrong possibilities.  I
> think it would be very nice to have such mistakes detected and
> corrected automatically.  This http://packages.python.org/pyenchant/
> seems on topic.
>
> I'd also recommend showing the substitutions and the resulting
> canonical form.  In this way the user will be able to see how their
> input was interpreted and, maybe, change their formulation to get
> nearer to what they wanted.
>
This will be implemented just like Wolfram Alpha. We will show them what 
the input was interpreted as. 
 

> The list of actions the preprocessor should can be extended
> arbitrarily, I guess.  For example, it could try to fix wrongly
> balanced parentheses.  It might also try to cope with a different
> order of keywords in the string, like in "sin(x) integral".  It would
> be nice to parse single-argument functions written without parentheses
> ("sin x" instead of "sin(x)").  The preprocessor could also drop
> incomprehensible (and thus supposedly meaningless) words, like in
> "find the integral of x^2".
>
> Apparently, the elements in this list should also be given priorities,
> because some of them are essential (synonyms, for example, as I see
> it), others are less critical.
>
Thanks for your help. I think I have to look into all these ideas and 
prioritize them and
come up with a plan to implement them. 

> Sergiu
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/sympy/-/aNSWgICyxgoJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Re: [sympy] feedback for GSOC 2012 idea

Reply via email to