Alright, thanks for that other thread. I'll review this and discuss with my teacher to come up with a more specific plan.
The tokenize module is quite interesting - I guess how Gamma would eventually work is to try to process non-Python syntax but also accept Python expressions? Or perhaps some sort of relaxed/non-strict Python grammar could be implemented (with some processing to allow integrate, integral of, etc.) so that tan x, 3x, e^x are accepted as well as tan(x), 3*x, and exp(x). On Monday, September 3, 2012 1:21:27 PM UTC-7, Joachim Durchholz wrote: > We would not want to add an entire natural language processing toolkit; > SymPy has a rather strict "no external dependencies" policy because it > needs to be installable in installer-unfriendly environments > (non-administrator accounts, mobile devices). > Alright, I wasn't aware of that requirement. Thanks for pointing that out. NLTK would have been too onerous of a dependency in any case, as it requires a total of 800 MB of corpus data to run its various algorithms. > > However, it would be extremely useful if we could "steal" the parser > engine from such a toolkit. > Most if not all of these engines accept arbitrary context-free grammars. > With such an engine, we could just write down the BNF of some grammar > (Mathematica, Latex, natural language, whatever), and experiment with it > until it works satisfactorily. > I think this was another of the ideas listed on the wiki - I guess it falls under a similar category too. So perhaps some heuristic for differentiating between various input languages and then interpreting them as Python (Python, TeX, "English-like", etc.) could also be an interesting task. > > A general remark: Natural language is notoriously hard to parse. Either > it's intuitive, then it's too ambiguous to be useful in a context like > that of symbolic math; or it's precise, in which case it isn't natural > language anymore. Finding the right trade-off for such things is an > ongoing research topic. And after that, you get into the *really* > "interesting" problems... > My advice would be to avoid natural language if it's just a means to an > end; natural language processing just isn't explored well enough for > that, and you'll likely get more problems than the approach can solve. > If, on the other hand, natural language processing is your primary > interest, by all means continue with it, there's a lot of PhD material > in there :-) > Since Gamma only deals with mathematical expressions (which is more limited than Wolfram|Alpha) I believe at least some basic English-like queries can be interpreted. I should've been more specific about that. I thought that natural language could help somewhat with the task, or at least point me towards algorithms and ideas, which is why I mentioned it. Given how difficult it is, though, I guess just being able to interpret 2x, sin x, and integral of x^2 would be a nice step up in functionality. Thanks for all your help and suggestions! David Li -- You received this message because you are subscribed to the Google Groups "sympy" group. To view this discussion on the web visit https://groups.google.com/d/msg/sympy/-/euHGwxuSH84J. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
