Am 04.09.2012 00:11, schrieb David Li:
So perhaps some heuristic for differentiating
between various input languages and then interpreting them as Python
(Python, TeX, "English-like", etc.) could also be an interesting task.

Heh. That's simple:
- Have a grammar for each syntax that we have,
- run the input through all grammars,
- use the grammar that doesn't return an error.

The fun begins when considering the following cases:
1) No grammar matches.
2) More than one grammar matches.

For (1), you'd want to somehow rank the grammars according to how close the input is to each grammar, and assume the user really meant the closest one.

For (2), you'd want to check if the different grammars all really mean the same. E.g. "1*1" should parse the same for all math grammars. Just continue processing. Otherwise, you'll have to ask the user. Or randomly guess one and let the user explicitly select grammars.

There's also a slight complication for case (2): You may get different parse trees but they'd boil down to the same operations. For examples, grammars with different numbers of precedence levels tend to end up that way; 1*2 could end as

op: *
  int: 1
  int: 2

or as

op: *
  literal
    int: 1
  literal
    int: 1

where the second grammar would for some reason differentiate between literals, names, and other representations, where the first does not.

You'll either need a pass that normalizes grammars, or require that commonalities between grammars are handled by identical rules. The first approach probably requires less work because SymPy already has routines for simplifying expressions; however, that makes error reporting more difficult because the transformations aren't built for keeping track of input line/column numbers.

You see, there's enough to do :-)

Not all aspects need to be addressed on the first round though. Just choose how much of this all you want to deal with, and code in a way that the rest can be added later without rewriting everything.

Since Gamma only deals with mathematical expressions (which is more limited
than Wolfram|Alpha) I believe at least some basic English-like queries can
be interpreted.
> ...
> Given how
> difficult it is, though, I guess just being able to interpret 2x, sin
> x, and integral of x^2 would be a nice step up in functionality.

Indeed, that's easy enough. You can always write a grammar that accepts a subset of English.
Main points:
- Do not require parentheses for function parameters; a function call is just: name {expr} - Make name {expr} bind weaker than all operators, so sin x+y is equivalent to sin (x+y).

> I should've been more specific about that. I thought that
natural language could help somewhat with the task, or at least point me
towards algorithms and ideas, which is why I mentioned it.

That wouldn't have worked. Parsing natural language is really hard. And the algorithms beyond parsing aren't related much to natural language.

Still, the natural language parsers should be suitable.

--
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Reply via email to