Alright, that seems like a good approach. Actually, playing around with the 
parser, it already seems to parse (but won't evaluate) expressions like 2x: 
if I add a print statement to show the final list of tokens,

>>> sympy.parsing.sympy_parser.parse_expr("2 x y")
[(1, 'Integer'), (51, '('), (2, '2'), (51, ')'), (1, 'Symbol'), (51, '('), 
(1, "'x'"), (51, ')'), (1, 'Symbol'), (51, '('), (1, "'y'"), (51, ')'), (0, 
'')]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "sympy/parsing/sympy_parser.py", line 182, in parse_expr
    expr = eval(code, global_dict, local_dict) # take local objects in 
preference
  File "<string>", line 1
    Integer (2 )Symbol ('x' )Symbol ('y' )
                     ^
SyntaxError: invalid syntax

So it's already parsing it almost correctly, it just needs to recognize 
that the two are being multiplied (so a '*' needs to be inserted in there)

David

On Tuesday, September 4, 2012 6:52:16 PM UTC-7, Joachim Durchholz wrote:
>
> Am 05.09.2012 02:14, schrieb David Li: 
> > Yes, what I was thinking is that there would be a "whitespace expansion" 
> > step (probably after tokenization) that would convert statements like 
> 2xy 
> > into "2 x y" and then tokenize again 
>
> Multiple tokenization steps are usually not worth it. 
> Make it so that there's a token boundary between 2 and xy. 
>
> Splitting xy would be one of those things that need to be 
> syntax-dependent. 
> SymPy allows defining variable names, so if there's an "x" and a "y", 
> you can split, and if there's an "xy", you wouldn't want to split. 
> If there are all three of "x", "y", and "xy", you have an ambiguous 
> parse; report that as something that the user needs to decide (ambiguous 
> parses are a fact of life for ad-hoc grammars, and in fact we need to 
> deal with these for other reasons anyway). 
>
> If SymPy provides you with a variable named "xy", add a temporary 
> grammar rule 
>    xy ::= x y 
> and make each letter a separate token. (Temporary grammar rules and 
> ambiguities are anathema in the parser generators used for programming 
> languages, but they are no problem in the parser generators used for 
> natural languages. Different parsing technology.) 
>

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/sympy/-/-eSPx8E3CDcJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Reply via email to