Re: [sympy] Names that aren't unique

Oscar Benjamin Mon, 12 Apr 2021 06:42:21 -0700

On Mon, 12 Apr 2021 at 13:23, 'Bruce Allen' via sympy
<[email protected]> wrote:
>
> Hi Oscar,
>
> > It wouldn't be hard to make any new definition of a Symbol with the
> > same name as a previously created symbol raise an error but it would
> > break the assumption that it is okay to define a symbol that is only
> > used local to some context and that assumption is depended on by many
> > users and downstream libraries and is also used internally by sympy
> > itself.
>
> This context what I meant when I was talking about the 'scope' of a
> variable, and now it makes sense to me.


I think you misunderstand how this works in Python. I'm going to guess
that you are more familiar with C and describe this in those terms.

In Python there are objects and then there are names. The Python
expression Symbol('x') creates an object. The Python statement y =
Symbol('x') binds the name y to that object within the current scope
or namespace e.g.:

# bind the name y in the module scope:
y = Symbol('x')

def f():
    # bind the name t in the local scope of the function f
    t = Symbol('x')
    return t

z = f()

Variable names are scoped but objects are not and reside in a global
space. Each call to Symbol('x') adds a new object to that global space
(ignoring SymPy's cache for the moment). In C terms the object itself
is a heap-allocated struct. Binding a name is like making a pointer
point at the struct. Returning from a function actually returns the
pointer. In the above t and z are different pointers to the same
struct referenced by different names in different scopes.

The code implementing Symbol('x') has no way to know the scope of the
variable (pointer) that it is being assigned to. There is no way in
SymPy to know that y and t above are names in different scopes.
Likewise in C I can make a function that returns a pointer to a
heap-allocated object but then there is no way for me to keep track of
what a user does with that pointer:

object *x = make_heap_object();
object *y = x; /* the author of make_heap_object has no control over this */

Within SymPy we can not distinguish where in the Python code SymPy
expressions are being used. We can only look at the values stored in
the "struct" when a user calls a SymPy function.

Note that I describe this in terms of structs and pointers as an
analogy but if you use the standard CPython interpreter then that is a
C program and this is literally how it is implemented under the hood.

> Is it right that this function
> is "broken":
>
> def my_power(n):
>         x=Symbol('x')
>         expr = x**n
>         return expr
>
> because the scope of x is lost on function return?  But this function is OK:
>
> def my_power(x, n):
>         expr = x**n
>         return expr
>
> because when the second function is called, the variable x is already
> defined in the scope/context of the calling function?

It's not necessarily broken. That depends on the context. Within the
SymPy codebase this would be considered bad practice because there's
no way to know if the user is already using a symbol called `x`.
Instead either the user should be able to specify the symbol or at
least a Dummy symbol should be used e.g.:

In [4]: minpoly(sqrt(2))
Out[4]:
 2
x  - 2

In [5]: [sym] = minpoly(sqrt(2)).free_symbols

In [6]: sym
Out[6]: x

In [7]: type(sym)
Out[7]: sympy.core.symbol.Dummy

In [8]: sym == Symbol('x')
Out[8]: False

In [9]: sym == Dummy('x')  # Dummy behaves different to Symbol
Out[9]: False

In [10]: minpoly(sqrt(2), y)
Out[10]:
 2
y  - 2

> > What I said about the overhead is that when creating an expression like
> >
> > x = Symbol('x')
> > x2 = Symbol('x', positive=True)
> > expr = x*x2
> >
> > it would be possible to check here that expr contains two different
> > symbols having the same name. However that would need to be checked in
> > the evaluation of x*x2 and then also in the evaluation of cos(x) +
> > 3*sin(x2) etc. We would need to walk the expression tree looking for
> > symbols with the same name every time any operation constructs a new
> > expression which would be too expensive.
>
> Wouldn't it be enough if the second line above:
> x2 = Symbol('x', positive=True)
> issued a warning message to the user, saying that "Symbol('x', ...) was
> called more than once with the same name?  Or is that what would break
> existing code/libraries?

I think that would lead to a whole load of warnings from sympy itself
let alone other libraries. This would give out warnings for code that
works perfectly fine. I would consider this a breaking change.


Oscar

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/CAHVvXxSmyE9Lh07sZxJ-0WSVJ03DE7HM%2B94fMeqTVnz_SRn6%2Bg%40mail.gmail.com.

Re: [sympy] Names that aren't unique

Reply via email to