Re: [sympy] A few ideas

Ronan Lamy Thu, 17 Feb 2011 19:27:25 -0800

Le mercredi 16 février 2011 à 13:05 -0700, Aaron S. Meurer a écrit : 
> There are some good ideas here.  See my comments below.  I have
> responded to both Brian and Ronan.
> 
> On Feb 15, 2011, at 9:17 PM, Brian Granger wrote:
> 
> > Ronan,
> > 
> >> Considering the merge of the quantum branch and some of the recent
> >> discussions, I have a few ideas to make sympy more modular and to make
> >> it easier to extend it.
> > 
> > Yes, these issues came up for us a lot in the writing of the quantum
> > stuff.  For examples, we currently make the implicit (and possibly
> > false) assumption that commutative=False means that something is an
> > Operator.  But, of course, you can easily have non-commutative things
> > other than Operators.
> > 
> >> Mathematical types
> >> ------------------
> >> 
> >> While the concept of type doesn't usually appear explicitly in
> >> elementary presentations of mathematics, it's very important for
> >> mathematical intuition, allowing for instance to recognize immediately
> >> that it's absurd to add a scalar and a vector.
> > 
> > Yes, absolutely.  Having a proper mathematical type system would allow
> > us to easily do validation of various operations.  We definitely need
> > this in quantum, where, for example, we can do lots of mathematically
> > insane things like 2 + Ket('psi').
> > 
> >> So, every sympy object should have a 'type' attribute (NB: this is
> >> independent from the class of the object returned by 'type(obj)'). These
> >> types should be sympy objects themselves, forming a hierarchy - a
> >> directed acyclic graph - similar to a class hierarchy, including
> >> equivalents of isinstance and issubclass. Also, each type defines a
> >> Python-level interface (implemented as a mixin or abstract base class)
> >> which objects belonging to that type must implement (e.g. by subclassing
> >> the interface). This allows the decoupling of the mathematical
> >> properties from the implementation and, provided that algorithms use
> >> only the interface instead of details of the actual class, should allow
> >> different implementations of the same type to coexist without troubles.
> > 
> > I think the basics of this idea are great.  Some questions though...
> > 
> > * For mathematical objects, it seems like there are two broad types of
> > things: spaces and things that live in those spaces (elements).  There
> > are vectors and vector spaces, Kets and hilbert spaces, the ring of
> > integers, etc.  Do we also need to think about objects that encode the
> > spaces as well as the elements they contain?


Yes, objects representing spaces are important, but this kind of things
is still underdeveloped in sympy, so I haven't thought about it much.
Quantum Hilbert spaces are brand new and though polynomials have Domain
objects, they're mostly meant for internal use. 

> I think having objects for all the sets (or sets with structures) is
>  another common thing that other CASs do.  Again, we should look at how
>  the better, more mature ones out there do it.

Yes, sets and structures are important as well. We need more of them,
independently of this proposal.

> Also, this makes me realize that this is related to assumptions.  How
>  would you recommend having the assumptions with respect to this
>  system?

Assumptions need to be aware of types so that checking that something
with integer type is an integer quickly resolves to True, but I don't
think major changes are required.

> > * Coercion.  Sage has thought through this type of thing in
> > considerable depth and one conecpt they have is coercion.  I don't
> > quite understand the difference (in Sage) between casting and
> > coercion, but the general idea is what do you do when you attempt an
> > operation between two different types - how do you coerse one type
> > into another so that the operation can happen.

I've looked into Sage's coercion model (starting from William Stein's
blog post:
http://sagemath.blogspot.com/2010/11/brief-history-and-motivation-behind.html) 
and while the Types I'm defining here and Sage's Parents are quite similar, 
there's a major difference in that Sage conflates types and algebraic 
structures. Therefore, in Sage ZZ(1) and QQ(1) aren't the same object, but in 
sympy there's a single Integer(1) object which can be considered as an element 
of the ring of integers or of the field of rationals. Also, we don't need to 
coerce between different types to allow the operation to happen but we do need 
to assign the correct type to the result (and possibly to take some action 
based on the type). 

> Isn't coercion implicit while casting is explicit?  At least that's
>  what the names imply to me.  Anyway, if it is, we might remember that
>  "explicit is better than implicit."

Yes, that's one more reason to avoid coercion as much as possible. It
should mostly be limited to Python builtins and interconversion between
different implementations of the same type. 
> 
> > * Are the interfaces you envision on the type objects or the objects
> > that have the type attributes?  Does that make sense?

I'm not sure I understand your question. Concretely, we should have:
>>> n = Integer(12)
>>> n.type
IntegerType
>>> IntegerType.interface
IntegerInterface
>>> isinstance(n, IntegerInterface)
True
>>> n.type in RationalType # or n.type.issubtype(RationalType)
True
>>> isinstance(n , RationalInterface)
True

where IntegerType and RationalType are instances of Basic, and
IntegerInterface and RationalInterface are mixin classes and possibly
ABCs. 

> >> Now, what prevents these types from simply being implemented as ABCs is
> >> that we need parametric types (like Vector(Real, 3)) and that we want
> >> operation objects like Add to have different types depending on their
> >> arguments so that Int + Real -> Real, Ket + Ket -> Ket and Ket + Real ->
> >> TypeError(?). Implementing these objects with covariant type is, of
> >> course, is the tricky bit, but it would allow to extend sympy without
> >> having to reimplement half the core's classes.
> > 
> > I think the extensibilty of sympy is super important - it is one of
> > the most attractive parts of it.  The quantum stuff we did would have
> > been very difficult in many other environments, but having a proper
> > type system would make it even easier and more robust.
> > 
> >> Another benefit of this is that we could implement a Variable class so
> >> that implementing a type would automatically allow us to create symbolic
> >> instantiations of it. Which brings me to my second point…
> 
> This all sounds good.  We should look at what other, more mature
>  computer algebra systems do with this.
> 
> It sounds to me like it might require a pretty significant rewrite of
>  the core.  I think that if we ever do decide to make such a change,
>  that we should try to make the core as modular as possible, so that we
>  can avoid future rewrites.

Yes, but I think it can be done incrementally. And the objective is
certainly to allow more things to be done without changing a single line
in the core.

> 
> >> 
> >> Symbols vs variables
> >> ——————————
> 
> >> 
> >> Currently in sympy, there is no distinction between the concepts of
> >> symbol and variable. Yet these are different. A symbol is just a sign -
> >> blobs of ink on the page - without any intrinsic meaning. A variable is
> >> basically a placeholder that can be replaced by any value in some range.
> >> The properties of the variable are the properties that are common to all
> >> the elements of the range. Most of what we do in sympy should use
> >> variables, symbols should only matter for parsing and printing.
> > 
> > So are you thinking that a variable is a symbol that has a definite type?
> 
> > 
> >> Symbol() objects are actually almost variables already. Besides the
> >> name, they only lack a more explicit range and shouldn't require a name.
> 
> I understand mathematically the difference between the two, but I don't
>  see from your explanation how you think they should be different in
>  SymPy.  
> 
> Anyway, the name "symbol" is pretty engrained in SymPy by now, so if
>  you are suggesting a name change, I don't think that would be
>  possible.

Actually, Variables would be used much like Symbols (or Dummies) are
now, but they are completely different internally. They must have a
type, but the name is optional. Also, variables have a concept of
instantiation and the instance must belong to the range of the variable.
Using instantiation of variables instead of substitution of symbols
should give better guarantees on the correctness of calculations.

> >> 
> >> Intermediate representation for printing
> >> ----------------------------------------
> >> 
> >> At the moment, we have many different printers (str, repr, pretty,
> >> latex, mathml, ...) and printing logic specific to an object needs to be
> >> implemented for each of them even though we want the results to look
> >> similar between most printers. It would be much easier if we could
> >> specify once how the object should be printed (e.g "put the contents of
> >> self.label between a vertical bar and an angled bracket") and let sympy
> >> take care of outputting actual LaTeX/MathML/pretty-printer stuff/...
> > 
> > The only issue with this is that in many cases the abstract notion of
> > how something is printed is very dependent on the features of the
> > printer.  For example, in the quantum stuff for printers that have
> > subscripts (pretty, latex, etc.) we do one thing, but for printers
> > with no good looking subscripts, something completely different.  Thus
> > is may be difficult to abstract this in a clean way.
> 
> Brian makes a good point.  Why do we need what would essentially be a
>  separate implementation of the core to do printing. The representation
>  for an object can just be the object itself.  This wouldn't, as you
>  say, make it very easy to adapt to other systems, but I think it would
>  be the easiest to work with from within SymPy itself.

It wouldn't be a separate implementation of the core. The things that
need to be implemented are subscripts, superscripts, matching
delimiters, ... But perhaps this is only applicable to pretty-printers
(defined as targets with a 2-D output and a wide range of symbols
available). Anyway, the preferred representation of an object is an
intrinsic property that can't be inferred from the structure of the
expression tree. 

Having this information in the object would remove a lot of code
duplication. Compare for instance LatexPrinter._print_Add,
MathMLPrinter._print_Add and PrettyPrinter._print_Add. These are 3
different implementations of the same problem: sorting the arguments in
a sensible order and converting 'x + -1' into 'x - 1'.

Also, it would allow to create symbols named (in LateX) 'x_+' or
'\gamma^5' without dirty hacks.

> 
> > 
> >> So, the idea would be to have a system of objects giving a high-level
> >> description of printed mathematical expressions - this system should
> >> probably look like a straightforward translation of Content MathML into
> >> an object hierarchy. Objects should have a single method returning their
> >> representation in this system while the Printers have all the logic
> >> needed to create the print_* output. This would replace the _latex(),
> >> _mathml() methods as well as all the _print_Stuff() mess in printer
> >> classes and turn this (N object classes) * (M printers) problem into an
> >> (N+M) one.
> > 
> > Also, our notion of printers is much more general than mathematica
> > notation.  Some of our printers output code, which is *very* different
> > from mathml or latex as a base.
> 
> This is another thing.  For an arbitrary printer, you might need any
> amount of information contained in the original object.  Or maybe you
> will say that the code printer shouldn't really be implemented as a
> printer.

Indeed, I'm not sure that code generation should use the printer model.
It works for expressions, but there are difficulties when you need to
output statements.

> 
> Aaron Meurer
> 
> > 
> >> I don't know whether this representation should use sympy objects (=
> >> instances of Basic): avoiding the dependence on the rest of sympy could
> >> allow as a long-term goal to get other projects (matplotlib,
> >> ipython, ...) to use it, so that it becomes possible to exchange
> >> mathematical expressions with them without serialising to LaTeX or
> >> MathML.
> >> 
> >> 
> >> 
> >> I don't know whether any of this can realistically be implemented any
> >> time soon but I think these are important medium-term goals that would
> >> make sympy more of a generic framework.
> > 
> > I think the first two of these proposals are quite interesting.  The
> > third, possibly, but it is not clear how it would all work.
> > 
> > Cheers,
> > 
> > Brian
> > 
> >> 
> >> --
> >> You received this message because you are subscribed to the Google Groups 
> >> "sympy" group.
> >> To post to this group, send email to [email protected].
> >> To unsubscribe from this group, send email to 
> >> [email protected].
> >> For more options, visit this group at 
> >> http://groups.google.com/group/sympy?hl=en.
> >> 
> >> 
> > 
> > 
> > 
> > -- 
> > Brian E. Granger, Ph.D.
> > Assistant Professor of Physics
> > Cal Poly State University, San Luis Obispo
> > [email protected]
> > [email protected]
> 



-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Re: [sympy] A few ideas

Reply via email to