Re: [sympy] A few ideas

Aaron S. Meurer Fri, 18 Feb 2011 11:39:52 -0800

On Feb 17, 2011, at 8:27 PM, Ronan Lamy wrote:

> Le mercredi 16 février 2011 à 13:05 -0700, Aaron S. Meurer a écrit : 
>> There are some good ideas here.  See my comments below.  I have
>> responded to both Brian and Ronan.
>> 
>> On Feb 15, 2011, at 9:17 PM, Brian Granger wrote:
>> 
>>> Ronan,
>>> 
>>>> Considering the merge of the quantum branch and some of the recent
>>>> discussions, I have a few ideas to make sympy more modular and to make
>>>> it easier to extend it.
>>> 
>>> Yes, these issues came up for us a lot in the writing of the quantum
>>> stuff.  For examples, we currently make the implicit (and possibly
>>> false) assumption that commutative=False means that something is an
>>> Operator.  But, of course, you can easily have non-commutative things
>>> other than Operators.
>>> 
>>>> Mathematical types
>>>> ------------------
>>>> 
>>>> While the concept of type doesn't usually appear explicitly in
>>>> elementary presentations of mathematics, it's very important for
>>>> mathematical intuition, allowing for instance to recognize immediately
>>>> that it's absurd to add a scalar and a vector.
>>> 
>>> Yes, absolutely.  Having a proper mathematical type system would allow
>>> us to easily do validation of various operations.  We definitely need
>>> this in quantum, where, for example, we can do lots of mathematically
>>> insane things like 2 + Ket('psi').
>>> 
>>>> So, every sympy object should have a 'type' attribute (NB: this is
>>>> independent from the class of the object returned by 'type(obj)'). These
>>>> types should be sympy objects themselves, forming a hierarchy - a
>>>> directed acyclic graph - similar to a class hierarchy, including
>>>> equivalents of isinstance and issubclass. Also, each type defines a
>>>> Python-level interface (implemented as a mixin or abstract base class)
>>>> which objects belonging to that type must implement (e.g. by subclassing
>>>> the interface). This allows the decoupling of the mathematical
>>>> properties from the implementation and, provided that algorithms use
>>>> only the interface instead of details of the actual class, should allow
>>>> different implementations of the same type to coexist without troubles.
>>> 
>>> I think the basics of this idea are great.  Some questions though...
>>> 
>>> * For mathematical objects, it seems like there are two broad types of
>>> things: spaces and things that live in those spaces (elements).  There
>>> are vectors and vector spaces, Kets and hilbert spaces, the ring of
>>> integers, etc.  Do we also need to think about objects that encode the
>>> spaces as well as the elements they contain?
> 
> Yes, objects representing spaces are important, but this kind of things
> is still underdeveloped in sympy, so I haven't thought about it much.
> Quantum Hilbert spaces are brand new and though polynomials have Domain
> objects, they're mostly meant for internal use. 
> 
>> I think having objects for all the sets (or sets with structures) is
>> another common thing that other CASs do.  Again, we should look at how
>> the better, more mature ones out there do it.
> 
> Yes, sets and structures are important as well. We need more of them,
> independently of this proposal.
> 
>> Also, this makes me realize that this is related to assumptions.  How
>> would you recommend having the assumptions with respect to this
>> system?
> 
> Assumptions need to be aware of types so that checking that something
> with integer type is an integer quickly resolves to True, but I don't
> think major changes are required.
> 
>>> * Coercion.  Sage has thought through this type of thing in
>>> considerable depth and one conecpt they have is coercion.  I don't
>>> quite understand the difference (in Sage) between casting and
>>> coercion, but the general idea is what do you do when you attempt an
>>> operation between two different types - how do you coerse one type
>>> into another so that the operation can happen.
> 
> I've looked into Sage's coercion model (starting from William Stein's
> blog post:
> http://sagemath.blogspot.com/2010/11/brief-history-and-motivation-behind.html)
>  and while the Types I'm defining here and Sage's Parents are quite similar, 
> there's a major difference in that Sage conflates types and algebraic 
> structures. Therefore, in Sage ZZ(1) and QQ(1) aren't the same object, but in 
> sympy there's a single Integer(1) object which can be considered as an 
> element of the ring of integers or of the field of rationals. Also, we don't 
> need to coerce between different types to allow the operation to happen but 
> we do need to assign the correct type to the result (and possibly to take 
> some action based on the type).


So which way is better in your opinion? 

> 
>> Isn't coercion implicit while casting is explicit?  At least that's
>> what the names imply to me.  Anyway, if it is, we might remember that
>> "explicit is better than implicit."
> 
> Yes, that's one more reason to avoid coercion as much as possible. It
> should mostly be limited to Python builtins and interconversion between
> different implementations of the same type. 
>> 
>>> * Are the interfaces you envision on the type objects or the objects
>>> that have the type attributes?  Does that make sense?
> 
> I'm not sure I understand your question. Concretely, we should have:
>>>> n = Integer(12)
>>>> n.type
> IntegerType
>>>> IntegerType.interface
> IntegerInterface
>>>> isinstance(n, IntegerInterface)
> True
>>>> n.type in RationalType # or n.type.issubtype(RationalType)
> True
>>>> isinstance(n , RationalInterface)
> True
> 
> where IntegerType and RationalType are instances of Basic, and
> IntegerInterface and RationalInterface are mixin classes and possibly
> ABCs. 
> 
>>>> Now, what prevents these types from simply being implemented as ABCs is
>>>> that we need parametric types (like Vector(Real, 3)) and that we want
>>>> operation objects like Add to have different types depending on their
>>>> arguments so that Int + Real -> Real, Ket + Ket -> Ket and Ket + Real ->
>>>> TypeError(?). Implementing these objects with covariant type is, of
>>>> course, is the tricky bit, but it would allow to extend sympy without
>>>> having to reimplement half the core's classes.
>>> 
>>> I think the extensibilty of sympy is super important - it is one of
>>> the most attractive parts of it.  The quantum stuff we did would have
>>> been very difficult in many other environments, but having a proper
>>> type system would make it even easier and more robust.
>>> 
>>>> Another benefit of this is that we could implement a Variable class so
>>>> that implementing a type would automatically allow us to create symbolic
>>>> instantiations of it. Which brings me to my second point…
>> 
>> This all sounds good.  We should look at what other, more mature
>> computer algebra systems do with this.
>> 
>> It sounds to me like it might require a pretty significant rewrite of
>> the core.  I think that if we ever do decide to make such a change,
>> that we should try to make the core as modular as possible, so that we
>> can avoid future rewrites.
> 
> Yes, but I think it can be done incrementally. And the objective is
> certainly to allow more things to be done without changing a single line
> in the core.
> 
>> 
>>>> 
>>>> Symbols vs variables
>>>> ——————————
>> 
>>>> 
>>>> Currently in sympy, there is no distinction between the concepts of
>>>> symbol and variable. Yet these are different. A symbol is just a sign -
>>>> blobs of ink on the page - without any intrinsic meaning. A variable is
>>>> basically a placeholder that can be replaced by any value in some range.
>>>> The properties of the variable are the properties that are common to all
>>>> the elements of the range. Most of what we do in sympy should use
>>>> variables, symbols should only matter for parsing and printing.
>>> 
>>> So are you thinking that a variable is a symbol that has a definite type?
>> 
>>> 
>>>> Symbol() objects are actually almost variables already. Besides the
>>>> name, they only lack a more explicit range and shouldn't require a name.
>> 
>> I understand mathematically the difference between the two, but I don't
>> see from your explanation how you think they should be different in
>> SymPy.  
>> 
>> Anyway, the name "symbol" is pretty engrained in SymPy by now, so if
>> you are suggesting a name change, I don't think that would be
>> possible.
> 
> Actually, Variables would be used much like Symbols (or Dummies) are
> now, but they are completely different internally. They must have a
> type, but the name is optional. Also, variables have a concept of
> instantiation and the instance must belong to the range of the variable.
> Using instantiation of variables instead of substitution of symbols
> should give better guarantees on the correctness of calculations.

I still don't quite understand what you are saying here.  Can you give some 
kind of example?

> 
>>>> 
>>>> Intermediate representation for printing
>>>> ----------------------------------------
>>>> 
>>>> At the moment, we have many different printers (str, repr, pretty,
>>>> latex, mathml, ...) and printing logic specific to an object needs to be
>>>> implemented for each of them even though we want the results to look
>>>> similar between most printers. It would be much easier if we could
>>>> specify once how the object should be printed (e.g "put the contents of
>>>> self.label between a vertical bar and an angled bracket") and let sympy
>>>> take care of outputting actual LaTeX/MathML/pretty-printer stuff/...
>>> 
>>> The only issue with this is that in many cases the abstract notion of
>>> how something is printed is very dependent on the features of the
>>> printer.  For example, in the quantum stuff for printers that have
>>> subscripts (pretty, latex, etc.) we do one thing, but for printers
>>> with no good looking subscripts, something completely different.  Thus
>>> is may be difficult to abstract this in a clean way.
>> 
>> Brian makes a good point.  Why do we need what would essentially be a
>> separate implementation of the core to do printing. The representation
>> for an object can just be the object itself.  This wouldn't, as you
>> say, make it very easy to adapt to other systems, but I think it would
>> be the easiest to work with from within SymPy itself.
> 
> It wouldn't be a separate implementation of the core. The things that
> need to be implemented are subscripts, superscripts, matching
> delimiters, ... But perhaps this is only applicable to pretty-printers
> (defined as targets with a 2-D output and a wide range of symbols
> available). Anyway, the preferred representation of an object is an
> intrinsic property that can't be inferred from the structure of the
> expression tree. 
> 
> Having this information in the object would remove a lot of code
> duplication. Compare for instance LatexPrinter._print_Add,
> MathMLPrinter._print_Add and PrettyPrinter._print_Add. These are 3
> different implementations of the same problem: sorting the arguments in
> a sensible order and converting 'x + -1' into 'x - 1'.
> 
> Also, it would allow to create symbols named (in LateX) 'x_+' or
> '\gamma^5' without dirty hacks.

OK, that makes sense then.

> 
>> 
>>> 
>>>> So, the idea would be to have a system of objects giving a high-level
>>>> description of printed mathematical expressions - this system should
>>>> probably look like a straightforward translation of Content MathML into
>>>> an object hierarchy. Objects should have a single method returning their
>>>> representation in this system while the Printers have all the logic
>>>> needed to create the print_* output. This would replace the _latex(),
>>>> _mathml() methods as well as all the _print_Stuff() mess in printer
>>>> classes and turn this (N object classes) * (M printers) problem into an
>>>> (N+M) one.
>>> 
>>> Also, our notion of printers is much more general than mathematica
>>> notation.  Some of our printers output code, which is *very* different
>>> from mathml or latex as a base.
>> 
>> This is another thing.  For an arbitrary printer, you might need any
>> amount of information contained in the original object.  Or maybe you
>> will say that the code printer shouldn't really be implemented as a
>> printer.
> 
> Indeed, I'm not sure that code generation should use the printer model.
> It works for expressions, but there are difficulties when you need to
> output statements.

Maybe there should be a separate class (subclassing from Printer) for 2-D 
output, which uses the model above, and the other "printers" like this one 
would just be regular Printers.

Aaron Meurer

> 
>> 
>> Aaron Meurer
>> 
>>> 
>>>> I don't know whether this representation should use sympy objects (=
>>>> instances of Basic): avoiding the dependence on the rest of sympy could
>>>> allow as a long-term goal to get other projects (matplotlib,
>>>> ipython, ...) to use it, so that it becomes possible to exchange
>>>> mathematical expressions with them without serialising to LaTeX or
>>>> MathML.
>>>> 
>>>> 
>>>> 
>>>> I don't know whether any of this can realistically be implemented any
>>>> time soon but I think these are important medium-term goals that would
>>>> make sympy more of a generic framework.
>>> 
>>> I think the first two of these proposals are quite interesting.  The
>>> third, possibly, but it is not clear how it would all work.
>>> 
>>> Cheers,
>>> 
>>> Brian
>>> 
>>>> 
>>>> --
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "sympy" group.
>>>> To post to this group, send email to [email protected].
>>>> To unsubscribe from this group, send email to 
>>>> [email protected].
>>>> For more options, visit this group at 
>>>> http://groups.google.com/group/sympy?hl=en.
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> Brian E. Granger, Ph.D.
>>> Assistant Professor of Physics
>>> Cal Poly State University, San Luis Obispo
>>> [email protected]
>>> [email protected]
>> 
> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sympy" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/sympy?hl=en.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Re: [sympy] A few ideas

Reply via email to