Re: [sympy] public API for the linear algebra module

Aaron Meurer Wed, 20 Jul 2011 11:46:28 -0700

On Wed, Jul 20, 2011 at 4:32 AM, someone <[email protected]> wrote:
> Hi,
>
>> >> For those who've not read my latest blog, I'm working on
>> >> implementing a new structure for the matrices module of sympy.
>> >> I faced many decision problems while doing so, like what should be
>> >> named what and what should be where.
>> >>
>> >> Should matrix and linear algebra be separated ?
>>
>> So, here you need to be clear on what you think of as "linear
>> algebra". I know what the difference is mathematically, but I am not
>> entirely clear that you are meaning the same thing.
>>
>> To me, linear algebra is a theoretical way of looking at things.  In
>> the finite dimensional case, linear operators are equivalent to
>> matrices.
>
> That's the concise mathematical point. A more praxis related way to look
> at this is to use "matrix" for all the basic stuff like storage, etc 
> essentially
> just take "matrix" as a data structure with very little additional 
> functionality.
> And then call things like solve, inverse, the BLAS and LAPACK etc "linalg".
>
> This was more my point of view.
>
>
>> There are two main reasons why you would want to do "linear
>> algebra" without directly using matrices:
>
>> - Infinite dimensional vector spaces.  This is probably not really
>> much of a consideration in the CAS view, but to be sure, matrix
>> representations only work in finite dimensional spaces.
>
> Depends on what you want to do and how to get there. Often enough I
> wish that a CAS can work on a more abstract point than whats usual today.
> (Thus I'm very interested on the topic of "abstract" matrices w/o "data",
> this goes in that direction.) But I'm drifting too much off topic now.
>
>> - Looking at things with respect to matrices requires you to have a
>> basis.  This is where I think some stuff that does direct things with
>> linear operators without the user having to manually use matrices
>> would be useful.  For example, suppose we had a linear operator F(p) =
>> p' + 2*p on the set Q[x] of polynomials in x.  This set is infinite
>> dimensional, but we can restrict ourselves to only polynomials of a
>> certain degree (depending on the problem), so matrix representations
>> are useful.  But actually constructing the matrix of this operator is
>> not trivial.  Fist, you need a basis.  Polynomials have a standard
>> basis, but in some cases you might want to work in some other basis.
>> Second, you must rewrite an arbitrary input in terms of this basis.
>> Third, you must convert this into a Matrix.  It would be awesome if
>> you could just do something tell SymPy that that is your operator and
>> it operates over the set, say P2 of polynomials of degree 2 or lower.
>
> And it would also be awesome if sympy could return the matrices given
> the operator (and another basis if a default one doesn't fit). :-)
>
>
>> >> What should be the public API ?
>> >
>> > I see two "parts" of a public API. Some operations are more like
>> > member functions while others are more independent of a concrete
>> > matrix and should be implemented as free functions. (This is also
>> > where we go more into the direction of procedural code.)
>>
>> I also see two parts of the public API, but in a different sense.
>> There is the user level stuff.  This is pretty straight forward, as we
>> already basically have something that works, and we just need to adapt
>> the new stuff to work transparently with it.
>
> Yes and no. I think we should take the opportunity and
> streamline the API where it's necessary.
>
> Just a small example, there are three methods for computing
> determinants. The "simple" entry point is "det", but then there
> is also "det_bareis" which uses another algorithm. Why does this
> method exist at user level if "det" has the keyword argument "method"?
> And there is even a third method "berkowitz_det" but why are the two
> parts in the name switched here? If we really need the specialized
> methods, then I as a user would expect "det_berkowitz" ...
> But I think that at user level, "det" with a reasonable default
> for "method" is just fine.
>
> This only to show why I think we should not just copy the API over.


I see.  Then yes, in some places the API should be cleaned up.

>
>
>> There is also the API for someone who wants to extend the module, say
>> by writing a new matrix algorithm or representation.  This is much
>> less easy to get right.
>
> Thats another important point, yes. I was talking about the user
> level API mainly.
>
>
>> > But this also depends a little on the question if the matrix M
>> > is mutable here. For mutables, the member functions are better,
>> > but for immutable objects maybe the independent functions are
>> > more natural.
>>
>> Mutability is an interesting point to raise here, because currently
>> the mutable Matrix class is not changed by M.T or M.invert().  So
>> actually the question arises as to exactly which things should be done
>> mutably.  Personally, I think something should be mutable only if it
>> saves time or resources.  If you have to basically reconstruct the
>> Matrix, the operation might as well be immutable, as that is what
>> people will expect anyway.
>
> Depends if you have more the mathematicians or the programmers view.
> But you are right, most of the time matrices should be considered
> as immutable at the user level. However, this is not true at lower
> levels with most algorithms operating on matrices. And it is also
> not true on the numerics side where slice assignment is very common.
> (But we leave numerics to numpy anyway.)
>
> And setting a single entry (or a whole row/col) of a matrix should
> be possible. I imagine that this could be usefull for studying and
> developing numerical methods.
>
> But this would go below your first point in any case.
>
>>  I think this will always hold for inverse and for transpose it
>> depends on the internal data type.  Actually, that is something
>> else to consider, that you cannot completely ignore the internal data
>> type, as some operations are inherently more efficient in one way for
>> one data type than in a completely different way for a different data
>> type.
>
> That's why we have different storage formats and algorithms.
> But we should hide this from the user as much as possible.
> Just expose "Matrix" and maybe "SparseMatrix" for special demands
> but not the different storage schemes for sparse matrices.
>
> By expose I mean what's visible to the user first. Of course
> you can always import more lowlevel stuff ...
>
> Independent of storage, all square matrices should f.e. provide
> a way to compute the inverse, even if it's horribly slow.
>
>>  The classic example of this is when you have an efficiency
>> gain by doing things mutably.  This is, e.g., why list.sort()
>> modifies the list in place.
>
>> > If we go to rectangular matrices, some of the above functions
>> > are not defined, so the question is how we want to handle this?
>> >
>> > And we also get "new" functions, f.e. singular values, pseudo
>> > inverses and all that generalisations. These would then also be
>> > member functions and reduce to the "usual" functions if the matrix
>> > is square. (Or do we want to have also a method to compute the
>> > singular values of a square matrix?)
>> >
>> > To show a code snippet:
>> >
>> >        def singularvalues(self, ...):
>> >                if not self.is_square():
>> >                        return singularvalues
>> >                else:
>> >                        # Two choices:
>> >                        return eigenvalues
>> >                        # or
>> >                        return NotImplementedError
>>
>> I would say raise NotImplementedError.  Explicit is definitely better
>> than implicit in this case.
>
> And not only in this case :-)
>
>> If singularvalues() return eigenvalues and eigenvalues() returns singular
>> values, then it will be impossible for the user to tell which she is getting.
>
> No, depending on the matrix shape and the function name
> it would be unambiguous. But it's probably bad design anyway.
>
> Having a rectangular matrix throwing a NIE for .eigenvalues()
> is a bit awkward too, isn't it?
>
> And btw, I think NIE gives a wrong impression as it's not
> just not implemented but really mathematically undefined.

That's why we have errors like ShapeError and MatrixError.
Nonsquarematrix.eigenvalues() should raise a ShapeError.

>
>
>> There is another side to this, which is that functions would pollute
>> the namespace.  Having functions in the namespace is not an inherently
>> bad thing.
>
> The question is *which* namespace. We could put this in sympy.linalg
> for example. So a user can import with "as" and avoid any clash.

A large number of SymPy users use the "from sympy import *" namespace
that you get from isympy.  So unless you just don't want to import
these things by default at all, you have a problem.

>
>>  The main problem is that these functions operate on Matrix
>> objects, whereas most functions in SymPy operate on Expr objects.  It
>> depends on the name of course, but somethings like *_solve() and
>> variants would be very confusing to the user against the normal SymPy
>> solve() function.  Of course, we could name things matrix_solve(), but
>> at that point, you might as well just make them methods of Matrix.
>
> This is where strong typing comes in handy ... and solve(eqn, vars) can
> coexists with solve(A, b).

I'm not sure if that is a good idea.

>
>>  So I actually vote to have almost all if not all matrix operations as
>> methods to Matrix.
>>
>> This might get hairy if there is some operation that has more than two
>> operands, in which case just having a function might make sense.
>
> So this should guide the API design, namely what gives more natural
> notation while working with these objects and functions.
>
> Also member functions bring an asymetry to the expression. Members
> are good for "unary" operations like trace, det, inv etc. And maybe
> for "binary" operations with (very) different operands: A.solve(b)
> is not so nice but also not too bad. A.matrixmult(B) is worse because
> of this asymetry. And for nary operations member functions will
> give a mess I think. (And difficulties in the implementation
> of some properties maybe.)
>
>
>> > Another important topic is iteration. I think that matrices really
>> > should support iteration. And there are at least three cases which
>> > are useful:
>> >
>> > for row in matrix.iterrows():
>> >        ...
>> >
>> > for col in matrix.itercols():
>> >        ...
>> >
>> > for cell in matrix.iterentries():
>> >        ...
>>
>> Should Matrix.__iter__ just iterate over the entries, so "for cell in
>> matrix" is equivalent to "for cell in matrix.iterentries()"?
>
> Oh I just missused this iter*() in a dict style way to make clear
> what I'm talking about. The question which of these gets the
> "native" __iter__ is still open. I'm not sure if iterating over
> columns as vectors is not more important as iterating over single
> entries? On the other hand, the most basic iteration in collections
> is always over single units ...

Well, I think having the iterelements(), etc. would be useful,
especially if the data structure doesn't provide instant access to one
type (e.g., either columns or rows will have to be constructed in a
dense representation).  But, yes, clearly __iter__ should iterate over
all elements.

Aaron Meurer

>
> And the answer is probably: explicit is better than implicit ...
>
>> > Or there is an efficient (maybe lazy) transpose operation,
>> > and then if f.e. iterrows() is efficient, itercols() replaced
>> > by M.T.iterrows() is somewhat efficient too? I don't know ...
>>
>> I think this depends on the internal representation of the Matrix.
>
> Of course.
>
>
> -- Raoul
>

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Re: [sympy] public API for the linear algebra module

Reply via email to