Re: [sympy] Re: GSoC: Sparse Matrix Implementation

Andy Ray Terrel Tue, 29 Mar 2011 11:44:33 -0700

On Tue, Mar 29, 2011 at 7:22 PM, Aaron S. Meurer <[email protected]> wrote:
>
> On Mar 29, 2011, at 12:51 AM, Andy Ray Terrel wrote:
>
>> On Tue, Mar 29, 2011 at 9:13 AM, Tim Lahey <[email protected]> wrote:
>>> On 03-29-2011, at 1:44 AM, Andy Ray Terrel wrote:
>>>
>>>> There are some major confusions on this list about the difference
>>>> between algorithms and storage.
>>>
>>> I'm not confused and I'm not sure if other people are. I just think storage 
>>> alone isn't suitable for a GSoC project. Certainly one could just implement 
>>> storage techniques behind the scenes and no algorithms, but that would only 
>>> save space, not computation time.
>>
>> On modern hardware, computation is free memory is expensive.
>>
>>>
>>>> Yes, it is typical to use iterative
>>>> algorithms with sparse matrices and factorizations with dense, but
>>>> this is because sparse matrices are typically very large (1M -- 10^12
>>>> entries with 1000 -- 2B non-zeros).  One will switch to things that
>>>> don't fill in the matrix but SymPy will probably never be at this
>>>> level.
>>>
>>> With symbolic matrices, the computation time for each calculation is much 
>>> higher than with numerical matrices so the benefit of sparse matrix 
>>> techniques occur at much smaller matrix sizes.
>>>
>>>>  One can implement some Krylov subspace iteration algorithms,
>>>> but its somewhat pointless (how do you evaluate a residual in
>>>> symbols).  Another algorithm ILU versus LU is also silly (basically it
>>>> throws away small fill in entries but we have no way of saying what is
>>>> small in symbolics).  For this reason I suggest just implementing
>>>> standard factorizations and dealing with the fill in as it happens if
>>>> we get to the point that we need to do UMFPACK style direct
>>>> factorizations we can do that the next summer.  We practitioners only
>>>> use "algorithms suited for sparse matrices" because of the size of the
>>>> matrices which will not be an issue in SymPy.
>>>
>>> I've already stated that storage isn't suitable for a GSoC project.
>>
>> I respectfully disagree.  Furthermore I find it distasteful for one
>> applicant to tell another that their idea is not suitable.
>
> Let's remain civil.  I think it is important to decide what is needed and 
> what is less needed, or what will be better than the current (dense) 
> implementation and what will actually end up being worse.


Please give me your guide of civility so I can rip out every page and
use it as toilet paper.

>
> Perhaps the only way we will really know for sure what sparse methods will 
> work best is to implement a bunch (in a modular way) and see what performs 
> the best.  By in a modular way, I mean they should be wrapped around classes 
> in such a way that choosing a different sparse method should be trivial and 
> the algorithms should be unaware to what is being used (except for the cases 
> where it matters because one algorithm will be more efficient for one data 
> type).
>
>>
>>>
>>> It's very true that residual based methods will have problems, so will 
>>> others. That's what I've been saying. The reasons the different methods 
>>> will have problems will vary.
>>>
>>> How can you say that size won't be an issue? I've seen very large symbolic 
>>> sparse matrices in CAS before. In fact, a friend of mine who was working on 
>>> a project built on top of Maple wrote his own sparse matrix algorithms 
>>> because Maple only handled dense matrices and bogged down at about 7x7 
>>> matrices, which isn't uncommon. I've personally wanted matrices with 10^4 
>>> -- 10^6 order entries with symbolic entries. Certainly not as large as a 
>>> lot of some of the numerical sparse matrices, but still a reasonable size 
>>> when each entry has several terms in it, rather than just a single number.
>>
>> I would have to see the problem to believe you.  In a field where a
>> single equation can "bog down", I find it impossible to separate such
>> concerns.  You wishes are the future GSOC is about today, I'm more
>> interested in projects that will be successful over the summer.
>
> Well, I admit that I am no expert in the area (I have only taken that 
> standard theoretical undergraduate courses in linear algebra for a math 
> major).  However, I have personally see the current (dense) module be bogged 
> down solving a sparse system.  Take a look at 
> http://code.google.com/p/sympy/issues/detail?id=1441.  When computing 
> integrate((sin(1/x) - x*exp(x))/((-sin(1/x) + x*exp(x))*x + x*sin(1/x))), x), 
> the heurisch algorithm sets up and tries to solve a very sparse system of 
> about 600 equations in about 450 variables with rational coefficients (very 
> sparse meaning quite a few of the equations are nothing more than something 
> like 5*x23 = 0).   (hopefully I remembered that all correctly)
>
> For this reason, the above integral will hang indefinitely and never return 
> anything, even if you leave it running for a very long time.  If you are 
> interested, put a debug break point or a print statement at line 364 of 
> risch.py (it will take 10-30 min. to reach that point because there is also 
> another slow line in the file related to expansion).
>
> So while I don't know for sure, it really seems to me when I look at those 
> equations that a good sparse solver should be able to return very fast with 
> them (by the way, this particular system is almost certainly inconsistent).
>
> By the way, this reminds me, the one thing that has not been mentioned here 
> by anyone (except for Sherjil because I told him about it already) is that 
> the matrices need to be rewritten to use the polys ground types, so that we 
> can use gmpy in the matrices just like we use it in the polys.  Ideally, we 
> could also use it for matrices with polynomial entries too (though that might 
> be cleaner once we have a Frac() class, I don't know).
>
> Aaron Meurer
>
>
>>
>>>
>>>>
>>>> It would be good if you listed what algorithms are "better" for sparse
>>>> matrices and we go from there.  It is a weak project to just do the
>>>> storage by itself, and unless there are many people who will use
>>>> sparse matrices, it's not particularly relevant.  For example I just
>>>> use scipy or petsc4py for sparse things in python.  With that said,
>>>> some day SymPy will be much faster than it is today (or at least I
>>>> hope) and such structures may be commonplace if they are there.
>>>>
>>>
>>> I'm mainly counselling being careful about the algorithms. I've seen work 
>>> done on this area, but I don't have the references handy, since I don't 
>>> actually work on this for my research. There can be clever ways of doing 
>>> standard dense linear algebra for sparse matrices. While I'd personally use 
>>> scipy for sparse matrices now, I'd prefer to do more symbolically. Because 
>>> the of the nature of my sparse matrices, if I had them symbolically, I'd be 
>>> able to simplify things before I converted to numerical sparse matrices.
>>
>> My approach, which is managable today, is to do symbolics at a very
>> coarse level, then push to numerics at a fine level via Full
>> Approximation Schemes.
>>
>> -- Andy
>>
>>>
>>>> The project becomes much more interesting if you take Ronan's point of
>>>> view of redesigning the Matrix class or perhaps implementing more
>>>> matrix factorizations (LU, QR, Cholesky, diagonalization, eigenvalues:
>>>> L\Lambda R should be achievable over the summer).
>>>
>>> I think implementing matrix factorizations are a good idea for somebody and 
>>> but I think sparse matrices (both algorithms and storage) are a good idea 
>>> too. Certainly factorization and eigenvalues would be of use with symbolic 
>>> sparse matrices as well.
>>>
>>> Cheers,
>>>
>>> Tim.
>>>
>>> ---
>>> Tim Lahey
>>> PhD Candidate, Systems Design Engineering
>>> University of Waterloo
>>> http://about.me/tjlahey
>>>
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups 
>>> "sympy" group.
>>> To post to this group, send email to [email protected].
>>> To unsubscribe from this group, send email to 
>>> [email protected].
>>> For more options, visit this group at 
>>> http://groups.google.com/group/sympy?hl=en.
>>>
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "sympy" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to 
>> [email protected].
>> For more options, visit this group at 
>> http://groups.google.com/group/sympy?hl=en.
>>
>
> --
> You received this message because you are subscribed to the Google Groups 
> "sympy" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/sympy?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sympy?hl=en.

Re: [sympy] Re: GSoC: Sparse Matrix Implementation

Reply via email to