On 02/23/2012 09:47 AM, Dag Sverre Seljebotn wrote: > On 02/23/2012 05:50 AM, Jaakko Luttinen wrote: >> Hi! >> >> I was wondering whether it would be easy/possible/reasonable to have >> classes for arrays that have special structure in order to use less >> memory and speed up some computations? >> >> For instance: >> - symmetric matrix could be stored in almost half the memory required by >> a non-symmetric matrix >> - diagonal matrix only needs to store the diagonal vector >> - Toeplitz matrix only needs to store one or two vectors >> - sparse matrix only needs to store non-zero elements (some >> implementations in scipy.sparse) >> - and so on >> >> If such classes were implemented, it would be nice if they worked with >> numpy functions (dot, diag, ...) and operations (+, *, +=, ...) easily. >> >> I believe this has been discussed before but google didn't help a lot.. > > I'm currently working on a library for this. The catch is that that I'm > doing it as a work project, not a hobby project -- so only the features > I strictly need for my PhD thesis really gets priority. That means that > it will only really be developed for use on clusters/MPI, not so much > for single-node LAPACK. > > I'd love to pair up with someone who could make sure the library is more > generally useful, which is my real goal (if I ever get spare time > again...). > > The general idea of my approach is to have lazily evaluated expressions: > > A = # ... diagonal matrix > B = # ... dense matrix > > L = (give(A) + give(B)).cholesky() # only "symbolic"! > # give means: overwrite if you want to > > explain(L) # prints what it will do if it computes L > L = compute(L) # does the computation > > What the code above would do is: > > - First, determine that the fastest way of doing + is to take the > elements in A and += them inplace to the diagonal in B > - Then, do the Cholesky in
Sorry: Then, do the Cholesky inplace in the buffer of B, and use that for L. Dag > > Note that if you change the types of. The goal is to facilitate writing > general code which doesn't know the types of the matrices, yet still > string together the optimal chain of calls. This requires waiting with > evaluation until an explicit compute call (which essentially does a > "compilation"). > > Adding matrix types and operations is done through pattern matching. > This one can provide code like this to provide optimized code for wierd > special cases: > > @computation(RowMajorDense + ColMajorDense, RowMajorDense) > def add(a, b): > # provide an optimized case for row-major + col-major, resulting > # in row-major > > @cost(add) > def add_cost(a, b): > # provide estimate for cost of the above routine > > The compiler looks at all the provided @computation and should > determines the cheapest path. > > My code is at https://github.com/dagss/oomatrix, but I certainly haven't > done anything yet to make the codebase useful to anyone but me, so you > probably shouldn't look at it, but rather ask me here. > > Dag _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion