On Tue, Mar 9, 2010 at 11:14 PM, David Bateman <dbate...@dbateman.org> wrote: > Jaroslav Hajek wrote: >> >> I think I looked into those sources briefly (because I also noticed >> that it implements basic ops), but indeed there seems to be nothing >> magical (read: no assembler). Some operations we need (such as >> transpose-multiply or diagonal scaling) were not even there, I think. >> That doesn't mean it's a bad idea to defer some operations to CXSparse >> (as you said, it is a dependency already), but I don't expect big >> gains unless there are some better algorithms. >> >> I didn't mean to imply that the current built-in sparse implementation >> is somehow slow (although perhaps there is room for optimizations). >> Block sparse storage is simply bound to beat the plain sparse one when >> the blocks get large enough. >> It's just that I didn't expect 10x10 blocks to be enough, given the >> m-code implementation. >> >> > > Well its not hard to reimplement the cs_multiply mex file from CXSPARSE as > octave code and I attach a cs_multiply oct-file to test... Running the > example > > n = [1e2, 5e2, 1e3, 1e3, 1e4, 1e4]; > d = [0.01, 0.01, 0.001, 0.01, 0.001, 0.005]; > for i = 1: length (n) > fprintf ("N = %i, d = %f\n", n(i), d(i)); > A = sprand (n(i), n(i), d(i)); > B = sprand (n(i), n(i), d(i)); > tic; C0 = A * B; toc > tic; C1 = mymultiply (A, B); toc > endfor > > I see the results > > N = 100, d = 0.010000 > Elapsed time is 0.000342958 seconds. > Elapsed time is 0.00414899 seconds. > N = 500, d = 0.010000 > Elapsed time is 0.00301409 seconds. > Elapsed time is 0.00264595 seconds. > N = 1000, d = 0.001000 > Elapsed time is 0.000446094 seconds. > Elapsed time is 0.000560952 seconds. > N = 1000, d = 0.010000 > Elapsed time is 0.0157239 seconds. > Elapsed time is 0.0209789 seconds. > N = 10000, d = 0.001000 > Elapsed time is 0.317249 seconds. > Elapsed time is 0.374798 seconds. > N = 10000, d = 0.005000 > Elapsed time is 5.38696 seconds. > Elapsed time is 9.66393 seconds. > > so it doesn't appear that using cs_multiply is an advantage > > D.
Hmm. One thing I don't understand: why the transposes? Isn't cxsparse using CSC just like Octave? I completed the block sparse-sparse multiplication (it's quite simple, but hacky), so here's a benchmark (again 10x10 and bigger blocks): nb = 10; n = 1000; [i, j] = find (sprand (n, n, 3/n) + speye (n)); a = rand (nb, nb, length (i)); s1 = blksparse (i, j, a, n, n); [i, j] = find (sprand (n, n, 3/n) + speye (n)); a = rand (nb, nb, length (i)); s2 = blksparse (i, j, a, n, n); tic; p1 = s1 * s2; toc s1 = sparse (s1); s2 = sparse (s2); tic; p2 = s1 * s2; toc norm (sparse(p1) - p2, "fro") with nb = 10, n = 1000: Elapsed time is 0.0700891 seconds. Elapsed time is 0.154243 seconds. ans = 1.1327e-13 nb = 10, n = 100 Elapsed time is 0.005583 seconds. Elapsed time is 0.01352 seconds. ans = 8.7034e-14 nb = 50, n = 100 Elapsed time is 0.1525 seconds. Elapsed time is 1.456 seconds. ans = 3.9378e-12 nb = 20, n = 1000 Elapsed time is 0.2465 seconds. Elapsed time is 1.007 seconds. ans = 3.9142e-13 so it seems that block multiplication is really a big advantage (with larger blocks), so much that it even outweighs the extra work & memory needed by the m-code. It's not that surprising, considering that optimized BLAS also get their speed from blocked operations. Unfortunately; while it's relatively easy to split arbitrary dense matrix to blocks, it would be quite hard to do so with a general sparse (CSC) matrix. That's why a separate class is needed to take advantage of block structure. regards -- RNDr. Jaroslav Hajek, PhD computing expert & GNU Octave developer Aeronautical Research and Test Institute (VZLU) Prague, Czech Republic url: www.highegg.matfyz.cz ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Octave-dev mailing list Octave-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/octave-dev