w4nderlust wrote: > Hello everyone. > > I'm still working with sparse matrices in oct-file, but i've got some > problems: i'm doing some multithread and as the octave libraries > revealed to be not really thread safe (some unexpected crashed ofter > repeating executions) i did the dirty work by hand: i copied the 3 > vectors of the matrix B (the input matrix) and copied the full matrix > A into a vector. If you do something like
const SparseMatrix a = args(0).sparse_matrix_value (); const *double adata = a.data(); const *octave_idx_type acidx = a.cidx(); const *octave_idx_type aridx = a.ridx(); you can clean up a lot of the issues I believe you were having. As the rest of you code will then just work on these pointers there is no reason that the code shouldn't be thread-safe. However > Then did what i have to do and then gave back the 3 > vectors for the matrix C, then create the sparsematrix out of the 3 > vectors and return from the function. something in creating the c > sparse matrix out of the 3 vector is going wrong, and the > documentation about this seem to be poor. The documentation clearly mentions that octave uses "Compressed Column" format and gives a reference for them in section 21.1.1. It is then briefly discussed in the rest of section 21.1.1 with an example.. I don't see how the documentation is lacking. > For example, why the 3 > vector has to be the same lenght? the cidx is always really shorter > than the other 2. (i solved the problem filling with zeroes but i > don't know if it's the correct why). Another problem is that sometimes > when creating the matrix it says i'm referencing an item that's out of > the matrix bounds (even if the 3 vectors have correct indexes and > values inside i suppose). The last problems is that i get an error on > a free call, the log is long so i post an attachment with it. the call > is: > The vector cidx is one element longer than the number of columns in the sparse matrix.. cidx(0) is always 0, and "cidx(i+1) - cidx(i)" will give the number of non-zero elements in each column, with cidx(i) being the index of the first non-zero element of the column i (ie ridx(cidx(i)) is the row index of the first non zero element and data(cidx(i)) is the data). You can therefore step through all of the non-zero elements of a column of an octave sparse matrix a with do (octave_idx_type ii = a.cidx(j); ii < a.cidx(j+1); ii++) { // val is the a(i,j)-th element of the matrix and is non-zero i = a.ridx (ii); val = a.data(ii) // Now do something with it. } Your problem in your code comes from a basic misunderstanding on the compressed column format. Your function "thread_function" essentially has all of the threads writting to the beginning of the matrix C. The basic problem in threading this code is that your indexing into the data is dependent on the number of non-zero elements in the previous columns and you don't know that till the preceding threads have terminated. I see two solutions. 1) Precalculate the number of non-zero elements per-column. That is, in the code block I sent you for spmaxmin_new.cc calculate the output cidx at the same time you calculate nel, rather than in the second loop. If you first save the number of non-zero elements to ccidx you can even thread this. For example (with code that might be threaded, without the threading itself) c.xcidx (0) = 0; for (octave_idx_type i = 0; i < rowsA; i++) { // From here on this code could be threaded octave_idx_type kk = 0 for(octave_idx_type j = 0; j < colsB; j++) { if (b.cidx(j) != b.cidx(j+1)) kk++; else { for (octave_idx_type k = b.cidx(j); k <= (b.cidx(j+1) - 1); k++) { if (a(i, b.ridx(k)) != 0) { kk++; break; } } } } c.xcidx(i+1) = kk; } // Now that we've calculated the number of elements per column of the output matrix in code // that might be threaded, correct the indexing (Can't be threaded) for (octave_idx_type i = 0; i < rowsA; i++) c.xcidx(i+1) += c.xcidx(i); octave_idx_type nel = c.xcidx(rowsA); It might not be profitable to thread the above and it could be simplified to octave_idx_type nel = 0; c.xcidx(0) = 0; for (octave_idx_type i = 0; i < rowsA; i++) { for(octave_idx_type j = 0; j < colsB; j++) { if (b.cidx(j) != b.cidx(j+1)) nel++; else { for (octave_idx_type k = b.cidx(j); k <= (b.cidx(j+1) - 1); k++) { if (a(i, b.ridx(k)) != 0) { nel++; break; } } } } c.xcidx(i+1) = nel; } You can then uses c.xcidx in your thread_function to index correctly into c.xridx and c.xdata. 2) You could create N different sparse matrices, one per thread, and then concatenate them together with the concat method. This will probably be slower and more memory hungry than the above so I won't develop this idea further. If you read and understand section 21.1.1 of the manual, the above will probably give you the needed push to fix your code. D. ------------------------------------------------------------------------------ Learn how Oracle Real Application Clusters (RAC) One Node allows customers to consolidate database storage, standardize their database environment, and, should the need arise, upgrade to a full multi-node Oracle RAC database without downtime or disruption http://p.sf.net/sfu/oracle-sfdevnl _______________________________________________ Octave-dev mailing list Octave-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/octave-dev