w4nderlust wrote:
> Hello everyone.
>
> I'm still working with sparse matrices in oct-file, but i've got some
> problems: i'm doing some multithread and as the octave libraries
> revealed to be not really thread safe (some unexpected crashed ofter
> repeating executions) i did the dirty work by hand: i copied the 3
> vectors of the matrix B (the input matrix) and copied the full matrix
> A into a vector.
If you do something like

const SparseMatrix a = args(0).sparse_matrix_value ();
const *double adata = a.data();
const *octave_idx_type acidx = a.cidx();
const *octave_idx_type aridx = a.ridx();

you can clean up a lot of the issues  I believe you were having. As the
rest of you code will then just work on these pointers there is no
reason that the code shouldn't be thread-safe. However

>  Then did what i have to do and then gave back the 3
> vectors for the matrix C, then create the sparsematrix out of the 3
> vectors and return from the function. something in creating the c
> sparse matrix out of the 3 vector is going wrong, and the
> documentation about this seem to be poor. 
The documentation clearly mentions that octave uses "Compressed Column"
format and gives a reference for them in section 21.1.1. It is then
briefly discussed in the rest of section 21.1.1 with an example.. I
don't see how the documentation is lacking.

> For example, why the 3
> vector has to be the same lenght? the cidx is always really shorter
> than the other 2. (i solved the problem filling with zeroes but i
> don't know if it's the correct why). Another problem is that sometimes
> when creating the matrix it says i'm referencing an item that's out of
> the matrix bounds (even if the 3 vectors have correct indexes and
> values inside i suppose). The last problems is that i get an error on
> a free call, the log is long so i post an attachment with it. the call
> is:
>   
The vector cidx is one element longer than the number of columns in the
sparse matrix.. cidx(0) is always 0, and "cidx(i+1) - cidx(i)" will give
the number of non-zero elements in each column, with cidx(i) being the
index of the first non-zero element of the column i (ie ridx(cidx(i)) is
the row index of the first non zero element and data(cidx(i)) is the
data). You can therefore step through all of the non-zero elements of a
column of an octave sparse matrix a with

do (octave_idx_type ii = a.cidx(j); ii < a.cidx(j+1); ii++)
  {
    // val is the a(i,j)-th element of the matrix and is non-zero
    i = a.ridx (ii);
    val = a.data(ii)
    // Now do something with it.

  }

Your problem in your code comes from a basic misunderstanding on the
compressed column format. Your function "thread_function" essentially
has all of the threads writting to the beginning of the matrix C. The
basic problem in threading this code is that your indexing into the data
is dependent on the number of non-zero elements in the previous columns
and you don't know that till the preceding threads have terminated.

I see two solutions.

1) Precalculate the number of non-zero elements per-column. That is, in
the code block I sent you for spmaxmin_new.cc calculate the output cidx
at the same time you calculate nel, rather than in the second loop. If
you first save the number of non-zero elements to ccidx you can even
thread this. For example (with code that might be threaded, without the
threading itself)

     c.xcidx (0) = 0;
      for (octave_idx_type i = 0; i < rowsA; i++)
        {
           // From here on this code could be threaded
           octave_idx_type kk = 0

          for(octave_idx_type j = 0; j < colsB; j++)
        {
          if (b.cidx(j) != b.cidx(j+1))
            kk++;
          else
            {
              for (octave_idx_type k = b.cidx(j);
               k <= (b.cidx(j+1) - 1); k++)
            {
              if (a(i, b.ridx(k)) != 0)
                {
                  kk++;
                  break;
                }
            }
            }
        }
        c.xcidx(i+1) = kk;
        }

       // Now that we've calculated the number of elements per column of
the output matrix in code
       // that might be threaded, correct the indexing (Can't be threaded)
      for (octave_idx_type i = 0; i < rowsA; i++)
        c.xcidx(i+1) += c.xcidx(i);

      octave_idx_type nel = c.xcidx(rowsA);

It might not be profitable to thread the above and it could be
simplified to

       octave_idx_type nel = 0;

      c.xcidx(0) = 0;
      for (octave_idx_type i = 0; i < rowsA; i++)
        {
          for(octave_idx_type j = 0; j < colsB; j++)
        {
          if (b.cidx(j) != b.cidx(j+1))
            nel++;
          else
            {
              for (octave_idx_type k = b.cidx(j);
               k <= (b.cidx(j+1) - 1); k++)
            {
              if (a(i, b.ridx(k)) != 0)
                {
                  nel++;
                  break;
                }
            }
            }
        }
        c.xcidx(i+1) = nel;
        }

You can then uses c.xcidx in your thread_function to index correctly
into c.xridx and c.xdata.


2) You could create N different sparse matrices, one per thread, and
then concatenate them together with the concat method. This will
probably be slower and more memory hungry than the above so I won't
develop this idea  further.

If you read and understand section 21.1.1 of the manual, the above will
probably give you the needed push to fix your code.

D.


------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
Octave-dev mailing list
Octave-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/octave-dev

Reply via email to