Hi Chris,
> Ok, this seemed to work very well. I can then modify the element > internal vector to zero out the matrix for my finite volume > implementation, &c and preserve the sparsity information so-as to use > operator() quickly. it's still best to avoid operator() if you aim for maximum performance, but instead work on the CSR arrays directly. Chances are, however, that more time in already spent on other parts of your finite volume application, in which case case there's no need for further optimizing this part. > The whole point of doing this was to avoid 2 sets of copies from main > memory STL format to main memory compressed_matrix format when using > OpenMP. Yes, that's definitely the right way to do. > However I'm not seeing any performance increase, and rather I am > seeing a performance decrease! > > Is this to be expected? In which part do you see the performance decrease? If it's in the assembly, then work on the CSR arrays directly. Or are you referring to other parts, e.g. sparse matrix-vector products? Best regards, Karli > On 7 April 2017 at 10:12, Chris Marsh <[email protected] > <mailto:[email protected]>> wrote: > > Hi, > > Right, it's the sparsity pattern that you have no way of knowing a > priori during allocation. The parallel insert is then of course an > issue without the 2 passes... > I have to build a new A and b many, many times (during some > timestepping) so 2 passes is probably not much faster than what I'm > getting with copy. The sparsity pattern will stay constant. If I > initialize the sparsity, then operator() should work, correct? And > make my parallel code faster, i.e., not require 2 passes. > > Following this further: if I use a std::map< ... > sparse > representation, and copy it to a compressed_matrix, it should set up > the sparse structure for me. Then, I can use operator() without slow > down, and access in parallel as the sparsity will be correctly > setup. Reasonable approach for host only? For GPU, I obviously will > still need to copy. But this approach, if it works, should also > reduce code duplication..... > > (I'm trying to avoid learning CSR at the moment, have a time crunch!) > > Cheers > Chris > > > On 7 April 2017 at 00:21, Karl Rupp <[email protected] > <mailto:[email protected]>> wrote: > > Hey, > > On 04/06/2017 11:48 PM, Chris Marsh wrote: > > Unless you are changing only a few entries, this is > likely to be too > slow. > > Big time :) > > Ok, so even though it is pre allocated for the right number > of nnz > values, operator() still incurs the cost? Must admit that is > not what > I'd have expected. > > > Well, this is a sparse matrix. Since operator() deals with a > single entry, there is no way this could be fast (note that CSR > has requirements on entries from the same row being located > consecutively in memory) > > > When I obtain those CSR buffers, they will be the correct > size, and I > should be able to insert into them in parallel, correct? > > > Yes, exactly. > You may need to populate the matrix in two passes: The first > determines the sparsity pattern, the second writes the actual > numerical values. > > Best regards, > Karli > > > > On 6 April 2017 at 13:13, Karl Rupp <[email protected] > <mailto:[email protected]> > <mailto:[email protected] > <mailto:[email protected]>>> wrote: > > Hi! > > > > On 04/06/2017 06:44 PM, Chris Marsh wrote: > > Hi, > > I know the number of non-zero entries for a sparse > matrix so I > am trying > to pre-allocate it with > > viennacl::compressed_matrix<vcl_scalar_type> > vl_C(row, col, nnz); > > > At this point your matrix is still empty (i.e. no > nonzeros). It only > preallocated an array to hold up to 'nnz' entries. > > > and access it with vl_C.operator(). > > > Unless you are changing only a few entries, this is > likely to be too > slow. > > > I am using host only memory context, with ViennaCL > 1.7.1 from > homebrew. > > How should I proceed with this? > > > To fill the CSR format efficiencly, have a look here: > > > https://sourceforge.net/p/viennacl/discussion/1143678/thread/325a937c/?limit=25#d6f0 > > <https://sourceforge.net/p/viennacl/discussion/1143678/thread/325a937c/?limit=25#d6f0> > > > <https://sourceforge.net/p/viennacl/discussion/1143678/thread/325a937c/?limit=25#d6f0 > > <https://sourceforge.net/p/viennacl/discussion/1143678/thread/325a937c/?limit=25#d6f0>> > > For host-based memory, an example of how to get pointers > to the > three CSR arrays is here: > > > https://github.com/viennacl/viennacl-dev/blob/master/viennacl/linalg/host_based/sparse_matrix_operations.hpp#L115 > > <https://github.com/viennacl/viennacl-dev/blob/master/viennacl/linalg/host_based/sparse_matrix_operations.hpp#L115> > > > <https://github.com/viennacl/viennacl-dev/blob/master/viennacl/linalg/host_based/sparse_matrix_operations.hpp#L115 > > <https://github.com/viennacl/viennacl-dev/blob/master/viennacl/linalg/host_based/sparse_matrix_operations.hpp#L115>> > > Best regards, > Karli > > > > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ ViennaCL-support mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/viennacl-support
