Re: [ViennaCL-support] compressed_matrix operator() with preallocation

Chris Marsh Thu, 27 Apr 2017 14:13:38 -0700

Hi Karl,

This looks great! Thank you very much for this effort. I will attempt to
implement around this next week.


Cheers
Chris


On 25 April 2017 at 09:25, Karl Rupp <[email protected]> wrote:

> Hi Chris,
>
> the copy-CTOR for compressed_matrix is now implemented:
> https://github.com/viennacl/viennacl-dev/commit/0d62d8e0fb9a
> 3eefc37aa225b5eb7195256181c9
>
> You should get the desired behavior of just updating numerical values on
> the GPU with code similar to the following:
>
>  viennacl::context host_ctx(viennacl::MAIN_MEMORY);
>  viennacl::compressed_matrix<T> A(N,N, host_ctx); //your 'host matrix'
>  /* fill A here */
>
>  viennacl::compressed_matrix<T> B(A);   //create copy of A
>  viennacl::context gpu_ctx(viennacl::CUDA_MEMORY);
>  B.switch_memory_context(gpu_ctx);      //migrate B to CUDA memory
>
>  // write to B, starting at offset 0, copy 'nnz' elements
>  // use host data from nonzero floating point values of A
>  viennacl::backend::memory_write(B.handle(), 0, sizeof(T) * A.nnz(),
> A.handle().ram_handle().get());
>
> Just repeat the last line every time you need to update the numerical
> values on the GPU.
>
> Please let me know how this turns out.
>
> Best regards,
> Karli
>
>
> On 04/21/2017 09:06 PM, Chris Marsh wrote:
>
>> Karl,
>>
>> No problem, the copy-constructor sounds like a perfect solution. Thanks
>> for doing this.
>>
>>     How big is your system?
>>
>> The sparse matrix is approx 10^10  with about 1 million total non-zero
>> elements.
>>
>>
>>     2.5min for 5 time steps sounds a lot to me.
>>
>> I should have been more clear, sorry. The 2.5min includes a bunch of
>> other routines that are being run for the timestep, so it is more than
>> just the matrix solve. However, that 12s is entirely attributable to the
>> difference between STL and the copy and the opterator() access. Also,
>> running on a single laptop core instead of a cluster like it should be!
>>
>>     However, one still has to compare against the available column indices
>>
>> Makes sense. In my case, I think I can just say I need the 3rd, or 4th
>> non-zero row item as I "know" where things are. but that's a non-generic
>> case.
>>
>> Cheers
>> Chris
>>
>>
>>
>> On 21 April 2017 at 04:34, Karl Rupp <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>     Hi Chris,
>>
>>     please apologize my late reply.
>>
>>
>>                 This is a local search operation
>>
>>
>>             Oh, that isn't at all what I expected. I assumed with the
>>         row, col
>>             offset it could just index the CSR array directly?
>>
>>
>>     when you call operator(), you pass the row and column index. The row
>>     index jumps at the beginning of nonzeros for that row in the CSR
>>     array. However, one still has to compare against the available
>>     column indices to finally pick the correct entry (or create a new
>>     one...). Only for dense matrices you can locate the respective entry
>>     in the matrix directly.
>>
>>
>>
>>                  By how much does your code slow down?
>>
>>
>>             The "optimization"? Over 5 time steps or so it was 12 s
>>         slower, out
>>             of a total of 2.5min or so. So enough that when I run it for
>>         15000
>>             time steps it adds up!
>>
>>
>>     So it's 10 percent. How big is your system? 2.5min for 5 time steps
>>     sounds a lot to me.
>>
>>
>>                  Also, do you fill the CSR matrix by increasing row
>>         index, or is
>>                 your code filling rows at random?
>>
>>
>>             I'm filling the CSR via operator(), and that is by
>>         increasing row
>>             index.
>>
>>
>>     Ok, this should be acceptable in terms of performance.
>>
>>
>>         However, when it is run in parallel with openmp, it will
>>             effectively be random.
>>
>>
>>     In parallel you should really fill the CSR array directly (possibly
>>     with the exception of the first time step, where you build the
>>     sparsity pattern)
>>
>>
>>                  What are you trying to accomplish?
>>
>>
>>             With a OpenMP backend, I want to avoid the copy from STL ->
>>             compressed_matrix. So my idea is to pre-allocate A, a
>>             compressed_matrix on the host, regardless of what backend
>>         I'm using
>>             (instead of the STL variant). Then I want to either solve
>>         directly
>>             using A, or I want to copy A to a GPU and solve it on the GPU
>> if
>>             configured. For the former, this is currently working well,
>>         barring
>>             the operator() issues we are discussing above.  The problem
>>         arises
>>             with the 2nd case. I could do the context change, but once
>>         it's been
>>             copied to the GPU I have to copy it *back* to take advantage
>>         of the
>>             pre-allocated matrix. That is, I'd like to avoid any
>> additional
>>             memory allocations. I would like to just copy(A,gpu_A) when
>>         gpu is
>>             available. However, there is no copy for compressed_matrix to
>>             comprssed_matrix.
>>
>>
>>     Thanks, that helps me with understanding the setting better. Let me
>>     add a copy-constructor for compressed_matrix for you, so you can
>>     avoid the unnecessary copy back to the host. Copying the numerical
>>     entries for a fixed sparsity pattern can be done efficiently; I'll
>>     send you a code snippet when I'm done with the copy-constructor.
>>
>>     Best regards,
>>     Karli
>>
>>
>>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________
ViennaCL-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/viennacl-support

Re: [ViennaCL-support] compressed_matrix operator() with preallocation

Reply via email to