Hi Evan,

> I need to scatter the elements of a vector out to multiple processors.
> The mapping is one to many (vector elements can go to many procs). I
> would like to do this with a permutation matrix which has 1 nonzero
> per row.
>
> I'd like the process to run on the GPU, so a warp would need to
> operate on multiple rows of the permutation matrix. What do you think
> the best approach would be?

The best approach with the existing API would be to fill such an 
'permutation matrix'. However, this is a little wasteful in terms of 
memory bandwidth.

There's also another option available soon: A couple of days back I 
added some first support for viennacl::vector<int> and friends, and for 
the next release we already want to provide an assignment of the form
  vector<double> x,y;
  vector<int> indices;
  ...
  x = y(indices); // or similar. API not decided yet.

Over time this will gradually be extended to support more complicated 
expressions, but this is quite some undertaking in terms of robust 
implementation and thus won't happen 'immediately'.

Best regards,
Karli

------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Reply via email to