> On a 4 core machine (8 with hyperthreading) I'm observing a 10x speedup.
> The parallel related speedup is 4x. There is an additional 2.5x speedup
> which appears to be related to the lower level access to the Matrix memory
> done by RMatrix (and perhaps some elimination of copying).
>
It turns
Here's a parallel version:
https://github.com/jjallaire/RcppParallel/blob/master/inst/examples/parallel-distance-matrix.cpp
To make the code reasonable I introduced a new RMatrix class in
RcppParallel that makes offsetting into rows and columns safe and
straightforward. This class has no connecti
On 12 July 2014 at 12:37, JJ Allaire wrote:
| If you could send the full source code to your example (including js_distance
| and whatever R code you are using to test/exercise the functions) I'll see if
I
| can come up with the code you'd use to parallelize the outer loop. Depending
on
| how it
James,
If you could send the full source code to your example (including
js_distance and whatever R code you are using to test/exercise the
functions) I'll see if I can come up with the code you'd use to parallelize
the outer loop. Depending on how it turns out perhaps we can even convert
this int
James,
My attempt at marginal usefulness (compared to DE and JJE's comments) would
be to offer some code examples that I've got using OpenMP--based
parallelism. There are examples in the gallery, of course. But I've got a
number of others for euclidean distance and one for an EM algorithm. If you
On 11 July 2014 at 21:19, JJ Allaire wrote:
| (2) The premise of RcppParallel is that you are reading and writing directly
| into C arrays in background threads (it's not safe to call into R and
therefore
| not really safe to call into Rcpp). So to interact with a Matrix/Vector you
| need to calc
Two things to consider:
(1) The parallelFor and parallelReduce functions don't require iterators --
they just take indexes which you can use for iterating over any range. In
the gallery examples they are used to offset NumericVector.begin() to get
the address of the slide of the vector or matrix t
Sorry, this is a continuation of my previous email, I've just learned a new
gmail keybinding for accidentally sending ...
My questions are:
- is it right to parallelize the outer loop versus doing a parallelReduce
on the kl_divergence function?
- is there a row iterator which returns NumericVector
Hi All,
I'm attempting to paralellize some Rcpp code I have now, but I'm having a
hard time understanding the best route for this. I'm looking for some
guidance about how things need to be restructured to take advantage of the
parallelFor (if that would be the speediest option).
I have the follow