Embarassingly parallel computation in SparkR?

Kristina Rogale Plazonic Mon, 17 Aug 2015 12:54:30 -0700

Hi,

I'm wondering how to achieve, say, a Monte Carlo simulation in SparkR
without use of low level RDD functions that were made private in 1.4, such
as parallelize and map. Something like


parallelize(sc, 1:1000).map (
   ### R code that does my computation
)

where the code is the same on every node, only with different seeds.

(I'm going to use this code with SparkR:::parallelize, but I'm wondering if
there is a better way, or whether this might be a use case that would
justify not making those functions private?)

Many thanks!

kristina

Embarassingly parallel computation in SparkR?

Reply via email to