Hi
I'd like to use the result of one RDD1 in another RDD2. Normally I would use 
something like a barrier so make the 2nd RDD wait till the computation of the 
1st RDD is done then include the result from RDD1 in the closure for RDD2.
Currently I create another RDD, RDD3, out of the result of RDD1 then do 
Cartesian product on RDD2 and RDD3. NB: This operation is slow and expands 
partitions from 270 to 1200

This is a simplified example but I think it should help:
What I want to do (pseudocode):
   val a:Int=RDD1.reduce(..)
   RDD2.map(x => x*a)

What I use right now (pseudocode):
  val a:Int=RDD1.reduce(..)
  RDD3=makeRDD(a)
   RDD2.cartesianProduct(RDD3)

How to structure this type of operation to not need the barrier to block 
computing RDD2 until RDD1 is done?

-Adrian

Reply via email to