Hi I'd like to use the result of one RDD1 in another RDD2. Normally I would use something like a barrier so make the 2nd RDD wait till the computation of the 1st RDD is done then include the result from RDD1 in the closure for RDD2. Currently I create another RDD, RDD3, out of the result of RDD1 then do Cartesian product on RDD2 and RDD3. NB: This operation is slow and expands partitions from 270 to 1200
This is a simplified example but I think it should help: What I want to do (pseudocode): val a:Int=RDD1.reduce(..) RDD2.map(x => x*a) What I use right now (pseudocode): val a:Int=RDD1.reduce(..) RDD3=makeRDD(a) RDD2.cartesianProduct(RDD3) How to structure this type of operation to not need the barrier to block computing RDD2 until RDD1 is done? -Adrian