Hi Zhiliang,

How about doing something like this?

val rdd3 = rdd1.zip(rdd2).map(p =>
    p._1.zip(p._2).map(z => z._1 - z._2))

The first zip will join the two RDDs and produce an RDD of (Array[Float],
Array[Float]) pairs. On each pair, we zip the two Array[Float] components
together to form an Array[(Float, Float)] and then we subtract the first
element from the second in the inner map (the inner map is a Scala map not
a Spark one).

I tried this out on a notebook:

val rdd1 = sc.parallelize(List(Array(1.0, 2.0, 3.0), Array(4.0, 5.0, 6.0),
Array(7.0, 8.0, 9.0)))
val rdd2 = sc.parallelize(List(Array(1.0, 4.0, 3.0), Array(4.0, 10.0, 6.0),
Array(7.0, 16.0, 9.0)))
val rdd3 = rdd1.zip(rdd2).map(p => p._1.zip(p._2).map(z => z._1 - z._2))
rdd3.collect()

gives me:
res0: Array[Array[Double]] = Array(Array(0.0, -2.0, 0.0), Array(0.0, -5.0,
0.0), Array(0.0, -8.0, 0.0))

-sujit


On Wed, Sep 23, 2015 at 12:23 AM, Zhiliang Zhu <zchl.j...@yahoo.com> wrote:

> there is matrix add API, might map rdd2 each row element to be negative ,
> then make rdd1 and rdd2 and call add ?
>
> Or some more ways ...
>
>
>
> On Wednesday, September 23, 2015 3:11 PM, Zhiliang Zhu <
> zchl.j...@yahoo.com> wrote:
>
>
> Hi All,
>
> There are two RDDs :  RDD<Array<float>> rdd1, and RDD<Array<float>> rdd2,
> that is to say, rdd1 and rdd2 are similar with DataFrame, or Matrix with
> same row number and column number.
>
> I would like to get RDD<Array<float>> rdd3,  each element in rdd3 is the
> subtract between rdd1 and rdd2 of the
> same position, which is similar Matrix subtract:
> rdd3<i, j> = rdd1<i, j> - rdd2<i, j> ...
>
> It seemed very difficult to operate this kinds of matrix  arithmetic, even
> is about add, subtract, multiple , diff etc...
>
> I shall  appreciate your help very much~~
> Zhiliang
>
>
>
>
>

Reply via email to