Re: How to deal with multidimensional keys?

Andrew Ash Thu, 02 Jan 2014 15:29:53 -0800

If you had RDD[[i, j, k], value] then you could reduce by j by essentially
mapping j into the key slot, doing the reduce, and then mapping it back:


rdd.map( ((i,j,k),v) => (j, (i, k, v)).reduce( ... ).map( (j,(i,k,v)) =>
((i,j,k),v))

It's not pretty, but I've had to use this pattern before too.


On Thu, Jan 2, 2014 at 6:23 PM, Aureliano Buendia <[email protected]>wrote:

> Hi,
>
> How is it possible to reduce by multidimensional keys?
>
> For example, if every line is a tuple like:
>
> (i, j, k, value)
>
> or, alternatively:
>
> ((I, j, k), value)
>
> how can spark handle reducing over j, or k?
>

Re: How to deal with multidimensional keys?

Reply via email to