You can do a cast

val rdd = some RDD[SomeData]

rdd.asInstanceOf[RDD[Tuple2[Int, Data]]].reduceByKey(...)



It's invariant because of historic reasons I think. It is fairly hard to
change it now.



--
Reynold Xin, AMPLab, UC Berkeley
http://rxin.org



On Thu, Sep 26, 2013 at 6:25 AM, Han JU <[email protected]> wrote:

> Hi,
>
> I have some classes like
>
> abstract class RawData[+K, +V](id: K, data: V) extends Tuple2[K, V](uid,
> data)
>
> case class SomeData(id: Int, data: Data) extends RawData[Int, Data](id,
> data)
>
>
> to model some input data.
>
> Then I find out that RDD[SomeData] doesn't have access to
> pairRDDFunctions, like join. But SomeData is indeed a subclass of Tuple2.
>
> I guess that the problem comes from the invariance of T in RDD[T], and
> RDD[SomeData] is not a subclass of RDD[Tuple2] so the implicit conversion
> won't work.
>
> So,
>
> 1) how could I work this around? How do you model data of lots of fields
> that need to be joined? I don't really want to have things like "_._2._2"
> but rather "_.id" or "_.data.someFields".
>
> 2) is there some reason for invariance of T in RDD? could it be covariant?
>
>
> Thanks!
>
> --
> *JU Han*
>
> Data Engineer @ Botify.com
>
> +33 0619608888
>

Reply via email to