You can do a cast val rdd = some RDD[SomeData]
rdd.asInstanceOf[RDD[Tuple2[Int, Data]]].reduceByKey(...) It's invariant because of historic reasons I think. It is fairly hard to change it now. -- Reynold Xin, AMPLab, UC Berkeley http://rxin.org On Thu, Sep 26, 2013 at 6:25 AM, Han JU <[email protected]> wrote: > Hi, > > I have some classes like > > abstract class RawData[+K, +V](id: K, data: V) extends Tuple2[K, V](uid, > data) > > case class SomeData(id: Int, data: Data) extends RawData[Int, Data](id, > data) > > > to model some input data. > > Then I find out that RDD[SomeData] doesn't have access to > pairRDDFunctions, like join. But SomeData is indeed a subclass of Tuple2. > > I guess that the problem comes from the invariance of T in RDD[T], and > RDD[SomeData] is not a subclass of RDD[Tuple2] so the implicit conversion > won't work. > > So, > > 1) how could I work this around? How do you model data of lots of fields > that need to be joined? I don't really want to have things like "_._2._2" > but rather "_.id" or "_.data.someFields". > > 2) is there some reason for invariance of T in RDD? could it be covariant? > > > Thanks! > > -- > *JU Han* > > Data Engineer @ Botify.com > > +33 0619608888 >
