Generic types and pair RDDs

2014-04-01 Thread Daniel Siegmann
When my tuple type includes a generic type parameter, the pair RDD functions aren't available. Take for example the following (a join on two RDDs, taking the sum of the values): def joinTest(rddA: RDD[(String, Int)], rddB: RDD[(String, Int)]) : RDD[(String, Int)] = { rddA.join(rddB).map {

Re: Generic types and pair RDDs

2014-04-01 Thread Koert Kuipers
import org.apache.spark.SparkContext._ import org.apache.spark.rdd.RDD import scala.reflect.ClassTag def joinTest[K: ClassTag](rddA: RDD[(K, Int)], rddB: RDD[(K, Int)]) : RDD[(K, Int)] = { rddA.join(rddB).map { case (k, (a, b)) = (k, a+b) } } On Tue, Apr 1, 2014 at 4:55 PM, Daniel

Re: Generic types and pair RDDs

2014-04-01 Thread Aaron Davidson
Koert's answer is very likely correct. This implicit definition which converts an RDD[(K, V)] to provide PairRDDFunctions requires a ClassTag is available for K: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L1124 To fully understand what's