Hi Alexander, Thanks for your reply In the custom rdd, there are some fields I have defined so that both custom method and compute method can see and operate them, can the method in implicit class implement that?
On Mon, Mar 28, 2016 at 1:09 AM, Alexander Krasnukhin <the.malk...@gmail.com > wrote: > Extending breaks chaining and not nice. I think it is much better to write > implicit class with extra methods. This way you add new methods without > touching hierarchy at all i.e. > > object RddFunctions { > implicit class RddFunctionsImplicit[T](rdd: RDD[T]) { > /*** > * Cache RDD and name it in one step. > */ > def cacheNamed(name: String) = { > rdd.cache.setName(name) > } > } > } > > ... > > import RddFunctions._ > > val rdd = ... > rdd.cacheNamed("banana") > > ... > > On Sun, Mar 27, 2016 at 6:50 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> bq. def customable(partitioner: Partitioner): RDD[(K, V)] = >> self.withScope { >> >> In above, you declare return type as RDD. While you actually intended to >> declare MyRDD as the return type. >> Or, you can cast myrdd as MyRDD in spark-shell. >> >> BTW I don't think it is good practice to add custom method to base RDD. >> >> On Sun, Mar 27, 2016 at 9:44 AM, Tenghuan He <tenghua...@gmail.com> >> wrote: >> >>> Hi Ted, >>> >>> The codes are running in spark-shell >>> >>> scala> val part = new org.apache.spark.HashPartitioner(10) >>> scala> val baseRDD = sc.parallelize(1 to 100000).map(x => (x, >>> "hello")).partitionBy(part).cache() >>> scala> val myrdd = baseRDD.customable(part) // here customable is a >>> method added to the abstract RDD to create MyRDD >>> myrdd: org.apache.spark.rdd.RDD[(Int, String)] = MyRDD[3] at customable >>> at >>> <console>:28 >>> scala> *myrdd.customMethod(bulk)* >>> *error: value customMethod is not a member of >>> org.apache.spark.rdd.RDD[(Int, String)]* >>> >>> and the customable method in PairRDDFunctions.scala is >>> >>> def customable(partitioner: Partitioner): RDD[(K, V)] = self.withScope >>> { >>> new MyRDD[K, V](self, partitioner) >>> } >>> >>> Thanks:) >>> >>> On Mon, Mar 28, 2016 at 12:28 AM, Ted Yu <yuzhih...@gmail.com> wrote: >>> >>>> Can you show the full stack trace (or top 10 lines) and the snippet >>>> using your MyRDD ? >>>> >>>> Thanks >>>> >>>> On Sun, Mar 27, 2016 at 9:22 AM, Tenghuan He <tenghua...@gmail.com> >>>> wrote: >>>> >>>>> Hi everyone, >>>>> >>>>> I am creating a custom RDD which extends RDD and add a custom >>>>> method, however the custom method cannot be found. >>>>> The custom RDD looks like the following: >>>>> >>>>> class MyRDD[K, V]( >>>>> var base: RDD[(K, V)], >>>>> part: Partitioner >>>>> ) extends RDD[(K, V)](base.context, Nil) { >>>>> >>>>> def *customMethod*(bulk: ArrayBuffer[(K, (V, Int))]): myRDD[K, V] = >>>>> { >>>>> // ... custom code here >>>>> } >>>>> >>>>> override def compute(split: Partition, context: TaskContext): >>>>> Iterator[(K, V)] = { >>>>> // ... custome code here >>>>> } >>>>> >>>>> override protected def getPartitions: Array[Partition] = { >>>>> // ... custom code here >>>>> } >>>>> >>>>> override protected def getDependencies: Seq[Dependency[_]] = { >>>>> // ... custom code here >>>>> } >>>>> } >>>>> >>>>> In spark-shell, it turns out that the overrided methods works well, >>>>> but when calling myrdd.customMethod(bulk), it throws out: >>>>> <console>:33: error: value customMethod is not a member of >>>>> org.apache.spark.rdd.RDD[(In >>>>> t, String)] >>>>> >>>>> Can anyone tell why the custom method can not be found? >>>>> Or do I have to add the customMethod to the abstract RDD and then >>>>> override it in custom RDD? >>>>> >>>>> PS: spark-version: 1.5.1 >>>>> >>>>> Thanks & Best regards >>>>> >>>>> Tenghuan >>>>> >>>>> >>>>> >>>> >>> >> > > > -- > Regards, > Alexander >