Hi Alexander,
Thanks for your reply

In the custom rdd, there are some fields I have defined so that both custom
method and compute method can see and operate them, can the method in
implicit class implement that?

On Mon, Mar 28, 2016 at 1:09 AM, Alexander Krasnukhin <the.malk...@gmail.com
> wrote:

> Extending breaks chaining and not nice. I think it is much better to write
> implicit class with extra methods. This way you add new methods without
> touching hierarchy at all i.e.
>
> object RddFunctions {
>   implicit class RddFunctionsImplicit[T](rdd: RDD[T]) {
>     /***
>      * Cache RDD and name it in one step.
>      */
>     def cacheNamed(name: String) = {
>       rdd.cache.setName(name)
>     }
>   }
> }
>
> ...
>
> import RddFunctions._
>
> val rdd = ...
> rdd.cacheNamed("banana")
>
> ...
>
> On Sun, Mar 27, 2016 at 6:50 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> bq.   def customable(partitioner: Partitioner): RDD[(K, V)] =
>> self.withScope {
>>
>> In above, you declare return type as RDD. While you actually intended to
>> declare MyRDD as the return type.
>> Or, you can cast myrdd as MyRDD in spark-shell.
>>
>> BTW I don't think it is good practice to add custom method to base RDD.
>>
>> On Sun, Mar 27, 2016 at 9:44 AM, Tenghuan He <tenghua...@gmail.com>
>> wrote:
>>
>>> Hi Ted,
>>>
>>> The codes are running in spark-shell
>>>
>>> scala> val part = new org.apache.spark.HashPartitioner(10)
>>> scala> val baseRDD = sc.parallelize(1 to 100000).map(x => (x,
>>> "hello")).partitionBy(part).cache()
>>> scala> val myrdd = baseRDD.customable(part)  // here customable is a
>>> method added to the abstract RDD to create MyRDD
>>> myrdd: org.apache.spark.rdd.RDD[(Int, String)] = MyRDD[3] at customable
>>> at
>>> <console>:28
>>> scala> *myrdd.customMethod(bulk)*
>>> *error: value customMethod is not a member of
>>> org.apache.spark.rdd.RDD[(Int, String)]*
>>>
>>> and the customable method in PairRDDFunctions.scala is
>>>
>>>   def customable(partitioner: Partitioner): RDD[(K, V)] = self.withScope
>>> {
>>>     new MyRDD[K, V](self, partitioner)
>>>   }
>>>
>>> Thanks:)
>>>
>>> On Mon, Mar 28, 2016 at 12:28 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>
>>>> Can you show the full stack trace (or top 10 lines) and the snippet
>>>> using your MyRDD ?
>>>>
>>>> Thanks
>>>>
>>>> On Sun, Mar 27, 2016 at 9:22 AM, Tenghuan He <tenghua...@gmail.com>
>>>> wrote:
>>>>
>>>>> ​Hi everyone,
>>>>>
>>>>>     I am creating a custom RDD which extends RDD and add a custom
>>>>> method, however the custom method cannot be found.
>>>>>     The custom RDD looks like the following:
>>>>>
>>>>> class MyRDD[K, V](
>>>>>     var base: RDD[(K, V)],
>>>>>     part: Partitioner
>>>>>   ) extends RDD[(K, V)](base.context, Nil) {
>>>>>
>>>>>   def *customMethod*(bulk: ArrayBuffer[(K, (V, Int))]): myRDD[K, V] =
>>>>> {
>>>>>   // ... custom code here
>>>>>   }
>>>>>
>>>>>   override def compute(split: Partition, context: TaskContext):
>>>>> Iterator[(K, V)] = {
>>>>>   // ... custome code here
>>>>>   }
>>>>>
>>>>>   override protected def getPartitions: Array[Partition] = {
>>>>>   // ... custom code here
>>>>>   }
>>>>>
>>>>>   override protected def getDependencies: Seq[Dependency[_]] = {
>>>>>   // ... custom code here
>>>>>   }
>>>>> }​
>>>>>
>>>>> In spark-shell, it turns out that the overrided methods works well,
>>>>> but when calling myrdd.customMethod(bulk), it throws out:
>>>>> <console>:33: error: value customMethod is not a member of
>>>>> org.apache.spark.rdd.RDD[(In
>>>>> t, String)]
>>>>>
>>>>> Can anyone tell why the custom method can not be found?
>>>>> Or do I have to add the customMethod to the abstract RDD and then
>>>>> override it in custom RDD?
>>>>>
>>>>> PS: spark-version: 1.5.1
>>>>>
>>>>> Thanks & Best regards
>>>>>
>>>>> Tenghuan
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
>
> --
> Regards,
> Alexander
>

Reply via email to