One 'rule of thumbs' is to use rdd.toDebugString and check the lineage for
ShuffleRDD. As long as there's no need for restructuring the RDD,
operations can be pipelined on each partition.

"rdd.toDebugString" is your friend :-)

-kr, Gerard.


On Mon, Nov 17, 2014 at 7:37 AM, Mukesh Jha <me.mukesh....@gmail.com> wrote:

> Thanks I did go through the video it was very informative, but I think I's
> looking for the Transformations section @ page
> https://spark.apache.org/docs/0.9.1/scala-programming-guide.html.
>
>
> On Mon, Nov 17, 2014 at 10:31 AM, Samarth Mailinglist <
> mailinglistsama...@gmail.com> wrote:
>
>> Check this video out:
>> https://www.youtube.com/watch?v=dmL0N3qfSc8&list=UURzsq7k4-kT-h3TDUBQ82-w
>>
>> On Mon, Nov 17, 2014 at 9:43 AM, Deep Pradhan <pradhandeep1...@gmail.com>
>> wrote:
>>
>>> Hi,
>>> Is there any way to know which of my functions perform better in Spark?
>>> In other words, say I have achieved same thing using two different
>>> implementations. How do I judge as to which implementation is better than
>>> the other. Is processing time the only metric that we can use to claim the
>>> goodness of one implementation to the other?
>>> Can anyone please share some thoughts on this?
>>>
>>> Thank You
>>>
>>
>>
>
>
> --
>
>
> Thanks & Regards,
>
> *Mukesh Jha <me.mukesh....@gmail.com>*
>

Reply via email to