You can calculate the complexity of these operators by looking at the
RDD.scala basically. There, you will find - for example - what happens when
you call a map on RDDs. It's a simple Scala map function on a simple
Iterator of type T. Distinct has been implemented with mapping and grouping
on the iterator as I resemble.

Zoltán

On Sun, Apr 26, 2015 at 7:43 PM Vijayasarathy Kannan <kvi...@vt.edu> wrote:

> What is the complexity of transformations and actions in Spark, such as
> groupBy(), flatMap(), collect(), etc.?
>
> What attributes do we need to factor (such as number of partitions) in
> while analyzing codes using these operations?
>

Reply via email to