You can calculate the complexity of these operators by looking at the RDD.scala basically. There, you will find - for example - what happens when you call a map on RDDs. It's a simple Scala map function on a simple Iterator of type T. Distinct has been implemented with mapping and grouping on the iterator as I resemble.
Zoltán On Sun, Apr 26, 2015 at 7:43 PM Vijayasarathy Kannan <kvi...@vt.edu> wrote: > What is the complexity of transformations and actions in Spark, such as > groupBy(), flatMap(), collect(), etc.? > > What attributes do we need to factor (such as number of partitions) in > while analyzing codes using these operations? >