What is the complexity of transformations and actions in Spark, such as
groupBy(), flatMap(), collect(), etc.?
What attributes do we need to factor (such as number of partitions) in
while analyzing codes using these operations?
You can calculate the complexity of these operators by looking at the
RDD.scala basically. There, you will find - for example - what happens when
you call a map on RDDs. It's a simple Scala map function on a simple
Iterator of type T. Distinct has been implemented with mapping and grouping
on the