I am sorting a data frame using something like: val sortedDF = df.orderBy(df("score").desc)
The sorting is really fast. The issue I have is that after sorting, the resulting data frame sortedDF appears to be in a single partition, which is a problem because when I try to execute another operation in this new data frame (i.e sortedDF.limit(1000000)) I have an error like the following: Job aborted due to stage failure: Total size of serialized results of 194 tasks (5.0 GB) is bigger than spark.driver.maxResultSize (5.0 GB) I have already tried to repartition the resulting sortedDF before doing any operation on it, but the same error appears. *Is there any smarter way to use dataframe orderBy on Spark, such that I do not have this problem?* The current version of spark I am using is 1.3.0, and due to company policy it is not possible for me to try it in a newer version. Thanks!!! -- Cesar Flores