Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/1242#issuecomment-55352215 Hi @rxin , sorry to bring this out. Are you planning to merge this terasort example into Spark? I think this would be a good standard to test the performance of Shuffle. Besides I think generated records should be copied, otherwise will lead to error in sort-based shuffle like [SPARK-2967](https://issues.apache.org/jira/browse/SPARK-2967). Also is this intended not to do in-partition sorting or will do later? Thanks a lot.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org