Merge pull request #38 from AndreSchumacher/pyspark_sorting SPARK-705: implement sortByKey() in PySpark
This PR contains the implementation of a RangePartitioner in Python and uses its partition ID's to get a global sort in PySpark. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/b4fa11f6 Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/b4fa11f6 Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/b4fa11f6 Branch: refs/heads/master Commit: b4fa11f6c96ee37ecd30231c1e22630055f52115 Parents: 19d445d fdbae41 Author: Matei Zaharia <ma...@eecs.berkeley.edu> Authored: Wed Oct 9 11:59:47 2013 -0700 Committer: Matei Zaharia <ma...@eecs.berkeley.edu> Committed: Wed Oct 9 11:59:47 2013 -0700 ---------------------------------------------------------------------- python/pyspark/rdd.py | 48 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) ----------------------------------------------------------------------