[
https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375323#comment-14375323
]
DoingDone9 edited comment on SPARK-2926 at 3/23/15 3:07 AM:
------------------------------------------------------------
hi, i test sortByKey with spark-perf(https://github.com/databricks/spark-perf),
but i have a result like that :
spark1.3 :
{"time":452.453},{"time":457.929},{"time":452.295}
with your pr
{"time":471.215},{"time":460.59},{"time":463.795}
could you tell me something that i did incorrectly. Thank you.
was (Author: doingdone9):
hi, i test sortByKey with spark-perf(https://github.com/databricks/spark-perf),
but i have a result like that :
spark1.3 :
{"time":452.453},{"time":457.929},{"time":452.295}
with your pr
{"time":471.215},{"time":460.59},{"time":463.795}
could you tell me something taht i did incorretly. Thank you.
> Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle
> ------------------------------------------------------------------
>
> Key: SPARK-2926
> URL: https://issues.apache.org/jira/browse/SPARK-2926
> Project: Spark
> Issue Type: Improvement
> Components: Shuffle
> Affects Versions: 1.1.0
> Reporter: Saisai Shao
> Assignee: Saisai Shao
> Attachments: SortBasedShuffleRead.pdf, Spark Shuffle Test
> Report(contd).pdf, Spark Shuffle Test Report.pdf
>
>
> Currently Spark has already integrated sort-based shuffle write, which
> greatly improve the IO performance and reduce the memory consumption when
> reducer number is very large. But for the reducer side, it still adopts the
> implementation of hash-based shuffle reader, which neglects the ordering
> attributes of map output data in some situations.
> Here we propose a MR style sort-merge like shuffle reader for sort-based
> shuffle to better improve the performance of sort-based shuffle.
> Working in progress code and performance test report will be posted later
> when some unit test bugs are fixed.
> Any comments would be greatly appreciated.
> Thanks a lot.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]