Github user xiajunluan commented on the pull request:
https://github.com/apache/spark/pull/931#issuecomment-45059292
hi @matei
1.I will measure the performance influence after I add the pluggable
comparator
2.I agree with you. if we just implement sortByKey, we should not use
combiner(it is for combineByKey related API), it will need firstly aggregate
values and after sorting, unfold values for same key. In this patch, I would
like to reuse current class and fix this bug quickly. for long-term, I think we
should write another similar AppendOnlyMap and ExternalAppendOnlyMap class for
sortByKey, and ignore functions such as createCombiner, mergeValue, etc. I will
try to design these class later.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---