[
https://issues.apache.org/jira/browse/CRUNCH-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13458644#comment-13458644
]
Rahul Sharma commented on CRUNCH-51:
------------------------------------
I just realized that it is a log statement that I wrote back in Sort function.
Actually we need to create file only when the number of reducers is greater
than 1. So that's number corresponds to the no of keys in partition file, if it
is greater than1 then we create it. I think we can improve the log statement.
I tested it with string data and avro data for ascending and descending
order.It worked fine. Please have a go at it if it still gives issues.
> PCollection#sort relies on using a single reducer for total order sorting
> -------------------------------------------------------------------------
>
> Key: CRUNCH-51
> URL: https://issues.apache.org/jira/browse/CRUNCH-51
> Project: Crunch
> Issue Type: Improvement
> Affects Versions: 0.3.0
> Reporter: Gabriel Reid
> Attachments: 0001-CRUNCH-51-Total-Order-Sort.patch, CRUNCH-51.patch,
> CRUNCH-51.patch, SortTest.java
>
>
> The total-order sorting provided by the Sort class (and therefore
> PCollection#sort) relies on using a single reducer in order to provide
> total-order sorting. This is very inefficient for large datasets, and should
> be replaced with a total order partitioner instead.
> For more information, see CRUNCH-23 (and possibly also MAPREDUCE-4574).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira