Unless you use a TotalOrderPartitioner, the outputs are only sorted per partition file.
See http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/lib/TotalOrderPartitioner.html for achieving a total order sort output, which would give you what you want. On Tue, Nov 22, 2011 at 7:57 PM, Leon Mergen <[email protected]> wrote: > Hello, > > I wasn't able to find this information in the documentation anywhere, but > are the part-* output files guaranteed to be sorted? As in, when > traversing the files as part-00000, part-00001, part-00002, etc, am I > guaranteed in getting the output results sorted? Or is it only sorted > within a single file ? > > Thanks in advance! > > Regards, > > Leon Mergen > -- Harsh J
