A map-only job does not result in the standard shuffle-sort. Map outputs are written directly to HDFS.
-Sandy On Fri, Feb 15, 2013 at 12:23 PM, Jay Vyas <[email protected]> wrote: > Maybe im mistaken about what is meant by map-only. Does a map-only job > still result in standard shuffle-sort ? Or does that get cut short? > > hmmm i think I see what you mean, i guess a map-only sort is possible as > long as you use a custom partitioner and you let the shuffle/sort run to > completion. > > i think the shuffle/sort, if you use a partitioner that partitions the > sorting in order (i.e. part-0 is all lines starting with "a", part-1 is all > starting with "b", etc...), > does still run inspite of the fact that your not running reducers. > > > > > On Fri, Feb 15, 2013 at 3:09 PM, Michael Segel > <[email protected]>wrote: > >> Why do you need a 1TB block? >> >> On Feb 15, 2013, at 1:29 PM, Jay Vyas <[email protected]> wrote: >> >> well.. ok... i guess you could have a 1TB block do an in place sort on >> the file, write it to a tmp directory, and then spill the records in order >> or something. at that point might as well not use hadoop. >> >> >> Michael Segel <[email protected]> | (m) 312.755.9623**** >> >> Segel and Associates**** >> >> > > > -- > Jay Vyas > http://jayunit100.blogspot.com >
