On 18/Feb/2014, at 10:59 , Dan Berindei <[email protected]> wrote:
> I think Hadoop only loads a block of intermediate values in memory at once, 
> and can even sort the intermediate values (with a user-supplied comparison 
> function) so that the reduce function can work on a sorted list without 
> loading the values in memory itself.

Actually, Hadoop sorts in the map node, the last two steps being sort and 
combine. Reduce nodes fetch partitions from the map nodes and just merges them. 
Such partitions are fetched incrementally, and whenever a given key ends in all 
partially fetched partitions, reduce() is called.

Cheers, MP
--
Marcelo Pasin
Université de Neuchâtel · Institut d'informatique
rue Emile-Argand 11 · Case postale 158 · 2000 Neuchâtel · Switzerland



_______________________________________________
infinispan-dev mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Reply via email to