Re: [infinispan-dev] MapReduce limitations and suggestions.

Vladimir Blagojevic Tue, 18 Feb 2014 07:38:42 -0800

On 2/18/2014, 4:59 AM, Dan Berindei wrote:
>
> The limitation we have now is that in the reduce phase, the entire 
> list of values for one intermediate key must be in memory at once. I 
> think Hadoop only loads a block of intermediate values in memory at 
> once, and can even sort the intermediate values (with a user-supplied 
> comparison function) so that the reduce function can work on a sorted 
> list without loading the values in memory itself.
>
>
Dan and others,


This is where Sanne's idea comes into play. Why collect entire list of 
intermediate values for each intermediate key and then invoke reduce on 
those values when we can invoke reduce each time new intermediate value 
gets inserted?

https://issues.jboss.org/browse/ISPN-3999

Cheers,
Vladimir
_______________________________________________
infinispan-dev mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] MapReduce limitations and suggestions.

Reply via email to