If you only have 1000 rows, why use MapReduce?
On 4/5/12 6:37 AM, "Arnaud Le-roy" <[email protected]> wrote: >but do you think that i can change the default behavior ? > >for exemple i have ten nodes in my cluster and my table is stored only >on two nodes this table have 1000 rows. >with the default behavior only two nodes will work for a map/reduce >task., isn't it ? > >if i do a custom input that split the table by 100 rows, can i >distribute manually each part on a node regardless where the data >is ? > >Le 5 avril 2012 00:36, Doug Meil <[email protected]> a écrit : >> >> The default behavior is that the input splits are where the data is >>stored. >> >> >> >> >> On 4/4/12 5:24 PM, "sdnetwork" <[email protected]> wrote: >> >>>ok thanks, >>> >>>but i don't find the information that tell me how the result of the >>>split >>>is >>>distrubuted across the different node of the cluster ? >>> >>>1) randomely ? >>>2) where the data is stored ? >>> >>> >>> >>> >>> >> >> >
