If you only have 1000 rows, why use MapReduce?




On 4/5/12 6:37 AM, "Arnaud Le-roy" <[email protected]> wrote:

>but do you think that i can change the default behavior ?
>
>for exemple i have ten nodes in my cluster and my table is stored only
>on two nodes this table have 1000 rows.
>with the default behavior only two nodes will work for a map/reduce
>task., isn't it ?
>
>if i do a custom input that split the table by 100 rows, can i
>distribute manually each part  on a node   regardless where the data
>is ?
>
>Le 5 avril 2012 00:36, Doug Meil <[email protected]> a écrit :
>>
>> The default behavior is that the input splits are where the data is
>>stored.
>>
>>
>>
>>
>> On 4/4/12 5:24 PM, "sdnetwork" <[email protected]> wrote:
>>
>>>ok thanks,
>>>
>>>but i don't find the information that tell me how the result of the
>>>split
>>>is
>>>distrubuted across the different node of the cluster ?
>>>
>>>1) randomely ?
>>>2) where the data is stored ?
>>>
>>>
>>>
>>>
>>>
>>
>>
>


Reply via email to