Matt,

You're right that it (slowstart) does not / would not affect much. I
was merely explaining the reason behind his observance of reducers
getting scheduled early, not really recommending a tweak for
performance changes there.

On Tue, Jun 21, 2011 at 10:46 PM, GOEKE, MATTHEW (AG/1000)
<[email protected]> wrote:
> Harsh,
>
> Is it possible for mapred.reduce.slowstart.completed.maps to even play a 
> significant role in this? The only benefit he would find in tweaking that for 
> his problem would be to spread network traffic from the shuffle over a longer 
> period of time at a cost of having the reducer using resources earlier. 
> Either way he would see this effect across both sets of runs if he is using 
> the default parameters. I guess it would all depend on what kind of network 
> layout the cluster is on.
>
> Matt
>
> -----Original Message-----
> From: Harsh J [mailto:[email protected]]
> Sent: Tuesday, June 21, 2011 12:09 PM
> To: [email protected]
> Subject: Re: Poor scalability with map reduce application
>
> Alberto,
>
> On Tue, Jun 21, 2011 at 10:27 PM, Alberto Andreotti
> <[email protected]> wrote:
>> I don't know if speculatives maps are on, I'll check it. One thing I
>> observed is that reduces begin before all maps have finished. Let me check
>> also if the difference is on the map side or in the reduce. I believe it's
>> balanced, both are slower when adding more nodes, but i'll confirm that.
>
> Maps and reduces are speculative by default, so must've been ON. Could
> you also post a general input vs. output record counts and statistics
> like that between your job runs, to correlate?
>
> The reducers get scheduled early but do not exactly "reduce()" until
> all maps are done. They just keep fetching outputs. Their scheduling
> can be controlled with some configurations (say, to start only after
> X% of maps are done -- by default it starts up when 5% of maps are
> done).
>
> --
> Harsh J
> This e-mail message may contain privileged and/or confidential information, 
> and is intended to be received only by persons entitled
> to receive such information. If you have received this e-mail in error, 
> please notify the sender immediately. Please delete it and
> all attachments from any servers, hard drives or any other media. Other use 
> of this e-mail by you is strictly prohibited.
>
> All e-mails and attachments sent and received are subject to monitoring, 
> reading and archival by Monsanto, including its
> subsidiaries. The recipient of this e-mail is solely responsible for checking 
> for the presence of "Viruses" or other "Malware".
> Monsanto, along with its subsidiaries, accepts no liability for any damage 
> caused by any such code transmitted by or accompanying
> this e-mail or any attachment.
>
>
> The information contained in this email may be subject to the export control 
> laws and regulations of the United States, potentially
> including but not limited to the Export Administration Regulations (EAR) and 
> sanctions regulations issued by the U.S. Department of
> Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this 
> information you are obligated to comply with all
> applicable U.S. export laws and regulations.
>
>



-- 
Harsh J

Reply via email to