Thank you guys, I really appreciate your answers. I don't have access to the
cluster right now, I'll check the info you are asking and come back in a
couple of hours.
BTW, I tried the app on two clusters with similar results. I'm using 0.21.0.

thanks again, Alberto.

On 21 June 2011 14:16, GOEKE, MATTHEW (AG/1000)
<[email protected]>wrote:

> Harsh,
>
> Is it possible for mapred.reduce.slowstart.completed.maps to even play a
> significant role in this? The only benefit he would find in tweaking that
> for his problem would be to spread network traffic from the shuffle over a
> longer period of time at a cost of having the reducer using resources
> earlier. Either way he would see this effect across both sets of runs if he
> is using the default parameters. I guess it would all depend on what kind of
> network layout the cluster is on.
>
> Matt
>
> -----Original Message-----
> From: Harsh J [mailto:[email protected]]
> Sent: Tuesday, June 21, 2011 12:09 PM
> To: [email protected]
> Subject: Re: Poor scalability with map reduce application
>
> Alberto,
>
> On Tue, Jun 21, 2011 at 10:27 PM, Alberto Andreotti
> <[email protected]> wrote:
> > I don't know if speculatives maps are on, I'll check it. One thing I
> > observed is that reduces begin before all maps have finished. Let me
> check
> > also if the difference is on the map side or in the reduce. I believe
> it's
> > balanced, both are slower when adding more nodes, but i'll confirm that.
>
> Maps and reduces are speculative by default, so must've been ON. Could
> you also post a general input vs. output record counts and statistics
> like that between your job runs, to correlate?
>
> The reducers get scheduled early but do not exactly "reduce()" until
> all maps are done. They just keep fetching outputs. Their scheduling
> can be controlled with some configurations (say, to start only after
> X% of maps are done -- by default it starts up when 5% of maps are
> done).
>
> --
> Harsh J
> This e-mail message may contain privileged and/or confidential information,
> and is intended to be received only by persons entitled
> to receive such information. If you have received this e-mail in error,
> please notify the sender immediately. Please delete it and
> all attachments from any servers, hard drives or any other media. Other use
> of this e-mail by you is strictly prohibited.
>
> All e-mails and attachments sent and received are subject to monitoring,
> reading and archival by Monsanto, including its
> subsidiaries. The recipient of this e-mail is solely responsible for
> checking for the presence of "Viruses" or other "Malware".
> Monsanto, along with its subsidiaries, accepts no liability for any damage
> caused by any such code transmitted by or accompanying
> this e-mail or any attachment.
>
>
> The information contained in this email may be subject to the export
> control laws and regulations of the United States, potentially
> including but not limited to the Export Administration Regulations (EAR)
> and sanctions regulations issued by the U.S. Department of
> Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this
> information you are obligated to comply with all
> applicable U.S. export laws and regulations.
>
>


-- 
José Pablo Alberto Andreotti.
Tel: 54 351 4730292
Móvil: 54351156526363.
MSN: [email protected]
Skype: andreottialberto

Reply via email to