Re: Poor scalability with map reduce application

Alberto Andreotti Thu, 23 Jun 2011 06:21:12 -0700

Hi again!,

thanks for the very valuable answers! I think I may be having few reduces,
could it be the case, take a look at part of the log I sent you,


        Job Counters
                Data-local map tasks=18
                Total time spent by all maps waiting after reserving
slots (ms)=0
                Total time spent by all reduces waiting after
reserving slots (ms)=0
                Rack-local map tasks=3
                SLOTS_MILLIS_MAPS=1523846
                SLOTS_MILLIS_REDUCES=574976
                Launched map tasks=21
                Launched reduce tasks=2


could it be possible to have only TWO reduce tasks?

Alberto.
On 23 June 2011 01:55, Harsh J <[email protected]> wrote:

> Alberto,
>
> I can assure you that fiddling with default replication factors can't
> be the solution here. Most of us running a 3+ cluster still use the
> 3-replica-factor and it hardly introduces a performance lag. As long
> as your Hadoop cluster network is not shared with other network
> applications, you shouldn't be seeing any network slowdowns.
>
> Anyhow, the dfs.replication.max is not what you were looking to
> change. It was dfs.replication instead (to affect all new file
> replication values). AFAIK, there is no replication factor hardcoded
> anywhere in code, its all configurable, so its just a matter of
> setting the right configuration :)
>
> Regarding the "10" thing: The MR components try to load their jars and
> other submitted code/files with a 10 replication factor by default, so
> that it propagates to all racks/etc and leads to a fast startup of
> tasks. I do not think that's a problem either in your case (if it gets
> 4, it will use 4, if it gets 7, it will use 7 -- but won't take too
> long).
>
> On Thu, Jun 23, 2011 at 6:14 AM, Alberto Andreotti
> <[email protected]> wrote:
> > Hi guys,
> >
> > I suspected that the problem was due to overhead introduced by the
> > filesystem, so I tried to set the "dfs.replication.max" property to
> > different values.
> > First, I tried with 2, and I got a message saying that I was requesting a
> > value of 3, which was bigger than the limit. So I couldn't do the run(it
> > seems this 3 is hardcoded somewhere, I read that in Jira).
> > Then I tried with 3, I could generate the input files for the map reduce
> > app, but when trying to run I got this one,
> >
> > Exception in thread "main" java.io.IOException: file
> >
> /tmp/hadoop-aandre/mapred/staging/aandre/.staging/job_201106230004_0003/job.jar.
> > Requested replication 10 exceeds maximum 3
> >         at
> >
> org.apache.hadoop.hdfs.server.namenode.BlockManager.verifyReplication(BlockManager.java:468)
> >
> >
> > which seems like the framework were trying to replicate the output in as
> > many nodes as possible. Could this be the degradation source?.
> > Also I attached the log for the run with 7 nodes,.
> >
> > Alberto.
> >
> >
> > On 21 June 2011 14:40, Harsh J <[email protected]> wrote:
> >>
> >> Matt,
> >>
> >> You're right that it (slowstart) does not / would not affect much. I
> >> was merely explaining the reason behind his observance of reducers
> >> getting scheduled early, not really recommending a tweak for
> >> performance changes there.
> >>
> >> On Tue, Jun 21, 2011 at 10:46 PM, GOEKE, MATTHEW (AG/1000)
> >> <[email protected]> wrote:
> >> > Harsh,
> >> >
> >> > Is it possible for mapred.reduce.slowstart.completed.maps to even play
> a
> >> > significant role in this? The only benefit he would find in tweaking
> that
> >> > for his problem would be to spread network traffic from the shuffle
> over a
> >> > longer period of time at a cost of having the reducer using resources
> >> > earlier. Either way he would see this effect across both sets of runs
> if he
> >> > is using the default parameters. I guess it would all depend on what
> kind of
> >> > network layout the cluster is on.
> >> >
> >> > Matt
> >> >
> >> > -----Original Message-----
> >> > From: Harsh J [mailto:[email protected]]
> >> > Sent: Tuesday, June 21, 2011 12:09 PM
> >> > To: [email protected]
> >> > Subject: Re: Poor scalability with map reduce application
> >> >
> >> > Alberto,
> >> >
> >> > On Tue, Jun 21, 2011 at 10:27 PM, Alberto Andreotti
> >> > <[email protected]> wrote:
> >> >> I don't know if speculatives maps are on, I'll check it. One thing I
> >> >> observed is that reduces begin before all maps have finished. Let me
> >> >> check
> >> >> also if the difference is on the map side or in the reduce. I believe
> >> >> it's
> >> >> balanced, both are slower when adding more nodes, but i'll confirm
> >> >> that.
> >> >
> >> > Maps and reduces are speculative by default, so must've been ON. Could
> >> > you also post a general input vs. output record counts and statistics
> >> > like that between your job runs, to correlate?
> >> >
> >> > The reducers get scheduled early but do not exactly "reduce()" until
> >> > all maps are done. They just keep fetching outputs. Their scheduling
> >> > can be controlled with some configurations (say, to start only after
> >> > X% of maps are done -- by default it starts up when 5% of maps are
> >> > done).
> >> >
> >> > --
> >> > Harsh J
> >> > This e-mail message may contain privileged and/or confidential
> >> > information, and is intended to be received only by persons entitled
> >> > to receive such information. If you have received this e-mail in
> error,
> >> > please notify the sender immediately. Please delete it and
> >> > all attachments from any servers, hard drives or any other media.
> Other
> >> > use of this e-mail by you is strictly prohibited.
> >> >
> >> > All e-mails and attachments sent and received are subject to
> monitoring,
> >> > reading and archival by Monsanto, including its
> >> > subsidiaries. The recipient of this e-mail is solely responsible for
> >> > checking for the presence of "Viruses" or other "Malware".
> >> > Monsanto, along with its subsidiaries, accepts no liability for any
> >> > damage caused by any such code transmitted by or accompanying
> >> > this e-mail or any attachment.
> >> >
> >> >
> >> > The information contained in this email may be subject to the export
> >> > control laws and regulations of the United States, potentially
> >> > including but not limited to the Export Administration Regulations
> (EAR)
> >> > and sanctions regulations issued by the U.S. Department of
> >> > Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of
> >> > this information you are obligated to comply with all
> >> > applicable U.S. export laws and regulations.
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
> >
> > --
> > José Pablo Alberto Andreotti.
> > Tel: 54 351 4730292
> > Móvil: 54351156526363.
> > MSN: [email protected]
> > Skype: andreottialberto
> >
>
>
>
> --
> Harsh J
>



-- 
José Pablo Alberto Andreotti.
Tel: 54 351 4730292
Móvil: 54351156526363.
MSN: [email protected]
Skype: andreottialberto

Re: Poor scalability with map reduce application

Reply via email to