Re: Best practices for hadoop shuffling/tunning ?

praveenesh kumar Tue, 31 Jan 2012 01:10:11 -0800

Can anyone please eyeball the config parameters as defined below and share
their thoughts on this ?


Thanks,
Praveenesh

On Mon, Jan 30, 2012 at 6:20 PM, praveenesh kumar <praveen...@gmail.com>wrote:

> Hey guys,
>
> Just wanted to ask, are there any sort of best practices to be followed
> for hadoop shuffling improvements ?
>
> I am running Hadoop 0.20.205 on 8 nodes cluster.Each node is 24 cores/CPUs
> with 48 GB RAM.
>
> I have set the following parameters :
>
> fs.inmemory.size.mb=2000
> io.sort.mb=2000
> io.sort.factor=200
> io.file.buffer.size=262544
>
> mapred.map.tasks=200
> mapred.reduce.tasks=40
> mapred.reduce.parallel.copies=80
> mapred.map.child.java.opts = 1024 Mb
> mapred.map.reduce.java.opts=1024 Mb
>
> mapred.job.tracker.handler.count=60
> tasktracker.http.threads=50
> mapred.job.reuse.jvm.num.tasks = -1
> mapred.compress.map.output = true
> mapred.reduce.slowstart.completed.maps = 0.5
>
> mapred.tasktracker.map.tasks.maximum=24
> mapred.tasktracker.reduce.tasks.maximum=12
>
>
> Can anyone please validate the above tuning parameters, and suggest any
> further improvements ?
> My mappers are running fine. Shuffling and reducing part is comparatively
> slower, than expected for normal jobs. Wanted to know what I am doing
> wrong/missing.
>
> Thanks,
> Praveenesh
>
>

Re: Best practices for hadoop shuffling/tunning ?

Reply via email to