MR job launching is slower

praveenesh kumar Tue, 20 Mar 2012 03:40:53 -0700

I have 10 node cluster ( around 24 CPUs, 48 GB RAM, 1 TB HDD, 10 GB
ethernet connection)
After triggering any MR job, its taking like 3-5 seconds to launch ( I mean
the time when I can see any MR job completion % on the screen).
I know internally its trying to launch the job,intialize mappers, loading
data etc.
What I want to know - Is it a default/desired/expected hadoop behavior or
there are ways in which I can decrease this startup time ?


Also I feel like my hadoop jobs should run faster, but I am still not able
to make it as fast as it should be according to me ?
I did some tunning also, following are the parameters I am playing around
these days but still I feel there are something missing that I can still
use:

dfs.block.size:

mapred.compress.map.output

mapred.map/reduce.tasks.speculative.execution

mapred.tasktracker.map/reduce.tasks.maximum:

mapred.child.java.opts

io.sort.mb:

io.sort.factor:

mapred.reduce.parallel.copies:

mapred.job.reuse.jvm.num.tasks:


Thanks,
Praveenesh

MR job launching is slower

Reply via email to