About conception and usage of Uber

sam liu Thu, 07 Nov 2013 18:23:57 -0800

Hi Experts,

In previous discussions, I found following descriptions:
"mapreduce.job.ubertask.enable | (false) | 'Whether to enable the
small-jobs "ubertask" optimization, which runs "sufficiently small" jobs
sequentially within a single JVM. "Small" is defined by the following
maxmaps, maxreduces, and maxbytes settings. Users may override this value.'"


Basing on above description, I set "mapreduce.job.ubertask.enable" to true
and also configured other uber related parameters, and then I did some
practices and have following understanding.
1) If I submit a bunch of small MR jobs to Hadoop cluster(each MR job will
run in uber mode):
   - Each MR job corresponds to an application, like
application_1383815949546_0006
   - Each application has its own container, like
container_1383815949546_0010_01_000001
   - When a container launched by nodemanager, it will launch a JVM too.
When the container stops, the JVM will stop as well. A container only has
one JVM in its whole life cycle.
   - Each application_1383815949546_0006 includes some map tasks and reduce
tasks
   - In uber mode, all the map tasks and reduce tasks of
application_1383815949546_0006 will be executed in a the same and only
container container_1383815949546_0010_01_000001. It also means that all
map tasks and reduce tasks will be executed in a single JVM.
   - A container could not be shared among different applications(jobs)

2) If I submit a bunch of big MR jobs to Hadoop cluster(each MR job will
run and NOT in uber mode):
   - Each map task and reduce task of application_1383815949546_0006 will
be executed in its own container. It means that
application_1383815949546_0006 will have lots of containers.

I am not sure whether above undertandings are correct or not, so any
comments/corrections will be appreciated!

About conception and usage of Uber

Reply via email to