Any comments/corrections on my understanding on Uber? Thanks in advance!
2013/11/8 sam liu <[email protected]> > Hi Experts, > > In previous discussions, I found following descriptions: > "mapreduce.job.ubertask.enable | (false) | 'Whether to enable the > small-jobs "ubertask" optimization, which runs "sufficiently small" jobs > sequentially within a single JVM. "Small" is defined by the following > maxmaps, maxreduces, and maxbytes settings. Users may override this value.'" > > Basing on above description, I set "mapreduce.job.ubertask.enable" to true > and also configured other uber related parameters, and then I did some > practices and have following understanding. > 1) If I submit a bunch of small MR jobs to Hadoop cluster(each MR job will > run in uber mode): > - Each MR job corresponds to an application, like > application_1383815949546_0006 > - Each application has its own container, like > container_1383815949546_0010_01_000001 > - When a container launched by nodemanager, it will launch a JVM too. > When the container stops, the JVM will stop as well. A container only has > one JVM in its whole life cycle. > - Each application_1383815949546_0006 includes some map tasks and > reduce tasks > - In uber mode, all the map tasks and reduce tasks of > application_1383815949546_0006 will be executed in a the same and only > container container_1383815949546_0010_01_000001. It also means that all > map tasks and reduce tasks will be executed in a single JVM. > - A container could not be shared among different applications(jobs) > > 2) If I submit a bunch of big MR jobs to Hadoop cluster(each MR job will > run and NOT in uber mode): > - Each map task and reduce task of application_1383815949546_0006 will > be executed in its own container. It means that > application_1383815949546_0006 will have lots of containers. > > I am not sure whether above undertandings are correct or not, so any > comments/corrections will be appreciated! >
