Hi Adam, yarn.nodemanager.resource.memory-mb = 2370 MiB, yarn.nodemanager.resource.cpu-vcores = 2, yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler, Use CGroups for Resource Management yarn.nodemanager.linux-container-executor.resources-handler.class is NOT checked.
Considering that I have 8 cores in my cluster and not 16 as I thought at the beginning, starting more than 7 map tasks (and AM) is not supposed to give me performance gains as all the cores have been used already. Am I right? True, for hundreds or thousands of nodes a single coordination node might be a bottleneck. My deployments are not expected to exceed 32 or 64 nodes. Pozdrawiam / Regards / Med venlig hilsen Tomasz Guziałek 2014-07-09 16:01 GMT+02:00 Adam Kawa <[email protected]>: > Hi Tomek, > > You have 9.26GB in 4 nodes what is 2.315GB on average. What is your value > of yarn.nodemanager.resource.memory-mb? > > You consume 1GB of RAM per container (8 containers running = 8GB of memory > used). My idea is that, after running 8 containers (1 AM + 7 map tasks), > you have only 315MB of available memory on each NodeManager. Therefore, > when you request 1GB to get a container for #8 map task, there is no > NodeManager than can give you a whole 1GB (despite having more than 1GB of > aggregated memory on the cluster). > > To verify this, please check the value of > yarn.nodemanager.resource.memory-mb. > > Thanks, > Adam > > PS1. > Just our of curiosity. What are your values of > *yarn.nodemanager.resource.cpu-vcores* (is not it 2?) > *yarn.resourcemanager.scheduler.class* (I assume that Fair Scheduler, but > just to confirm. Could you have any non-default settings in your > scheduler's configuration that limit the number of resources per user?) > *yarn.nodemanager.linux-container-executor.resources-handler.class* > ? > > PS2. > "I am comparing M/R implementation with a custom one, where one node is > dedicated for coordination and I utilize 4 slaves fully for computation." > > Note that this might not work on a larger scale, because "one node is > dedicated for coordination" might become the bottleneck. This is one of a > couple of reasons why YARN and original MapReduce at Google have decided to > run coordination processes on slave nodes. > > > > > 2014-07-09 9:47 GMT+02:00 Tomasz Guziałek <[email protected]>: > > Thank you for your assistance, Adam. >> >> Containers running | Memory used | Memory total | Memory reserved >> 8 | 8 GB | 9.26 GB >> | 0 B >> >> Seems like you are right: the ApplicationMaster is occupying one slot as >> I have 8 containers running, but 7 map tasks. >> >> Again, I revised my information about m1.large instance on EC2. There are >> only 2 cores available per node giving 4 computing units (ECU units >> introduced by Amazon). So 8 slots at a time is expected. However, >> scheduling AM on a slave node ruins my experiment. I am comparing M/R >> implementation with a custom one, where one node is dedicated for >> coordination and I utilize 4 slaves fully for computation. This one core >> for AM is extending the execution time by a factor of 2. Does any one have >> an idea how to have 8 map tasks running? >> >> Pozdrawiam / Regards / Med venlig hilsen >> Tomasz Guziałek >> >> >> 2014-07-09 0:56 GMT+02:00 Adam Kawa <[email protected]>: >> >> If you run an application (e.g. MapReduce job) on YARN cluster, first the >>> Application Master will be is started on some slave node to coordinate the >>> execution of all tasks within the job. The ApplicationMaster and tasks that >>> belong to its application run in the containers controlled by the >>> NodeManagers. >>> >>> Maybe, you simply run 8 containers on your YARN cluster and 1 container >>> is consumed by MapReduce AppMaster and 7 containers are consumed by map >>> tasks. But it seems not to be a root cause of you problem, because >>> according to your settings you should be able to run 16 containers >>> maximally. >>> >>> Another idea might be that your are bottlenecked by the amount of memory >>> on the cluster (each container consumes memory) and despite having vcore(s) >>> available, you can not launch new tasks. When you go to the ResourceManager >>> Web UI, do you see that you utilize whole cluster memory? >>> >>> >>> >>> 2014-07-08 21:06 GMT+02:00 Tomasz Guziałek <[email protected]>: >>> >>> I was not precise when describing my cluster. I have 4 slave nodes and a >>>> separate master node. The master has ResourceManager role (along with >>>> JobHistory role) and the rest have NodeManager roles. If this really is an >>>> ApplicationMaster, is it possible to schedule it on the master node? This >>>> single waiting map task is doubling my execution time. >>>> >>>> Pozdrawiam / Regards / Med venlig hilsen >>>> Tomasz Guziałek >>>> >>>> >>>> 2014-07-08 18:42 GMT+02:00 Adam Kawa <[email protected]>: >>>> >>>> Is not your MapReduce AppMaster occupying one slot? >>>>> >>>>> Sent from my iPhone >>>>> >>>>> > On 8 jul 2014, at 13:01, Tomasz Guziałek <[email protected]> >>>>> wrote: >>>>> > >>>>> > Hello all, >>>>> > >>>>> > I am running a 4-nodes CDH5 cluster on Amazon EC2 . The instances >>>>> used are m1.large, so I have 4 cores (2 core x 2 unit) per node. My HBase >>>>> table has 8 regions, so I expected at least 8 (if not 16) mapper tasks to >>>>> run simultaneously. However, only 7 are running and 1 is waiting for an >>>>> empty slot. Why this surprising number came up? I have checked that the >>>>> regions are equally distributed on the region servers (2 per node). >>>>> > >>>>> > My properties in the job: >>>>> > Configuration mapReduceConfiguration = HBaseConfiguration.create(); >>>>> > mapReduceConfiguration.set("hbase.client.max.perregion.tasks", "4"); >>>>> > >>>>> mapReduceConfiguration.set("mapreduce.tasktracker.map.tasks.maximum", >>>>> "16"); >>>>> > >>>>> > My properties in the CDH: >>>>> > yarn.scheduler.minimum-allocation-vcores = 1 >>>>> > yarn.scheduler.maximum-allocation-vcores = 4 >>>>> > >>>>> > Do I miss some property? Please share your experience. >>>>> > >>>>> > Best regards >>>>> > Tomasz >>>>> >>>> >>>> >>> >> >
