I also had this issue, and it was resolved by changing settings in yarn-site.xml and capacity-scheduler.xml. The amount of memory (and number of virtual CPUs) allocated to your jobs is controlled by settings in yarn-site.xml. And I suspect you’re seeing jobs going into ACCEPTED instead of RUNNING due to the default value (0.1) of yarn.scheduler.capacity.maximum-am-resource-percent in capacity-scheduler.xml being too low. For example, here are the values I’m using in my staging cluster (just two m3.medium EC2 instances), where I typically request 256MB per container.
----------------------------------------------------------------------------- yarn-site.xml ----------------------------------------------------------------------------- <?xml version="1.0"?> <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>128</value> <description>Minimum limit of memory to allocate to each container request at the Resource Manager.</description> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>512</value> <description>Maximum limit of memory to allocate to each container request at the Resource Manager.</description> </property> <property> <name>yarn.scheduler.minimum-allocation-vcores</name> <value>1</value> <description>The minimum allocation for every container request at the RM, in terms of virtual CPU cores. Requests lower than this won't take effect, and the specified value will get allocated the minimum.</description> </property> <property> <name>yarn.scheduler.maximum-allocation-vcores</name> <value>2</value> <description>The maximum allocation for every container request at the RM, in terms of virtual CPU cores. Requests higher than this won't take effect, and will get capped to this value.</description> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>3072</value> <description>Physical memory, in MB, to be made available to running containers</description> </property> <property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>8</value> <description>Number of CPU cores that can be allocated for containers.</description> </property> <property> <name>yarn.nodemanager.vmem-pmem-ratio</name> <value>2.1</value> </property> <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>******.amazonaws.com</value> </property> <property> <name>yarn.nodemanager.log.retain-seconds</name> <value>86400</value> </property> </configuration> ----------------------------------------------------------------------------- capacity-scheduler.xml ----------------------------------------------------------------------------- <configuration> <property> <name>yarn.scheduler.capacity.maximum-applications</name> <value>10000</value> <description> Maximum number of applications that can be pending and running. </description> </property> <!-- Changed by MM from 0.1 (default) to 0.5, as our Samza jobs have typically just one AppMaster and one job container. --> <property> <name>yarn.scheduler.capacity.maximum-am-resource-percent</name> <value>0.5</value> <description> Maximum percent of resources in the cluster which can be used to run application masters i.e. controls number of concurrent running applications. </description> </property> <property> <name>yarn.scheduler.capacity.resource-calculator</name> <value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value> <description> The ResourceCalculator implementation to be used to compare Resources in the scheduler. The default i.e. DefaultResourceCalculator only uses Memory while DominantResourceCalculator uses dominant-resource to compare multi-dimensional resources such as Memory, CPU etc. </description> </property> <property> <name>yarn.scheduler.capacity.root.queues</name> <value>default</value> <description> The queues at the this level (root is the root queue). </description> </property> <property> <name>yarn.scheduler.capacity.root.default.capacity</name> <value>100</value> <description>Default queue target capacity.</description> </property> <property> <name>yarn.scheduler.capacity.root.default.user-limit-factor</name> <value>1</value> <description> Default queue user limit a percentage from 0.0 to 1.0. </description> </property> <property> <name>yarn.scheduler.capacity.root.default.maximum-capacity</name> <value>100</value> <description> The maximum capacity of the default queue. </description> </property> <property> <name>yarn.scheduler.capacity.root.default.state</name> <value>RUNNING</value> <description> The state of the default queue. State can be one of RUNNING or STOPPED. </description> </property> <property> <name>yarn.scheduler.capacity.root.default.acl_submit_applications</name> <value>*</value> <description> The ACL of who can submit jobs to the default queue. </description> </property> <property> <name>yarn.scheduler.capacity.root.default.acl_administer_queue</name> <value>*</value> <description> The ACL of who can administer jobs on the default queue. </description> </property> <property> <name>yarn.scheduler.capacity.node-locality-delay</name> <value>40</value> <description> Number of missed scheduling opportunities after which the CapacityScheduler attempts to schedule rack-local containers. Typically this should be set to number of nodes in the cluster, By default is setting approximately number of nodes in one rack which is 40. </description> </property> </configuration> On September 15, 2015 at 5:05:28 AM, Jordi Blasi Uribarri (jbl...@nextel.es<mailto:jbl...@nextel.es>) wrote: I have tried changing all the jobs configuration to this: yarn.container.memory.mb=128 yarn.am.container.memory.mb=128 and on the startup I can see: 2015-09-15 12:40:18 ClientHelper [INFO] set memory request to 128 for application_1442313590092_0002 On the web interface of hadoop I see that every job is still getting 2 gb each. In fact, only two of the jobs are in state running, while the rest are accepted. Any ideas? Thanks, Jordi -----Mensaje original----- De: Yan Fang [mailto:yanfang...@gmail.com] Enviado el: viernes, 11 de septiembre de 2015 20:56 Para: dev@samza.apache.org Asunto: Re: memory limits Hi Jordi, I believe you can change the memory by* yarn.container.memory.mb* , default is 1024. And *yarn.am.container.memory.mb* is for the AM memory. See http://samza.apache.org/learn/documentation/0.9/jobs/configuration-table.html Thanks, Fang, Yan yanfang...@gmail.com On Fri, Sep 11, 2015 at 4:21 AM, Jordi Blasi Uribarri <jbl...@nextel.es> wrote: > Hi, > > I am trying to implement an environment that requires multiple > combined samza jobs for different tasks. I see that there is a limit > to the number of jobs that can be running at the same time as they block 1GB > of ram each. > I understand that this is a reasonable limit in a production > environment (as long as we are speaking of Big Data, we need big > amounts of resources ☺ > ) but my lab does not have so much ram. Is there a way to reduce this > limit so I can test it properly? I am using Samza 0.9. > > Thanks in advance, > > Jordi > ________________________________ > Jordi Blasi Uribarri > Área I+D+i > > jbl...@nextel.es > Oficina Bilbao > > [http://www.nextel.es/wp-content/uploads/Firma_Nextel_2015.png] >