Thanks Lukas and Mark - setting yarn.nodemanager.resource.cpu-vcores to 10 allowed the 5th job to run, and it's working beautifully now!
On Mon, Oct 6, 2014 at 1:53 PM, Mark Mindenhall < [email protected]> wrote: > Yes, looks like you need to increase the number of vCores to at least 10 > in order to run 5 jobs (yarn-site.xml): > > <property> > <name>yarn.nodemanager.resource.cpu-vcores</name> > <value>10</value> > <description>Number of CPU cores that can be allocated for > containers.</description> > </property> > > > On Oct 6, 2014, at 12:27 PM, Zach Cox <[email protected]> wrote: > > > Thanks for the replies everyone! I did the 3 things that Mark mentioned, > > re-built & deployed the .tar.gz, then did `bin/grid stop all` and > `bin/grid > > start all`. But when I re-submitted the 3 hello-samza jobs along with my > 2 > > new jobs, yarn still won't run my 5th job. The yarn web ui now shows > Memory > > Used = 4 GB and Memory Total = 8 GB, but my job still sits at State = > > ACCEPTED. > > > > When I tail deploy/yarn/logs/yarn-vagrant-resourcemanager-precise64.log I > > see this repeated continuously: > > > https://gist.githubusercontent.com/zcox/0f2b260d29e18d40d038/raw/0d805bcd7d8fec5332756efc9c990679480df117/gistfile1.txt > > > > I notice it says "available=<memory:4096, vCores:0>" - is my job not > being > > run now because vCores=0? > > > > I also updated Vagrantfile to use: > > > > samza.vm.provider :virtualbox do |vb| vb.memory = 4096 vb.cpus = 8 end > > > > Thanks, > > Zach > > > > > > On Mon, Oct 6, 2014 at 12:40 PM, Lukas Steiblys <[email protected]> > > wrote: > > > >> I'll add that if you check the YARN node application master container > log > >> and see that the job is constantly restarting, you might need to > increase > >> the container memory limit to 1024MB at least. Also, a good parameter to > >> play with in YARN is yarn.nodemanager.vmem-pmem-ratio. > >> > >> Lukas > >> > >> -----Original Message----- From: Mark Mindenhall > >> Sent: Monday, October 6, 2014 8:44 AM > >> To: [email protected] > >> Subject: Re: Problems running new jobs in hello-samza > >> > >> > >> Hi Zach, > >> > >> I’m also a relative newbie, but I did run into this same issue. You are > >> correct, in that your 5th job isn’t starting due to not enough resources > >> available in the cluster, so you need to reduce the resources required. > >> > >> First, in yarn-site.xml I switched over to the FairScheduler< > >> http://hadoop.apache.org/docs/r2.2.0/hadoop-yarn/hadoop-yarn-site/ > >> FairScheduler.html>: > >> > >> <property> > >> <name>yarn.resourcemanager.scheduler.class</name> > >> <value>org.apache.hadoop.yarn.server.resourcemanager. > >> scheduler.fair.FairScheduler</value> > >> </property> > >> > >> I also added these two properties (yarn-site.xml) to control the amount > of > >> memory allocated to each job: > >> > >> <property> > >> <name>yarn.scheduler.minimum-allocation-mb</name> > >> <value>256</value> > >> <description>Minimum limit of memory to allocate to each container > >> request at the Resource Manager.</description> > >> </property> > >> <property> > >> <name>yarn.scheduler.maximum-allocation-mb</name> > >> <value>512</value> > >> <description>Maximum limit of memory to allocate to each container > >> request at the Resource Manager.</description> > >> </property> > >> > >> Then, in each of my Samza properties files describing my jobs, I added > the > >> following two settings: > >> > >> yarn.container.memory.mb=512 > >> yarn.am.container.memory.mb=256 > >> > >> Hope that helps! > >> > >> Best, > >> Mark > >> > >> > >> On Oct 6, 2014, at 6:27 AM, Zach Cox <[email protected]<mailto:zcox > >> [email protected]>> wrote: > >> > >> Hi - I'm just getting started with Samza. I got the hello-samza example > >> working properly in the vagrant box. Then I wrote 2 new tasks, rebuilt > >> everything and submitted them to yarn using run-job.sh. These 2 new jobs > >> show up in the yarn web ui, however only one of them has State=RUNNING, > the > >> other just sits forever at State=ACCEPTED. > >> > >> The Cluster Metrics section shows some interesting things: > >> - Apps Pending = 1 > >> - Apps Running = 4 > >> - Containers Running = 8 > >> - Memory Used = 8 GB > >> - Memory Total = 8 GB > >> - Memory Reserved = 0 B > >> > >> Again I'm really new to samza & yarn, but does this mean that the node > on > >> this vagrant box has 8 GB memory available but all 8 GB is being used, > so > >> it can't run the 5th samza job? > >> > >> Are there 8 containers running because each Samza job has an > >> ApplicationMaster and a SamzaContainer? Are each of those containers > using > >> 1 GB memory, and that's why all the available memory is used up? Do > these > >> containers really need 1 GB memory each? Can this be adjusted somehow? > >> > >> Just trying to better understand what's going on here, and see if > there's a > >> simple way to get both of my new tasks running in hello-samza. > >> > >> Thanks, > >> Zach > >> > >> > >
