Hey John, I set up a role for myriad, restarted mesos-master, and now I'm seeing RMs starting on the Mesos UI, but they fail with the message "lost with exit status: 256". The executor log says "Error: JAVA_HOME is not set and could not be found." $JAVA_HOME is set on all my slaves as far as I'm aware. Running `java -version` confirms openjdk 1.7.0_111. Looks like its close to a working state. Am I missing something?
Thanks! Matt -----Original Message----- From: John Yost [mailto:hokiege...@gmail.com] Sent: Wednesday, August 17, 2016 2:38 PM To: dev@myriad.incubator.apache.org Subject: Re: Resource manager error Please uncomment frameworkRole and then add the name of whatever Mesos role you have configured that is not *. Note: at the risk of telling you something you already know, you define roles in /etc/mesos-master/roles. In the meantime, I opened up a JIRA ticket and gonna fix this ASAP starting now! :) --John On Wed, Aug 17, 2016 at 2:23 PM, Matthew J. Loppatto <mloppa...@keywcorp.com > wrote: > Hey Darin, > > Commenting out myriadFrameworkRole got rid of the log message about > the missing role, but I'm still seeing the "n must be positive" exception. > > The only other thing of interest I see in the log is WARN > fair.AllocationFileLoaderService: > fair-scheduler.xml not found on the classpath. Not sure if that is > causing any issue though. > > Matt > > -----Original Message----- > From: Darin Johnson [mailto:dbjohnson1...@gmail.com] > Sent: Wednesday, August 17, 2016 1:26 PM > To: Dev > Subject: Re: Resource manager error > > Hey Matt, > > Looking through the code, I think setting myriadFrameworkRole to "*" > might be the problem. Can you try commenting out that line in your > config? I'll double check this in a little while too. If that works > I'll submit a patch that checks that. > > Sorry - Myriad is still a pretty young project! Thanks for checking > it out though! > > Darin > > On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto < > mloppa...@keywcorp.com> wrote: > > > Hey Darin, > > > > Pulling from master got rid of the errors I was seeing, however I'm > > running into a new issue. After starting the resource manager, I > > see this in the logs: > > > > 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1 > > NM(s) with profile medium > > 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler. > MyriadOperations: > > Adding 1 NM instances to cluster > > 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler. > event.handlers.ErrorEventHandler: > > Role '' is not present in the master's --roles > > > > My Mesos cluster has the default "*" role so I tried setting > > frameworkRole: "*" in myriad-config-default.yml, restarted the > > resource manager and got this error: > > > > 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler. > event.handlers.ResourceOffersEventHandler: > > Exception thrown while trying to create a task for nm > > java.lang.IllegalArgumentException: n must be positive > > at java.util.Random.nextInt(Random.java:300) > > at org.apache.myriad.scheduler.resource.RangeResource. > > getRandomValues(RangeResource.java:128) > > at org.apache.myriad.scheduler.resource.RangeResource. > > consumeResource(RangeResource.java:99) > > at org.apache.myriad.scheduler.resource.ResourceOfferContainer. > > consumePorts(ResourceOfferContainer.java:171) > > at org.apache.myriad.scheduler.NMTaskFactory.createTask( > > NMTaskFactory.java:45) > > at org.apache.myriad.scheduler.event.handlers. > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:119) > > at org.apache.myriad.scheduler.event.handlers. > > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:49) > > at com.lmax.disruptor.BatchEventProcessor.run( > > BatchEventProcessor.java:128) > > at java.util.concurrent.ThreadPoolExecutor.runWorker( > > ThreadPoolExecutor.java:1145) > > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > > ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:745) > > > > Does Myriad require its own role in Mesos? > > > > Thanks, > > Matt > > > > > > -----Original Message----- > > From: Darin Johnson [mailto:dbjohnson1...@gmail.com] > > Sent: Tuesday, August 16, 2016 6:18 PM > > To: Dev > > Subject: Re: Resource manager error > > > > Hey Mathew, my coworker found the same issue recently, I fixed it on > > my last pull request, if you'd like to pull from master. > > > > Alternatively, you could comment out the appendCgroups line in > > myriad-scheduler <https://urldefense.proofpoint.com/v2/url?u=https- > > 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_ > > myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m= > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_ > > qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C > > wI > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG > > Dn > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT > > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> > > proofpoint.com/v2/url?u=https-3A__github.com_apache_ > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c= > > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r= > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m= > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s= > > 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main < > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apac > > he > > _ > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIFaQ&c > > = 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r= > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m= > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s= > > BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java < > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apac > > he > > _ > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java&d=CwI > > Fa Q&c= 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r= > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m= > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_ > > 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C > > wI > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG > > Dn > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT > > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> > > proofpoint.com/v2/url?u=https-3A__github.com_apache_ > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_ > > java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r= > > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m= > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s= > > KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache < > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apac > > he _ incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_ > > java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2Zh > > lU &r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m= > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_ > > tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C > > wI > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG > > Dn > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT > > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> > > proofpoint.com/v2/url?u=https-3A__github.com_apache_ > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_ > > java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m= > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtXx- > > w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C > > wI > > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG > > Dn > > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT > > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> > > proofpoint.com/v2/url?u=https-3A__github.com_apache_ > > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_ > > java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1 > > qe zfsY > > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m= > > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s= > > 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= > > > /*NMExecutorCLGenImpl* and rebuild. > > > > Sorry that missed my QA unfortunately I'm always using cgroups and > > didn't test that. We may do a 0.2.1 release but I can say when. > > > > Darin > > > > On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto" > > <mloppa...@keywcorp.com> > > wrote: > > > > > Hi, > > > > > > > > > > > > I’m setting up Myriad 0.2.0 on my Mesos cluster following this guide: > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache. > > > or > > > g_ > > > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezf > > > sY > > > Hy > > > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZ > > > QS > > > sK > > > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58BuSD5e > > > tw > > > Im > > > WZHzFz6Sk&e= > > > Installing+for+Developers > > > > > > > > > > > > And I get the following error in the resource manager executor log > > > in mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn > > resourcemanager`: > > > > > > > > > > > > chown: cannot access > > > ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30- > > f298affb6442’: > > > No such file or directory > > > > > > env: /bin/yarn: No such file or directory > > > > > > ory > > > > > > > > > > > > It appears the ‘mesos’ directory doesn’t exist under > /sys/fs/cgroup/cpu. > > > Any ideas what the issue could be? > > > > > > > > > > > > This is my yarn-site.xml: > > > > > > > > > > > > <configuration> > > > > > > <!-- Site-specific YARN configuration properties --> > > > > > > <property> > > > > > > <name>yarn.nodemanager.aux-services</name> > > > > > > <value>mapreduce_shuffle,myriad_executor</value> > > > > > > <!-- If using MapR distro, please use the following value: > > > > > > > > > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</valu > > > e> > > > --> > > > > > > </property> > > > > > > <property> > > > > > > > > > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> > > > > > > <value>org.apache.hadoop.mapred.ShuffleHandler</value> > > > > > > </property> > > > > > > <property> > > > > > > > > > <name>yarn.nodemanager.aux-services.myriad_executor.class</name> > > > > > > > > > <value>org.apache.myriad.executor.MyriadExecutorAuxService</value> > > > > > > </property> > > > > > > <property> > > > > > > <name>yarn.nm.liveness-monitor.expiry-interval-ms</name> > > > > > > <value>2000</value> > > > > > > </property> > > > > > > <property> > > > > > > <name>yarn.am.liveness-monitor.expiry-interval-ms</name> > > > > > > <value>10000</value> > > > > > > </property> > > > > > > <property> > > > > > > > > > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name> > > > > > > <value>1000</value> > > > > > > </property> > > > > > > <!-- Needed for Fine Grain Scaling --> > > > > > > <property> > > > > > > <name>yarn.scheduler.minimum-allocation-vcores</name> > > > > > > <value>0</value> > > > > > > </property> > > > > > > <property> > > > > > > <name>yarn.scheduler.minimum-allocation-mb</name> > > > > > > <value>0</value> > > > > > > </property> > > > > > > <!-- Site specific YARN configuration properties --> > > > > > > <property> > > > > > > <name>yarn.nodemanager.resource.cpu-vcores</name> > > > > > > <value>${nodemanager.resource.cpu-vcores}</value> > > > > > > </property> > > > > > > <property> > > > > > > <name>yarn.nodemanager.resource.memory-mb</name> > > > > > > <value>${nodemanager.resource.memory-mb}</value> > > > > > > </property> > > > > > > <!--These options enable dynamic port assignment by mesos --> > > > > > > <property> > > > > > > <name>yarn.nodemanager.address</name> > > > > > > <value>${myriad.yarn.nodemanager.address}</value> > > > > > > </property> > > > > > > <property> > > > > > > <name>yarn.nodemanager.webapp.address</name> > > > > > > <value>${myriad.yarn.nodemanager.webapp.address}</value> > > > > > > </property> > > > > > > <property> > > > > > > <name>yarn.nodemanager.webapp.https.address</name> > > > > > > <value>${myriad.yarn.nodemanager.webapp.address}</value> > > > > > > </property> > > > > > > <property> > > > > > > <name>yarn.nodemanager.localizer.address</name> > > > > > > <value>${myriad.yarn.nodemanager.localizer.address}</value> > > > > > > </property> > > > > > > <!-- Configure Myriad Scheduler here --> > > > > > > <property> > > > > > > <name>yarn.resourcemanager.scheduler.class</name> > > > > > > > > > <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</value > > > > > > > > > > <description>One can configure other scehdulers as well from > > > following > > > list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler, > > > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</description> > > > > > > </property> > > > > > > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 --> > > > > > > <property> > > > > > > <name>yarn.nodemanager.pmem-check-enabled</name> > > > > > > <value>false</value> > > > > > > </property> > > > > > > <property> > > > > > > <name>yarn.nodemanager.vmem-check-enabled</name> > > > > > > <value>false</value> > > > > > > </property> > > > > > > </configuration> > > > > > > > > > > > > > > > > > > My myriad-config-default.yml: > > > > > > > > > > > > mesosMaster: zk://myip:2181/mesos > > > > > > checkpoint: false > > > > > > frameworkFailoverTimeout: 43200000 > > > > > > frameworkName: MyriadAlpha > > > > > > frameworkRole: > > > > > > frameworkUser: root # User the Node Manager runs as, required if > > > nodeManagerURI set, otherwise defaults to the user > > > > > > # running the resource manager. > > > > > > frameworkSuperUser: root # To be depricated, currently > > > permissions need set by a superuser due to Mesos-1790. Must be > > > > > > # root or have passwordless sudo. > > > Required if nodeManagerURI set, ignored otherwise. > > > > > > nativeLibrary: /usr/local/lib/libmesos.so > > > > > > zkServers: myip:2181 > > > > > > zkTimeout: 20000 > > > > > > restApiPort: 8192 > > > > > > servedConfigPath: dist/config.tgz > > > > > > servedBinaryPath: dist/binary.tgz > > > > > > profiles: > > > > > > zero: # NMs launched with this profile dynamically obtain cpu/mem > > > from Mesos > > > > > > cpu: 0 > > > > > > mem: 0 > > > > > > small: > > > > > > cpu: 2 > > > > > > mem: 2048 > > > > > > medium: > > > > > > cpu: 4 > > > > > > mem: 4096 > > > > > > large: > > > > > > cpu: 10 > > > > > > mem: 12288 > > > > > > nmInstances: # NMs to start with. Requires at least 1 NM with a > > > non-zero profile. > > > > > > medium: 1 # <profile_name : instances> > > > > > > rebalancer: false > > > > > > haEnabled: false > > > > > > nodemanager: > > > > > > jvmMaxMemoryMB: 1024 > > > > > > cpus: 0.2 > > > > > > cgroups: false > > > > > > executor: > > > > > > jvmMaxMemoryMB: 256 > > > > > > path: > > > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.jar > > > > > > #The following should be used for a remotely distributed URI, hdfs > > > assumed but other URI types valid. > > > > > > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz > > > > > > #configUri: > > > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_api_ > > > ar > > > if > > > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2 > > > Zh > > > lU > > > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5ir > > > uY > > > 8I > > > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRUfRsm > > > ew > > > &e > > > = > > > > > > #jvmUri: > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads.myc > > > om > > > pa > > > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1tvZeu > > > WB > > > T6 > > > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtz > > > Nh > > > AI > > > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH3k3C > > > Ls > > > gl > > > 4Zj3tGawuCLVB1a2WXBUOWEelU&e= > > > > > > yarnEnvironment: > > > > > > YARN_HOME: /opt/hadoop-2.7.2 > > > > > > > > > > > > > > > > > > Thanks! > > > > > > Matt > > > > > >
smime.p7s
Description: S/MIME cryptographic signature