Hey John,

I set up a role for myriad, restarted mesos-master, and now I'm seeing RMs 
starting on the Mesos UI, but they fail with the message "lost with exit 
status: 256".  The executor log says "Error: JAVA_HOME is not set and could not 
be found."  $JAVA_HOME is set on all my slaves as far as I'm aware.  Running 
`java -version` confirms openjdk 1.7.0_111.  Looks like its close to a working 
state.  Am I missing something?

Thanks!
Matt

-----Original Message-----
From: John Yost [mailto:hokiege...@gmail.com] 
Sent: Wednesday, August 17, 2016 2:38 PM
To: dev@myriad.incubator.apache.org
Subject: Re: Resource manager error

Please uncomment frameworkRole and then add the name of whatever Mesos role you 
have configured that is not *. Note: at the risk of telling you something you 
already know, you define roles in /etc/mesos-master/roles.

In the meantime, I opened up a JIRA ticket and gonna fix this ASAP starting 
now! :)

--John

On Wed, Aug 17, 2016 at 2:23 PM, Matthew J. Loppatto <mloppa...@keywcorp.com
> wrote:

> Hey Darin,
>
> Commenting out myriadFrameworkRole got rid of the log message about 
> the missing role, but I'm still seeing the "n must be positive" exception.
>
> The only other thing of interest I see in the log is WARN 
> fair.AllocationFileLoaderService:
> fair-scheduler.xml not found on the classpath.  Not sure if that is 
> causing any issue though.
>
> Matt
>
> -----Original Message-----
> From: Darin Johnson [mailto:dbjohnson1...@gmail.com]
> Sent: Wednesday, August 17, 2016 1:26 PM
> To: Dev
> Subject: Re: Resource manager error
>
> Hey Matt,
>
> Looking through the code, I think setting myriadFrameworkRole to "*" 
> might be the problem.  Can you try commenting out that line in your 
> config?  I'll double check this in a little while too.  If that works 
> I'll submit a patch that checks that.
>
> Sorry - Myriad is still a pretty young project!  Thanks for checking 
> it out though!
>
> Darin
>
> On Wed, Aug 17, 2016 at 11:25 AM, Matthew J. Loppatto < 
> mloppa...@keywcorp.com> wrote:
>
> > Hey Darin,
> >
> > Pulling from master got rid of the errors I was seeing, however I'm 
> > running into a new issue.  After starting the resource manager, I 
> > see this in the logs:
> >
> > 2016-08-17 10:56:40,709 INFO org.apache.myriad.Main: Launching 1 
> > NM(s) with profile medium
> > 2016-08-17 10:56:40,710 INFO org.apache.myriad.scheduler.
> MyriadOperations:
> > Adding 1 NM instances to cluster
> > 2016-08-17 10:56:40,733 ERROR org.apache.myriad.scheduler.
> event.handlers.ErrorEventHandler:
> > Role '' is not present in the master's --roles
> >
> > My Mesos cluster has the default "*" role so I tried setting
> > frameworkRole: "*" in myriad-config-default.yml, restarted the 
> > resource manager and got this error:
> >
> > 2016-08-17 11:06:28,244 ERROR org.apache.myriad.scheduler.
> event.handlers.ResourceOffersEventHandler:
> > Exception thrown while trying to create a task for nm
> > java.lang.IllegalArgumentException: n must be positive
> >     at java.util.Random.nextInt(Random.java:300)
> >     at org.apache.myriad.scheduler.resource.RangeResource.
> > getRandomValues(RangeResource.java:128)
> >     at org.apache.myriad.scheduler.resource.RangeResource.
> > consumeResource(RangeResource.java:99)
> >     at org.apache.myriad.scheduler.resource.ResourceOfferContainer.
> > consumePorts(ResourceOfferContainer.java:171)
> >     at org.apache.myriad.scheduler.NMTaskFactory.createTask(
> > NMTaskFactory.java:45)
> >     at org.apache.myriad.scheduler.event.handlers.
> > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:119)
> >     at org.apache.myriad.scheduler.event.handlers.
> > ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:49)
> >     at com.lmax.disruptor.BatchEventProcessor.run(
> > BatchEventProcessor.java:128)
> >     at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1145)
> >     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:615)
> >     at java.lang.Thread.run(Thread.java:745)
> >
> > Does Myriad require its own role in Mesos?
> >
> > Thanks,
> > Matt
> >
> >
> > -----Original Message-----
> > From: Darin Johnson [mailto:dbjohnson1...@gmail.com]
> > Sent: Tuesday, August 16, 2016 6:18 PM
> > To: Dev
> > Subject: Re: Resource manager error
> >
> > Hey Mathew, my coworker found the same issue recently, I fixed it on 
> > my last pull request, if you'd like to pull from master.
> >
> > Alternatively, you could comment out the appendCgroups line in 
> > myriad-scheduler <https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__github.com_apache_incubator-2Dmyriad_tree_0.2.x_
> > myriad-2Dscheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5zc_OUK_
> > qwnVQoC2kVCcAgvb4ZmZrVKF-iHca_dif4Y&e= >/src 
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C
> > wI 
> > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG
> > Dn 
> > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT
> > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src&d=CwIFaQ&c=
> > 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > 6CJWEHP2t7cY2oTmNz9Aq9AV39VEkUOKpMoRuz1q9nY&e= > /main < 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apac
> > he
> > _
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main&d=CwIFaQ&c
> > = 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > BtYphvuvvNZ5owUTfiRd4hW90jq0Ib8GGtKiHU0fTB4&e= > /java < 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apac
> > he
> > _
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_java&d=CwI
> > Fa Q&c= 31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=5o45_
> > 8zSN96rSaQJ8oCWfhCvmqhSbLpz9fMnV9Fk4WI&e= > /org 
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C
> > wI 
> > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG
> > Dn 
> > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT
> > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=
> > D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > KUTlDXsl6Okj5nCNNyCqnHcE2ePwEEOsYkf2ASzQP2Y&e= > /apache < 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apac
> > he _ incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org_apache&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2Zh
> > lU &r= D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=465af32H1JItcea_
> > tp5hz7zxwpqWgAqbVA8APaWmSUE&e= > /myriad 
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C
> > wI 
> > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG
> > Dn 
> > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT
> > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org_apache_myriad&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsY
> > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=QEZka3G6qwIsYzvPtXx-
> > w4uVek0Bt2D3bD4M4160Dnk&e= > /scheduler 
> > <https://urldefense.proofpoint.com/v2/url?u=https-3A__urldefense&d=C
> > wI 
> > FaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaG
> > Dn 
> > Pt52V5PqDlabKIPtzNhAIfJCs&m=sX9u4FJdfE4P4b24cRwGMPuyeT4XkeQRP5t8wCZT
> > jV w&s=DeQq-jARIja9dGEYfjeIQMd6jGkf_tNUyvQn7PIMieU&e= .> 
> > proofpoint.com/v2/url?u=https-3A__github.com_apache_
> > incubator-2Dmyriad_tree_0.2.x_myriad-2Dscheduler_src_main_
> > java_org_apache_myriad_scheduler&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1
> > qe zfsY 
> > HyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=
> > ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=
> > 2EubhJ2JLwuGjY6DBZXpauvyuXJ0xefgOFHC8lEo5JE&e= >
> > /*NMExecutorCLGenImpl* and rebuild.
> >
> > Sorry that missed my QA unfortunately I'm always using cgroups and 
> > didn't test that.  We may do a 0.2.1 release but I can say when.
> >
> > Darin
> >
> > On Aug 16, 2016 8:49 AM, "Matthew J. Loppatto"
> > <mloppa...@keywcorp.com>
> > wrote:
> >
> > > Hi,
> > >
> > >
> > >
> > > I’m setting up Myriad 0.2.0 on my Mesos cluster following this guide:
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.
> > > or
> > > g_
> > > confluence_display_MYRIAD_&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezf
> > > sY
> > > Hy
> > > olgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZ
> > > QS
> > > sK
> > > tyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=LaQad9p3ZI3Rt5cTn3kHAb58BuSD5e
> > > tw
> > > Im
> > > WZHzFz6Sk&e=
> > > Installing+for+Developers
> > >
> > >
> > >
> > > And I get the following error in the resource manager executor log 
> > > in mesos after starting it with `/opt/hadoop-2.7.2/bin/yarn
> > resourcemanager`:
> > >
> > >
> > >
> > > chown: cannot access
> > > ‘/sys/fs/cgroup/cpu/mesos/f5d6c530-c13d-4b1d-bc30-
> > f298affb6442’:
> > > No such file or directory
> > >
> > > env: /bin/yarn: No such file or directory
> > >
> > > ory
> > >
> > >
> > >
> > > It appears the ‘mesos’ directory doesn’t exist under
> /sys/fs/cgroup/cpu.
> > > Any ideas what the issue could be?
> > >
> > >
> > >
> > > This is my yarn-site.xml:
> > >
> > >
> > >
> > > <configuration>
> > >
> > > <!-- Site-specific YARN configuration properties -->
> > >
> > >    <property>
> > >
> > >        <name>yarn.nodemanager.aux-services</name>
> > >
> > >        <value>mapreduce_shuffle,myriad_executor</value>
> > >
> > >        <!-- If using MapR distro, please use the following value:
> > >
> > >
> > > <value>mapreduce_shuffle,mapr_direct_shuffle,myriad_executor</valu
> > > e>
> > > -->
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >
> > > <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
> > >
> > >        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >
> > > <name>yarn.nodemanager.aux-services.myriad_executor.class</name>
> > >
> > >
> > > <value>org.apache.myriad.executor.MyriadExecutorAuxService</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >        <name>yarn.nm.liveness-monitor.expiry-interval-ms</name>
> > >
> > >        <value>2000</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >        <name>yarn.am.liveness-monitor.expiry-interval-ms</name>
> > >
> > >        <value>10000</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >
> > > <name>yarn.resourcemanager.nm.liveness-monitor.interval-ms</name>
> > >
> > >        <value>1000</value>
> > >
> > >    </property>
> > >
> > > <!-- Needed for Fine Grain Scaling -->
> > >
> > >    <property>
> > >
> > >        <name>yarn.scheduler.minimum-allocation-vcores</name>
> > >
> > >        <value>0</value>
> > >
> > >    </property>
> > >
> > >    <property>
> > >
> > >        <name>yarn.scheduler.minimum-allocation-mb</name>
> > >
> > >        <value>0</value>
> > >
> > >    </property>
> > >
> > > <!-- Site specific YARN configuration properties -->
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.resource.cpu-vcores</name>
> > >
> > >    <value>${nodemanager.resource.cpu-vcores}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.resource.memory-mb</name>
> > >
> > >    <value>${nodemanager.resource.memory-mb}</value>
> > >
> > > </property>
> > >
> > > <!--These options enable dynamic port assignment by mesos -->
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.address}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.webapp.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.webapp.https.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.webapp.address}</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.localizer.address</name>
> > >
> > >    <value>${myriad.yarn.nodemanager.localizer.address}</value>
> > >
> > > </property>
> > >
> > > <!-- Configure Myriad Scheduler here -->
> > >
> > > <property>
> > >
> > >    <name>yarn.resourcemanager.scheduler.class</name>
> > >
> > >
> > > <value>org.apache.myriad.scheduler.yarn.MyriadFairScheduler</value
> > > >
> > >
> > >    <description>One can configure other scehdulers as well from 
> > > following
> > > list: org.apache.myriad.scheduler.yarn.MyriadCapacityScheduler,
> > > org.apache.myriad.scheduler.yarn.MyriadFifoScheduler</description>
> > >
> > > </property>
> > >
> > > <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.pmem-check-enabled</name>
> > >
> > >    <value>false</value>
> > >
> > > </property>
> > >
> > > <property>
> > >
> > >    <name>yarn.nodemanager.vmem-check-enabled</name>
> > >
> > >    <value>false</value>
> > >
> > > </property>
> > >
> > > </configuration>
> > >
> > >
> > >
> > >
> > >
> > > My myriad-config-default.yml:
> > >
> > >
> > >
> > > mesosMaster: zk://myip:2181/mesos
> > >
> > > checkpoint: false
> > >
> > > frameworkFailoverTimeout: 43200000
> > >
> > > frameworkName: MyriadAlpha
> > >
> > > frameworkRole:
> > >
> > > frameworkUser: root     # User the Node Manager runs as, required if
> > > nodeManagerURI set, otherwise defaults to the user
> > >
> > >                          # running the resource manager.
> > >
> > > frameworkSuperUser: root  # To be depricated, currently 
> > > permissions need set by a superuser due to Mesos-1790.  Must be
> > >
> > >                          # root or have passwordless sudo. 
> > > Required if nodeManagerURI set, ignored otherwise.
> > >
> > > nativeLibrary: /usr/local/lib/libmesos.so
> > >
> > > zkServers: myip:2181
> > >
> > > zkTimeout: 20000
> > >
> > > restApiPort: 8192
> > >
> > > servedConfigPath: dist/config.tgz
> > >
> > > servedBinaryPath: dist/binary.tgz
> > >
> > > profiles:
> > >
> > > zero:  # NMs launched with this profile dynamically obtain cpu/mem 
> > > from Mesos
> > >
> > >    cpu: 0
> > >
> > >    mem: 0
> > >
> > > small:
> > >
> > >    cpu: 2
> > >
> > >    mem: 2048
> > >
> > > medium:
> > >
> > >    cpu: 4
> > >
> > >    mem: 4096
> > >
> > > large:
> > >
> > >    cpu: 10
> > >
> > >    mem: 12288
> > >
> > > nmInstances: # NMs to start with. Requires at least 1 NM with a 
> > > non-zero profile.
> > >
> > > medium: 1 # <profile_name : instances>
> > >
> > > rebalancer: false
> > >
> > > haEnabled: false
> > >
> > > nodemanager:
> > >
> > > jvmMaxMemoryMB: 1024
> > >
> > > cpus: 0.2
> > >
> > > cgroups: false
> > >
> > > executor:
> > >
> > > jvmMaxMemoryMB: 256
> > >
> > > path:
> > > file:///usr/local/libexec/mesos/myriad-executor-runnable-0.1.0.jar
> > >
> > > #The following should be used for a remotely distributed URI, hdfs 
> > > assumed but other URI types valid.
> > >
> > > #nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.0.tar.gz
> > >
> > > #configUri:
> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1_api_
> > > ar
> > > if
> > > acts_config.tgz&d=CwIFaQ&c=31nHN1tvZeuWBT6LwDN4Ngk1qezfsYHyolgGeY2
> > > Zh
> > > lU
> > > &r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtzNhAIfJCs&m=ibxhOZQSsKtyVi5ir
> > > uY
> > > 8I
> > > mkW7bQ8zOrHcuDTLL7GBwA&s=IpOqhUOtwJsdorbAOeoY7GgHalMJ1s9EUjuRUfRsm
> > > ew
> > > &e
> > > =
> > >
> > > #jvmUri:
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__downloads.myc
> > > om
> > > pa
> > > ny.com_java_jre-2D7u76-2Dlinux-2Dx64.tar.gz&d=CwIFaQ&c=31nHN1tvZeu
> > > WB
> > > T6
> > > LwDN4Ngk1qezfsYHyolgGeY2ZhlU&r=D2bc6ANY3sIFSxaGDnPt52V5PqDlabKIPtz
> > > Nh
> > > AI
> > > fJCs&m=ibxhOZQSsKtyVi5iruY8ImkW7bQ8zOrHcuDTLL7GBwA&s=jPB2677RH3k3C
> > > Ls
> > > gl
> > > 4Zj3tGawuCLVB1a2WXBUOWEelU&e=
> > >
> > > yarnEnvironment:
> > >
> > > YARN_HOME: /opt/hadoop-2.7.2
> > >
> > >
> > >
> > >
> > >
> > > Thanks!
> > >
> > > Matt
> > >
> >
>

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to