[jira] [Commented] (MYRIAD-254) can not autoscale with zero profile on myriad 0.2.0

2017-05-23 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021597#comment-16021597
 ] 

DarinJ commented on MYRIAD-254:
---

Hongtaosun, 
You need to have at least one small node manager running or the map reduce job 
won't start.  It's mentioned in the myriad configuration, though it should be 
more explicit in the fgs documentation.  I'll try to update it.

Once you have the 1 small NM and the few zero NM's are up, you can start the 
map-reduce job.  Once started the zero NM's will resize and accept YARN tasks.  

> can not autoscale with zero profile on myriad 0.2.0
> ---
>
> Key: MYRIAD-254
> URL: https://issues.apache.org/jira/browse/MYRIAD-254
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.2.0
> Environment: mesos 1.1.0
> docker 1.12.6
>Reporter: Hongtaosun
>
> [root@csv-dcosstorage60 hadoop]# hadoop jar 
> /usr/hadoop-2.7.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar 
> wordcount /wordcount/1.txt /wordcount/output
> 17/03/30 17:11:13 INFO client.RMProxy: Connecting to ResourceManager at 
> /20.26.28.246:8032
> 17/03/30 17:11:14 INFO input.FileInputFormat: Total input paths to process : 1
> 17/03/30 17:11:14 INFO mapreduce.JobSubmitter: number of splits:1
> 17/03/30 17:11:14 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_1490859953345_0002
> 17/03/30 17:11:15 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
> /tmp/hadoop-yarn/staging/root/.staging/job_1490859953345_0002
> java.io.IOException: 
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
> resource request, requested memory < 0, or requested memory > max configured, 
> requestedMemory=512, maxMemory=0
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:204)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:328)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:306)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:243)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
>   at org.apache.hadoop.examples.WordCount.main(WordCount.java:87)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> 

[jira] [Comment Edited] (MYRIAD-254) can not autoscale with zero profile on myriad 0.2.0

2017-03-30 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949439#comment-15949439
 ] 

DarinJ edited comment on MYRIAD-254 at 3/30/17 5:18 PM:


Which Hadoop Scheduler are you using?  Have you tried Fair?  Capacity won't 
allow 0 core 0 memory nodemanagers.


was (Author: darinj):
Which Hadoop Scheduler are you using?  Have you tried Fair?

> can not autoscale with zero profile on myriad 0.2.0
> ---
>
> Key: MYRIAD-254
> URL: https://issues.apache.org/jira/browse/MYRIAD-254
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.2.0
> Environment: mesos 1.1.0
> docker 1.12.6
>Reporter: Hongtaosun
>
> [root@csv-dcosstorage60 hadoop]# hadoop jar 
> /usr/hadoop-2.7.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar 
> wordcount /wordcount/1.txt /wordcount/output
> 17/03/30 17:11:13 INFO client.RMProxy: Connecting to ResourceManager at 
> /20.26.28.246:8032
> 17/03/30 17:11:14 INFO input.FileInputFormat: Total input paths to process : 1
> 17/03/30 17:11:14 INFO mapreduce.JobSubmitter: number of splits:1
> 17/03/30 17:11:14 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_1490859953345_0002
> 17/03/30 17:11:15 INFO mapreduce.JobSubmitter: Cleaning up the staging area 
> /tmp/hadoop-yarn/staging/root/.staging/job_1490859953345_0002
> java.io.IOException: 
> org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid 
> resource request, requested memory < 0, or requested memory > max configured, 
> requestedMemory=512, maxMemory=0
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:204)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:385)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:328)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:281)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:580)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:218)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:419)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:306)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:243)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
>   at org.apache.hadoop.examples.WordCount.main(WordCount.java:87)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
> 

[jira] [Commented] (MYRIAD-251) ZERO size NodeManager fail to obtain resource from Mesos Offer

2016-12-14 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749451#comment-15749451
 ] 

DarinJ commented on MYRIAD-251:
---

[~hokiegeek2] looks like this occurs in PR-91, on line 145 of 
NMHeartBeatManager, instead on not using any of the resource in they are above 
the theshold, you should take the min of resources and the theshold.

> ZERO size NodeManager fail to obtain resource from Mesos Offer
> --
>
> Key: MYRIAD-251
> URL: https://issues.apache.org/jira/browse/MYRIAD-251
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Reporter: Tao Jie
>
> I tried Fine-grained Scaling and flexed up zero size NodeManager, then I run 
> a MR job which request for resource.
> However zero size NM did not obtain resource from mesos offer. RM logs like:
> {code}
> 2016-12-14 16:58:23,929 INFO 
> org.apache.myriad.scheduler.fgs.NMHeartBeatHandler: Did not update 
> bdi13.cmss.com with 10 cores and 5888 memory, over max cpu cores and/or max 
> memory
> 2016-12-14 16:58:23,931 WARN 
> org.apache.myriad.scheduler.fgs.YarnNodeCapacityManager: Asked to set Node 
> bdi13.cmss.com:31905 to a value less than zero!  Had , 
> setting to .
> 2016-12-14 16:58:23,931 WARN 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler:
>  Update resource on node: bdi13.cmss.com with the same resource:  vCores:0>
> {code}
> It seems that mesos offer with memory larger than 2252.8mb would be denied, 
> and 2252.8mb is fixed value in code :
> {code}
> private Double generateNodeManagerMemory() {
> return (NodeManagerConfiguration.DEFAULT_JVM_MAX_MEMORY_MB) * (1 + 
> NodeManagerConfiguration.JVM_OVERHEAD);
>   }
> {code}
> where DEFAULT_JVM_MAX_MEMORY_MB=2048 and 
> NodeManagerConfiguration.JVM_OVERHEAD=0.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-250) Should shutdown mesos framework when stop resourcemanager

2016-11-30 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708570#comment-15708570
 ] 

DarinJ commented on MYRIAD-250:
---

Yes I've always found it mildly annoying when I shutdown myriad I had to kill 
the resource manager manually.

> Should shutdown mesos framework when stop resourcemanager
> -
>
> Key: MYRIAD-250
> URL: https://issues.apache.org/jira/browse/MYRIAD-250
> Project: Myriad
>  Issue Type: Bug
>Affects Versions: Myriad 0.2.0
>Reporter: Tao Jie
>
> When I started resourcemanager and flex up nodes, nodemanagers were launched 
> as mesos tasks in framework created by RM.
> I stopped resourcemanager, the framework turned to inactive framework but 
> nodemanagers still run as active task. Then I restarted the resourcemanager, 
> which create another framework. Those nodemanager would report to the new 
> Resourcemanager, and I could not kill those nodemanager by flex down nodes.
> It seems that the framework should be shutdown once the resourcemanager is 
> stopped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-248) Fail to launch Nodemanager when frameworkRole is default value "*"

2016-11-28 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704085#comment-15704085
 ] 

DarinJ commented on MYRIAD-248:
---

Quickly hacked this together after recreating the problem: 
https://github.com/darinj/incubator-myriad.  I'd like to put some unit tests in 
place that demonstrate the problem before a pr though.

> Fail to launch Nodemanager when frameworkRole is default value "*"
> --
>
> Key: MYRIAD-248
> URL: https://issues.apache.org/jira/browse/MYRIAD-248
> Project: Myriad
>  Issue Type: Bug
>Affects Versions: Myriad 0.2.0
>Reporter: Tao Jie
>
> I tried to start hadoop cluster with myriad-0.2.0, but got error message in 
> rm log:
> {code}
> 2016-11-25 10:32:50,750 ERROR 
> org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler: 
> Exception thrown while trying to create a task for nm
> java.lang.IllegalArgumentException: n must be positive
> at java.util.Random.nextInt(Random.java:300)
> at 
> org.apache.myriad.scheduler.resource.RangeResource.getRandomValues(RangeResource.java:128)
> at 
> org.apache.myriad.scheduler.resource.RangeResource.consumeResource(RangeResource.java:99)
> at 
> org.apache.myriad.scheduler.resource.ResourceOfferContainer.consumePorts(ResourceOfferContainer.java:171)
> at 
> org.apache.myriad.scheduler.NMTaskFactory.createTask(NMTaskFactory.java:45)
> at 
> org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:146)
> at 
> org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:51)
> at 
> com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> I seems that the failure is due to the default value("*") of frameworkRole in 
>  myriad-config-default.yml.
> I set value of  frameworkRole to someone, then it worked well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-248) Fail to launch Nodemanager when frameworkRole is default value "*"

2016-11-28 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702397#comment-15702397
 ] 

DarinJ commented on MYRIAD-248:
---

Will take a look, I think this can be solved pretty easily if I can get a 
little time.


> Fail to launch Nodemanager when frameworkRole is default value "*"
> --
>
> Key: MYRIAD-248
> URL: https://issues.apache.org/jira/browse/MYRIAD-248
> Project: Myriad
>  Issue Type: Bug
>Affects Versions: Myriad 0.2.0
>Reporter: Tao Jie
>
> I tried to start hadoop cluster with myriad-0.2.0, but got error message in 
> rm log:
> {code}
> 2016-11-25 10:32:50,750 ERROR 
> org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler: 
> Exception thrown while trying to create a task for nm
> java.lang.IllegalArgumentException: n must be positive
> at java.util.Random.nextInt(Random.java:300)
> at 
> org.apache.myriad.scheduler.resource.RangeResource.getRandomValues(RangeResource.java:128)
> at 
> org.apache.myriad.scheduler.resource.RangeResource.consumeResource(RangeResource.java:99)
> at 
> org.apache.myriad.scheduler.resource.ResourceOfferContainer.consumePorts(ResourceOfferContainer.java:171)
> at 
> org.apache.myriad.scheduler.NMTaskFactory.createTask(NMTaskFactory.java:45)
> at 
> org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:146)
> at 
> org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:51)
> at 
> com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> I seems that the failure is due to the default value("*") of frameworkRole in 
>  myriad-config-default.yml.
> I set value of  frameworkRole to someone, then it worked well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-96) Support network isolation between distinct YARN clusters using overlay networks

2016-10-10 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-96?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563092#comment-15563092
 ] 

DarinJ commented on MYRIAD-96:
--

I'd really like this to be a feature of Myriad.  I think presently we could 
attempt to use it via the Docker Network Plugin, but it'd be relatively simple 
to add the NetworkInfo section to the protobuf for the universal containerizer.

> Support network isolation between distinct YARN clusters using overlay 
> networks
> ---
>
> Key: MYRIAD-96
> URL: https://issues.apache.org/jira/browse/MYRIAD-96
> Project: Myriad
>  Issue Type: Improvement
>Reporter: Swapnil Daingade
>Assignee: Swapnil Daingade
>
> * Enable creation of a overlay networks per tenant YARN cluster using a 
> virtual switch (like Open vSwitch)
> * Connect different docker containers belonging to a cluster to the same 
> overlay network.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-242) Task_Failed

2016-09-13 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15487013#comment-15487013
 ] 

DarinJ commented on MYRIAD-242:
---

yangjunfeng looks like you need to set yarn.resourcemanager.hostname in your 
yarn-site.xml.

> Task_Failed
> ---
>
> Key: MYRIAD-242
> URL: https://issues.apache.org/jira/browse/MYRIAD-242
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.2.0
>Reporter: yangjunfeng
> Fix For: Myriad 0.2.0
>
>
> How can I fix this problem?Why does the computer fetch this profile?
> Fetching URI 'http://0.0.0.0:8088/conf'
> Fetching directly into the sandbox directory
> Fetching URI 'http://0.0.0.0:8088/conf'
> Downloading resource from 'http://0.0.0.0:8088/conf' to 
> '/var/lib/mesos/slaves/4133522e-9c27-4381-8fcf-6cf40a412f08-S5/frameworks/4133522e-9c27-4381-8fcf-6cf40a412f08-0009/executors/myriad_executor4133522e-9c27-4381-8fcf-6cf40a412f08-00094133522e-9c27-4381-8fcf-6cf40a412f08-O313144133522e-9c27-4381-8fcf-6cf40a412f08-S5/runs/a27db10b-f4ef-416f-86ee-1ca4f35d60cb/conf'
> Failed to fetch 'http://0.0.0.0:8088/conf': Error downloading resource: 
> Couldn't connect to server
> Failed to synchronize with agent (it's probably exited)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-240) Created lower limit on FGS

2016-08-24 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435179#comment-15435179
 ] 

DarinJ commented on MYRIAD-240:
---

This would be super useful!

> Created lower limit on FGS
> --
>
> Key: MYRIAD-240
> URL: https://issues.apache.org/jira/browse/MYRIAD-240
> Project: Myriad
>  Issue Type: New Feature
>Reporter: John Yost
>
> In analogy to MYRIAD-229, create a lower limit in the FGS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (MYRIAD-241) Create Docker container based upon Cloudera Hadoop distribution

2016-08-24 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ closed MYRIAD-241.
-
Resolution: Won't Fix

Not in scope of project.

> Create Docker container based upon Cloudera Hadoop distribution
> ---
>
> Key: MYRIAD-241
> URL: https://issues.apache.org/jira/browse/MYRIAD-241
> Project: Myriad
>  Issue Type: Improvement
>Reporter: John Yost
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-241) Create Docker container based upon Cloudera Hadoop distribution

2016-08-24 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435174#comment-15435174
 ] 

DarinJ commented on MYRIAD-241:
---

While I think vendor specific Docker Images are a good idea.  I'm not sure 
that's part of the Apache Project's scope.  If you'd like to create them and 
host on a personal repo that's fine but I don't think it can be part of the 
project, especially if there's licensing issues.

> Create Docker container based upon Cloudera Hadoop distribution
> ---
>
> Key: MYRIAD-241
> URL: https://issues.apache.org/jira/browse/MYRIAD-241
> Project: Myriad
>  Issue Type: Improvement
>Reporter: John Yost
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-221) Dynamically reserve Mesos resources/quota

2016-08-12 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418780#comment-15418780
 ] 

DarinJ commented on MYRIAD-221:
---

[~samchen] typically you'd fork the project from GitHub, push your changes to 
your fork and then submit a pull request.  This give us the ability to review 
your code, make comments if necessary and complete the merge.


> Dynamically reserve Mesos resources/quota
> -
>
> Key: MYRIAD-221
> URL: https://issues.apache.org/jira/browse/MYRIAD-221
> Project: Myriad
>  Issue Type: Improvement
>Reporter: Adam B
>Assignee: Sam chen
>
> Mesos allows a framework to dynamically reserve the resources offered to it, 
> so that if a task launched on reserved resources dies, the framework that 
> reserved those resources will immediately get offered back the same reserved 
> resources, without any other framework (under a different role) getting a 
> chance to use them. http://mesos.apache.org/documentation/latest/reservation/ 
> This would be valuable if the user truly wanted to guarantee resources on a 
> node once they have started using it. On a related note, we should also 
> support quota, for minimum resource guarantees that are not tied to 
> particular nodes: http://mesos.apache.org/documentation/latest/quota/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-238) Convert groovy-based unit tests to JUnit tests

2016-08-05 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409478#comment-15409478
 ] 

DarinJ commented on MYRIAD-238:
---

[~kensipe] [~smarella] should probably weigh in here.

> Convert groovy-based unit tests to JUnit tests
> --
>
> Key: MYRIAD-238
> URL: https://issues.apache.org/jira/browse/MYRIAD-238
> Project: Myriad
>  Issue Type: Test
>Reporter: John Yost
>Assignee: John Yost
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-231) Failed to regist zero profile to ResourceManager (2.7.2)

2016-06-30 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357930#comment-15357930
 ] 

DarinJ commented on MYRIAD-231:
---

[~klaus1982], merged your PR.  Did that resolve the issue?  If so I'll close 
this ticket.

> Failed to regist zero profile to ResourceManager (2.7.2)
> 
>
> Key: MYRIAD-231
> URL: https://issues.apache.org/jira/browse/MYRIAD-231
> Project: Myriad
>  Issue Type: Bug
>Reporter: Klaus Ma
>
> Here's the error message of NodeManager:
> {code}
> 16/06/29 16:27:50 ERROR nodemanager.NodeStatusUpdaterImpl: Unexpected error 
> starting NodeStatusUpdater
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:270)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:196)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:271)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:486)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:533)
> 16/06/29 16:27:50 INFO service.AbstractService: Service 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl failed in 
> state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:202)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:271)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:486)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:533)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved 
> SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, 
> Message from ResourceManager: NodeManager from  dcos33.private.dns.zone 
> doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the 
> NodeManager.
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:270)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:196)
>   ... 6 more
> 16/06/29 16:27:50 INFO service.AbstractService: Service NodeManager failed in 
> state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
>   at 
> 

[jira] [Commented] (MYRIAD-231) Failed to regist zero profile to ResourceManager (2.7.2)

2016-06-30 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356898#comment-15356898
 ] 

DarinJ commented on MYRIAD-231:
---

Klaus, thanks I'll look at this later today.

> Failed to regist zero profile to ResourceManager (2.7.2)
> 
>
> Key: MYRIAD-231
> URL: https://issues.apache.org/jira/browse/MYRIAD-231
> Project: Myriad
>  Issue Type: Bug
>Reporter: Klaus Ma
>
> Here's the error message of NodeManager:
> {code}
> 16/06/29 16:27:50 ERROR nodemanager.NodeStatusUpdaterImpl: Unexpected error 
> starting NodeStatusUpdater
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:270)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:196)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:271)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:486)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:533)
> 16/06/29 16:27:50 INFO service.AbstractService: Service 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl failed in 
> state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:202)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:271)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:486)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:533)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved 
> SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, 
> Message from ResourceManager: NodeManager from  dcos33.private.dns.zone 
> doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the 
> NodeManager.
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:270)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:196)
>   ... 6 more
> 16/06/29 16:27:50 INFO service.AbstractService: Service NodeManager failed in 
> state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
>   at 
> 

[jira] [Comment Edited] (MYRIAD-231) Failed to regist zero profile to ResourceManager (2.7.2)

2016-06-29 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355338#comment-15355338
 ] 

DarinJ edited comment on MYRIAD-231 at 6/29/16 2:39 PM:


[~klaus1982] can you try adding this to your yarn-site.xml, they changed a 
default setting in Hadoop 2.7.2.
{code:xml}
  
yarn.nodemanager.pmem-check-enabled
false
  
  
yarn.nodemanager.vmem-check-enabled
false
  
{code}

There's a discussion on the mailing list about this I'll update this comment 
with the link shortly.



was (Author: darinj):
[~klaus1982] can you try adding this to your yarn-site.xml, they changed a 
default setting in Hadoop 2.7.2.
[code]
  
yarn.nodemanager.pmem-check-enabled
false
  
  
yarn.nodemanager.vmem-check-enabled
false
  
[code]

There's a discussion on the mailing list about this I'll update this comment 
with the link shortly.


> Failed to regist zero profile to ResourceManager (2.7.2)
> 
>
> Key: MYRIAD-231
> URL: https://issues.apache.org/jira/browse/MYRIAD-231
> Project: Myriad
>  Issue Type: Bug
>Reporter: Klaus Ma
>
> Here's the error message of NodeManager:
> {code}
> 16/06/29 16:27:50 ERROR nodemanager.NodeStatusUpdaterImpl: Unexpected error 
> starting NodeStatusUpdater
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:270)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:196)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:271)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:486)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:533)
> 16/06/29 16:27:50 INFO service.AbstractService: Service 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl failed in 
> state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:202)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:271)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:486)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:533)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved 
> SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, 
> Message from ResourceManager: NodeManager from  dcos33.private.dns.zone 
> doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the 
> NodeManager.
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:270)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:196)
>   ... 6 more
> 16/06/29 16:27:50 INFO service.AbstractService: Service NodeManager failed in 
> state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved 

[jira] [Commented] (MYRIAD-231) Failed to regist zero profile to ResourceManager (2.7.2)

2016-06-29 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355338#comment-15355338
 ] 

DarinJ commented on MYRIAD-231:
---

[~klaus1982] can you try adding this to your yarn-site.xml, they changed a 
default setting in Hadoop 2.7.2.
[code]
  
yarn.nodemanager.pmem-check-enabled
false
  
  
yarn.nodemanager.vmem-check-enabled
false
  
[code]

There's a discussion on the mailing list about this I'll update this comment 
with the link shortly.


> Failed to regist zero profile to ResourceManager (2.7.2)
> 
>
> Key: MYRIAD-231
> URL: https://issues.apache.org/jira/browse/MYRIAD-231
> Project: Myriad
>  Issue Type: Bug
>Reporter: Klaus Ma
>
> Here's the error message of NodeManager:
> {code}
> 16/06/29 16:27:50 ERROR nodemanager.NodeStatusUpdaterImpl: Unexpected error 
> starting NodeStatusUpdater
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:270)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:196)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:271)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:486)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:533)
> 16/06/29 16:27:50 INFO service.AbstractService: Service 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl failed in 
> state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:202)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:271)
>   at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:486)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:533)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved 
> SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, 
> Message from ResourceManager: NodeManager from  dcos33.private.dns.zone 
> doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the 
> NodeManager.
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:270)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:196)
>   ... 6 more
> 16/06/29 16:27:50 INFO service.AbstractService: Service NodeManager failed in 
> state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of NodeManager failed, Message from 
> ResourceManager: NodeManager from  dcos33.private.dns.zone doesn't satisfy 
> minimum allocations, Sending SHUTDOWN signal to the NodeManager.
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN 
> signal from Resourcemanager ,Registration of 

[jira] [Resolved] (MYRIAD-177) Figure out the right LICENSE for bootstrap bundled with Myriad

2016-06-28 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ resolved MYRIAD-177.
---
Resolution: Cannot Reproduce

RAT didn't catch anything with the 0.2.0 release and it was a non issue during 
IPMC voting this time.  Closing for now.

> Figure out the right LICENSE for bootstrap bundled with Myriad
> --
>
> Key: MYRIAD-177
> URL: https://issues.apache.org/jira/browse/MYRIAD-177
> Project: Myriad
>  Issue Type: Bug
>Affects Versions: Myriad 0.1.0
>Reporter: Santosh Marella
>Priority: Trivial
>  Labels: easyfix, newbie
> Fix For: Myriad 0.2.0
>
>
> As per the VOTE feedback for 0.1.0, we need to revisit the LICENSE for 
> bootstrap version bundled with Myriad. 
> http://mail-archives.apache.org/mod_mbox/incubator-general/201512.mbox/%3C1E5E90A7-FAC3-4561-98E2-3CA7D98997C6%40classsoftware.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-228) Duplicated NM opts

2016-06-28 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15353672#comment-15353672
 ] 

DarinJ commented on MYRIAD-228:
---

[~klaus1982] I just pushed some additional changes if you want to try it 
out/review.  Will be adding some unit tests but think it's close.

> Duplicated NM opts
> --
>
> Key: MYRIAD-228
> URL: https://issues.apache.org/jira/browse/MYRIAD-228
> Project: Myriad
>  Issue Type: Bug
>Reporter: Klaus Ma
>
> In {{NMExecutorCLGenImpl.java:addYarnNodemanagerOpt}}, it keep appending NM 
> Opts. It'll make arguments too long.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-229) Put upper limit on FGS

2016-06-20 Thread DarinJ (JIRA)
DarinJ created MYRIAD-229:
-

 Summary: Put upper limit on FGS
 Key: MYRIAD-229
 URL: https://issues.apache.org/jira/browse/MYRIAD-229
 Project: Myriad
  Issue Type: Bug
  Components: Scheduler
Affects Versions: Myriad 0.2.0
Reporter: DarinJ
Priority: Minor


Currently FGS can utilize as many resources as on the machine this is sometimes 
problematic when considering machines with disproportionate core to mem to disk 
ratios.  One way to fix the issue is to cap the amount of CPU/MEM that Myriad 
will utilize with FGS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MYRIAD-228) Duplicated NM opts

2016-06-18 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338047#comment-15338047
 ] 

DarinJ edited comment on MYRIAD-228 at 6/18/16 5:22 PM:


I have a wip PR which addresses this.  Should be ready next week.
https://github.com/apache/incubator-myriad/pull/79



was (Author: darinj):
I have a wip PR which addresses this.  Should be ready next week.

> Duplicated NM opts
> --
>
> Key: MYRIAD-228
> URL: https://issues.apache.org/jira/browse/MYRIAD-228
> Project: Myriad
>  Issue Type: Bug
>Reporter: Klaus Ma
>
> In {{NMExecutorCLGenImpl.java:addYarnNodemanagerOpt}}, it keep appending NM 
> Opts. It'll make arguments too long.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-228) Duplicated NM opts

2016-06-18 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338047#comment-15338047
 ] 

DarinJ commented on MYRIAD-228:
---

I have a wip PR which addresses this.  Should be ready next week.

> Duplicated NM opts
> --
>
> Key: MYRIAD-228
> URL: https://issues.apache.org/jira/browse/MYRIAD-228
> Project: Myriad
>  Issue Type: Bug
>Reporter: Klaus Ma
>
> In {{NMExecutorCLGenImpl.java:addYarnNodemanagerOpt}}, it keep appending NM 
> Opts. It'll make arguments too long.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-221) Dynamically reserve Mesos resources/quota

2016-06-14 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15330810#comment-15330810
 ] 

DarinJ commented on MYRIAD-221:
---

Might be worth exploring using an external store as well to preserve node 
manager state (currently lost when nodemanager goes done).

> Dynamically reserve Mesos resources/quota
> -
>
> Key: MYRIAD-221
> URL: https://issues.apache.org/jira/browse/MYRIAD-221
> Project: Myriad
>  Issue Type: Improvement
>Reporter: Adam B
>
> Mesos allows a framework to dynamically reserve the resources offered to it, 
> so that if a task launched on reserved resources dies, the framework that 
> reserved those resources will immediately get offered back the same reserved 
> resources, without any other framework (under a different role) getting a 
> chance to use them. http://mesos.apache.org/documentation/latest/reservation/ 
> This would be valuable if the user truly wanted to guarantee resources on a 
> node once they have started using it. On a related note, we should also 
> support quota, for minimum resource guarantees that are not tied to 
> particular nodes: http://mesos.apache.org/documentation/latest/quota/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-219) Cleanup method related to MyriadExecutorConfiguration

2016-06-10 Thread DarinJ (JIRA)
DarinJ created MYRIAD-219:
-

 Summary: Cleanup method related to MyriadExecutorConfiguration
 Key: MYRIAD-219
 URL: https://issues.apache.org/jira/browse/MYRIAD-219
 Project: Myriad
  Issue Type: Bug
  Components: Executor, Scheduler
Reporter: DarinJ
Priority: Minor


Originally the executor: configuration in the yaml config related to the java 
app that started the node manager.  Since we went to a model of starting the 
node manager directory this changed and should be tidied up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-218) Clean up TaskUtils

2016-06-10 Thread DarinJ (JIRA)
DarinJ created MYRIAD-218:
-

 Summary: Clean up TaskUtils
 Key: MYRIAD-218
 URL: https://issues.apache.org/jira/browse/MYRIAD-218
 Project: Myriad
  Issue Type: Bug
  Components: Scheduler
Reporter: DarinJ
Priority: Minor


I noticed several unused methods here and a few methods which are only used in 
unit tests.  Seems like an easy target for clean up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-213) Better Resource Matching/Task Resource Creation

2016-06-09 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322805#comment-15322805
 ] 

DarinJ commented on MYRIAD-213:
---

I've looked into this a bit, I think I'll break it into the following subtasks:

1. Resolve NMPorts by adding a more general Ports container class, Refactor 
ExecutorCommandLineGenerator classes to use this class (will resolve Myriad-214 
in the process).  Refactor TackFactory classes as necessary to work with this.  
This adds a few methods to TaskUtils to get port Resources.

2. Refactor ResourceOffersEventHandler, here I'll actually create a 
ResourceProfile which will be passed to the TaskFactories.  I'll then refactor 
TaskFactories appropriately.  This should made everything more modular and 
remove a lot of redundant code (which is executed more than needed).


> Better Resource Matching/Task Resource Creation
> ---
>
> Key: MYRIAD-213
> URL: https://issues.apache.org/jira/browse/MYRIAD-213
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.2.0
>Reporter: DarinJ
>Priority: Minor
>
> Currently ResourceOffersEventHandler matches resources in an offer to tasks 
> then TaskFactory adds them to a task.  However, they both have to cycle 
> through the offers.  This is inefficient and could be optimized. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MYRIAD-198) Remove optionals when sane defaults are available

2016-06-08 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ resolved MYRIAD-198.
---
   Resolution: Fixed
Fix Version/s: Myriad 0.3.0

MYRIAD-198

> Remove optionals when sane defaults are available
> -
>
> Key: MYRIAD-198
> URL: https://issues.apache.org/jira/browse/MYRIAD-198
> Project: Myriad
>  Issue Type: Bug
>  Components: Executor, Scheduler
>Affects Versions: Myriad 0.2.0
>Reporter: DarinJ
>Assignee: John Yost
>Priority: Minor
>  Labels: easyfix, newbie
> Fix For: Myriad 0.3.0
>
>
> Currently we overuse Optionals in the config and then use an or method in 
> various factories later.  In many cases having the configuration return a 
> default when the parameter was specified would create cleaner code.  For 
> instance:
> {quote}
> Optional getCgroups() {
>   Optional.fromNullable(cgroups);
> }
> {quote}
> vs
> {quote}
> Boolean getCgroups() {
>   return cgroups != null ? cgroups : false;
> }
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-215) Support serving the JVM

2016-06-03 Thread DarinJ (JIRA)
DarinJ created MYRIAD-215:
-

 Summary: Support serving the JVM 
 Key: MYRIAD-215
 URL: https://issues.apache.org/jira/browse/MYRIAD-215
 Project: Myriad
  Issue Type: New Feature
  Components: Scheduler
Affects Versions: Myriad 0.2.0
Reporter: DarinJ


We currently support serving a tarball of hadoop and the config.  We should 
consider serving a jvm as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MYRIAD-212) Separate/Backup NodeState from the run time either using MyriadFS or Zookeeper

2016-06-03 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15314625#comment-15314625
 ] 

DarinJ edited comment on MYRIAD-212 at 6/3/16 7:15 PM:
---

[~zjaffee] can you articulate why we need to keep NodeStore resilient?  It 
looks as though if a NodeManage sends a Heartbeat it will be readded.


was (Author: darinj):
[~zjaffee] can you articulate why we need to keep NodeStore Resiliant?  It 
looks as though if a NodeManage sends a Heartbeat it will be readded.

> Separate/Backup NodeState from the run time either using MyriadFS or Zookeeper
> --
>
> Key: MYRIAD-212
> URL: https://issues.apache.org/jira/browse/MYRIAD-212
> Project: Myriad
>  Issue Type: Improvement
>Reporter: Zachary Jaffee
>
> I noticed that NodeStore has no form of recovery, lets assess the best way to 
> store the state so it can be restored on restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-212) Separate/Backup NodeState from the run time either using MyriadFS or Zookeeper

2016-06-03 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15314625#comment-15314625
 ] 

DarinJ commented on MYRIAD-212:
---

[~zjaffee] can you articulate why we need to keep NodeStore Resiliant?  It 
looks as though if a NodeManage sends a Heartbeat it will be readded.

> Separate/Backup NodeState from the run time either using MyriadFS or Zookeeper
> --
>
> Key: MYRIAD-212
> URL: https://issues.apache.org/jira/browse/MYRIAD-212
> Project: Myriad
>  Issue Type: Improvement
>Reporter: Zachary Jaffee
>
> I noticed that NodeStore has no form of recovery, lets assess the best way to 
> store the state so it can be restored on restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-134) Support Zookeeper based implementation of RMStateStore for storing Myriad state

2016-06-02 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-134:
--
Assignee: Zachary Jaffee  (was: Swapnil Daingade)

> Support Zookeeper based implementation of RMStateStore for storing Myriad 
> state
> ---
>
> Key: MYRIAD-134
> URL: https://issues.apache.org/jira/browse/MYRIAD-134
> Project: Myriad
>  Issue Type: Task
>  Components: Scheduler
>Reporter: Swapnil Daingade
>Assignee: Zachary Jaffee
>
> Currently we support a DFS based implementation of RMStateStore for storing 
> Myriad State (MyriadFileSystemRMStateStore). We need to similarly support a 
> Zookeeper base one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-134) Support Zookeeper based implementation of RMStateStore for storing Myriad state

2016-06-01 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310686#comment-15310686
 ] 

DarinJ commented on MYRIAD-134:
---

[~zjaffee] Go for it.  I'll add you and [~hokiegeek2] to JIRA so you can assign 
tasks to yourselves.

> Support Zookeeper based implementation of RMStateStore for storing Myriad 
> state
> ---
>
> Key: MYRIAD-134
> URL: https://issues.apache.org/jira/browse/MYRIAD-134
> Project: Myriad
>  Issue Type: Task
>  Components: Scheduler
>Reporter: Swapnil Daingade
>Assignee: Swapnil Daingade
>
> Currently we support a DFS based implementation of RMStateStore for storing 
> Myriad State (MyriadFileSystemRMStateStore). We need to similarly support a 
> Zookeeper base one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-134) Support Zookeeper based implementation of RMStateStore for storing Myriad state

2016-06-01 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310593#comment-15310593
 ] 

DarinJ commented on MYRIAD-134:
---

John, I think the ZK based Impl of the RMStateStore makes sense if you want to 
try it out.  I'm concerned about going after consul though as you'd effectively 
have to make a complete RMStateStore for YARN.  Probably not worth it.

> Support Zookeeper based implementation of RMStateStore for storing Myriad 
> state
> ---
>
> Key: MYRIAD-134
> URL: https://issues.apache.org/jira/browse/MYRIAD-134
> Project: Myriad
>  Issue Type: Task
>  Components: Scheduler
>Reporter: Swapnil Daingade
>Assignee: Swapnil Daingade
>
> Currently we support a DFS based implementation of RMStateStore for storing 
> Myriad State (MyriadFileSystemRMStateStore). We need to similarly support a 
> Zookeeper base one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-134) Support Zookeeper based implementation of RMStateStore for storing Myriad state

2016-06-01 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310601#comment-15310601
 ] 

DarinJ commented on MYRIAD-134:
---

Link to MyriadFileSystemRMStateStore for inspiration. 

https://github.com/apache/incubator-myriad/blob/master/myriad-scheduler/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MyriadFileSystemRMStateStore.java

> Support Zookeeper based implementation of RMStateStore for storing Myriad 
> state
> ---
>
> Key: MYRIAD-134
> URL: https://issues.apache.org/jira/browse/MYRIAD-134
> Project: Myriad
>  Issue Type: Task
>  Components: Scheduler
>Reporter: Swapnil Daingade
>Assignee: Swapnil Daingade
>
> Currently we support a DFS based implementation of RMStateStore for storing 
> Myriad State (MyriadFileSystemRMStateStore). We need to similarly support a 
> Zookeeper base one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-204) CGroup support in Docker

2016-05-18 Thread DarinJ (JIRA)
DarinJ created MYRIAD-204:
-

 Summary: CGroup support in Docker
 Key: MYRIAD-204
 URL: https://issues.apache.org/jira/browse/MYRIAD-204
 Project: Myriad
  Issue Type: Bug
  Components: Executor
Affects Versions: Myriad 0.1.0, Myriad 0.2.0
Reporter: DarinJ
Priority: Minor
 Fix For: Myriad 0.3.0


Docker currently can't support cgroup, This is due to the the framework not 
being able to execute sudo as the Dockerfile sets USER to yarn, so 
frameworkSuperuser can't be set.  This means the sandbox will be owned by yarn 
which causes container-executor to throw an exception.  While I have a fix I 
think it's best to think more about this one.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MYRIAD-138) ContainerInfo for Executor (Enable docker support on the executor)

2016-05-18 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ resolved MYRIAD-138.
---
   Resolution: Fixed
Fix Version/s: Myriad 0.2.0

PR #76

> ContainerInfo for Executor (Enable docker support on the executor)
> --
>
> Key: MYRIAD-138
> URL: https://issues.apache.org/jira/browse/MYRIAD-138
> Project: Myriad
>  Issue Type: New Feature
>  Components: Executor, Scheduler
>Affects Versions: Myriad 0.2.0
>Reporter: DarinJ
>Assignee: DarinJ
>Priority: Minor
> Fix For: Myriad 0.2.0
>
>
> ContainerInfo allows one to specify a docker image to execute the executor 
> in, it's a relatively straight forward implementation to add as an option 
> since it's mostly getting additional information out of the YAML 
> configuration and passing to TaskFactory.  There's a clear advantage of 
> having assume less information about what is installed on the node (i.e java).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-168) Myriad should be able to use ports from the frameworkRole and the defaultRole.

2016-05-18 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-168:
--
Fix Version/s: (was: Myriad 0.2.0)
   Myriad 0.3.0

> Myriad should be able to use ports from the frameworkRole and the defaultRole.
> --
>
> Key: MYRIAD-168
> URL: https://issues.apache.org/jira/browse/MYRIAD-168
> Project: Myriad
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
>Reporter: DarinJ
>Assignee: DarinJ
>Priority: Minor
> Fix For: Myriad 0.3.0
>
>
> Currently, Myriad ignored ports from roles other than the default role.  
> However, one could reserve ports on hosts in the resources file for the 
> framework role myriad runs on.  This would prevent port binding collisions 
> for services needing fixed port such as the Job History Server. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-168) Myriad should be able to use ports from the frameworkRole and the defaultRole.

2016-05-18 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15288976#comment-15288976
 ] 

DarinJ commented on MYRIAD-168:
---

Pulled out for additional testing.

> Myriad should be able to use ports from the frameworkRole and the defaultRole.
> --
>
> Key: MYRIAD-168
> URL: https://issues.apache.org/jira/browse/MYRIAD-168
> Project: Myriad
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
>Reporter: DarinJ
>Assignee: DarinJ
>Priority: Minor
> Fix For: Myriad 0.3.0
>
>
> Currently, Myriad ignored ports from roles other than the default role.  
> However, one could reserve ports on hosts in the resources file for the 
> framework role myriad runs on.  This would prevent port binding collisions 
> for services needing fixed port such as the Job History Server. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (MYRIAD-189) Make Config URI Configurable

2016-05-18 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ closed MYRIAD-189.
-
Resolution: Fixed
  Assignee: DarinJ

> Make Config URI Configurable
> 
>
> Key: MYRIAD-189
> URL: https://issues.apache.org/jira/browse/MYRIAD-189
> Project: Myriad
>  Issue Type: Bug
>  Components: Executor, Scheduler
>Affects Versions: Myriad 0.1.0, Myriad 0.2.0
>Reporter: DarinJ
>Assignee: DarinJ
>Priority: Minor
>  Labels: easyfix, features
> Fix For: Myriad 0.2.0
>
>
> Currently when using remote distribution the config is pulled from the 
> resource manager and when not using remote distribution the configs are 
> pulled from the local machine.  While these are reasonable defaults it is not 
> ideal for many users.  Having a configurable configuration URI seems like a 
> much better option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-180) Build should not pollute sources

2016-05-18 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-180:
--
Fix Version/s: (was: Myriad 0.2.0)
   Myriad 0.3.0

> Build should not pollute sources
> 
>
> Key: MYRIAD-180
> URL: https://issues.apache.org/jira/browse/MYRIAD-180
> Project: Myriad
>  Issue Type: Bug
>Affects Versions: Myriad 0.1.0
>Reporter: Santosh Marella
> Fix For: Myriad 0.3.0
>
>
> http://mail-archives.apache.org/mod_mbox/incubator-general/201512.mbox/%3CCAPmHmDTM4T37wJj6sPZniiU0mwm7qC4P9VCEHaOxhwWAvr%2Be_g%40mail.gmail.com%3E
> Myriad scheduler build generates the node_modules  directory 
> (myriad-scheduler/src/main/resources/webapp/node_modules). Ideally, any files 
> downloaded/generated by build should be outside of "src".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MYRIAD-192) Better Support Cgroups

2016-05-18 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ resolved MYRIAD-192.
---
Resolution: Fixed

Still need to correct for Docker, will add a separate issue.

> Better Support Cgroups
> --
>
> Key: MYRIAD-192
> URL: https://issues.apache.org/jira/browse/MYRIAD-192
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
>Reporter: DarinJ
> Fix For: Myriad 0.2.0, Myriad 0.1.1
>
>
> Current many of the options for cgroups are hard coded into Myriad.  These 
> should be configurable.  In addition we should no longer chown the sandbox 
> directory to yarn in `DownloadNMExecutorCLGenImpl.java`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-193) Umbrella JIRA for security in Myriad

2016-05-18 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-193:
--
Fix Version/s: (was: Myriad 0.2.0)
   Myriad 0.3.0

> Umbrella JIRA for security in Myriad
> 
>
> Key: MYRIAD-193
> URL: https://issues.apache.org/jira/browse/MYRIAD-193
> Project: Myriad
>  Issue Type: Bug
>  Components: Executor, Scheduler
>Affects Versions: Myriad 0.1.0, Myriad 0.2.0, Myriad 0.1.1
>Reporter: Swapnil Daingade
>Assignee: Swapnil Daingade
> Fix For: Myriad 0.3.0
>
>
> Creating an umbrella JIRA for security in Myriad. We can add sub tasks to 
> this one as we notice gaps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MYRIAD-197) Upgrade Myriad to the 0.28.1 version of Mesos

2016-05-09 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ resolved MYRIAD-197.
---
Resolution: Fixed

PR#66

> Upgrade Myriad to the 0.28.1 version of Mesos
> -
>
> Key: MYRIAD-197
> URL: https://issues.apache.org/jira/browse/MYRIAD-197
> Project: Myriad
>  Issue Type: Improvement
>  Components: Executor, Scheduler
>Reporter: Mohit Soni
>Assignee: Mohit Soni
>  Labels: features
>
> Myriad currently depends on version {{0.24.1}} of Mesos. I would like to bump 
> up the version to the latest stable Mesos {{0.28.1}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-195) Node Managers randomly die on Mesos

2016-05-09 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276908#comment-15276908
 ] 

DarinJ commented on MYRIAD-195:
---

I was unable to recreate this on stock Myriad, do you have any updates?

> Node Managers randomly die on Mesos
> ---
>
> Key: MYRIAD-195
> URL: https://issues.apache.org/jira/browse/MYRIAD-195
> Project: Myriad
>  Issue Type: Bug
>  Components: Executor
>Affects Versions: Myriad 0.1.0
> Environment: Ubuntu 14.04; kernel 3.13.0-66-generic; MapR 5.0
>Reporter: Miguel Bernadin
>
> Hello, I have been noticing that the Node Managers randomly die on mesos. 
> Here are the attached two logs below. The first is the mesos logs, and the 
> second is the node manager createNMVolume log. Looking to see if anyone else 
> is experiencing this. 
> Mesos logs: 
> 16/03/25 13:51:10 INFO nodemanager.NodeManager: STARTUP_MSG: 
> / STARTUP_MSG: 
> Starting NodeManager STARTUP_MSG: host = nodemanager/10.1.194.71 STARTUP_MSG: 
> args = [] STARTUP_MSG: version = 2.7.0-mapr-1506 STARTUP_MSG: classpath = 
> 

[jira] [Resolved] (MYRIAD-196) Allow remote distribution and configuration of JVM for executors

2016-05-09 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ resolved MYRIAD-196.
---
Resolution: Fixed

PR#65

> Allow remote distribution and configuration of JVM for executors
> 
>
> Key: MYRIAD-196
> URL: https://issues.apache.org/jira/browse/MYRIAD-196
> Project: Myriad
>  Issue Type: Improvement
>  Components: Executor
>Reporter: Mohit Soni
>Assignee: Mohit Soni
>  Labels: features
> Fix For: Myriad 0.2.0
>
>
> Currently, Myriad Executors relies on system-wide JVM installation and 
> configuration. Extend the remote distribution of binaries support for JVM and 
> make {{JAVA_HOME}} and {{JAVA_LIBRARY_PATH}} environment variables 
> configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-198) Remove optionals when sane defaults are available

2016-05-06 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-198:
--
Assignee: (was: DarinJ)

> Remove optionals when sane defaults are available
> -
>
> Key: MYRIAD-198
> URL: https://issues.apache.org/jira/browse/MYRIAD-198
> Project: Myriad
>  Issue Type: Bug
>  Components: Executor, Scheduler
>Affects Versions: Myriad 0.2.0
>Reporter: DarinJ
>Priority: Minor
>  Labels: easyfix, newbie
>
> Currently we overuse Optionals in the config and then use an or method in 
> various factories later.  In many cases having the configuration return a 
> default when the parameter was specified would create cleaner code.  For 
> instance:
> {quote}
> Optional getCgroups() {
>   Optional.fromNullable(cgroups);
> }
> {quote}
> vs
> {quote}
> Boolean getCgroups() {
>   return cgroups != null ? cgroups : false;
> }
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-198) Remove optionals when sane defaults are available

2016-05-06 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-198:
--
Assignee: DarinJ

> Remove optionals when sane defaults are available
> -
>
> Key: MYRIAD-198
> URL: https://issues.apache.org/jira/browse/MYRIAD-198
> Project: Myriad
>  Issue Type: Bug
>  Components: Executor, Scheduler
>Affects Versions: Myriad 0.2.0
>Reporter: DarinJ
>Assignee: DarinJ
>Priority: Minor
>  Labels: easyfix, newbie
>
> Currently we overuse Optionals in the config and then use an or method in 
> various factories later.  In many cases having the configuration return a 
> default when the parameter was specified would create cleaner code.  For 
> instance:
> {quote}
> Optional getCgroups() {
>   Optional.fromNullable(cgroups);
> }
> {quote}
> vs
> {quote}
> Boolean getCgroups() {
>   return cgroups != null ? cgroups : false;
> }
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-192) Better Support Cgroups

2016-05-05 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273612#comment-15273612
 ] 

DarinJ commented on MYRIAD-192:
---

https://github.com/apache/incubator-myriad/pull/69

> Better Support Cgroups
> --
>
> Key: MYRIAD-192
> URL: https://issues.apache.org/jira/browse/MYRIAD-192
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
>Reporter: DarinJ
> Fix For: Myriad 0.2.0, Myriad 0.1.1
>
>
> Current many of the options for cgroups are hard coded into Myriad.  These 
> should be configurable.  In addition we should no longer chown the sandbox 
> directory to yarn in `DownloadNMExecutorCLGenImpl.java`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-200) Increased Unit Test and Integration Testing

2016-05-05 Thread DarinJ (JIRA)
DarinJ created MYRIAD-200:
-

 Summary: Increased Unit Test and Integration Testing
 Key: MYRIAD-200
 URL: https://issues.apache.org/jira/browse/MYRIAD-200
 Project: Myriad
  Issue Type: Bug
  Components: Executor, Scheduler
Reporter: DarinJ


Currently Unit Test coverage is weak in places also, a good integration test 
framework would be helpful.  (potentially [minimesos|http://minimesos.org]?) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-198) Remove optionals when sane defaults are available

2016-05-05 Thread DarinJ (JIRA)
DarinJ created MYRIAD-198:
-

 Summary: Remove optionals when sane defaults are available
 Key: MYRIAD-198
 URL: https://issues.apache.org/jira/browse/MYRIAD-198
 Project: Myriad
  Issue Type: Bug
  Components: Executor, Scheduler
Affects Versions: Myriad 0.2.0
Reporter: DarinJ
Priority: Minor


Currently we overuse Optionals in the config and then use an or method in 
various factories later.  In many cases having the configuration return a 
default when the parameter was specified would create cleaner code.  For 
instance:
{quote}
Optional getCgroups() {
  Optional.fromNullable(cgroups);
}
{quote}
vs
{quote}
Boolean getCgroups() {
  return cgroups != null ? cgroups : false;
}
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-177) Figure out the right LICENSE for bootstrap bundled with Myriad

2016-05-02 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-177:
--
Priority: Trivial  (was: Major)

> Figure out the right LICENSE for bootstrap bundled with Myriad
> --
>
> Key: MYRIAD-177
> URL: https://issues.apache.org/jira/browse/MYRIAD-177
> Project: Myriad
>  Issue Type: Bug
>Affects Versions: Myriad 0.1.0
>Reporter: Santosh Marella
>Priority: Trivial
>  Labels: easyfix, newbie
> Fix For: Myriad 0.2.0
>
>
> As per the VOTE feedback for 0.1.0, we need to revisit the LICENSE for 
> bootstrap version bundled with Myriad. 
> http://mail-archives.apache.org/mod_mbox/incubator-general/201512.mbox/%3C1E5E90A7-FAC3-4561-98E2-3CA7D98997C6%40classsoftware.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-36) Package executor/NM into a Docker image

2016-04-18 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15245545#comment-15245545
 ] 

DarinJ commented on MYRIAD-36:
--

Recent progress...
https://github.com/apache/incubator-myriad/pull/64

> Package executor/NM into a Docker image
> ---
>
> Key: MYRIAD-36
> URL: https://issues.apache.org/jira/browse/MYRIAD-36
> Project: Myriad
>  Issue Type: New Feature
>Reporter: Adam B
>Assignee: DarinJ
> Fix For: Myriad 0.2.0
>
>
> Probably has to be done in privileged mode, to support NM launching its own 
> Dockers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-192) Better Support Cgroups

2016-04-02 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-192:
--
Fix Version/s: Myriad 0.1.1
   Myriad 0.2.0

> Better Support Cgroups
> --
>
> Key: MYRIAD-192
> URL: https://issues.apache.org/jira/browse/MYRIAD-192
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
>Reporter: DarinJ
> Fix For: Myriad 0.2.0, Myriad 0.1.1
>
>
> Current many of the options for cgroups are hard coded into Myriad.  These 
> should be configurable.  In addition we should no longer chown the sandbox 
> directory to yarn in `DownloadNMExecutorCLGenImpl.java`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-194) More REST Interfaces

2016-04-02 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-194:
--
Labels: easyfix features newbie  (was: )

> More REST Interfaces
> 
>
> Key: MYRIAD-194
> URL: https://issues.apache.org/jira/browse/MYRIAD-194
> Project: Myriad
>  Issue Type: New Feature
>  Components: Scheduler
>Reporter: DarinJ
>Priority: Minor
>  Labels: easyfix, features, newbie
>
> As an operator I'd like to be able to kill a particular task as opposed to a 
> specific number of tasks.  This could be combined with other interfaces to 
> create rolling restarts.
> Consider this a request for additional requests for interfaces as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-194) More REST Interfaces

2016-04-02 Thread DarinJ (JIRA)
DarinJ created MYRIAD-194:
-

 Summary: More REST Interfaces
 Key: MYRIAD-194
 URL: https://issues.apache.org/jira/browse/MYRIAD-194
 Project: Myriad
  Issue Type: New Feature
  Components: Scheduler
Reporter: DarinJ
Priority: Minor


As an operator I'd like to be able to kill a particular task as opposed to a 
specific number of tasks.  This could be combined with other interfaces to 
create rolling restarts.

Consider this a request for additional requests for interfaces as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-88) Reduce verbose logging

2016-04-02 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-88?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-88:
-
Issue Type: Improvement  (was: Bug)

> Reduce verbose logging
> --
>
> Key: MYRIAD-88
> URL: https://issues.apache.org/jira/browse/MYRIAD-88
> Project: Myriad
>  Issue Type: Improvement
>Reporter: Santosh Marella
>
> One instance of "INFO" logging that should perhaps be at "DEBUG" level is
> 15/04/06 08:15:52 INFO handlers.ResourceOffersEventHandler: Received offers 2
> 15/04/06 08:15:52 INFO handlers.ResourceOffersEventHandler: No pending
> tasks, declining all offers
> 15/04/06 08:15:55 INFO handlers.ResourceOffersEventHandler: Received offers 2
> 15/04/06 08:15:55 INFO handlers.ResourceOffersEventHandler: No pending
> tasks, declining all offers
> 15/04/06 08:15:56 INFO handlers.ResourceOffersEventHandler: Received offers 1
> 15/04/06 08:15:56 INFO handlers.ResourceOffersEventHandler: No pending
> tasks, declining all offers
> There might be other instances where Myriad is logging too much.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-103) Optimization: Reuse Mesos offer across YARN containers

2016-04-02 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-103:
--
   Assignee: (was: Kannan Rajah)
   Priority: Minor
Component/s: Scheduler
 Executor
 Issue Type: New Feature  (was: Bug)

> Optimization: Reuse Mesos offer across YARN containers
> --
>
> Key: MYRIAD-103
> URL: https://issues.apache.org/jira/browse/MYRIAD-103
> Project: Myriad
>  Issue Type: New Feature
>  Components: Executor, Scheduler
>Reporter: Kannan Rajah
>Priority: Minor
>
> In the fine grained scheduling implementation, when YARN scheduler allocates 
> a container, Myriad launches a pseudo Mesos task for it by specifying its 
> resource allocation. When the container completes, a task finished update is 
> sent to Mesos so that the pseudo task is stopped and its resources reclaimed. 
> When there are pending tasks in the YARN scheduler’s pipeline, it would be 
> beneficial to hold on to these resources and see if it can be used again. 
> This avoids the overhead of giving away resources and waiting for Mesos to 
> offer it back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MYRIAD-190) Tests fail on more recent versions of YARN

2016-04-02 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ resolved MYRIAD-190.
---
   Resolution: Fixed
Fix Version/s: Myriad 0.1.1
   Myriad 0.2.0

> Tests fail on more recent versions of YARN
> --
>
> Key: MYRIAD-190
> URL: https://issues.apache.org/jira/browse/MYRIAD-190
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0, Myriad 0.2.0
>Reporter: DarinJ
>Assignee: DarinJ
>Priority: Minor
> Fix For: Myriad 0.2.0, Myriad 0.1.1
>
>
> For Hadoop Versions 2.6.2+ the test fail do to a missing method in the mock 
> rmContext.  While it doesn't effect the runtime or the build, some users may 
> wish to build against their specific hadoop version to ensure no broken 
> dependencies.  Adding the mocked method corrects the behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MYRIAD-165) Cleanup old branches in git

2016-03-31 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ resolved MYRIAD-165.
---
   Resolution: Fixed
Fix Version/s: (was: Myriad 0.2.0)

Thanks Ken

> Cleanup old branches in git
> ---
>
> Key: MYRIAD-165
> URL: https://issues.apache.org/jira/browse/MYRIAD-165
> Project: Myriad
>  Issue Type: Improvement
>Reporter: Jim Klucar
>Assignee: Ken Sipe
>Priority: Trivial
>
> The git repo contains several old branches that are out of date and/or have 
> been merged into master. We should review the published branches and prune 
> what's not needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MYRIAD-141) Include myriad framework name into task name

2016-03-31 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ resolved MYRIAD-141.
---
   Resolution: Fixed
Fix Version/s: Myriad 0.1.1
   Myriad 0.2.0

Thanks Brandon!

> Include myriad framework name into task name
> 
>
> Key: MYRIAD-141
> URL: https://issues.apache.org/jira/browse/MYRIAD-141
> Project: Myriad
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
>Reporter: Yuliya Feldman
>Assignee: Brandon Gulla
> Fix For: Myriad 0.2.0, Myriad 0.1.1
>
>
> when people plan to run multiple myriad clusters under same Mesos it is very 
> hard to figure out which task belongs to which framework when getting list of 
> active tasks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-192) Better Support Cgroups

2016-03-23 Thread DarinJ (JIRA)
DarinJ created MYRIAD-192:
-

 Summary: Better Support Cgroups
 Key: MYRIAD-192
 URL: https://issues.apache.org/jira/browse/MYRIAD-192
 Project: Myriad
  Issue Type: Bug
  Components: Scheduler
Affects Versions: Myriad 0.1.0
Reporter: DarinJ


Current many of the options for cgroups are hard coded into Myriad.  These 
should be configurable.  In addition we should no longer chown the sandbox 
directory to yarn in `DownloadNMExecutorCLGenImpl.java`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-182) Ability to ignore certificate warnings on config download from SSL secured RM

2016-03-14 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193335#comment-15193335
 ] 

DarinJ commented on MYRIAD-182:
---

I ment [MYRIAD-189], sorry.

> Ability to ignore certificate warnings on config download from SSL secured RM
> -
>
> Key: MYRIAD-182
> URL: https://issues.apache.org/jira/browse/MYRIAD-182
> Project: Myriad
>  Issue Type: Bug
>  Components: Executor
>Affects Versions: Myriad 0.1.0
>Reporter: John Omernik
>
> When SSL is enabled for the Resource Manager, and the executor tries download 
> the config from /conf, if the CA is not valid a warning is thrown the 
> download fails. There are many cases where SSL certificate may not be valid 
> (especially in test, but maybe in production) thus we need the ability to 
> specify that certificate warnings should be ignored. in that case.  The 
> warning received is: 
> Failed to fetch 'https://myriadprod.marathonprod.mesos:8090/conf': Error 
> downloading resource: Peer certificate cannot be authenticated with given CA 
> certificates



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MYRIAD-182) Ability to ignore certificate warnings on config download from SSL secured RM

2016-03-14 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192806#comment-15192806
 ] 

DarinJ edited comment on MYRIAD-182 at 3/14/16 2:09 PM:


I believe recently added [MYRIAD-189] might solve this.


was (Author: darinj):
I believe recently added [MYRIAD-190] might solve this.

> Ability to ignore certificate warnings on config download from SSL secured RM
> -
>
> Key: MYRIAD-182
> URL: https://issues.apache.org/jira/browse/MYRIAD-182
> Project: Myriad
>  Issue Type: Bug
>  Components: Executor
>Affects Versions: Myriad 0.1.0
>Reporter: John Omernik
>
> When SSL is enabled for the Resource Manager, and the executor tries download 
> the config from /conf, if the CA is not valid a warning is thrown the 
> download fails. There are many cases where SSL certificate may not be valid 
> (especially in test, but maybe in production) thus we need the ability to 
> specify that certificate warnings should be ignored. in that case.  The 
> warning received is: 
> Failed to fetch 'https://myriadprod.marathonprod.mesos:8090/conf': Error 
> downloading resource: Peer certificate cannot be authenticated with given CA 
> certificates



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-190) Tests fail on more recent versions of YARN

2016-03-13 Thread DarinJ (JIRA)
DarinJ created MYRIAD-190:
-

 Summary: Tests fail on more recent versions of YARN
 Key: MYRIAD-190
 URL: https://issues.apache.org/jira/browse/MYRIAD-190
 Project: Myriad
  Issue Type: Bug
  Components: Scheduler
Affects Versions: Myriad 0.1.0, Myriad 0.2.0
Reporter: DarinJ
Assignee: DarinJ
Priority: Minor


For Hadoop Versions 2.6.2+ the test fail do to a missing method in the mock 
rmContext.  While it doesn't effect the runtime or the build, some users may 
wish to build against their specific hadoop version to ensure no broken 
dependencies.  Adding the mocked method corrects the behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MYRIAD-188) Zero sized node managers can cause the Resource Manager to crash with an NPE

2016-03-13 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ resolved MYRIAD-188.
---
Resolution: Fixed

> Zero sized node managers can cause the Resource Manager to crash with an NPE
> 
>
> Key: MYRIAD-188
> URL: https://issues.apache.org/jira/browse/MYRIAD-188
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
>Reporter: DarinJ
>Assignee: DarinJ
> Fix For: Myriad 0.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MYRIAD-156) NullPointerException from "Error in handling event type NODE_RESOURCE_UPDATE to the scheduler"

2016-03-13 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ reassigned MYRIAD-156:
-

Assignee: DarinJ  (was: Swapnil Daingade)

> NullPointerException from "Error in handling event type NODE_RESOURCE_UPDATE 
> to the scheduler"
> --
>
> Key: MYRIAD-156
> URL: https://issues.apache.org/jira/browse/MYRIAD-156
> Project: Myriad
>  Issue Type: Bug
>Reporter: Sarjeet Singh
>Assignee: DarinJ
>
> The NPE happens where there is a node in cluster becomes unhealthy, and 
> scheduler removes them from internal data structure. However, when the node 
> heartbeats and scheduler tries to search for this node, and try to operate on 
> it, it gets nullPointerException there. Here is the code snippet where this 
> is causing NPE: 
> SchedulerNode node = getSchedulerNode(nm.getNodeID());
> the node object is Null causing the Null pointer exception.
> Here is the RM log for caused exception:
> 15/10/06 09:18:09 INFO handlers.ResourceOffersEventHandler: Offer not
> sufficient for task with, cpu: 4.4, memory: 5504.0, spindles: 4.0, ports: 996
> 15/10/06 09:18:11 FATAL resourcemanager.ResourceManager: Error in handling
> event type NODE_RESOURCE_UPDATE to the scheduler
> java.lang.NullPointerException
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNodeResource(AbstractYarnScheduler.java:548)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.updateNodeResource(FairScheduler.java:1712)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1293)
> at
> com.ebay.myriad.scheduler.yarn.MyriadFairScheduler.handle(MyriadFairScheduler.java:64)
> at
> com.ebay.myriad.scheduler.yarn.MyriadFairScheduler.handle(MyriadFairScheduler.java:17)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:693)
> at java.lang.Thread.run(Thread.java:745)
> 15/10/06 09:18:11 INFO resourcemanager.ResourceManager: Exiting, bbye..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-156) NullPointerException from "Error in handling event type NODE_RESOURCE_UPDATE to the scheduler"

2016-03-13 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192515#comment-15192515
 ] 

DarinJ commented on MYRIAD-156:
---

As [~sarjeet] pointed out this looks to be the same bug as [MYRIAD-188].  I've 
submitted a PR which fixes the issue.

> NullPointerException from "Error in handling event type NODE_RESOURCE_UPDATE 
> to the scheduler"
> --
>
> Key: MYRIAD-156
> URL: https://issues.apache.org/jira/browse/MYRIAD-156
> Project: Myriad
>  Issue Type: Bug
>Reporter: Sarjeet Singh
>Assignee: Swapnil Daingade
>
> The NPE happens where there is a node in cluster becomes unhealthy, and 
> scheduler removes them from internal data structure. However, when the node 
> heartbeats and scheduler tries to search for this node, and try to operate on 
> it, it gets nullPointerException there. Here is the code snippet where this 
> is causing NPE: 
> SchedulerNode node = getSchedulerNode(nm.getNodeID());
> the node object is Null causing the Null pointer exception.
> Here is the RM log for caused exception:
> 15/10/06 09:18:09 INFO handlers.ResourceOffersEventHandler: Offer not
> sufficient for task with, cpu: 4.4, memory: 5504.0, spindles: 4.0, ports: 996
> 15/10/06 09:18:11 FATAL resourcemanager.ResourceManager: Error in handling
> event type NODE_RESOURCE_UPDATE to the scheduler
> java.lang.NullPointerException
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNodeResource(AbstractYarnScheduler.java:548)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.updateNodeResource(FairScheduler.java:1712)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1293)
> at
> com.ebay.myriad.scheduler.yarn.MyriadFairScheduler.handle(MyriadFairScheduler.java:64)
> at
> com.ebay.myriad.scheduler.yarn.MyriadFairScheduler.handle(MyriadFairScheduler.java:17)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:693)
> at java.lang.Thread.run(Thread.java:745)
> 15/10/06 09:18:11 INFO resourcemanager.ResourceManager: Exiting, bbye..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MYRIAD-153) Placeholder tasks yarn_container_* is not cleaned after yarn job is complete.

2016-03-09 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ resolved MYRIAD-153.
---
Resolution: Fixed

#53

> Placeholder tasks yarn_container_* is not cleaned after yarn job is complete.
> -
>
> Key: MYRIAD-153
> URL: https://issues.apache.org/jira/browse/MYRIAD-153
> Project: Myriad
>  Issue Type: Bug
>Reporter: Sarjeet Singh
>Assignee: DarinJ
> Fix For: Myriad 0.2.0
>
> Attachments: Mesos_UI_screeshot_placeholder_tasks_running.png
>
>
> Observed the placeholder tasks for containers launched on FGS are still in 
> RUNNING state on mesos. These container tasks are not cleaned up properly 
> after job is finished completely.
> see screenshot attached for mesos UI with placeholder tasks still running.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-188) Zero sized node managers can cause the Resource Manager to crash with an NPE

2016-03-08 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185632#comment-15185632
 ] 

DarinJ commented on MYRIAD-188:
---

https://github.com/apache/incubator-myriad/pull/62

> Zero sized node managers can cause the Resource Manager to crash with an NPE
> 
>
> Key: MYRIAD-188
> URL: https://issues.apache.org/jira/browse/MYRIAD-188
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
>Reporter: DarinJ
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MYRIAD-188) Zero sized node managers can cause the Resource Manager to crash with an NPE

2016-03-03 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177133#comment-15177133
 ] 

DarinJ edited comment on MYRIAD-188 at 3/3/16 2:56 PM:
---

Changed setNodeCapacity in YarnNodeCapacity as below, fixed the problem once I 
fixed the Unit Tests I'll submit a patch.
  public void setNodeCapacity(RMNode rmNode, Resource newCapacity) {
rmNode.getTotalCapability().setMemory(newCapacity.getMemory());
rmNode.getTotalCapability().setVirtualCores(newCapacity.getVirtualCores());
LOGGER.debug("Setting capacity for node {} to {}", rmNode.getHostName(), 
newCapacity);
// updates the scheduler with the new capacity for the NM.
// the event is handled by the scheduler asynchronously
synchronized (yarnScheduler) {
  if (yarnScheduler.getSchedulerNode(rmNode.getNodeID()) != null) {
rmContext.getDispatcher().getEventHandler().handle(new 
NodeResourceUpdateSchedulerEvent(rmNode,
 ResourceOption.newInstance(rmNode.getTotalCapability(),
 RMNode.OVER_COMMIT_TIMEOUT_MILLIS_DEFAULT)));
  } else {
LOGGER.info("Node {} doesn't exist in Scheduler", rmNode.getNode());
  }
}
  }



was (Author: darinj):
Changed setNodeCapacity in YarnNodeCapacity as below, fixed the problem once I 
fixed the Unit Tests I'll submit a patch.
  public void setNodeCapacity(RMNode rmNode, Resource newCapacity) {
rmNode.getTotalCapability().setMemory(newCapacity.getMemory());
rmNode.getTotalCapability().setVirtualCores(newCapacity.getVirtualCores());
LOGGER.debug("Setting capacity for node {} to {}", rmNode.getHostName(), 
newCapacity);
// updates the scheduler with the new capacity for the NM.
// the event is handled by the scheduler asynchronously
synchronized (yarnScheduler) {
  if (yarnScheduler.getSchedulerNode(rmNode.getNodeID()) != null) {
yarnScheduler.updateNodeResource(rmNode, 
ResourceOption.newInstance(newCapacity,
 RMNode.OVER_COMMIT_TIMEOUT_MILLIS_DEFAULT));
rmContext.getDispatcher().getEventHandler().handle(new 
NodeResourceUpdateSchedulerEvent(rmNode,
 ResourceOption.newInstance(rmNode.getTotalCapability(),
 RMNode.OVER_COMMIT_TIMEOUT_MILLIS_DEFAULT)));
  } else {
LOGGER.info("Node {} doesn't exist in Scheduler", rmNode.getNode());
  }
}
  }


> Zero sized node managers can cause the Resource Manager to crash with an NPE
> 
>
> Key: MYRIAD-188
> URL: https://issues.apache.org/jira/browse/MYRIAD-188
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
>Reporter: DarinJ
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-188) Zero sized node managers can cause the Resource Manager to crash with an NPE

2016-03-03 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177908#comment-15177908
 ] 

DarinJ commented on MYRIAD-188:
---

[~sarjeet]] good catch, I think this is definitely related I believe 
[~sdaingade] is assigned to that ticket.  Is he actively working is?  My patch 
fixes the issue, but have some thoughts on how to improve it.  Would be helpful 
to discuss. The main one is the use of 

yarnScheduler.updateNodeResource(rmNode, 
ResourceOption.newInstance(newCapacity, 
RMNode.OVER_COMMIT_TIMEOUT_MILLIS_DEFAULT)); 

vs

yarnScheduler.updateNodeResource(rmNode, 
ResourceOption.newInstance(newCapacity, 
RMNode.OVER_COMMIT_TIMEOUT_MILLIS_DEFAULT)); 

> Zero sized node managers can cause the Resource Manager to crash with an NPE
> 
>
> Key: MYRIAD-188
> URL: https://issues.apache.org/jira/browse/MYRIAD-188
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
>Reporter: DarinJ
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-182) Ability to ignore certificate warnings on config download from SSL secured RM

2016-01-13 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15096652#comment-15096652
 ] 

DarinJ commented on MYRIAD-182:
---

The config is pulled using the Mesos Fetcher.  We will need to determine if 
it's possible to not use certs with Mesos and if so how.

> Ability to ignore certificate warnings on config download from SSL secured RM
> -
>
> Key: MYRIAD-182
> URL: https://issues.apache.org/jira/browse/MYRIAD-182
> Project: Myriad
>  Issue Type: Bug
>  Components: Executor
>Affects Versions: Myriad 0.1.0
>Reporter: John Omernik
>
> When SSL is enabled for the Resource Manager, and the executor tries download 
> the config from /conf, if the CA is not valid a warning is thrown the 
> download fails. There are many cases where SSL certificate may not be valid 
> (especially in test, but maybe in production) thus we need the ability to 
> specify that certificate warnings should be ignored. in that case.  The 
> warning received is: 
> Failed to fetch 'https://myriadprod.marathonprod.mesos:8090/conf': Error 
> downloading resource: Peer certificate cannot be authenticated with given CA 
> certificates



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-184) RM Ports are Hardcoded in NMExecutorCLGenImpl.java

2015-12-29 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-184:
--
Labels: easyfix newbie  (was: )

> RM Ports are Hardcoded in NMExecutorCLGenImpl.java
> --
>
> Key: MYRIAD-184
> URL: https://issues.apache.org/jira/browse/MYRIAD-184
> Project: Myriad
>  Issue Type: Bug
>  Components: Executor
>Affects Versions: Myriad 0.1.0
>Reporter: John Omernik
>  Labels: easyfix, newbie
>
> In NMExecutorCLGenImpl.java, the ports for Resource Manager are derived via 
> the http.policy config setting. Instead, the ports should be using a 
> different variable that actually corresponds to running port. The ports that 
> are hard coded are the default ports for the RM for HTTP and HTTPS (8088, and 
> 8090) but if a user changed the port, the config download would fail.   Thus 
> finding a better variable here would help make it so operators are not 
> limited to the default ports in their environments. 
> (Hard Coding in function public String getConfigurationUrl())
> https://github.com/apache/incubator-myriad/blob/df7d05c8639b371b94a1e94406e2f2446d10eaaf/myriad-scheduler/src/main/java/org/apache/myriad/scheduler/NMExecutorCLGenImpl.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MYRIAD-153) Placeholder tasks yarn_container_* is not cleaned after yarn job is complete.

2015-12-07 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ reassigned MYRIAD-153:
-

Assignee: DarinJ  (was: Santosh Marella)

> Placeholder tasks yarn_container_* is not cleaned after yarn job is complete.
> -
>
> Key: MYRIAD-153
> URL: https://issues.apache.org/jira/browse/MYRIAD-153
> Project: Myriad
>  Issue Type: Bug
>Reporter: Sarjeet Singh
>Assignee: DarinJ
> Attachments: Mesos_UI_screeshot_placeholder_tasks_running.png
>
>
> Observed the placeholder tasks for containers launched on FGS are still in 
> RUNNING state on mesos. These container tasks are not cleaned up properly 
> after job is finished completely.
> see screenshot attached for mesos UI with placeholder tasks still running.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MYRIAD-172) More resources assigned to executor

2015-11-29 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ resolved MYRIAD-172.
---
   Resolution: Fixed
Fix Version/s: Myriad 0.1.0

PR #41

> More resources assigned to executor
> ---
>
> Key: MYRIAD-172
> URL: https://issues.apache.org/jira/browse/MYRIAD-172
> Project: Myriad
>  Issue Type: Bug
>  Components: Executor, Scheduler
>Reporter: Aashreya Ravi Shankar
>Assignee: DarinJ
> Fix For: Myriad 0.1.0
>
>
> Twice the number of resources for CPU and memory is being assigned for the 
> executor.
> Once the Node Manager task is launched I see the following resources in mesos:
> Zero Profile NM : Mem : 1.4 GB , CPU : 0.4
> Medium Profile NM: Mem : 5.4 GB CPU :  4.4 
> Earlier it was 
> Zero : 1.2 GB and CPU : 0.2   ( Mem : 1 GB + 10% , CPU static value of 0.2)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-168) Myriad should be able to use ports from the frameworkRole and the defaultRole.

2015-11-18 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15011614#comment-15011614
 ] 

DarinJ commented on MYRIAD-168:
---

started work here: https://github.com/DarinJ/incubator-myriad/tree/Docker

Doing some docker work in tandem as that will effect my work.

> Myriad should be able to use ports from the frameworkRole and the defaultRole.
> --
>
> Key: MYRIAD-168
> URL: https://issues.apache.org/jira/browse/MYRIAD-168
> Project: Myriad
>  Issue Type: Improvement
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
>Reporter: DarinJ
>Assignee: DarinJ
>Priority: Minor
> Fix For: Myriad 0.2.0
>
>
> Currently, Myriad ignored ports from roles other than the default role.  
> However, one could reserve ports on hosts in the resources file for the 
> framework role myriad runs on.  This would prevent port binding collisions 
> for services needing fixed port such as the Job History Server. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-138) ContainerInfo for Executor (Enable docker support on the executor)

2015-11-05 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-138:
--
Affects Version/s: (was: Myriad 0.1.0)

> ContainerInfo for Executor (Enable docker support on the executor)
> --
>
> Key: MYRIAD-138
> URL: https://issues.apache.org/jira/browse/MYRIAD-138
> Project: Myriad
>  Issue Type: New Feature
>  Components: Executor, Scheduler
>Affects Versions: Myriad 0.2.0
>Reporter: DarinJ
>Assignee: DarinJ
>Priority: Minor
>
> ContainerInfo allows one to specify a docker image to execute the executor 
> in, it's a relatively straight forward implementation to add as an option 
> since it's mostly getting additional information out of the YAML 
> configuration and passing to TaskFactory.  There's a clear advantage of 
> having assume less information about what is installed on the node (i.e java).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-168) Myriad should be able to use ports from the frameworkRole and the defaultRole.

2015-11-05 Thread DarinJ (JIRA)
DarinJ created MYRIAD-168:
-

 Summary: Myriad should be able to use ports from the frameworkRole 
and the defaultRole.
 Key: MYRIAD-168
 URL: https://issues.apache.org/jira/browse/MYRIAD-168
 Project: Myriad
  Issue Type: Improvement
  Components: Scheduler
Affects Versions: Myriad 0.1.0
Reporter: DarinJ
Assignee: DarinJ
Priority: Minor
 Fix For: Myriad 0.2.0


Currently, Myriad ignored ports from roles other than the default role.  
However, one could reserve ports on hosts in the resources file for the 
framework role myriad runs on.  This would prevent port binding collisions for 
services needing fixed port such as the Job History Server. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-36) Package executor/NM into a Docker image

2015-11-03 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-36?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-36:
-
Fix Version/s: (was: Myriad 0.1.0)
   Myriad 0.2.0

> Package executor/NM into a Docker image
> ---
>
> Key: MYRIAD-36
> URL: https://issues.apache.org/jira/browse/MYRIAD-36
> Project: Myriad
>  Issue Type: New Feature
>Reporter: Adam B
>Assignee: DarinJ
> Fix For: Myriad 0.2.0
>
>
> Probably has to be done in privileged mode, to support NM launching its own 
> Dockers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-36) Package executor/NM into a Docker image

2015-11-03 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14988001#comment-14988001
 ] 

DarinJ commented on MYRIAD-36:
--

Absolutely, I was planning that anyway.

> Package executor/NM into a Docker image
> ---
>
> Key: MYRIAD-36
> URL: https://issues.apache.org/jira/browse/MYRIAD-36
> Project: Myriad
>  Issue Type: New Feature
>Reporter: Adam B
>Assignee: DarinJ
> Fix For: Myriad 0.1.0
>
>
> Probably has to be done in privileged mode, to support NM launching its own 
> Dockers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-153) Placeholder tasks yarn_container_* is not cleaned after yarn job is complete.

2015-11-02 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986651#comment-14986651
 ] 

DarinJ commented on MYRIAD-153:
---

Having a similar issue, have you looked in the stderr in your sandbox and/or 
your hadoop logs?  I've noticed this line in each task that gets stuck in 
running:

{quote}
15/11/03 03:44:25 WARN containermanager.ContainerManagerImpl: Event EventType: 
KILL_CONTAINER sent to absent container container_1446520127877_0004_01_000509
{quote}
Where container_X matches yarn_container_X.  I haven't had a 
chance to investigate further though.

> Placeholder tasks yarn_container_* is not cleaned after yarn job is complete.
> -
>
> Key: MYRIAD-153
> URL: https://issues.apache.org/jira/browse/MYRIAD-153
> Project: Myriad
>  Issue Type: Bug
>Reporter: Sarjeet Singh
> Attachments: Mesos_UI_screeshot_placeholder_tasks_running.png
>
>
> Observed the placeholder tasks for containers launched on FGS are still in 
> RUNNING state on mesos. These container tasks are not cleaned up properly 
> after job is finished completely.
> see screenshot attached for mesos UI with placeholder tasks still running.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-162) Myriad Not Correctly Dealing with Resources from Multiple Roles

2015-10-29 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980728#comment-14980728
 ] 

DarinJ commented on MYRIAD-162:
---

I've got a pretty good handle on how to fix should be ready for review by 
Monday/Tuesday.  

> Myriad Not Correctly Dealing with Resources from Multiple Roles
> ---
>
> Key: MYRIAD-162
> URL: https://issues.apache.org/jira/browse/MYRIAD-162
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
> Environment: Any where frameworkRole is not *
>Reporter: DarinJ
>Assignee: DarinJ
> Fix For: Myriad 0.1.0
>
>
> When using Offers that have Resources from multiple roles, one needs to use 
> the setRole(String role) method to specify which role the resource belongs 
> to.  Myriad currently doesn't do this which causes TASK_LOST, with an error 
> in the mesos-master log stating in "attempted to use cpus(*): 1.2; mem(*): 
> 1305.6; ports(*): [31005-31005,31006-31006,...] greater than offered 
> cpu(*):1, mem(*): 1400, ports(*): [ ... ], cpu(roleA): 3, mem(roleA): 1, 
> ports(roleA): [...].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-162) Myriad Not Correctly Dealing with Resources from Multiple Roles

2015-10-29 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-162:
--
Description: When using Offers that have Resources from multiple roles, one 
needs to use the setRole(String role) method to specify which role the resource 
belongs to.  Myriad currently doesn't do this which causes TASK_LOST, with an 
error in the mesos-master log stating in "attempted to use cpus( * ): 1.2; mem( 
* ): 1305.6; ports( * ): [31005-31005,31006-31006,...] greater than offered 
cpu( * ):1, mem( * ): 1400, ports( * ): [ ... ], cpu(roleA): 3, mem(roleA): 
1, ports(roleA): [...].  (was: When using Offers that have Resources from 
multiple roles, one needs to use the setRole(String role) method to specify 
which role the resource belongs to.  Myriad currently doesn't do this which 
causes TASK_LOST, with an error in the mesos-master log stating in "attempted 
to use cpus(*): 1.2; mem(*): 1305.6; ports(*): [31005-31005,31006-31006,...] 
greater than offered cpu(*):1, mem(*): 1400, ports(*): [ ... ], cpu(roleA): 3, 
mem(roleA): 1, ports(roleA): [...].)

> Myriad Not Correctly Dealing with Resources from Multiple Roles
> ---
>
> Key: MYRIAD-162
> URL: https://issues.apache.org/jira/browse/MYRIAD-162
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
> Environment: Any where frameworkRole is not *
>Reporter: DarinJ
>Assignee: DarinJ
> Fix For: Myriad 0.1.0
>
>
> When using Offers that have Resources from multiple roles, one needs to use 
> the setRole(String role) method to specify which role the resource belongs 
> to.  Myriad currently doesn't do this which causes TASK_LOST, with an error 
> in the mesos-master log stating in "attempted to use cpus( * ): 1.2; mem( * 
> ): 1305.6; ports( * ): [31005-31005,31006-31006,...] greater than offered 
> cpu( * ):1, mem( * ): 1400, ports( * ): [ ... ], cpu(roleA): 3, mem(roleA): 
> 1, ports(roleA): [...].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-162) Myriad Not Correctly Dealing with Resources from Multiple Roles

2015-10-29 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980815#comment-14980815
 ] 

DarinJ commented on MYRIAD-162:
---

Close but not quite.  We want Myriad to be able to use Resources from both the 
frameworkUserRole AND the defaultRole (usually *).  So we need to be aware of 
both, otherwise you're essentially statically partitioning your data center 
instead of reserving resources. The idea isn't hard and shouldn't require a 
major refactor.  I'm prototyping now, will know more soon.

If you wish to recreate simply add a role with some cpu resources to you're 
cluster and set frameworkRole to that role.  You may need to ensure you're 
tasks require enough cpus/mem that they have to use the new role.

> Myriad Not Correctly Dealing with Resources from Multiple Roles
> ---
>
> Key: MYRIAD-162
> URL: https://issues.apache.org/jira/browse/MYRIAD-162
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
> Environment: Any where frameworkRole is not *
>Reporter: DarinJ
>Assignee: DarinJ
> Fix For: Myriad 0.1.0
>
>
> When using Offers that have Resources from multiple roles, one needs to use 
> the setRole(String role) method to specify which role the resource belongs 
> to.  Myriad currently doesn't do this which causes TASK_LOST, with an error 
> in the mesos-master log stating in "attempted to use cpus( * ): 1.2; mem( * 
> ): 1305.6; ports( * ): [31005-31005,31006-31006,...] greater than offered 
> cpu( * ):1, mem( * ): 1400, ports( * ): [ ... ], cpu(roleA): 3, mem(roleA): 
> 1, ports(roleA): [...].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-162) Myriad Not Correctly Dealing with Resources from Multiple Roles

2015-10-29 Thread DarinJ (JIRA)
DarinJ created MYRIAD-162:
-

 Summary: Myriad Not Correctly Dealing with Resources from Multiple 
Roles
 Key: MYRIAD-162
 URL: https://issues.apache.org/jira/browse/MYRIAD-162
 Project: Myriad
  Issue Type: Bug
  Components: Scheduler
Affects Versions: Myriad 0.1.0
 Environment: Any where frameworkRole is not *
Reporter: DarinJ
 Fix For: Myriad 0.1.0


When using Offers that have Resources from multiple roles, one needs to use the 
setRole(String role) method to specify which role the resource belongs to.  
Myriad currently doesn't do this which causes TASK_LOST, with an error in the 
mesos-master log stating in "attempted to use cpus(*): 1.2; mem(*): 1305.6; 
ports(*): [31005-31005,31006-31006,...] greater than offered cpu(*):1, mem(*): 
1400, ports(*): [ ... ], cpu(roleA): 3, mem(roleA): 1, ports(roleA): [...].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MYRIAD-162) Myriad Not Correctly Dealing with Resources from Multiple Roles

2015-10-29 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981698#comment-14981698
 ] 

DarinJ edited comment on MYRIAD-162 at 10/30/15 5:01 AM:
-

Without a fix Myriad will work if frameworkRole is not present or set to *, but 
fail badly if a frameworkRole is present.  For a 0.1.0 release, the easiest 
nice things to do are 1. add a check if frameworkRole is present and log a 
warning and exit or 2. log and then set role to * anyway.  After those options 
it's simply to fix it.  I'm close to a fix (mem/cpu done working ports now) and 
would like to have it for 0.1.0, but it can wait for 0.1.1.

Current copy of changes here: https://github.com/darinj/incubator-myriad

Todo ports (still).  This could be delayed with the assumption ports aren't 
generally reserved for random ports, then we just force Myriad to get ports 
from the default role (done in TaskFactoryImpl, would need to do for 
ServiceTaskFactoryImpl).


was (Author: darinj):
Without a fix Myriad will work if frameworkRole is not present or set to *, but 
fail badly if a frameworkRole is present.  For a 0.1.0 release, the easiest 
nice things to do are 1. add a check if frameworkRole is present and log a 
warning and exit or 2. log and then set role to * anyway.  After those options 
it's simply to fix it.  I'm close to a fix (mem/cpu done working ports now) and 
would like to have it for 0.1.0, but it can wait for 0.1.1.

> Myriad Not Correctly Dealing with Resources from Multiple Roles
> ---
>
> Key: MYRIAD-162
> URL: https://issues.apache.org/jira/browse/MYRIAD-162
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
> Environment: Any where frameworkRole is not *
>Reporter: DarinJ
>Assignee: DarinJ
> Fix For: Myriad 0.1.0
>
>
> When using Offers that have Resources from multiple roles, one needs to use 
> the setRole(String role) method to specify which role the resource belongs 
> to.  Myriad currently doesn't do this which causes TASK_LOST, with an error 
> in the mesos-master log stating in "attempted to use cpus( * ): 1.2; mem( * 
> ): 1305.6; ports( * ): [31005-31005,31006-31006,...] greater than offered 
> cpu( * ):1, mem( * ): 1400, ports( * ): [ ... ], cpu(roleA): 3, mem(roleA): 
> 1, ports(roleA): [...].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-162) Myriad Not Correctly Dealing with Resources from Multiple Roles

2015-10-29 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14981698#comment-14981698
 ] 

DarinJ commented on MYRIAD-162:
---

Without a fix Myriad will work if frameworkRole is not present or set to *, but 
fail badly if a frameworkRole is present.  For a 0.1.0 release, the easiest 
nice things to do are 1. add a check if frameworkRole is present and log a 
warning and exit or 2. log and then set role to * anyway.  After those options 
it's simply to fix it.  I'm close to a fix (mem/cpu done working ports now) and 
would like to have it for 0.1.0, but it can wait for 0.1.1.

> Myriad Not Correctly Dealing with Resources from Multiple Roles
> ---
>
> Key: MYRIAD-162
> URL: https://issues.apache.org/jira/browse/MYRIAD-162
> Project: Myriad
>  Issue Type: Bug
>  Components: Scheduler
>Affects Versions: Myriad 0.1.0
> Environment: Any where frameworkRole is not *
>Reporter: DarinJ
>Assignee: DarinJ
> Fix For: Myriad 0.1.0
>
>
> When using Offers that have Resources from multiple roles, one needs to use 
> the setRole(String role) method to specify which role the resource belongs 
> to.  Myriad currently doesn't do this which causes TASK_LOST, with an error 
> in the mesos-master log stating in "attempted to use cpus( * ): 1.2; mem( * 
> ): 1305.6; ports( * ): [31005-31005,31006-31006,...] greater than offered 
> cpu( * ):1, mem( * ): 1400, ports( * ): [ ... ], cpu(roleA): 3, mem(roleA): 
> 1, ports(roleA): [...].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-45) Make all Executor/NM ports configurable

2015-10-21 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-45:
-
Fix Version/s: Myriad 0.1.0

> Make all Executor/NM ports configurable
> ---
>
> Key: MYRIAD-45
> URL: https://issues.apache.org/jira/browse/MYRIAD-45
> Project: Myriad
>  Issue Type: Improvement
>Reporter: Adam B
>Assignee: DarinJ
> Fix For: Myriad 0.1.0
>
>
> Requirement for supporting multi-tenancy of multiple NMs on same host



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-36) Package executor/NM into a Docker image

2015-10-21 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-36?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14968425#comment-14968425
 ] 

DarinJ commented on MYRIAD-36:
--

I actually added a Myriad-138 which is related, and started work on this.  It's 
reasonably far along, but missing some functionality.  I could potentially get 
a version done by 0.1.0 but I'd consider this a new feature.

> Package executor/NM into a Docker image
> ---
>
> Key: MYRIAD-36
> URL: https://issues.apache.org/jira/browse/MYRIAD-36
> Project: Myriad
>  Issue Type: Bug
>Reporter: Adam B
> Fix For: Myriad 0.1.0
>
>
> Probably has to be done in privileged mode, to support NM launching its own 
> Dockers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MYRIAD-98) Move from 4 spaces to 2 spaces for indentation

2015-10-08 Thread DarinJ (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-98?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DarinJ updated MYRIAD-98:
-
Fix Version/s: Myriad 0.1.0

> Move from 4 spaces to 2 spaces for indentation
> --
>
> Key: MYRIAD-98
> URL: https://issues.apache.org/jira/browse/MYRIAD-98
> Project: Myriad
>  Issue Type: Bug
>Reporter: Santosh Marella
> Fix For: Myriad 0.1.0
>
>
> A lot of open source java projects use 2 spaces for indentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-22) Support Mesos Framework Authentication

2015-09-30 Thread DarinJ (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939197#comment-14939197
 ] 

DarinJ commented on MYRIAD-22:
--

Merged awhile ago.

> Support Mesos Framework Authentication
> --
>
> Key: MYRIAD-22
> URL: https://issues.apache.org/jira/browse/MYRIAD-22
> Project: Myriad
>  Issue Type: Bug
>Reporter: Adam B
>
> See the [Mesos protobuf for 
> Credential|https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L820]
> Also see Marathon's issue/implementation: 
> https://github.com/mesosphere/marathon/issues/638



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)