[jira] [Created] (MYRIAD-251) ZERO size NodeManager fail to obtain resource from Mesos Offer

2016-12-14 Thread Tao Jie (JIRA)
Tao Jie created MYRIAD-251:
--

 Summary: ZERO size NodeManager fail to obtain resource from Mesos 
Offer
 Key: MYRIAD-251
 URL: https://issues.apache.org/jira/browse/MYRIAD-251
 Project: Myriad
  Issue Type: Bug
  Components: Scheduler
Reporter: Tao Jie


I tried Fine-grained Scaling and flexed up zero size NodeManager, then I run a 
MR job which request for resource.
However zero size NM did not obtain resource from mesos offer. RM logs like:
{code}
2016-12-14 16:58:23,929 INFO 
org.apache.myriad.scheduler.fgs.NMHeartBeatHandler: Did not update 
bdi13.cmss.com with 10 cores and 5888 memory, over max cpu cores and/or max 
memory
2016-12-14 16:58:23,931 WARN 
org.apache.myriad.scheduler.fgs.YarnNodeCapacityManager: Asked to set Node 
bdi13.cmss.com:31905 to a value less than zero!  Had , 
setting to .
2016-12-14 16:58:23,931 WARN 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler: 
Update resource on node: bdi13.cmss.com with the same resource: 
{code}
It seems that mesos offer with memory larger than 2252.8mb would be denied, and 
2252.8mb is fixed value in code :
{code}
private Double generateNodeManagerMemory() {
return (NodeManagerConfiguration.DEFAULT_JVM_MAX_MEMORY_MB) * (1 + 
NodeManagerConfiguration.JVM_OVERHEAD);
  }
{code}
where DEFAULT_JVM_MAX_MEMORY_MB=2048 and 
NodeManagerConfiguration.JVM_OVERHEAD=0.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MYRIAD-250) Should shutdown mesos framework when stop resourcemanager

2016-11-30 Thread Tao Jie (JIRA)

 [ 
https://issues.apache.org/jira/browse/MYRIAD-250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Jie updated MYRIAD-250:
---
Comment: was deleted

(was: This happens on any NM, and I am using code on master branch.)

> Should shutdown mesos framework when stop resourcemanager
> -
>
> Key: MYRIAD-250
> URL: https://issues.apache.org/jira/browse/MYRIAD-250
> Project: Myriad
>  Issue Type: Bug
>Affects Versions: Myriad 0.2.0
>Reporter: Tao Jie
>
> When I started resourcemanager and flex up nodes, nodemanagers were launched 
> as mesos tasks in framework created by RM.
> I stopped resourcemanager, the framework turned to inactive framework but 
> nodemanagers still run as active task. Then I restarted the resourcemanager, 
> which create another framework. Those nodemanager would report to the new 
> Resourcemanager, and I could not kill those nodemanager by flex down nodes.
> It seems that the framework should be shutdown once the resourcemanager is 
> stopped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-250) Should shutdown mesos framework when stop resourcemanager

2016-11-30 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707916#comment-15707916
 ] 

Tao Jie commented on MYRIAD-250:


This happens for any size NM, and I am using code on master branch.

> Should shutdown mesos framework when stop resourcemanager
> -
>
> Key: MYRIAD-250
> URL: https://issues.apache.org/jira/browse/MYRIAD-250
> Project: Myriad
>  Issue Type: Bug
>Affects Versions: Myriad 0.2.0
>Reporter: Tao Jie
>
> When I started resourcemanager and flex up nodes, nodemanagers were launched 
> as mesos tasks in framework created by RM.
> I stopped resourcemanager, the framework turned to inactive framework but 
> nodemanagers still run as active task. Then I restarted the resourcemanager, 
> which create another framework. Those nodemanager would report to the new 
> Resourcemanager, and I could not kill those nodemanager by flex down nodes.
> It seems that the framework should be shutdown once the resourcemanager is 
> stopped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-250) Should shutdown mesos framework when stop resourcemanager

2016-11-29 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707739#comment-15707739
 ] 

Tao Jie commented on MYRIAD-250:


Hi, [~yufeldman] [~darinj], trying to inject shutdown mesos framework operation 
when resoucermanager is stopped.
https://github.com/apache/incubator-myriad/pull/101

> Should shutdown mesos framework when stop resourcemanager
> -
>
> Key: MYRIAD-250
> URL: https://issues.apache.org/jira/browse/MYRIAD-250
> Project: Myriad
>  Issue Type: Bug
>Affects Versions: Myriad 0.2.0
>Reporter: Tao Jie
>
> When I started resourcemanager and flex up nodes, nodemanagers were launched 
> as mesos tasks in framework created by RM.
> I stopped resourcemanager, the framework turned to inactive framework but 
> nodemanagers still run as active task. Then I restarted the resourcemanager, 
> which create another framework. Those nodemanager would report to the new 
> Resourcemanager, and I could not kill those nodemanager by flex down nodes.
> It seems that the framework should be shutdown once the resourcemanager is 
> stopped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-250) Should shutdown mesos framework when stop resourcemanager

2016-11-29 Thread Tao Jie (JIRA)
Tao Jie created MYRIAD-250:
--

 Summary: Should shutdown mesos framework when stop resourcemanager
 Key: MYRIAD-250
 URL: https://issues.apache.org/jira/browse/MYRIAD-250
 Project: Myriad
  Issue Type: Bug
Affects Versions: Myriad 0.2.0
Reporter: Tao Jie


When I started resourcemanager and flex up nodes, nodemanagers were launched as 
mesos tasks in framework created by RM.
I stopped resourcemanager, the framework turned to inactive framework but 
nodemanagers still run as active task. Then I restarted the resourcemanager, 
which create another framework. Those nodemanager would report to the new 
Resourcemanager, and I could not kill those nodemanager by flex down nodes.
It seems that the framework should be shutdown once the resourcemanager is 
stopped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MYRIAD-249) Should set NodeManager vcores more flexibly

2016-11-29 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704397#comment-15704397
 ] 

Tao Jie edited comment on MYRIAD-249 at 11/29/16 10:26 AM:
---

Hi, [~yufeldman] [~darinj], would you mind giving it a review?
https://github.com/apache/incubator-myriad/pull/100


was (Author: tao jie):
Hi, [~yufeldman] [~darinj], would you mind giving it a review?
https://github.com/apache/incubator-myriad/pull/98

> Should set NodeManager vcores more flexibly
> ---
>
> Key: MYRIAD-249
> URL: https://issues.apache.org/jira/browse/MYRIAD-249
> Project: Myriad
>  Issue Type: Bug
>Affects Versions: Myriad 0.2.0
>Reporter: Tao Jie
>
> Today we set resource of NodeManager by configuration like:
> {code}
> profiles:
>   zero:  # NMs launched with this profile dynamically obtain cpu/mem from 
> Mesos
> cpu: 0
> mem: 0
>   small:
> cpu: 2
> mem: 1024
>   medium:
> cpu: 4
> mem: 4096
>   large:
> cpu: 10
> mem: 12288
> {code}
> cpu/mem here is request for Mesos. We launch NodeManager and set 
> {{nodemanager.resource.cpu-vcores}} and {{nodemanager.resource.memory-mb}} as 
> cpu/mem once resource is allocated. However the meaning of vcores in YARN is 
> not extremely the same with cpu in Mesos. In Yarn, we may set vcores to 12 
> when physical cpu is 6, and it would be converted to real cpu when request 
> for vcores. Also in yarn, requested vcores must be integer, as a result each 
> task would take at least one vcore(actually not necessary one real cpu).
> We could have one more field configuration that multiply real cpu in mesos to 
> vcores in yarn. Perhaps set vcores directly in configuration?
> I am freshman of Myriad and Mesos, please correct me if I am wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MYRIAD-249) Should set NodeManager vcores more flexibly

2016-11-29 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704397#comment-15704397
 ] 

Tao Jie edited comment on MYRIAD-249 at 11/29/16 8:13 AM:
--

Hi, [~yufeldman] [~darinj], would you mind giving it a review?
https://github.com/apache/incubator-myriad/pull/98


was (Author: tao jie):
https://github.com/apache/incubator-myriad/pull/97

> Should set NodeManager vcores more flexibly
> ---
>
> Key: MYRIAD-249
> URL: https://issues.apache.org/jira/browse/MYRIAD-249
> Project: Myriad
>  Issue Type: Bug
>Affects Versions: Myriad 0.2.0
>Reporter: Tao Jie
>
> Today we set resource of NodeManager by configuration like:
> {code}
> profiles:
>   zero:  # NMs launched with this profile dynamically obtain cpu/mem from 
> Mesos
> cpu: 0
> mem: 0
>   small:
> cpu: 2
> mem: 1024
>   medium:
> cpu: 4
> mem: 4096
>   large:
> cpu: 10
> mem: 12288
> {code}
> cpu/mem here is request for Mesos. We launch NodeManager and set 
> {{nodemanager.resource.cpu-vcores}} and {{nodemanager.resource.memory-mb}} as 
> cpu/mem once resource is allocated. However the meaning of vcores in YARN is 
> not extremely the same with cpu in Mesos. In Yarn, we may set vcores to 12 
> when physical cpu is 6, and it would be converted to real cpu when request 
> for vcores. Also in yarn, requested vcores must be integer, as a result each 
> task would take at least one vcore(actually not necessary one real cpu).
> We could have one more field configuration that multiply real cpu in mesos to 
> vcores in yarn. Perhaps set vcores directly in configuration?
> I am freshman of Myriad and Mesos, please correct me if I am wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-247) Fail to fetch yarnConfiguration from Resourcemanager

2016-11-29 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704616#comment-15704616
 ] 

Tao Jie commented on MYRIAD-247:


hi [~yufeldman] [~darinj]
Try to fix this..
https://github.com/apache/incubator-myriad/pull/99

> Fail to fetch yarnConfiguration from Resourcemanager
> 
>
> Key: MYRIAD-247
> URL: https://issues.apache.org/jira/browse/MYRIAD-247
> Project: Myriad
>  Issue Type: Bug
>Affects Versions: Myriad 0.2.0
>Reporter: Tao Jie
>
> I setuped cluster with Mesos-1.0 and Myriad-0.20. When I tried to start 
> nodemanager, the mesos task tried download hadoop configuration file by 
> fetching {{http://rm-addr:8088/yarnConfiguration}}, but it failed.
> It seems that yarn-configuration file is available in 
> {{http://rm-addr:8088/conf}} rather than 
> {{http://rm-addr:8088/yarnConfiguration}}. 
> I tried modify ExecutorCommandLineGenerator.java and set uri to  
> {{http://rm-addr:8088/yarnConfiguration}}. Then the nodemanager started 
> successfully.
> I am not sure if it is a problem. Please correct me if I am wrong.
> My Hadoop version is 2.6.0 and I also tried Hadoop-3.0.0-alpha, and found no 
> difference.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-249) Should set NodeManager vcores more flexibly

2016-11-28 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704397#comment-15704397
 ] 

Tao Jie commented on MYRIAD-249:


https://github.com/apache/incubator-myriad/pull/97

> Should set NodeManager vcores more flexibly
> ---
>
> Key: MYRIAD-249
> URL: https://issues.apache.org/jira/browse/MYRIAD-249
> Project: Myriad
>  Issue Type: Bug
>Affects Versions: Myriad 0.2.0
>Reporter: Tao Jie
>
> Today we set resource of NodeManager by configuration like:
> {code}
> profiles:
>   zero:  # NMs launched with this profile dynamically obtain cpu/mem from 
> Mesos
> cpu: 0
> mem: 0
>   small:
> cpu: 2
> mem: 1024
>   medium:
> cpu: 4
> mem: 4096
>   large:
> cpu: 10
> mem: 12288
> {code}
> cpu/mem here is request for Mesos. We launch NodeManager and set 
> {{nodemanager.resource.cpu-vcores}} and {{nodemanager.resource.memory-mb}} as 
> cpu/mem once resource is allocated. However the meaning of vcores in YARN is 
> not extremely the same with cpu in Mesos. In Yarn, we may set vcores to 12 
> when physical cpu is 6, and it would be converted to real cpu when request 
> for vcores. Also in yarn, requested vcores must be integer, as a result each 
> task would take at least one vcore(actually not necessary one real cpu).
> We could have one more field configuration that multiply real cpu in mesos to 
> vcores in yarn. Perhaps set vcores directly in configuration?
> I am freshman of Myriad and Mesos, please correct me if I am wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-249) Should set NodeManager vcores more flexibly

2016-11-28 Thread Tao Jie (JIRA)
Tao Jie created MYRIAD-249:
--

 Summary: Should set NodeManager vcores more flexibly
 Key: MYRIAD-249
 URL: https://issues.apache.org/jira/browse/MYRIAD-249
 Project: Myriad
  Issue Type: Bug
Affects Versions: Myriad 0.2.0
Reporter: Tao Jie


Today we set resource of NodeManager by configuration like:
{code}
profiles:
  zero:  # NMs launched with this profile dynamically obtain cpu/mem from Mesos
cpu: 0
mem: 0
  small:
cpu: 2
mem: 1024
  medium:
cpu: 4
mem: 4096
  large:
cpu: 10
mem: 12288
{code}
cpu/mem here is request for Mesos. We launch NodeManager and set 
{{nodemanager.resource.cpu-vcores}} and {{nodemanager.resource.memory-mb}} as 
cpu/mem once resource is allocated. However the meaning of vcores in YARN is 
not extremely the same with cpu in Mesos. In Yarn, we may set vcores to 12 when 
physical cpu is 6, and it would be converted to real cpu when request for 
vcores. Also in yarn, requested vcores must be integer, as a result each task 
would take at least one vcore(actually not necessary one real cpu).
We could have one more field configuration that multiply real cpu in mesos to 
vcores in yarn. Perhaps set vcores directly in configuration?
I am freshman of Myriad and Mesos, please correct me if I am wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MYRIAD-248) Fail to launch Nodemanager when frameworkRole is default value "*"

2016-11-28 Thread Tao Jie (JIRA)

[ 
https://issues.apache.org/jira/browse/MYRIAD-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702381#comment-15702381
 ] 

Tao Jie commented on MYRIAD-248:


[~yufeldman], I am using code on master branch.

> Fail to launch Nodemanager when frameworkRole is default value "*"
> --
>
> Key: MYRIAD-248
> URL: https://issues.apache.org/jira/browse/MYRIAD-248
> Project: Myriad
>  Issue Type: Bug
>Affects Versions: Myriad 0.2.0
>Reporter: Tao Jie
>
> I tried to start hadoop cluster with myriad-0.2.0, but got error message in 
> rm log:
> {code}
> 2016-11-25 10:32:50,750 ERROR 
> org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler: 
> Exception thrown while trying to create a task for nm
> java.lang.IllegalArgumentException: n must be positive
> at java.util.Random.nextInt(Random.java:300)
> at 
> org.apache.myriad.scheduler.resource.RangeResource.getRandomValues(RangeResource.java:128)
> at 
> org.apache.myriad.scheduler.resource.RangeResource.consumeResource(RangeResource.java:99)
> at 
> org.apache.myriad.scheduler.resource.ResourceOfferContainer.consumePorts(ResourceOfferContainer.java:171)
> at 
> org.apache.myriad.scheduler.NMTaskFactory.createTask(NMTaskFactory.java:45)
> at 
> org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:146)
> at 
> org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:51)
> at 
> com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> I seems that the failure is due to the default value("*") of frameworkRole in 
>  myriad-config-default.yml.
> I set value of  frameworkRole to someone, then it worked well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-248) Fail to launch Nodemanager when frameworkRole is default value "*"

2016-11-28 Thread Tao Jie (JIRA)
Tao Jie created MYRIAD-248:
--

 Summary: Fail to launch Nodemanager when frameworkRole is default 
value "*"
 Key: MYRIAD-248
 URL: https://issues.apache.org/jira/browse/MYRIAD-248
 Project: Myriad
  Issue Type: Bug
Affects Versions: Myriad 0.2.0
Reporter: Tao Jie


I tried to start hadoop cluster with myriad-0.2.0, but got error message in rm 
log:
{code}
2016-11-25 10:32:50,750 ERROR 
org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler: 
Exception thrown while trying to create a task for nm
java.lang.IllegalArgumentException: n must be positive
at java.util.Random.nextInt(Random.java:300)
at 
org.apache.myriad.scheduler.resource.RangeResource.getRandomValues(RangeResource.java:128)
at 
org.apache.myriad.scheduler.resource.RangeResource.consumeResource(RangeResource.java:99)
at 
org.apache.myriad.scheduler.resource.ResourceOfferContainer.consumePorts(ResourceOfferContainer.java:171)
at 
org.apache.myriad.scheduler.NMTaskFactory.createTask(NMTaskFactory.java:45)
at 
org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:146)
at 
org.apache.myriad.scheduler.event.handlers.ResourceOffersEventHandler.onEvent(ResourceOffersEventHandler.java:51)
at 
com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}
I seems that the failure is due to the default value("*") of frameworkRole in  
myriad-config-default.yml.
I set value of  frameworkRole to someone, then it worked well.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MYRIAD-247) Fail to fetch yarnConfiguration from Resourcemanager

2016-11-28 Thread Tao Jie (JIRA)
Tao Jie created MYRIAD-247:
--

 Summary: Fail to fetch yarnConfiguration from Resourcemanager
 Key: MYRIAD-247
 URL: https://issues.apache.org/jira/browse/MYRIAD-247
 Project: Myriad
  Issue Type: Bug
Affects Versions: Myriad 0.2.0
Reporter: Tao Jie


I setuped cluster with Mesos-1.0 and Myriad-0.20. When I tried to start 
nodemanager, the mesos task tried download hadoop configuration file by 
fetching {{http://rm-addr:8088/yarnConfiguration}}, but it failed.
It seems that yarn-configuration file is available in 
{{http://rm-addr:8088/conf}} rather than 
{{http://rm-addr:8088/yarnConfiguration}}. 
I tried modify ExecutorCommandLineGenerator.java and set uri to  
{{http://rm-addr:8088/yarnConfiguration}}. Then the nodemanager started 
successfully.
I am not sure if it is a problem. Please correct me if I am wrong.
My Hadoop version is 2.6.0 and I also tried Hadoop-3.0.0-alpha, and found no 
difference.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)