[jira] [Assigned] (MESOS-6581) Add Seccomp support at Mesos Agent level

2018-06-15 Thread Jay Guo (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6581:
--

Assignee: (was: Jay Guo)

> Add Seccomp support at Mesos Agent level
> 
>
> Key: MESOS-6581
> URL: https://issues.apache.org/jira/browse/MESOS-6581
> Project: Mesos
>  Issue Type: Task
> Environment: Linux Only
>Reporter: Jay Guo
>Priority: Major
>
> Operator of Mesos cluster should be able to enforce a set of Seccomp rules on 
> an Mesos Agent to defend against potential exploit attack through syscalls. 
> When enabled, every container launched on the Agent would comply with the 
> Seccomp filter otherwise being killed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-1607) Introduce optimistic offers.

2017-06-22 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060290#comment-16060290
 ] 

Jay Guo commented on MESOS-1607:


[~saitejar] It's still on the roadmap, but nor actively worked on at the 
moment. For Mesos allocator, current focus is on hierarchical roles 
https://issues.apache.org/jira/browse/MESOS-6375

> Introduce optimistic offers.
> 
>
> Key: MESOS-1607
> URL: https://issues.apache.org/jira/browse/MESOS-1607
> Project: Mesos
>  Issue Type: Epic
>  Components: allocation, framework, master
>Reporter: Benjamin Hindman
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
> Attachments: optimisitic-offers.pdf
>
>
> *Background*
> The current implementation of resource offers only enable a single framework 
> scheduler to make scheduling decisions for some available resources at a 
> time. In some circumstances, this is good, i.e., when we don't want other 
> framework schedulers to have access to some resources. However, in other 
> circumstances, there are advantages to letting multiple framework schedulers 
> attempt to make scheduling decisions for the _same_ allocation of resources 
> in parallel.
> If you think about this from a "concurrency control" perspective, the current 
> implementation of resource offers is _pessimistic_, the resources contained 
> within an offer are _locked_ until the framework scheduler that they were 
> offered to launches tasks with them or declines them. In addition to making 
> pessimistic offers we'd like to give out _optimistic_ offers, where the same 
> resources are offered to multiple framework schedulers at the same time, and 
> framework schedulers "compete" for those resources on a 
> first-come-first-serve basis (i.e., the first to launch a task "wins"). We've 
> always reserved the right to rescind resource offers using the 'rescind' 
> primitive in the API, and a framework scheduler should be prepared to launch 
> a task and have those tasks go lost because another framework already started 
> to use those resources.
> *Feature*
> We plan to take a step towards optimistic offers, by introducing primitives 
> that allow resources to be offered to multiple frameworks at once.  At first, 
> we will use these primitives to optimistically allocate resources that are 
> reserved for a particular framework/role but have not been allocated by that 
> framework/role.  
> The work with optimistic offers will closely resemble the existing 
> oversubscription feature.  Optimistically offered resources are likely to be 
> considered "revocable resources" (the concept that using resources not 
> reserved for you means you might get those resources revoked).  In effect, we 
> can may create something like a "spot" market for unused resources, driving 
> up utilization by letting frameworks that are willing to use revocable 
> resources run tasks.
> *Future Work*
> This ticket tracks the introduction of some aspects of optimistic offers.  
> Taken to the limit, one could imagine always making optimistic resource 
> offers. This bears a striking resemblance with the Google Omega model (an 
> isomorphism even). However, being able to configure what resources should be 
> allocated optimistically and what resources should be allocated 
> pessimistically gives even more control to a datacenter/cluster operator that 
> might want to, for example, never let multiple frameworks (roles) compete for 
> some set of resources.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (MESOS-7402) Allocated quota of a child role should be also charged on all ancestors of that role

2017-05-09 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-7402:
--

Assignee: Jay Guo

> Allocated quota of a child role should be also charged on all ancestors of 
> that role
> 
>
> Key: MESOS-7402
> URL: https://issues.apache.org/jira/browse/MESOS-7402
> Project: Mesos
>  Issue Type: Bug
>Reporter: Jay Guo
>Assignee: Jay Guo
> Attachments: hrole_quota_test.patch
>
>
> Consider following case: role {{a}} is quota'd with resource 100, role 
> {{a/b}} is quota'd with resource 40. In current implementation, quota of 
> parent role is actually the aggregation of quota in whole subtree, including 
> itself. Therefore, the internal node of {{a}} is actually quota'd with 60, 
> instead of 100. In another word, allocation made for quota of {{a/b}} should 
> also be charged from the quota of its parent.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7228) Upgrade Mesos to build with proto3.

2017-04-27 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988106#comment-15988106
 ] 

Jay Guo commented on MESOS-7228:


[~zhitao] Thank you for clarifying it! I think I misinterpreted it as upgrading 
both bundle and syntax. Sounds good!

> Upgrade Mesos to build with proto3.
> ---
>
> Key: MESOS-7228
> URL: https://issues.apache.org/jira/browse/MESOS-7228
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Anand Mazumdar
>Assignee: Zhitao Li
>Priority: Critical
>
> We currently build Mesos with protobuf 2.6.1 and bundle it as a dependency. 
> We should upgrade it to use v3.2.0 instead. This would help us use arenas to 
> improve performance (MESOS-6971) and also help resolve some bugs around the 
> Mesos master not able to handle large protobufs (>64mbs in size, MESOS-4210) 
> etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7228) Upgrade Mesos to build with proto3.

2017-04-25 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984112#comment-15984112
 ] 

Jay Guo commented on MESOS-7228:


[~zhitao] so no more {{required}} field in Mesos?

> Upgrade Mesos to build with proto3.
> ---
>
> Key: MESOS-7228
> URL: https://issues.apache.org/jira/browse/MESOS-7228
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Anand Mazumdar
>Assignee: Zhitao Li
>Priority: Critical
>
> We currently build Mesos with protobuf 2.6.1 and bundle it as a dependency. 
> We should upgrade it to use v3.2.0 instead. This would help us use arenas to 
> improve performance (MESOS-6971) and also help resolve some bugs around the 
> Mesos master not able to handle large protobufs (>64mbs in size, MESOS-4210) 
> etc. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7402) Allocated quota of a child role should be also charged on all ancestors of that role

2017-04-20 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976812#comment-15976812
 ] 

Jay Guo commented on MESOS-7402:


[~qianzhang] yes, I think that could be a solution, ideally we could have a 
tree structure in both master, sorter and allocator, and maybe sorter should be 
quota-aware. I logged a JIRA for this: MESOS-7293

> Allocated quota of a child role should be also charged on all ancestors of 
> that role
> 
>
> Key: MESOS-7402
> URL: https://issues.apache.org/jira/browse/MESOS-7402
> Project: Mesos
>  Issue Type: Bug
>Reporter: Jay Guo
> Attachments: hrole_quota_test.patch
>
>
> Consider following case: role {{a}} is quota'd with resource 100, role 
> {{a/b}} is quota'd with resource 40. In current implementation, quota of 
> parent role is actually the aggregation of quota in whole subtree, including 
> itself. Therefore, the internal node of {{a}} is actually quota'd with 60, 
> instead of 100. In another word, allocation made for quota of {{a/b}} should 
> also be charged from the quota of its parent.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (MESOS-7402) Allocated quota of a child role should be also charged on all ancestors of that role

2017-04-19 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976051#comment-15976051
 ] 

Jay Guo edited comment on MESOS-7402 at 4/20/17 4:29 AM:
-

[~qianzhang] In current implementation, when we check if quota is satisfied, we 
check against actual *individual roles* (either virtual or leaf), but *NOT* the 
tree as a whole entity. And how we calculate quota for virtual node is 
incorrect (consider example in this JIRA, {{a/.}} should have 60, not 100). 
Therefore, we risk overcommitting resources to quota'd tree. To address your 
question, suppose we allocate 40 to {{a/b}}, which is accounted towards parent 
role {{a}}, however {{a/.}} is still allocated with 0, and quota'd 100, so we 
might allocate 100 to {{a/.}}, which violates the quota.


was (Author: guoger):
[~qianzhang] In current implementation, when we check if quota is satisfied, we 
check against actual *individual roles* (either virtual or leaf), but *NOT* the 
tree as a whole entity. And how we calculate quota for virtual node is 
incorrect (consider example in this JIRA, {{a/.}} should have 60, not 100). 
Therefore, we risk overcommitting resources to quota'd tree.

> Allocated quota of a child role should be also charged on all ancestors of 
> that role
> 
>
> Key: MESOS-7402
> URL: https://issues.apache.org/jira/browse/MESOS-7402
> Project: Mesos
>  Issue Type: Bug
>Reporter: Jay Guo
> Attachments: hrole_quota_test.patch
>
>
> Consider following case: role {{a}} is quota'd with resource 100, role 
> {{a/b}} is quota'd with resource 40. In current implementation, quota of 
> parent role is actually the aggregation of quota in whole subtree, including 
> itself. Therefore, the internal node of {{a}} is actually quota'd with 60, 
> instead of 100. In another word, allocation made for quota of {{a/b}} should 
> also be charged from the quota of its parent.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7402) Allocated quota of a child role should be also charged on all ancestors of that role

2017-04-19 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976051#comment-15976051
 ] 

Jay Guo commented on MESOS-7402:


[~qianzhang] In current implementation, when we check if quota is satisfied, we 
check against actual *individual roles* (either virtual or leaf), but *NOT* the 
tree as a whole entity. And how we calculate quota for virtual node is 
incorrect (consider example in this JIRA, {{a/.}} should have 60, not 100). 
Therefore, we risk overcommitting resources to quota'd tree.

> Allocated quota of a child role should be also charged on all ancestors of 
> that role
> 
>
> Key: MESOS-7402
> URL: https://issues.apache.org/jira/browse/MESOS-7402
> Project: Mesos
>  Issue Type: Bug
>Reporter: Jay Guo
> Attachments: hrole_quota_test.patch
>
>
> Consider following case: role {{a}} is quota'd with resource 100, role 
> {{a/b}} is quota'd with resource 40. In current implementation, quota of 
> parent role is actually the aggregation of quota in whole subtree, including 
> itself. Therefore, the internal node of {{a}} is actually quota'd with 60, 
> instead of 100. In another word, allocation made for quota of {{a/b}} should 
> also be charged from the quota of its parent.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (MESOS-7260) Authorization for `/role` endpoint should take both VIEW_ROLES and VIEW_FRAMEWORKS into account.

2017-04-18 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15950323#comment-15950323
 ] 

Jay Guo edited comment on MESOS-7260 at 4/19/17 3:40 AM:
-

https://reviews.apache.org/r/58095/
https://reviews.apache.org/r/58096/
https://reviews.apache.org/r/58097/
https://reviews.apache.org/r/58099/


was (Author: guoger):
https://reviews.apache.org/r/58095/
https://reviews.apache.org/r/58096/
https://reviews.apache.org/r/58097/

> Authorization for `/role` endpoint should take both VIEW_ROLES and 
> VIEW_FRAMEWORKS into account.
> 
>
> Key: MESOS-7260
> URL: https://issues.apache.org/jira/browse/MESOS-7260
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, master
>Reporter: Jay Guo
>Assignee: Jay Guo
>
> Consider following case: both {{framework1}} and {{framework2}} subscribe to 
> {{roleX}}, {{principal}} is allowed to view {{roleX}} and {{framework1}}, but 
> *NOT* {{framework2}}, therefore, {{/role}} endpoint should only contain 
> {{framework1}}, but not both frameworks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7402) Allocated quota of a child role should be also charged on all ancestors of that role

2017-04-18 Thread Jay Guo (JIRA)
Jay Guo created MESOS-7402:
--

 Summary: Allocated quota of a child role should be also charged on 
all ancestors of that role
 Key: MESOS-7402
 URL: https://issues.apache.org/jira/browse/MESOS-7402
 Project: Mesos
  Issue Type: Bug
Reporter: Jay Guo


Consider following case: role {{a}} is quota'd with resource 100, role {{a/b}} 
is quota'd with resource 40. In current implementation, quota of parent role is 
actually the aggregation of quota in whole subtree, including itself. 
Therefore, the internal node of {{a}} is actually quota'd with 60, instead of 
100. In another word, allocation made for quota of {{a/b}} should also be 
charged from the quota of its parent.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7136) Eliminate fair sharing between frameworks within a role.

2017-04-18 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973966#comment-15973966
 ] 

Jay Guo commented on MESOS-7136:


So we wouldn't calculate fair share of frameworks anymore, but only that of 
roles? Then how do we handle multiple frameworks under same role? round-robin? 
first-come-first-servce?

> Eliminate fair sharing between frameworks within a role.
> 
>
> Key: MESOS-7136
> URL: https://issues.apache.org/jira/browse/MESOS-7136
> Project: Mesos
>  Issue Type: Epic
>  Components: allocation, technical debt
>Reporter: Benjamin Mahler
>  Labels: multi-tenancy
>
> The current fair sharing algorithm performs fair sharing between frameworks 
> within a role. This is equivalent to having the framework id behave as a 
> pseudo-role beneath the role. Consider the case where there are two spark 
> frameworks running within the same "spark" role. This behaves similarly to 
> hierarchical roles with the framework ID acting as an implicit role:
> {noformat}
>  ^
>/   \
>   spark services
> ^
>   /   \
> /   \
> FrameworkId1 FrameworkId2
> (fixed weight of 1)(fixed weight of 1)
> {noformat}
> Unfortunately, the frameworks cannot change their weight to be a value other 
> than 1 (see MESOS-6247) and they cannot set quota.
> With the addition of hierarchical roles (see MESOS-6375) we can eliminate the 
> notion of the framework ID acting as a pseudo-role in favor of explicitly 
> using hierarchical roles. E.g.
> {noformat}
>  ^
>/   \
> engsales
> ^
>   /   \
>  analytics ui
>  ^
>/   \
>learning reports
> {noformat}
> Here if two frameworks run within the eng/analytics role, then they will 
> compete for resources without fair sharing. However, if resource guarantees 
> are required, sub-roles can be created explicitly, e.g. 
> eng/analytics/learning and eng/analytics/reports. These roles can be given 
> weights and quota.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7293) Refactor quota and weight to be stored and managed by roles

2017-04-18 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-7293:
---
Description: Currently {{weight}} and {{quota}} are stored as separate 
hashmap as {{roles}} in master, even though they are actually attributes of 
roles. Having them stored and managed by {{Role}} make it easier to fetch 
aggregated role-related information (rendering endpoint response), and the 
logic would be more clear. It could potentially make authZ simpler as well.  
(was: Currently {{weight}} and {{quota}} are stored as separate hashmap as 
{{roles}} in master, even though they are actually attributes of roles. Having 
them stored and managed by {{Role}} make it easier to fetch aggregated 
role-related information, and the logic would be more clear. It could 
potentially make authZ simpler as well.)

> Refactor quota and weight to be stored and managed by roles
> ---
>
> Key: MESOS-7293
> URL: https://issues.apache.org/jira/browse/MESOS-7293
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Jay Guo
>
> Currently {{weight}} and {{quota}} are stored as separate hashmap as 
> {{roles}} in master, even though they are actually attributes of roles. 
> Having them stored and managed by {{Role}} make it easier to fetch aggregated 
> role-related information (rendering endpoint response), and the logic would 
> be more clear. It could potentially make authZ simpler as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (MESOS-7078) Benchmarks to validate perf impact of hierarchical sorting

2017-04-17 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970876#comment-15970876
 ] 

Jay Guo edited comment on MESOS-7078 at 4/17/17 9:04 AM:
-

[~neilc] I built a tree of client in {{Sorter_BENCHMARK_Test.FullSort}} and the 
performance downgrades pretty badly. I guess it may be inevitable due to tree 
traversal. Should I add this test to capture it?


was (Author: guoger):
I built a tree of client in {{Sorter_BENCHMARK_Test.FullSort}} and the 
performance downgrades pretty badly. I guess it may be inevitable due to tree 
traversal. Should I add this test to capture it?

> Benchmarks to validate perf impact of hierarchical sorting
> --
>
> Key: MESOS-7078
> URL: https://issues.apache.org/jira/browse/MESOS-7078
> Project: Mesos
>  Issue Type: Task
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> Depending on how deeply we need to change the sorter/allocator, we should 
> ensure we take the time to run the existing benchmarks (and perhaps write new 
> benchmarks) to ensure we don't regress performance for existing 
> sorter/allocator use cases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (MESOS-7078) Benchmarks to validate perf impact of hierarchical sorting

2017-04-17 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970876#comment-15970876
 ] 

Jay Guo edited comment on MESOS-7078 at 4/17/17 9:03 AM:
-

I built a tree of client in {{Sorter_BENCHMARK_Test.FullSort}} and the 
performance downgrades pretty badly. I guess it may be inevitable due to tree 
traversal. Should I add this test to capture it?


was (Author: guoger):
Should we also add benchmark tests with hierarchical roles? More specifically, 
build a tree of clients and perform same procedures as 
{{Sorter_BENCHMARK_Test.FullSort}}.

> Benchmarks to validate perf impact of hierarchical sorting
> --
>
> Key: MESOS-7078
> URL: https://issues.apache.org/jira/browse/MESOS-7078
> Project: Mesos
>  Issue Type: Task
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> Depending on how deeply we need to change the sorter/allocator, we should 
> ensure we take the time to run the existing benchmarks (and perhaps write new 
> benchmarks) to ensure we don't regress performance for existing 
> sorter/allocator use cases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7078) Benchmarks to validate perf impact of hierarchical sorting

2017-04-17 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970876#comment-15970876
 ] 

Jay Guo commented on MESOS-7078:


Should we also add benchmark tests with hierarchical roles? More specifically, 
build a tree of clients and perform same procedures as 
{{Sorter_BENCHMARK_Test.FullSort}}.

> Benchmarks to validate perf impact of hierarchical sorting
> --
>
> Key: MESOS-7078
> URL: https://issues.apache.org/jira/browse/MESOS-7078
> Project: Mesos
>  Issue Type: Task
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> Depending on how deeply we need to change the sorter/allocator, we should 
> ensure we take the time to run the existing benchmarks (and perhaps write new 
> benchmarks) to ensure we don't regress performance for existing 
> sorter/allocator use cases.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7385) Framework should not starve due to `dovetailing` in naive H-DRF implementation.

2017-04-13 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-7385:
---
Description: 
Mesos currently implements naive H-DRF algorithm, as described in [h-drf 
paper|https://people.eecs.berkeley.edu/~alig/papers/h-drf.pdf], which may incur 
starvation due to `dovetailing`. Essentially, following test should pass:
{code}
TEST_F(HierarchicalAllocatorTest, Starvation)
{
  Clock::pause();

  initialize();

  const string ROLE1 = "a";
  const string ROLE2 = "b/c";
  const string ROLE3 = "b/d";

  FrameworkInfo framework1 = createFrameworkInfo({ROLE1});
  allocator->addFramework(framework1.id(), framework1, {}, true);

  SlaveInfo agent1 = createSlaveInfo("cpus:1");
  allocator->addSlave(
  agent1.id(),
  agent1,
  AGENT_CAPABILITIES(),
  None(),
  agent1.resources(),
  {});

  // `framework1` will be offered all of the resources on `agent1`.
  {
Allocation expected = Allocation(
framework1.id(),
{{ROLE1, {{agent1.id(), agent1.resources());

AWAIT_EXPECT_EQ(expected, allocations.get());
  }

  // Create `framework2` in the child role.
  FrameworkInfo framework2 = createFrameworkInfo({ROLE2});
  allocator->addFramework(framework2.id(), framework2, {}, true);

  SlaveInfo agent2 = createSlaveInfo("mem:32");
  allocator->addSlave(
  agent2.id(),
  agent2,
  AGENT_CAPABILITIES(),
  None(),
  agent2.resources(),
  {});

  {
Allocation expected = Allocation(
framework2.id(),
{{ROLE2, {{agent2.id(), agent2.resources());

AWAIT_EXPECT_EQ(expected, allocations.get());
  }

  // Create `framework3` in the child role.
  FrameworkInfo framework3 = createFrameworkInfo({ROLE3});
  allocator->addFramework(framework3.id(), framework3, {}, true);

  SlaveInfo agent3 = createSlaveInfo("cpus:1");
  allocator->addSlave(
  agent3.id(),
  agent3,
  AGENT_CAPABILITIES(),
  None(),
  agent3.resources(),
  {});

  // Current fair share is:
  // - `framework1`: 50% (1/2 cpus)
  // - `framework2`: 100% (32/32 mem)
  // - `framework3`: 0% (0/2 cpus)
  // So `framework3` should be offered all of the resources on `agent3`.
  // However, `framework3` is punished due to naive h-drf implementation,
  // where fair share of parent role `b` has fair share of 100%, which
  // leads to starvation.
  {
Allocation expected = Allocation(
framework3.id(),
{{ROLE3, {{agent3.id(), agent3.resources());

AWAIT_EXPECT_EQ(expected, allocations.get()); // It fails!
  }
}
{code}

This JIRA is created to make sure this behavior is captured and will be 
addressed in the future. Note that it affects current implementation without 
hierarchical role as well.

  was:
Mesos currently implements naive H-DRF algorithm, as described in [h-drf 
paper|https://people.eecs.berkeley.edu/~alig/papers/h-drf.pdf], which may incur 
starvation due to `dovetailing`. Essentially, following test should pass:
{{code}}
TEST_F(HierarchicalAllocatorTest, Starvation)
{
  Clock::pause();

  initialize();

  const string ROLE1 = "a";
  const string ROLE2 = "b/c";
  const string ROLE3 = "b/d";

  FrameworkInfo framework1 = createFrameworkInfo({ROLE1});
  allocator->addFramework(framework1.id(), framework1, {}, true);

  SlaveInfo agent1 = createSlaveInfo("cpus:1");
  allocator->addSlave(
  agent1.id(),
  agent1,
  AGENT_CAPABILITIES(),
  None(),
  agent1.resources(),
  {});

  // `framework1` will be offered all of the resources on `agent1`.
  {
Allocation expected = Allocation(
framework1.id(),
{{ROLE1, {{agent1.id(), agent1.resources());

AWAIT_EXPECT_EQ(expected, allocations.get());
  }

  // Create `framework2` in the child role.
  FrameworkInfo framework2 = createFrameworkInfo({ROLE2});
  allocator->addFramework(framework2.id(), framework2, {}, true);

  SlaveInfo agent2 = createSlaveInfo("mem:32");
  allocator->addSlave(
  agent2.id(),
  agent2,
  AGENT_CAPABILITIES(),
  None(),
  agent2.resources(),
  {});

  {
Allocation expected = Allocation(
framework2.id(),
{{ROLE2, {{agent2.id(), agent2.resources());

AWAIT_EXPECT_EQ(expected, allocations.get());
  }

  // Create `framework3` in the child role.
  FrameworkInfo framework3 = createFrameworkInfo({ROLE3});
  allocator->addFramework(framework3.id(), framework3, {}, true);

  SlaveInfo agent3 = createSlaveInfo("cpus:1");
  allocator->addSlave(
  agent3.id(),
  agent3,
  AGENT_CAPABILITIES(),
  None(),
  agent3.resources(),
  {});

  // Current fair share is:
  // - `framework1`: 50% (1/2 cpus)
  // - `framework2`: 100% (32/32 mem)
  // - `framework3`: 0% (0/2 cpus)
  // So `framework3` should be offered all of the resources on `agent3`.
  // However, `framework3` is punished due to naive h-drf implementation,
  // where fair share of parent 

[jira] [Created] (MESOS-7385) Framework should not starve due to `dovetailing` in naive H-DRF implementation.

2017-04-13 Thread Jay Guo (JIRA)
Jay Guo created MESOS-7385:
--

 Summary: Framework should not starve due to `dovetailing` in naive 
H-DRF implementation.
 Key: MESOS-7385
 URL: https://issues.apache.org/jira/browse/MESOS-7385
 Project: Mesos
  Issue Type: Bug
  Components: master
Reporter: Jay Guo


Mesos currently implements naive H-DRF algorithm, as described in [h-drf 
paper|https://people.eecs.berkeley.edu/~alig/papers/h-drf.pdf], which may incur 
starvation due to `dovetailing`. Essentially, following test should pass:
{{code}}
TEST_F(HierarchicalAllocatorTest, Starvation)
{
  Clock::pause();

  initialize();

  const string ROLE1 = "a";
  const string ROLE2 = "b/c";
  const string ROLE3 = "b/d";

  FrameworkInfo framework1 = createFrameworkInfo({ROLE1});
  allocator->addFramework(framework1.id(), framework1, {}, true);

  SlaveInfo agent1 = createSlaveInfo("cpus:1");
  allocator->addSlave(
  agent1.id(),
  agent1,
  AGENT_CAPABILITIES(),
  None(),
  agent1.resources(),
  {});

  // `framework1` will be offered all of the resources on `agent1`.
  {
Allocation expected = Allocation(
framework1.id(),
{{ROLE1, {{agent1.id(), agent1.resources());

AWAIT_EXPECT_EQ(expected, allocations.get());
  }

  // Create `framework2` in the child role.
  FrameworkInfo framework2 = createFrameworkInfo({ROLE2});
  allocator->addFramework(framework2.id(), framework2, {}, true);

  SlaveInfo agent2 = createSlaveInfo("mem:32");
  allocator->addSlave(
  agent2.id(),
  agent2,
  AGENT_CAPABILITIES(),
  None(),
  agent2.resources(),
  {});

  {
Allocation expected = Allocation(
framework2.id(),
{{ROLE2, {{agent2.id(), agent2.resources());

AWAIT_EXPECT_EQ(expected, allocations.get());
  }

  // Create `framework3` in the child role.
  FrameworkInfo framework3 = createFrameworkInfo({ROLE3});
  allocator->addFramework(framework3.id(), framework3, {}, true);

  SlaveInfo agent3 = createSlaveInfo("cpus:1");
  allocator->addSlave(
  agent3.id(),
  agent3,
  AGENT_CAPABILITIES(),
  None(),
  agent3.resources(),
  {});

  // Current fair share is:
  // - `framework1`: 50% (1/2 cpus)
  // - `framework2`: 100% (32/32 mem)
  // - `framework3`: 0% (0/2 cpus)
  // So `framework3` should be offered all of the resources on `agent3`.
  // However, `framework3` is punished due to naive h-drf implementation,
  // where fair share of parent role `b` has fair share of 100%, which
  // leads to starvation.
  {
Allocation expected = Allocation(
framework3.id(),
{{ROLE3, {{agent3.id(), agent3.resources());

AWAIT_EXPECT_EQ(expected, allocations.get()); // It fails!
  }
}
{{code}}

This JIRA is created to make sure this behavior is captured and will be 
addressed in the future. Note that it affects current implementation without 
hierarchical role as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-4732) Migrate rest of the endpoints to use `jsonify`

2017-04-04 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956202#comment-15956202
 ] 

Jay Guo commented on MESOS-4732:


RR: https://reviews.apache.org/r/58095/

> Migrate rest of the endpoints to use `jsonify`
> --
>
> Key: MESOS-4732
> URL: https://issues.apache.org/jira/browse/MESOS-4732
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Michael Park
>  Labels: tech-debt
>
> As MVP, we shipped `/state` and `/state-summary` to use `jsonify`. We need to 
> follow through with the migration of the rest of the endpoints to use 
> `jsonify` as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-6995) Update the webui to reflect hierarchical roles.

2017-03-23 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-6995:
---
Description: 
It may not need any changes, but we should confirm that the new role format for 
hierarchical roles is correctly displayed in the webui.

In addition, we can add a roles tab that shows the summary information (shares, 
weights, quotas). For now, we don't need to make any of this clickable (e.g. to 
see the tasks / frameworks under the role).

This work also requires role-related information to be gathered into {{/roles}} 
endpoint of master, i.e. {{quota}}. Ideally, we should have a unified 
bookkeeping of {{roles}}/{{weight}}/{{quotas}}, see related issue of this 
ticket.

  was:
It may not need any changes, but we should confirm that the new role format for 
hierarchical roles is correctly displayed in the webui.

In addition, we can add a roles tab that shows the summary information (shares, 
weights, quotas). For now, we don't need to make any of this clickable (e.g. to 
see the tasks / frameworks under the role).


> Update the webui to reflect hierarchical roles.
> ---
>
> Key: MESOS-6995
> URL: https://issues.apache.org/jira/browse/MESOS-6995
> Project: Mesos
>  Issue Type: Task
>  Components: webui
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> It may not need any changes, but we should confirm that the new role format 
> for hierarchical roles is correctly displayed in the webui.
> In addition, we can add a roles tab that shows the summary information 
> (shares, weights, quotas). For now, we don't need to make any of this 
> clickable (e.g. to see the tasks / frameworks under the role).
> This work also requires role-related information to be gathered into 
> {{/roles}} endpoint of master, i.e. {{quota}}. Ideally, we should have a 
> unified bookkeeping of {{roles}}/{{weight}}/{{quotas}}, see related issue of 
> this ticket.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7293) Refactor quota and weight to be stored and managed by roles

2017-03-22 Thread Jay Guo (JIRA)
Jay Guo created MESOS-7293:
--

 Summary: Refactor quota and weight to be stored and managed by 
roles
 Key: MESOS-7293
 URL: https://issues.apache.org/jira/browse/MESOS-7293
 Project: Mesos
  Issue Type: Bug
  Components: master
Reporter: Jay Guo


Currently {{weight}} and {{quota}} are stored as separate hashmap as {{roles}} 
in master, even though they are actually attributes of roles. Having them 
stored and managed by {{Role}} make it easier to fetch aggregated role-related 
information, and the logic would be more clear. It could potentially make authZ 
simpler as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (MESOS-7260) Authorization for `/role` endpoint should take both VIEW_ROLES and VIEW_FRAMEWORKS into account.

2017-03-20 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-7260:
--

Assignee: Jay Guo

> Authorization for `/role` endpoint should take both VIEW_ROLES and 
> VIEW_FRAMEWORKS into account.
> 
>
> Key: MESOS-7260
> URL: https://issues.apache.org/jira/browse/MESOS-7260
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, master
>Reporter: Jay Guo
>Assignee: Jay Guo
>
> Consider following case: both {{framework1}} and {{framework2}} subscribe to 
> {{roleX}}, {{principal}} is allowed to view {{roleX}} and {{framework1}}, but 
> *NOT* {{framework2}}, therefore, {{/role}} endpoint should only contain 
> {{framework1}}, but not both frameworks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7260) Authorization for {{/role}} endpoint should take both {{view_roles}} and {{view_frameworks}} into account.

2017-03-17 Thread Jay Guo (JIRA)
Jay Guo created MESOS-7260:
--

 Summary: Authorization for {{/role}} endpoint should take both 
{{view_roles}} and {{view_frameworks}} into account.
 Key: MESOS-7260
 URL: https://issues.apache.org/jira/browse/MESOS-7260
 Project: Mesos
  Issue Type: Bug
  Components: HTTP API, master
Reporter: Jay Guo


Consider following case: both {{framework1}} and {{framework2}} subscribe to 
{{roleX}}, {{principal}} is allowed to view {{roleX}} and {{ framework1}}, but 
*NOT* {{framework2}}, therefore, {{/role}} endpoint should only contain 
{{framework1}}, but not both frameworks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7260) Authorization for `/role` endpoint should take both VIEW_ROLES and VIEW_FRAMEWORKS into account.

2017-03-17 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-7260:
---
Description: Consider following case: both {{framework1}} and 
{{framework2}} subscribe to {{roleX}}, {{principal}} is allowed to view 
{{roleX}} and {{framework1}}, but *NOT* {{framework2}}, therefore, {{/role}} 
endpoint should only contain {{framework1}}, but not both frameworks.  (was: 
Consider following case: both {{framework1}} and {{framework2}} subscribe to 
{{roleX}}, {{principal}} is allowed to view {{roleX}} and {{ framework1}}, but 
*NOT* {{framework2}}, therefore, {{/role}} endpoint should only contain 
{{framework1}}, but not both frameworks.)

> Authorization for `/role` endpoint should take both VIEW_ROLES and 
> VIEW_FRAMEWORKS into account.
> 
>
> Key: MESOS-7260
> URL: https://issues.apache.org/jira/browse/MESOS-7260
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, master
>Reporter: Jay Guo
>
> Consider following case: both {{framework1}} and {{framework2}} subscribe to 
> {{roleX}}, {{principal}} is allowed to view {{roleX}} and {{framework1}}, but 
> *NOT* {{framework2}}, therefore, {{/role}} endpoint should only contain 
> {{framework1}}, but not both frameworks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7260) Authorization for `/role` endpoint should take both VIEW_ROLES and VIEW_FRAMEWORKS into account.

2017-03-17 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-7260:
---
Summary: Authorization for `/role` endpoint should take both VIEW_ROLES and 
VIEW_FRAMEWORKS into account.  (was: Authorization for {{/role}} endpoint 
should take both {{view_roles}} and {{view_frameworks}} into account.)

> Authorization for `/role` endpoint should take both VIEW_ROLES and 
> VIEW_FRAMEWORKS into account.
> 
>
> Key: MESOS-7260
> URL: https://issues.apache.org/jira/browse/MESOS-7260
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, master
>Reporter: Jay Guo
>
> Consider following case: both {{framework1}} and {{framework2}} subscribe to 
> {{roleX}}, {{principal}} is allowed to view {{roleX}} and {{ framework1}}, 
> but *NOT* {{framework2}}, therefore, {{/role}} endpoint should only contain 
> {{framework1}}, but not both frameworks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-6995) Update the webui to reflect hierarchical roles.

2017-03-16 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15927621#comment-15927621
 ] 

Jay Guo commented on MESOS-6995:


[~bmahler] You mean injecting {{/quota}} into {{/roles}} endpoint?

> Update the webui to reflect hierarchical roles.
> ---
>
> Key: MESOS-6995
> URL: https://issues.apache.org/jira/browse/MESOS-6995
> Project: Mesos
>  Issue Type: Task
>  Components: webui
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> It may not need any changes, but we should confirm that the new role format 
> for hierarchical roles is correctly displayed in the webui.
> In addition, we can add a roles tab that shows the summary information 
> (shares, weights, quotas). For now, we don't need to make any of this 
> clickable (e.g. to see the tasks / frameworks under the role).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (MESOS-7029) FaultToleranceTest.FrameworkReregister is flaky

2017-03-06 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-7029:
--

Assignee: Jay Guo

> FaultToleranceTest.FrameworkReregister is flaky
> ---
>
> Key: MESOS-7029
> URL: https://issues.apache.org/jira/browse/MESOS-7029
> Project: Mesos
>  Issue Type: Bug
>  Components: test, tests
> Environment: ASF CI, cmake, gcc, Ubuntu 14.04, libevent/SSL enabled
>Reporter: Greg Mann
>Assignee: Jay Guo
>  Labels: flaky, flaky-test
> Attachments: FaultToleranceTest.FrameworkReregister.txt
>
>
> This was observed on ASF CI:
> {code}
> /mesos/src/tests/fault_tolerance_tests.cpp:903: Failure
> The difference between registerTime.secs() and 
> framework.values["registered_time"].as().as() is 
> 1.0100052356719971, which exceeds 1, where
> registerTime.secs() evaluates to 1485732879.7673652,
> framework.values["registered_time"].as().as() evaluates 
> to 1485732878.75736, and
> 1 evaluates to 1.
> {code}
> Find the full log attached.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7029) FaultToleranceTest.FrameworkReregister is flaky

2017-03-06 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15898840#comment-15898840
 ] 

Jay Guo commented on MESOS-7029:


RR: https://reviews.apache.org/r/57364/

> FaultToleranceTest.FrameworkReregister is flaky
> ---
>
> Key: MESOS-7029
> URL: https://issues.apache.org/jira/browse/MESOS-7029
> Project: Mesos
>  Issue Type: Bug
>  Components: test, tests
> Environment: ASF CI, cmake, gcc, Ubuntu 14.04, libevent/SSL enabled
>Reporter: Greg Mann
>  Labels: flaky, flaky-test
> Attachments: FaultToleranceTest.FrameworkReregister.txt
>
>
> This was observed on ASF CI:
> {code}
> /mesos/src/tests/fault_tolerance_tests.cpp:903: Failure
> The difference between registerTime.secs() and 
> framework.values["registered_time"].as().as() is 
> 1.0100052356719971, which exceeds 1, where
> registerTime.secs() evaluates to 1485732879.7673652,
> framework.values["registered_time"].as().as() evaluates 
> to 1485732878.75736, and
> 1 evaluates to 1.
> {code}
> Find the full log attached.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7029) FaultToleranceTest.FrameworkReregister is flaky

2017-03-06 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15898835#comment-15898835
 ] 

Jay Guo commented on MESOS-7029:


[~neilc] I think it is due to our intentional delay here: 
https://github.com/apache/mesos/blob/master/src/tests/fault_tolerance_tests.cpp#L824-L826
 where the sum of them may exceed 1 sec

> FaultToleranceTest.FrameworkReregister is flaky
> ---
>
> Key: MESOS-7029
> URL: https://issues.apache.org/jira/browse/MESOS-7029
> Project: Mesos
>  Issue Type: Bug
>  Components: test, tests
> Environment: ASF CI, cmake, gcc, Ubuntu 14.04, libevent/SSL enabled
>Reporter: Greg Mann
>  Labels: flaky, flaky-test
> Attachments: FaultToleranceTest.FrameworkReregister.txt
>
>
> This was observed on ASF CI:
> {code}
> /mesos/src/tests/fault_tolerance_tests.cpp:903: Failure
> The difference between registerTime.secs() and 
> framework.values["registered_time"].as().as() is 
> 1.0100052356719971, which exceeds 1, where
> registerTime.secs() evaluates to 1485732879.7673652,
> framework.values["registered_time"].as().as() evaluates 
> to 1485732878.75736, and
> 1 evaluates to 1.
> {code}
> Find the full log attached.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (MESOS-7149) Support reservations for role subtrees

2017-03-06 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-7149:
--

Assignee: Jay Guo

> Support reservations for role subtrees
> --
>
> Key: MESOS-7149
> URL: https://issues.apache.org/jira/browse/MESOS-7149
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Neil Conway
>Assignee: Jay Guo
>  Labels: mesosphere
>
> When a reservation is made for a role path {{x}}, the reserved resource 
> should be offered to all frameworks registered in {{x}} _or any nested role 
> in the sub-tree under x_. For example, if a reservation is made for {{eng}}, 
> the reserved resource should be a candidate to appear in resource offers to 
> frameworks in any of the roles {{eng}}, {{eng/dev}}, and {{eng/prod}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-1806) Etcd-based master contender/detector module

2017-03-04 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-1806:
---

Thanks will, I had patches for etcd but need to revive that thread... will
take a look at the reference you pointed out, appreciated.



> Etcd-based master contender/detector module
> ---
>
> Key: MESOS-1806
> URL: https://issues.apache.org/jira/browse/MESOS-1806
> Project: Mesos
>  Issue Type: Epic
>  Components: leader election
>Reporter: Ed Ropple
>Assignee: Shuai Lin
>Priority: Minor
>
>eropple: Could you also file a new JIRA for Mesos to drop ZK 
> in favor of etcd or ReplicatedLog? Would love to get some momentum going on 
> that one.
> --
> Consider it filed. =)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-6995) Update the webui to reflect hierarchical roles.

2017-03-03 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893914#comment-15893914
 ] 

Jay Guo commented on MESOS-6995:


To clarify, this includes two tasks:
1. Reflect hierarchical roles in existing {{Roles}} i.e frameworks tab, agents 
tab. We may have two options here, either a chain, i.e. {{eng/frontend/dev}} or 
tree.
eng
├── frontend
│   ├── *dev*
│   └── prod
└── backend
 ├── dev
 └── prod
I'm inclined to use chain for tasks/executors and tree for frameworks.

2. Add a new tab {{Roles}} at top-level.

Am I missing something?

> Update the webui to reflect hierarchical roles.
> ---
>
> Key: MESOS-6995
> URL: https://issues.apache.org/jira/browse/MESOS-6995
> Project: Mesos
>  Issue Type: Task
>  Components: webui
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> It may not need any changes, but we should confirm that the new role format 
> for hierarchical roles is correctly displayed in the webui.
> In addition, we can add a roles tab that shows the summary information 
> (shares, weights, quotas). For now, we don't need to make any of this 
> clickable (e.g. to see the tasks / frameworks under the role).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (MESOS-6995) Update the webui to reflect hierarchical roles.

2017-03-03 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6995:
--

Assignee: Jay Guo

> Update the webui to reflect hierarchical roles.
> ---
>
> Key: MESOS-6995
> URL: https://issues.apache.org/jira/browse/MESOS-6995
> Project: Mesos
>  Issue Type: Task
>  Components: webui
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> It may not need any changes, but we should confirm that the new role format 
> for hierarchical roles is correctly displayed in the webui.
> In addition, we can add a roles tab that shows the summary information 
> (shares, weights, quotas). For now, we don't need to make any of this 
> clickable (e.g. to see the tasks / frameworks under the role).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (MESOS-7048) Remove adjustment code within Resources::apply.

2017-03-02 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-7048:
--

Assignee: Jay Guo

> Remove adjustment code within Resources::apply.
> ---
>
> Key: MESOS-7048
> URL: https://issues.apache.org/jira/browse/MESOS-7048
> Project: Mesos
>  Issue Type: Task
>  Components: technical debt
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> Currently, {{Resources::apply()}} will strip allocation info from operation's 
> resources in order to have operations apply correctly to unallocated 
> resources. To make this more explicit, we should move the stripping up into 
> the call sites that need it. We'll need a helper to do this.
> As a result, the master and allocator will need to strip prior to applying 
> operations to the agent's total resources (which are stored as unallocated 
> resources).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7182) Couple of MULTI_ROLE related tests are flaky

2017-03-01 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-7182:
---
Summary: Couple of MULTI_ROLE related tests are flaky  (was: 
MasterTest.MultiRoleFrameworkReceivesOffers is flaky)

> Couple of MULTI_ROLE related tests are flaky
> 
>
> Key: MESOS-7182
> URL: https://issues.apache.org/jira/browse/MESOS-7182
> Project: Mesos
>  Issue Type: Bug
>Reporter: Neil Conway
>Assignee: Jay Guo
>  Labels: flaky, mesosphere
> Attachments: test_fail.log
>
>
> {noformat}
> [ RUN  ] MasterTest.MultiRoleFrameworkReceivesOffers
> ../../mesos/src/tests/master_tests.cpp:6576: Failure
> Failed to wait 15secs for offers2
> ../../mesos/src/tests/master_tests.cpp:6564: Failure
> Actual function call count doesn't match EXPECT_CALL(sched, 
> resourceOffers(, _))...
>  Expected: to be called at least twice
>Actual: called once - unsatisfied and active
> [  FAILED  ] MasterTest.MultiRoleFrameworkReceivesOffers (15065 ms)
> {noformat}
> Verbose test log attached.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7182) Couple of MULTI_ROLE related tests are flaky

2017-03-01 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-7182:
---
Description: 
Failed tests are:
MasterTest.MultiRoleSchedulerUpgrade
UpgradeTest.ReregisterOldAgentWithMultiRoleMaster
MasterTest.MultiRoleFrameworkReceivesOffers

{noformat}
[ RUN  ] MasterTest.MultiRoleFrameworkReceivesOffers
../../mesos/src/tests/master_tests.cpp:6576: Failure
Failed to wait 15secs for offers2
../../mesos/src/tests/master_tests.cpp:6564: Failure
Actual function call count doesn't match EXPECT_CALL(sched, 
resourceOffers(, _))...
 Expected: to be called at least twice
   Actual: called once - unsatisfied and active
[  FAILED  ] MasterTest.MultiRoleFrameworkReceivesOffers (15065 ms)
{noformat}

Verbose test log attached.

  was:
{noformat}
[ RUN  ] MasterTest.MultiRoleFrameworkReceivesOffers
../../mesos/src/tests/master_tests.cpp:6576: Failure
Failed to wait 15secs for offers2
../../mesos/src/tests/master_tests.cpp:6564: Failure
Actual function call count doesn't match EXPECT_CALL(sched, 
resourceOffers(, _))...
 Expected: to be called at least twice
   Actual: called once - unsatisfied and active
[  FAILED  ] MasterTest.MultiRoleFrameworkReceivesOffers (15065 ms)
{noformat}

Verbose test log attached.


> Couple of MULTI_ROLE related tests are flaky
> 
>
> Key: MESOS-7182
> URL: https://issues.apache.org/jira/browse/MESOS-7182
> Project: Mesos
>  Issue Type: Bug
>Reporter: Neil Conway
>Assignee: Jay Guo
>  Labels: flaky, mesosphere
> Attachments: test_fail.log
>
>
> Failed tests are:
> MasterTest.MultiRoleSchedulerUpgrade
> UpgradeTest.ReregisterOldAgentWithMultiRoleMaster
> MasterTest.MultiRoleFrameworkReceivesOffers
> {noformat}
> [ RUN  ] MasterTest.MultiRoleFrameworkReceivesOffers
> ../../mesos/src/tests/master_tests.cpp:6576: Failure
> Failed to wait 15secs for offers2
> ../../mesos/src/tests/master_tests.cpp:6564: Failure
> Actual function call count doesn't match EXPECT_CALL(sched, 
> resourceOffers(, _))...
>  Expected: to be called at least twice
>Actual: called once - unsatisfied and active
> [  FAILED  ] MasterTest.MultiRoleFrameworkReceivesOffers (15065 ms)
> {noformat}
> Verbose test log attached.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (MESOS-7182) MasterTest.MultiRoleFrameworkReceivesOffers is flaky

2017-03-01 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-7182:
--

Assignee: Jay Guo

> MasterTest.MultiRoleFrameworkReceivesOffers is flaky
> 
>
> Key: MESOS-7182
> URL: https://issues.apache.org/jira/browse/MESOS-7182
> Project: Mesos
>  Issue Type: Bug
>Reporter: Neil Conway
>Assignee: Jay Guo
>  Labels: flaky, mesosphere
> Attachments: test_fail.log
>
>
> {noformat}
> [ RUN  ] MasterTest.MultiRoleFrameworkReceivesOffers
> ../../mesos/src/tests/master_tests.cpp:6576: Failure
> Failed to wait 15secs for offers2
> ../../mesos/src/tests/master_tests.cpp:6564: Failure
> Actual function call count doesn't match EXPECT_CALL(sched, 
> resourceOffers(, _))...
>  Expected: to be called at least twice
>Actual: called once - unsatisfied and active
> [  FAILED  ] MasterTest.MultiRoleFrameworkReceivesOffers (15065 ms)
> {noformat}
> Verbose test log attached.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7165) Agents should be able to upgrade to be MULTI_ROLE capable

2017-02-24 Thread Jay Guo (JIRA)
Jay Guo created MESOS-7165:
--

 Summary: Agents should be able to upgrade to be MULTI_ROLE capable
 Key: MESOS-7165
 URL: https://issues.apache.org/jira/browse/MESOS-7165
 Project: Mesos
  Issue Type: Bug
Reporter: Jay Guo
Assignee: Jay Guo


If agent capabilities are changed upon re-registration, allocator should be 
notified of the change in order to make correct allocation. For example, when 
agent is upgrade to be MULTI_ROLE capable, allocator should be updated and 
allocate resources of that agent to MULTI_ROLE frameworks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7162) Update long-running-framework to handle MULTI_ROLE support

2017-02-23 Thread Jay Guo (JIRA)
Jay Guo created MESOS-7162:
--

 Summary: Update long-running-framework to handle MULTI_ROLE support
 Key: MESOS-7162
 URL: https://issues.apache.org/jira/browse/MESOS-7162
 Project: Mesos
  Issue Type: Bug
Reporter: Jay Guo






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7158) Add `role` to task/executor to indicate allocation role of their resources

2017-02-22 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-7158:
---
Description: As we added {{allocation_info}} to {{resource}}, v1 API 
inherently shows those information when {{getState}} is invoked. However, we 
need to explicitly add {{allocation_info}} to {{/state}} v0 API. Hence, we need 
to add a {{role}} section to {{task}}/{{executor}}

> Add `role` to task/executor to indicate allocation role of their resources
> --
>
> Key: MESOS-7158
> URL: https://issues.apache.org/jira/browse/MESOS-7158
> Project: Mesos
>  Issue Type: Bug
>Reporter: Jay Guo
>Assignee: Jay Guo
>
> As we added {{allocation_info}} to {{resource}}, v1 API inherently shows 
> those information when {{getState}} is invoked. However, we need to 
> explicitly add {{allocation_info}} to {{/state}} v0 API. Hence, we need to 
> add a {{role}} section to {{task}}/{{executor}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7158) Add `role` to task/executor to indicate allocation role of their resources

2017-02-22 Thread Jay Guo (JIRA)
Jay Guo created MESOS-7158:
--

 Summary: Add `role` to task/executor to indicate allocation role 
of their resources
 Key: MESOS-7158
 URL: https://issues.apache.org/jira/browse/MESOS-7158
 Project: Mesos
  Issue Type: Bug
Reporter: Jay Guo
Assignee: Jay Guo






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Issue Comment Deleted] (MESOS-6657) Update the webui to reflect that frameworks have multiple roles.

2017-02-21 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-6657:
---
Comment: was deleted

(was: Paused progress as we want multi-role and hierarchical roles 
functionalities to be in place first before making WebUI changes.)

> Update the webui to reflect that frameworks have multiple roles.
> 
>
> Key: MESOS-6657
> URL: https://issues.apache.org/jira/browse/MESOS-6657
> Project: Mesos
>  Issue Type: Task
>  Components: webui
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> With the support for multi-role frameworks, the webui will need to be updated 
> to reflect that frameworks can now have more than a single role (as well as 0 
> roles).
> Details about how we should best do this are TBD and will be added to this 
> ticket.
> Work items:
> (1) Show the roles of the framework in the framework tables. Now we'll 
> incorrectly show *. (medium: how to handle a lot of roles, show the number of 
> roles? is there some way to easily see the full list?)
> (2) Show the role of the offer within the Offers table (easy)
> (3) Show the allocation role of tasks / executors (medium: just look at first 
> allocation info, not sure if we need to handle the case where there is no 
> allocation info, since the ui will be running against a new master. The ui 
> can run against old agent though, if we want to handle that then we need to 
> use the framework's role when no allocation info is present in task/executor 
> resources).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-6902) Add support for agent capabilities

2017-02-13 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865250#comment-15865250
 ] 

Jay Guo commented on MESOS-6902:


RR:
https://reviews.apache.org/r/56644/ - Add a member variable {{capabilities}} to 
slave
https://reviews.apache.org/r/56645/ - Add capabilities to {{/state}} endpoint 
of slave

> Add support for agent capabilities
> --
>
> Key: MESOS-6902
> URL: https://issues.apache.org/jira/browse/MESOS-6902
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent
>Reporter: Neil Conway
>Assignee: Jay Guo
>  Labels: mesosphere
>
> Similarly to how we might add support for master capabilities (MESOS-5675), 
> agent capabilities would also make sense: in a mixed cluster, the master 
> might have support for features that are not present on certain agents, and 
> vice versa.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-6902) Add support for agent capabilities

2017-02-13 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865099#comment-15865099
 ] 

Jay Guo commented on MESOS-6902:


[~bmahler] Will do. Should we expose {{Capabilities}} along with other sections 
in {{GetState}} of agent? Currently {{GetTasks}}, {{GetExecutors}} and 
{{GetFrameworks}} in {{GetState}} are all guarded by their own approver. If we 
put {{Capabilities}} in parallel with them, should we create a {{capability 
approver}} for viewing agent capabilities?

> Add support for agent capabilities
> --
>
> Key: MESOS-6902
> URL: https://issues.apache.org/jira/browse/MESOS-6902
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent
>Reporter: Neil Conway
>Assignee: Jay Guo
>  Labels: mesosphere
>
> Similarly to how we might add support for master capabilities (MESOS-5675), 
> agent capabilities would also make sense: in a mixed cluster, the master 
> might have support for features that are not present on certain agents, and 
> vice versa.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-6657) Update the webui to reflect that frameworks have multiple roles.

2017-02-13 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-6657:
---
Description: 
With the support for multi-role frameworks, the webui will need to be updated 
to reflect that frameworks can now have more than a single role (as well as 0 
roles).

Details about how we should best do this are TBD and will be added to this 
ticket.

Work items:
(1) Show the roles of the framework in the framework tables. Now we'll 
incorrectly show *. (medium: how to handle a lot of roles, show the number of 
roles? is there some way to easily see the full list?)
(2) Show the role of the offer within the Offers table (easy)
(3) Show the allocation role of tasks / executors (medium: just look at first 
allocation info, not sure if we need to handle the case where there is no 
allocation info, since the ui will be running against a new master. The ui can 
run against old agent though, if we want to handle that then we need to use the 
framework's role when no allocation info is present in task/executor resources).

  was:
With the support for multi-role frameworks, the webui will need to be updated 
to reflect that frameworks can now have more than a single role (as well as 0 
roles).

Details about how we should best do this are TBD and will be added to this 
ticket.


> Update the webui to reflect that frameworks have multiple roles.
> 
>
> Key: MESOS-6657
> URL: https://issues.apache.org/jira/browse/MESOS-6657
> Project: Mesos
>  Issue Type: Task
>  Components: webui
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> With the support for multi-role frameworks, the webui will need to be updated 
> to reflect that frameworks can now have more than a single role (as well as 0 
> roles).
> Details about how we should best do this are TBD and will be added to this 
> ticket.
> Work items:
> (1) Show the roles of the framework in the framework tables. Now we'll 
> incorrectly show *. (medium: how to handle a lot of roles, show the number of 
> roles? is there some way to easily see the full list?)
> (2) Show the role of the offer within the Offers table (easy)
> (3) Show the allocation role of tasks / executors (medium: just look at first 
> allocation info, not sure if we need to handle the case where there is no 
> allocation info, since the ui will be running against a new master. The ui 
> can run against old agent though, if we want to handle that then we need to 
> use the framework's role when no allocation info is present in task/executor 
> resources).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (MESOS-7035) Add test for framework upgrading to MULTI_ROLE with tasks running

2017-02-09 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-7035:
--

Assignee: Jay Guo

> Add test for framework upgrading to MULTI_ROLE with tasks running
> -
>
> Key: MESOS-7035
> URL: https://issues.apache.org/jira/browse/MESOS-7035
> Project: Mesos
>  Issue Type: Bug
>  Components: tests
>Reporter: Benjamin Bannier
>Assignee: Jay Guo
>
> Frameworks can upgrade to MULTI_ROLE capability provided their new {{roles}} 
> are consistent with their old {{role}}. We should add tests ensuring that 
> this upgrade works even when the framework has tasks running.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7063) Add a test for a MULTI_ROLE master re-registering an old agent.

2017-02-08 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857726#comment-15857726
 ] 

Jay Guo commented on MESOS-7063:


[~gyliu][~bmahler] I'm trying to figure out how to test allocation info being 
correctly injected to tasks/executors. Essentially I want to be able to access 
{{struct Slave}} to see that {{tasks}} contain {{allocation_info}}. Any ideas?

I had this question while writing test for agent capabilities (to see {{struct 
Slave}}'s {{capabilities}} are correctly constructed). Ben advised to follow 
similar test pattern of {{version}}, however I couldn't find a test for that...

> Add a test for a MULTI_ROLE master re-registering an old agent.
> ---
>
> Key: MESOS-7063
> URL: https://issues.apache.org/jira/browse/MESOS-7063
> Project: Mesos
>  Issue Type: Task
>  Components: test
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> We should ensure that the master's handling of non-MULTI_ROLE agents is 
> correct. The master handles this by injecting the resource allocation info 
> for the tasks / executors sent during the agent's re-registration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (MESOS-7063) Add a test for a MULTI_ROLE master re-registering an old agent.

2017-02-07 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-7063:
--

Assignee: Jay Guo

> Add a test for a MULTI_ROLE master re-registering an old agent.
> ---
>
> Key: MESOS-7063
> URL: https://issues.apache.org/jira/browse/MESOS-7063
> Project: Mesos
>  Issue Type: Task
>  Components: test
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> We should ensure that the master's handling of non-MULTI_ROLE agents is 
> correct. The master handles this by injecting the resource allocation info 
> for the tasks / executors sent during the agent's re-registration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (MESOS-7062) Add a test for a MULTI_ROLE framework receiving offers for each of its roles.

2017-02-05 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-7062:
--

Assignee: Jay Guo

> Add a test for a MULTI_ROLE framework receiving offers for each of its roles.
> -
>
> Key: MESOS-7062
> URL: https://issues.apache.org/jira/browse/MESOS-7062
> Project: Mesos
>  Issue Type: Task
>  Components: test
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> Ideally, we could avoid maintaining a set of allocator tests for many 1 role 
> frameworks separate from tests for 1 many role framework, since from a 
> resource allocation perspective, these situations are equivalent. (1)
> To start with, we could simply write a test that makes sure the multi role 
> framework can see the allocation info in the offer, and receives offers for 
> each of its roles.
> (1) Modulo the fair sharing that currently occurs when multiple frameworks 
> are running in a role.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (MESOS-6940) Do not send offers to MULTI_ROLE schedulers if agent does not have MULTI_ROLE capability.

2017-01-19 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6940:
--

Assignee: Jay Guo

> Do not send offers to MULTI_ROLE schedulers if agent does not have MULTI_ROLE 
> capability.
> -
>
> Key: MESOS-6940
> URL: https://issues.apache.org/jira/browse/MESOS-6940
> Project: Mesos
>  Issue Type: Task
>  Components: allocation, master
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> Old agents that do not have the MULTI_ROLE capability cannot correctly 
> receive tasks from schedulers that have the MULTI_ROLE capability *and are 
> using multiple roles*. In this case, we should not send the offer to the 
> scheduler, rather than sending an offer but rejecting the scheduler's 
> operations.
> Note also that since we allow a single role scheduler to upgrade into having 
> the MULTI_ROLE capability (use of the {{FrameworkInfo.roles}} field) so long 
> as they continue to use a single role (in phase 1 of multi-role support the 
> roles cannot be changed), we could continue sending offers if the scheduler 
> is MULTI_ROLE capable but only uses a single role.
> In phase 2 of multi-role support, we cannot safely allow a MULTI_ROLE 
> scheduler to receive resources from a non-MULTI_ROLE agent, so it seems we 
> should simply disallow MULTI_ROLE schedulers from receiving offers from 
> non-MULTI_ROLE agents, regardless of how many roles the scheduler is using.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6902) Add support for agent capabilities

2017-01-19 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15831216#comment-15831216
 ] 

Jay Guo commented on MESOS-6902:


https://reviews.apache.org/r/55710/ Add agent capabilities to v0 master API 
/state

> Add support for agent capabilities
> --
>
> Key: MESOS-6902
> URL: https://issues.apache.org/jira/browse/MESOS-6902
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent
>Reporter: Neil Conway
>Assignee: Jay Guo
>  Labels: mesosphere
>
> Similarly to how we might add support for master capabilities (MESOS-5675), 
> agent capabilities would also make sense: in a mixed cluster, the master 
> might have support for features that are not present on certain agents, and 
> vice versa.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6902) Add support for agent capabilities

2017-01-17 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827419#comment-15827419
 ] 

Jay Guo commented on MESOS-6902:


[~bmahler] Sure, working on it.

> Add support for agent capabilities
> --
>
> Key: MESOS-6902
> URL: https://issues.apache.org/jira/browse/MESOS-6902
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent
>Reporter: Neil Conway
>Assignee: Jay Guo
>  Labels: mesosphere
>
> Similarly to how we might add support for master capabilities (MESOS-5675), 
> agent capabilities would also make sense: in a mixed cluster, the master 
> might have support for features that are not present on certain agents, and 
> vice versa.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6902) Add support for agent capabilities

2017-01-16 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823689#comment-15823689
 ] 

Jay Guo commented on MESOS-6902:


Some initial patches on protobuf messages:
https://reviews.apache.org/r/55562/
https://reviews.apache.org/r/55563/

> Add support for agent capabilities
> --
>
> Key: MESOS-6902
> URL: https://issues.apache.org/jira/browse/MESOS-6902
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent
>Reporter: Neil Conway
>Assignee: Jay Guo
>  Labels: mesosphere
>
> Similarly to how we might add support for master capabilities (MESOS-5675), 
> agent capabilities would also make sense: in a mixed cluster, the master 
> might have support for features that are not present on certain agents, and 
> vice versa.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6854) Prevent launching MULTI_ROLE framework's tasks on agents without MULTI_ROLE support.

2017-01-16 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823641#comment-15823641
 ] 

Jay Guo commented on MESOS-6854:


[~bmahler] Consider following upgrade scenario where agent is not upgraded:
# start an old cluster consisting of master, agent and framework
# framework launches an executor on the agent
# upgrade master to support multi-role
# upgrade framework to support multi-role
# framework wants to launch a task on existing executor

Should we allow the last step?

> Prevent launching MULTI_ROLE framework's tasks on agents without MULTI_ROLE 
> support.
> 
>
> Key: MESOS-6854
> URL: https://issues.apache.org/jira/browse/MESOS-6854
> Project: Mesos
>  Issue Type: Task
>  Components: agent, master
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> The proposal for upgrades / backwards compatibility in phase 1 of multi-role 
> framework support is that we require that masters and agents are all upgraded 
> before a multi-role framework registers.
> We need to explicitly protect against this situation occurring given it's 
> common for old agents to show up in a cluster. The master can prevent the 
> launching of MULTI_ROLE frameworks' tasks on agent without MULTI_ROLE 
> framework support.
> If we were to naively let this happen the old agent would think the resources 
> are allocated to the "*" and there would need to be master logic to deal with 
> the old agent not populating Resource.AllocationInfo.
> The guard will either need to be version based or agent capability based, the 
> latter seeming like the stronger approach given some users upgrade off of 
> master rather than using release versions.
> We can initially start with the master side guard, and have the agent send 
> the capability once the agent-side implementation is complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6854) Prevent launching MULTI_ROLE framework's tasks on agents without MULTI_ROLE support.

2017-01-15 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6854:
--

Assignee: Jay Guo  (was: Guangya Liu)

> Prevent launching MULTI_ROLE framework's tasks on agents without MULTI_ROLE 
> support.
> 
>
> Key: MESOS-6854
> URL: https://issues.apache.org/jira/browse/MESOS-6854
> Project: Mesos
>  Issue Type: Task
>  Components: agent, master
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> The proposal for upgrades / backwards compatibility in phase 1 of multi-role 
> framework support is that we require that masters and agents are all upgraded 
> before a multi-role framework registers.
> We need to explicitly protect against this situation occurring given it's 
> common for old agents to show up in a cluster. The master can prevent the 
> launching of MULTI_ROLE frameworks' tasks on agent without MULTI_ROLE 
> framework support.
> If we were to naively let this happen the old agent would think the resources 
> are allocated to the "*" and there would need to be master logic to deal with 
> the old agent not populating Resource.AllocationInfo.
> The guard will either need to be version based or agent capability based, the 
> latter seeming like the stronger approach given some users upgrade off of 
> master rather than using release versions.
> We can initially start with the master side guard, and have the agent send 
> the capability once the agent-side implementation is complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6902) Add support for agent capabilities

2017-01-15 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6902:
--

Assignee: Jay Guo

> Add support for agent capabilities
> --
>
> Key: MESOS-6902
> URL: https://issues.apache.org/jira/browse/MESOS-6902
> Project: Mesos
>  Issue Type: Improvement
>  Components: agent
>Reporter: Neil Conway
>Assignee: Jay Guo
>  Labels: mesosphere
>
> Similarly to how we might add support for master capabilities (MESOS-5675), 
> agent capabilities would also make sense: in a mixed cluster, the master 
> might have support for features that are not present on certain agents, and 
> vice versa.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6657) Update the webui to reflect that frameworks have multiple roles.

2017-01-04 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6657:
--

Assignee: Jay Guo

> Update the webui to reflect that frameworks have multiple roles.
> 
>
> Key: MESOS-6657
> URL: https://issues.apache.org/jira/browse/MESOS-6657
> Project: Mesos
>  Issue Type: Task
>  Components: webui
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> With the support for multi-role frameworks, the webui will need to be updated 
> to reflect that frameworks can now have more than a single role (as well as 0 
> roles).
> Details about how we should best do this are TBD and will be added to this 
> ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6855) Add `role` section to response of /state endpoint

2017-01-04 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-6855:
---
Issue Type: Task  (was: Bug)

> Add `role` section to response of /state endpoint
> -
>
> Key: MESOS-6855
> URL: https://issues.apache.org/jira/browse/MESOS-6855
> Project: Mesos
>  Issue Type: Task
>  Components: HTTP API, master
>Reporter: Jay Guo
>Assignee: Jay Guo
>
> Role is becoming a more significant attribute as we are implementing 
> multi-tenant support in Mesos. Therefore, we need a more informative response 
> of {{/state}} endpoint to include {{roles}}. One use case is that WebUI could 
> use this information in _roles_ tab, which will be added as a new top-level 
> tab.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6855) Add `role` section to response of /state endpoint

2017-01-04 Thread Jay Guo (JIRA)
Jay Guo created MESOS-6855:
--

 Summary: Add `role` section to response of /state endpoint
 Key: MESOS-6855
 URL: https://issues.apache.org/jira/browse/MESOS-6855
 Project: Mesos
  Issue Type: Bug
  Components: HTTP API, master
Reporter: Jay Guo
Assignee: Jay Guo


Role is becoming a more significant attribute as we are implementing 
multi-tenant support in Mesos. Therefore, we need a more informative response 
of {{/state}} endpoint to include {{roles}}. One use case is that WebUI could 
use this information in _roles_ tab, which will be added as a new top-level tab.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6637) Validate that schedulers cannot perform operations on offers with different allocation roles.

2016-12-13 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15744558#comment-15744558
 ] 

Jay Guo commented on MESOS-6637:


RR: https://reviews.apache.org/r/54650/

> Validate that schedulers cannot perform operations on offers with different 
> allocation roles.
> -
>
> Key: MESOS-6637
> URL: https://issues.apache.org/jira/browse/MESOS-6637
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> With support for multi-role frameworks, offers contain allocation info 
> (currently just the role that the offer is being made to).
> In theory, schedulers could perform offer operations across multiple roles, 
> so long as the tasks, executors, and reservations individually don't mix 
> roles. However, there doesn't seem to be a clear reason to allow this. So, we 
> will validate against combining offers from multiple roles. This also makes 
> it semantically consistent with single-role frameworks (since they do not do 
> this either).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6636) Validate that tasks / executors / reservations do not mix Resource.allocation_info.roles.

2016-12-07 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6636:
--

Assignee: Jay Guo

> Validate that tasks / executors / reservations do not mix 
> Resource.allocation_info.roles.
> -
>
> Key: MESOS-6636
> URL: https://issues.apache.org/jira/browse/MESOS-6636
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> With support for multi-role frameworks, we need to make sure that individual 
> tasks and executors cannot mix roles. Likewise, we do not want to allow a 
> scheduler to make a reservation based on resources with different allocated 
> roles.
> We will however allow tasks from one role to run on executors from another 
> role.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6637) Validate that schedulers cannot perform operations on offers with different allocation roles.

2016-12-07 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6637:
--

Assignee: Jay Guo

> Validate that schedulers cannot perform operations on offers with different 
> allocation roles.
> -
>
> Key: MESOS-6637
> URL: https://issues.apache.org/jira/browse/MESOS-6637
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> With support for multi-role frameworks, offers contain allocation info 
> (currently just the role that the offer is being made to).
> In theory, schedulers could perform offer operations across multiple roles, 
> so long as the tasks, executors, and reservations individually don't mix 
> roles. However, there doesn't seem to be a clear reason to allow this. So, we 
> will validate against combining offers from multiple roles. This also makes 
> it semantically consistent with single-role frameworks (since they do not do 
> this either).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-6684) Update addFramework/removeFramework to handle multi-role frameworks

2016-12-06 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15721528#comment-15721528
 ] 

Jay Guo edited comment on MESOS-6684 at 12/6/16 1:53 PM:
-

RR:
https://reviews.apache.org/r/54360 addFramework
https://reviews.apache.org/r/54362 removeFramework
https://reviews.apache.org/r/54361 test for addFramework
https://reviews.apache.org/r/54363 test for removeFramework


was (Author: guoger):
RR:
https://reviews.apache.org/r/54360 addFramework
https://reviews.apache.org/r/54361 test for addFramework
https://reviews.apache.org/r/54362 removeFramework
https://reviews.apache.org/r/54363 test for removeFramework

> Update addFramework/removeFramework to handle multi-role frameworks
> ---
>
> Key: MESOS-6684
> URL: https://issues.apache.org/jira/browse/MESOS-6684
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Jay Guo
>
> The current master add/remove frameworks only handle single role framework, 
> it should be updated to support multi-role frameworks.
> {code}
>  if (!activeRoles.contains(role)) {
> activeRoles[role] = new Role();
>   }
>   activeRoles[role]->addFramework(framework);
> {code}
> We should update both {{addFramework}} and {{removeFramework}} in master.cpp 
> to be able to map one framework to multiple roles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6685) Update Role::Resources to correctly account for multi-role frameworks

2016-12-05 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6685:
--

Assignee: Jay Guo

> Update Role::Resources to correctly account for multi-role frameworks
> -
>
> Key: MESOS-6685
> URL: https://issues.apache.org/jira/browse/MESOS-6685
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Jay Guo
>
> With single role framework, when call the get role endpoint, the master will 
> return resources for this role with all of the resources for a framework who 
> is using this role. But with multi-role framework, the get role endpoint 
> should only return resources used by one of the roles in a multi-role 
> framework.
> {code}
>   Resources resources() const
>   {
> Resources resources;
> foreachvalue (Framework* framework, frameworks) {
>   resources += framework->totalUsedResources;
>   resources += framework->totalOfferedResources;
> }
> return resources;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6684) Update addFramework/removeFramework to handle multi-role frameworks

2016-12-04 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6684:
--

Assignee: Jay Guo

> Update addFramework/removeFramework to handle multi-role frameworks
> ---
>
> Key: MESOS-6684
> URL: https://issues.apache.org/jira/browse/MESOS-6684
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Jay Guo
>
> The current master add/remove frameworks only handle single role framework, 
> it should be updated to support multi-role frameworks.
> {code}
>  if (!activeRoles.contains(role)) {
> activeRoles[role] = new Role();
>   }
>   activeRoles[role]->addFramework(framework);
> {code}
> We should update both {{addFramework}} and {{removeFramework}} in master.cpp 
> to be able to map one framework to multiple roles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6635) Update allocator to handle multi-role frameworks.

2016-12-02 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6635:
--

Assignee: Jay Guo

> Update allocator to handle multi-role frameworks.
> -
>
> Key: MESOS-6635
> URL: https://issues.apache.org/jira/browse/MESOS-6635
> Project: Mesos
>  Issue Type: Task
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> The allocator needs to be adjusted once we allow frameworks to have multiple 
> roles:
> (1) When adding a framework, we need to store all of its roles and add it to 
> multiple role sorters.
> (2) We will CHECK that the framework does not modify its roles when updating 
> the framework (much like we do for single-role frameworks).
> (3) When performing an allocation, the allocator will set 
> allocation_info.role. When recovering resources, the allocator will unset 
> allocation_info.role.
> (4) The allocator will send AllocationInfo alongside offers that it sends to 
> the master, so that the master can easily augment {{Offer}} with allocation 
> info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6629) Add master validation of FrameworkInfo.roles.

2016-12-02 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717354#comment-15717354
 ] 

Jay Guo commented on MESOS-6629:


RR:
https://reviews.apache.org/r/54062/ Introduced FrameworkInfo validation.
https://reviews.apache.org/r/54301/ Introduced 'roles' validation in master.
https://reviews.apache.org/r/54300/ Added a test for FrameworkInfo role(s) 
validation.
https://reviews.apache.org/r/54302/ Added multi-role master validation 
integration tests.

> Add master validation of FrameworkInfo.roles.
> -
>
> Key: MESOS-6629
> URL: https://issues.apache.org/jira/browse/MESOS-6629
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> The master should disallow frameworks from subscribing based on the following:
> (1) Only one of {{FrameworkInfo.role}} and {{FrameworkInfo.roles}} must be 
> set at a time.
> (2) If {{FrameworkInfo.roles}} is set, then the MULTI_ROLE framework 
> capability must be provided.
> (3) If the MULTI_ROLE framework capability is provided, then 
> {{FrameworkInfo.role}} must not be set.
> (4) {{FrameworkInfo.roles}} must not contain duplicate entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6637) Validate that schedulers cannot perform operations on offers with different allocation roles.

2016-11-29 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707302#comment-15707302
 ] 

Jay Guo commented on MESOS-6637:


Could you elaborate on {quote}tasks, executors, and reservations individually 
don't mix roles{quote} ? I don't quite follow what you mean here, an example 
would be excellent! thx

> Validate that schedulers cannot perform operations on offers with different 
> allocation roles.
> -
>
> Key: MESOS-6637
> URL: https://issues.apache.org/jira/browse/MESOS-6637
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Benjamin Mahler
>
> With support for multi-role frameworks, offers contain allocation info 
> (currently just the role that the offer is being made to).
> In theory, schedulers could perform offer operations across multiple roles, 
> so long as the tasks, executors, and reservations individually don't mix 
> roles. However, there doesn't seem to be a clear reason to allow this. So, we 
> will validate against combining offers from multiple roles. This also makes 
> it semantically consistent with single-role frameworks (since they do not do 
> this either).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6634) Add Resource.AllocationInfo in Offer to indicate a single role per offer.

2016-11-27 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6634:
--

Assignee: Jay Guo

> Add Resource.AllocationInfo in Offer to indicate a single role per offer.
> -
>
> Key: MESOS-6634
> URL: https://issues.apache.org/jira/browse/MESOS-6634
> Project: Mesos
>  Issue Type: Task
>  Components: framework api
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> With multi-role framework support, we need to ensure that the framework can 
> determine which resources are allocated to which roles. Since we'd like to 
> preserve the offer semantics between a single multi-tenant scheduler and 
> multiple single-tenant schedulers, we would like to ensure that an offer only 
> contains resources allocated to a single role:
> {code}
> message Offer {
>   required OfferID id = 1;
>   ...
>   required AgentID agent_id = 3;
>   ...
>   repeated Resource resources = 5;
>   ...
>   // An offer represent resources allocated to *one* of the
>   // roles managed by the scheduler. (Therefore, each
>   // `Offer.resources[i].allocation_info` will match the
>   // top level `Offer.allocation_info`).
>   optional Resource.AllocationInfo allocation_info = 10;
> }
> {code}
> The assumption is that this will make it easier for schedulers to manage 
> offers, since each offer is made to a single tenant.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6633) Introduce Resource.AllocationInfo.

2016-11-27 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6633:
--

Assignee: Jay Guo

> Introduce Resource.AllocationInfo.
> --
>
> Key: MESOS-6633
> URL: https://issues.apache.org/jira/browse/MESOS-6633
> Project: Mesos
>  Issue Type: Task
>  Components: framework api
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> As part of supporting multi-role frameworks, we can no longer assume that the 
> framework ID maps directly to a role. Even without multi-role framework 
> support, this assumption breaks if we want to allow frameworks to modify 
> their role.
> To determine which role resources are allocated to, we now need to store 
> allocation information within the Resource:
> {code}
> message Resource {
>   ...
>   // The role that this resource is reserved for. If "*", this indicates
>   // that the resource is unreserved. Otherwise, the resource will only
>   // be offered to frameworks that belong to this role.
>   optional string role = 6 [default = "*"];
>   
>   message AllocationInfo {
> // If set, this resource is allocated to a role. Note that
> // in the future, this may be unset and the scheduler
> // may be responsible for allocating to one of its roles.
> optional string role = 1;
> // In the future, we may add additional fields here, e.g. priority tier,
> // type of allocation (quota / fair share).
>   }
>   optional AllocationInfo allocation_info = X;
>   ...
> }
> {code}
> An alternative considered was to augment {{TaskInfo}} or {{ExecutorInfo}} or 
> introduce another layer on top of {{Resource}} called {{Allocation}} which 
> contains {{Resource}}. The first option does not work since some components 
> that need to know about the allocation do not have visibility into the 
> tasks/executors. The second option requires dramatic changes and so is harder 
> to accomplish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-6629) Add master validation of FrameworkInfo.roles.

2016-11-24 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-6629:
---
Comment: was deleted

(was: bq. {{FrameworkInfo.roles}} must not contain duplicate entries.
Do we want subscription to fail for duplicate roles or we simply deduplicate it 
and generate a warning? For the latter case we could change {{roles::parse}} to 
return std::set and reuse it for both framework and master's {{-- roles}}. 
Currently, master throws out warning for duplicate roles in {{-- roles}}: 
https://github.com/apache/mesos/blob/master/src/master/master.cpp#L662)

> Add master validation of FrameworkInfo.roles.
> -
>
> Key: MESOS-6629
> URL: https://issues.apache.org/jira/browse/MESOS-6629
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> The master should disallow frameworks from subscribing based on the following:
> (1) Only one of {{FrameworkInfo.role}} and {{FrameworkInfo.roles}} must be 
> set at a time.
> (2) If {{FrameworkInfo.roles}} is set, then the MULTI_ROLE framework 
> capability must be provided.
> (3) If the MULTI_ROLE framework capability is provided, then 
> {{FrameworkInfo.role}} must not be set.
> (4) {{FrameworkInfo.roles}} must not contain duplicate entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6629) Add master validation of FrameworkInfo.roles.

2016-11-24 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15692580#comment-15692580
 ] 

Jay Guo commented on MESOS-6629:


bq. {{FrameworkInfo.roles}} must not contain duplicate entries.
Do we want subscription to fail for duplicate roles or we simply deduplicate it 
and generate a warning? For the latter case we could change {{roles::parse}} to 
return std::set and reuse it for both framework and master's {{-- roles}}. 
Currently, master throws out warning for duplicate roles in {{-- roles}}: 
https://github.com/apache/mesos/blob/master/src/master/master.cpp#L662

> Add master validation of FrameworkInfo.roles.
> -
>
> Key: MESOS-6629
> URL: https://issues.apache.org/jira/browse/MESOS-6629
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> The master should disallow frameworks from subscribing based on the following:
> (1) Only one of {{FrameworkInfo.role}} and {{FrameworkInfo.roles}} must be 
> set at a time.
> (2) If {{FrameworkInfo.roles}} is set, then the MULTI_ROLE framework 
> capability must be provided.
> (3) If the MULTI_ROLE framework capability is provided, then 
> {{FrameworkInfo.role}} must not be set.
> (4) {{FrameworkInfo.roles}} must not contain duplicate entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6629) Add master validation of FrameworkInfo.roles.

2016-11-23 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6629:
--

Assignee: Jay Guo

> Add master validation of FrameworkInfo.roles.
> -
>
> Key: MESOS-6629
> URL: https://issues.apache.org/jira/browse/MESOS-6629
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> The master should disallow frameworks from subscribing based on the following:
> (1) Only one of {{FrameworkInfo.role}} and {{FrameworkInfo.roles}} must be 
> set at a time.
> (2) If {{FrameworkInfo.roles}} is set, then the MULTI_ROLE framework 
> capability must be provided.
> (3) If the MULTI_ROLE framework capability is provided, then 
> {{FrameworkInfo.role}} must not be set.
> (4) {{FrameworkInfo.roles}} must not contain duplicate entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6628) Add a FrameworkInfo.roles field along with a MULTI_ROLE capability.

2016-11-23 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo reassigned MESOS-6628:
--

Assignee: Jay Guo

> Add a FrameworkInfo.roles field along with a MULTI_ROLE capability.
> ---
>
> Key: MESOS-6628
> URL: https://issues.apache.org/jira/browse/MESOS-6628
> Project: Mesos
>  Issue Type: Task
>  Components: framework api
>Reporter: Benjamin Mahler
>Assignee: Jay Guo
>
> In order to support frameworks having multiple roles, we will introduce a 
> {{FrameworkInfo.roles}} field as a {{repeated string}}.
> Note that because we cannot distinguish between an empty set of {{roles}} 
> (new-style framework wanting no roles) and an unset {{role}} (old-style 
> framework wanting the "*" role), we must introduce a framework capability 
> (i.e. MULTI_ROLE). This capability will be required for a framework to use 
> the new {{roles}} field.
> {code}
> message FrameworkInfo {
>   ...
>   // Roles are the entities to which allocations are made.
>   // The framework must have at least one role in order to
>   // be offered resources. Note that `role` is deprecated
>   // in favor of `roles` and only one of these fields must
>   // be used. Since we cannot distinguish between empty
>   // `roles` and the default unset `role`, we require that
>   // frameworks set the `MULTI_ROLE` capability if
>   // setting the `roles` field.
>   optional string role = 6 [default="*", deprecated=true];
>   repeated string roles = 12;
>   message Capability {
> enum Type {
>   ...
>   // This expresses the ability for the framework to be
>   // "multi-tenant" via using the newly introduced `roles`
>   // field, and examining `Offer.allocation_info` to determine
>   // which role the offers are being made to. We also
>   // expect that "single-tenant" schedulers eventually
>   // provide this and  move away from the deprecated
>   // `role` field.
>   MULTI_ROLE = 3;
> }
> optional Type type = 1;
>   }
>   ...
> }
> {code}
> Validation will be added in MESOS-6629 and we will prevent roles from being 
> modified in MESOS-6631.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6581) Add Seccomp support at Mesos Agent level

2016-11-14 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663232#comment-15663232
 ] 

Jay Guo commented on MESOS-6581:


Initial patches for review:
https://reviews.apache.org/r/53604/
https://reviews.apache.org/r/53605/
https://reviews.apache.org/r/53606/
https://reviews.apache.org/r/53607/
https://reviews.apache.org/r/53608/

> Add Seccomp support at Mesos Agent level
> 
>
> Key: MESOS-6581
> URL: https://issues.apache.org/jira/browse/MESOS-6581
> Project: Mesos
>  Issue Type: Task
> Environment: Linux Only
>Reporter: Jay Guo
>Assignee: Jay Guo
>
> Operator of Mesos cluster should be able to enforce a set of Seccomp rules on 
> an Mesos Agent to defend against potential exploit attack through syscalls. 
> When enabled, every container launched on the Agent would comply with the 
> Seccomp filter otherwise being killed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6585) Create a user guide to document security features including capabilities and seccomp support

2016-11-14 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663134#comment-15663134
 ] 

Jay Guo commented on MESOS-6585:


CC [~bbannier]

> Create a user guide to document security features including capabilities and 
> seccomp support
> 
>
> Key: MESOS-6585
> URL: https://issues.apache.org/jira/browse/MESOS-6585
> Project: Mesos
>  Issue Type: Task
>Reporter: Jay Guo
>
> We should have a user guide to document security features in Mesos, including 
> but not limited to: capabilities, seccomp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6585) Create a user guide to document security features including capabilities and seccomp support

2016-11-14 Thread Jay Guo (JIRA)
Jay Guo created MESOS-6585:
--

 Summary: Create a user guide to document security features 
including capabilities and seccomp support
 Key: MESOS-6585
 URL: https://issues.apache.org/jira/browse/MESOS-6585
 Project: Mesos
  Issue Type: Task
Reporter: Jay Guo


We should have a user guide to document security features in Mesos, including 
but not limited to: capabilities, seccomp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6584) Add tests for Seccomp support

2016-11-14 Thread Jay Guo (JIRA)
Jay Guo created MESOS-6584:
--

 Summary: Add tests for Seccomp support
 Key: MESOS-6584
 URL: https://issues.apache.org/jira/browse/MESOS-6584
 Project: Mesos
  Issue Type: Task
Reporter: Jay Guo
Assignee: Jay Guo


Add unit tests as well as integration tests for Seccomp support.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6583) Add Seccomp support for mesos-execute

2016-11-14 Thread Jay Guo (JIRA)
Jay Guo created MESOS-6583:
--

 Summary: Add Seccomp support for mesos-execute
 Key: MESOS-6583
 URL: https://issues.apache.org/jira/browse/MESOS-6583
 Project: Mesos
  Issue Type: Task
Reporter: Jay Guo
Assignee: Jay Guo


User should be able to specify Seccomp profile when launching a task using 
mesos-execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6582) Add Seccomp support at executor level

2016-11-13 Thread Jay Guo (JIRA)
Jay Guo created MESOS-6582:
--

 Summary: Add Seccomp support at executor level
 Key: MESOS-6582
 URL: https://issues.apache.org/jira/browse/MESOS-6582
 Project: Mesos
  Issue Type: Task
 Environment: Linux Only
Reporter: Jay Guo
Assignee: Jay Guo


In addition to agent level protection mention in MESOS-6581, Mesos users should 
be able to supply extra Seccomp filter to their containers in order to impose 
more restrictive rules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6581) Add Seccomp support at Mesos Agent level

2016-11-13 Thread Jay Guo (JIRA)
Jay Guo created MESOS-6581:
--

 Summary: Add Seccomp support at Mesos Agent level
 Key: MESOS-6581
 URL: https://issues.apache.org/jira/browse/MESOS-6581
 Project: Mesos
  Issue Type: Task
 Environment: Linux Only
Reporter: Jay Guo
Assignee: Jay Guo


Operator of Mesos cluster should be able to enforce a set of Seccomp rules on 
an Mesos Agent to defend against potential exploit attack through syscalls. 
When enabled, every container launched on the Agent would comply with the 
Seccomp filter otherwise being killed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3505) Support specifying Docker image by Image ID.

2016-10-11 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565098#comment-15565098
 ] 

Jay Guo commented on MESOS-3505:


alright, I unassigned myself. Please go ahead. If you need some help reviewing 
the patch, glad to help.

> Support specifying Docker image by Image ID.
> 
>
> Key: MESOS-3505
> URL: https://issues.apache.org/jira/browse/MESOS-3505
> Project: Mesos
>  Issue Type: Story
>Reporter: Yan Xu
>  Labels: mesosphere
>
> A common way to specify a Docker image with the docker engine is through 
> {{repo:tag}}, which is convenient and sufficient for most people in most 
> scenarios. However this combination is neither precise nor immutable.
> For this reason, it's possible when an image with a {{repo:tag}} already 
> cached locally on an agent host and a task requiring this {{repo:tag}} 
> arrives, it's using an image that's different than the one the user intended.
> Docker CLI already supports referring to an image by {{repo@id}}, where the 
> ID can have two forms:
> * v1 Image ID
> * digest
> Native Mesos provisioner should support the same for Docker images. IMO it's 
> fine if image discovery by ID is not supported (and thus still requiring 
> {{repo:tag}} to be specified) (looks like [v2 
> registry|http://docs.docker.com/registry/spec/api/] does support it) but the 
> user can optionally specify an image ID and match it against the cached / 
> newly pulled image. If the ID doesn't match the cached image, the store can 
> re-pull it; if the ID doesn't match the newly pulled image (manifest), the 
> provisioner can fail the request without having the user unknowingly running 
> its task on the wrong image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-5735) Update WebUI to use v1 operator API

2016-09-09 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15476137#comment-15476137
 ] 

Jay Guo edited comment on MESOS-5735 at 9/9/16 6:58 AM:


JSONP won't work anyway since we moved from {{GET}} to {{POST}} in HTTP API. 
{{CORS}} imposes security risks and may be only suitable for dev purposes. 
Hence, we are thinking to have proxies in master that relays requests/responds 
between WebUI and agents, so resources will always come from single domain from 
WebUI point of view.


was (Author: guoger):
JSONP won't work anyway since we moved from `GET` to `POST` in HTTP API. `CORS` 
imposes security risks and may be only suitable for dev purposes. Hence, we are 
thinking to have proxies in master that relays requests/responds between WebUI 
and agents, so resources will always come from single domain from WebUI point 
of view.

> Update WebUI to use v1 operator API
> ---
>
> Key: MESOS-5735
> URL: https://issues.apache.org/jira/browse/MESOS-5735
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Jay Guo
>
> Having the WebUI use the v1 API would be a good validation of it's usefulness 
> and correctness.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5735) Update WebUI to use v1 operator API

2016-09-09 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15476137#comment-15476137
 ] 

Jay Guo commented on MESOS-5735:


JSONP won't work anyway since we moved from `GET` to `POST` in HTTP API. `CORS` 
imposes security risks and may be only suitable for dev purposes. Hence, we are 
thinking to have proxies in master that relays requests/responds between WebUI 
and agents, so resources will always come from single domain from WebUI point 
of view.

> Update WebUI to use v1 operator API
> ---
>
> Key: MESOS-5735
> URL: https://issues.apache.org/jira/browse/MESOS-5735
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Jay Guo
>
> Having the WebUI use the v1 API would be a good validation of it's usefulness 
> and correctness.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5735) Update WebUI to use v1 operator API

2016-09-04 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15464010#comment-15464010
 ] 

Jay Guo commented on MESOS-5735:


It seems to be the most reasonable solution so far.

> Update WebUI to use v1 operator API
> ---
>
> Key: MESOS-5735
> URL: https://issues.apache.org/jira/browse/MESOS-5735
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Jay Guo
>
> Having the WebUI use the v1 API would be a good validation of it's usefulness 
> and correctness.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5960) Design doc for supporting seccomp in Mesos container

2016-08-24 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-5960:
---
Description: 
https://docs.google.com/document/d/1ZryC0KAsKp8yw6L3_5ZkjL-oi0bvpoLmUKkLV3wNFGY/edit?usp=sharing

> Design doc for supporting seccomp in Mesos container
> 
>
> Key: MESOS-5960
> URL: https://issues.apache.org/jira/browse/MESOS-5960
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jay Guo
>Assignee: Jay Guo
>
> https://docs.google.com/document/d/1ZryC0KAsKp8yw6L3_5ZkjL-oi0bvpoLmUKkLV3wNFGY/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5828) Modularize Network in replicated_log

2016-08-05 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409176#comment-15409176
 ] 

Jay Guo commented on MESOS-5828:


Updated patch chain summary:

||Reviews||Summary||
|https://reviews.apache.org/r/50837|Fixed minor code style.|
|https://reviews.apache.org/r/50491|Added PIDGroup to libprocess.|
|https://reviews.apache.org/r/50492|Switched replicated log to use PIDGroup.|
|https://reviews.apache.org/r/50490|Separated ZooKeeper PIDGroup implementation 
into its own cpp/hpp.|
|https://reviews.apache.org/r/50493|Added `base` to PIDGroup.|
|https://reviews.apache.org/r/50494|Remove `base` from ZooKeeperPIDGroup.|
|https://reviews.apache.org/r/50495|Added PIDGroup module struct.|
|https://reviews.apache.org/r/50496|Added static `createPIDGroup` method to 
LogProcess.|
|https://reviews.apache.org/r/50497|Added new constructors in Log and 
LogProcess.|
|https://reviews.apache.org/r/50498|Added --pid_group flag in master.|
|https://reviews.apache.org/r/50499|Added logic in master/main.cpp to use 
pid_group module.|
|https://reviews.apache.org/r/50838|Updated modules documentation to reflect 
PIDGroup module.|

> Modularize Network in replicated_log
> 
>
> Key: MESOS-5828
> URL: https://issues.apache.org/jira/browse/MESOS-5828
> Project: Mesos
>  Issue Type: Bug
>  Components: replicated log
>Reporter: Jay Guo
>Assignee: Jay Guo
>
> Currently replicated_log relies on Zookeeper for coordinator election. This 
> is done through network abstraction _ZookeeperNetwork_. We need to modularize 
> this part in order to enable replicated_log when using Master 
> contender/detector modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5960) Design doc for supporting seccomp in Mesos container

2016-08-02 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-5960:
---
   Assignee: Jay Guo
Component/s: containerization
 Issue Type: Task  (was: Bug)

> Design doc for supporting seccomp in Mesos container
> 
>
> Key: MESOS-5960
> URL: https://issues.apache.org/jira/browse/MESOS-5960
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jay Guo
>Assignee: Jay Guo
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5960) Design doc for supporting seccomp in Mesos container

2016-08-02 Thread Jay Guo (JIRA)
Jay Guo created MESOS-5960:
--

 Summary: Design doc for supporting seccomp in Mesos container
 Key: MESOS-5960
 URL: https://issues.apache.org/jira/browse/MESOS-5960
 Project: Mesos
  Issue Type: Bug
Reporter: Jay Guo






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-5186) mesos.interface: Allow using protobuf 3.x

2016-07-30 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400933#comment-15400933
 ] 

Jay Guo edited comment on MESOS-5186 at 7/31/16 3:11 AM:
-

Good to know! Unfortunately we are 'existing users' of proto2 and 'not 
recommended' to migrate to proto3. It's a bit hard to do that IMO, but I guess 
we could raise that in Mesos 2.0?


was (Author: guoger):
Good to know! Unfortunately we are 'existing users' of proto2 and 'not 
recommended' to migrate to proto3. It's a bit hard to do that IMO.

> mesos.interface: Allow using protobuf 3.x
> -
>
> Key: MESOS-5186
> URL: https://issues.apache.org/jira/browse/MESOS-5186
> Project: Mesos
>  Issue Type: Improvement
>  Components: python api
>Reporter: Myautsai PAN
>Assignee: Yong Tang
>Priority: Minor
>  Labels: easyfix
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> We're working on integrating TensorFlow(https://www.tensorflow.org) with 
> mesos. Both the two require {{protobuf}}. The python package 
> {{mesos.interface}} requires {{protobuf>=2.6.1,<3}}, but {{tensorflow}} 
> requires {{protobuf>=3.0.0}} . Though protobuf 3.x is not compatible with 
> protobuf 2.x, but anyway we modify the {{setup.py}} 
> (https://github.com/apache/mesos/blob/66cddaf/src/python/interface/setup.py.in#L29)
> from {{'install_requires': [ 'google-common>=0.0.1', 'protobuf>=2.6.1,<3' 
> ],}} to {{'install_requires': [ 'google-common>=0.0.1', 'protobuf>=2.6.1' ],}}
> It works fine. Would you please consider support protobuf 3.x officially in 
> the next release? Maybe just remove the {{,<3}} restriction is enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5186) mesos.interface: Allow using protobuf 3.x

2016-07-30 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15400933#comment-15400933
 ] 

Jay Guo commented on MESOS-5186:


Good to know! Unfortunately we are 'existing users' of proto2 and 'not 
recommended' to migrate to proto3. It's a bit hard to do that IMO.

> mesos.interface: Allow using protobuf 3.x
> -
>
> Key: MESOS-5186
> URL: https://issues.apache.org/jira/browse/MESOS-5186
> Project: Mesos
>  Issue Type: Improvement
>  Components: python api
>Reporter: Myautsai PAN
>Assignee: Yong Tang
>Priority: Minor
>  Labels: easyfix
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> We're working on integrating TensorFlow(https://www.tensorflow.org) with 
> mesos. Both the two require {{protobuf}}. The python package 
> {{mesos.interface}} requires {{protobuf>=2.6.1,<3}}, but {{tensorflow}} 
> requires {{protobuf>=3.0.0}} . Though protobuf 3.x is not compatible with 
> protobuf 2.x, but anyway we modify the {{setup.py}} 
> (https://github.com/apache/mesos/blob/66cddaf/src/python/interface/setup.py.in#L29)
> from {{'install_requires': [ 'google-common>=0.0.1', 'protobuf>=2.6.1,<3' 
> ],}} to {{'install_requires': [ 'google-common>=0.0.1', 'protobuf>=2.6.1' ],}}
> It works fine. Would you please consider support protobuf 3.x officially in 
> the next release? Maybe just remove the {{,<3}} restriction is enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5829) Mesos should be able to consume module for replicated_log

2016-07-10 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-5829:
---
External issue ID:   (was: https://issues.apache.org/jira/browse/MESOS-5828)

> Mesos should be able to consume module for replicated_log
> -
>
> Key: MESOS-5829
> URL: https://issues.apache.org/jira/browse/MESOS-5829
> Project: Mesos
>  Issue Type: Bug
>  Components: modules, replicated log
>Reporter: Jay Guo
>Assignee: Jay Guo
>
> Currently {{--quorum}} is hardcoded to 1 if no *zk* provided, assuming 
> standalone mode, however this is not the true when using master contender and 
> detector modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5829) Mesos should be able to consume module for replicated_log

2016-07-10 Thread Jay Guo (JIRA)
Jay Guo created MESOS-5829:
--

 Summary: Mesos should be able to consume module for replicated_log
 Key: MESOS-5829
 URL: https://issues.apache.org/jira/browse/MESOS-5829
 Project: Mesos
  Issue Type: Bug
  Components: modules, replicated log
Reporter: Jay Guo
Assignee: Jay Guo


Currently {{--quorum}} is hardcoded to 1 if no *zk* provided, assuming 
standalone mode, however this is not the true when using master contender and 
detector modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5828) Modularize Network in replicated_log

2016-07-10 Thread Jay Guo (JIRA)
Jay Guo created MESOS-5828:
--

 Summary: Modularize Network in replicated_log
 Key: MESOS-5828
 URL: https://issues.apache.org/jira/browse/MESOS-5828
 Project: Mesos
  Issue Type: Bug
  Components: replicated log
Reporter: Jay Guo
Assignee: Jay Guo


Currently replicated_log relies on Zookeeper for coordinator election. This is 
done through network abstraction _ZookeeperNetwork_. We need to modularize this 
part in order to enable replicated_log when using Master contender/detector 
modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3505) Support specifying Docker image by Image ID.

2016-07-08 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367475#comment-15367475
 ] 

Jay Guo commented on MESOS-3505:


ping [~gyliu] [~jieyu] [~xujyan]

> Support specifying Docker image by Image ID.
> 
>
> Key: MESOS-3505
> URL: https://issues.apache.org/jira/browse/MESOS-3505
> Project: Mesos
>  Issue Type: Story
>Reporter: Yan Xu
>Assignee: Jay Guo
>  Labels: mesosphere
>
> A common way to specify a Docker image with the docker engine is through 
> {{repo:tag}}, which is convenient and sufficient for most people in most 
> scenarios. However this combination is neither precise nor immutable.
> For this reason, it's possible when an image with a {{repo:tag}} already 
> cached locally on an agent host and a task requiring this {{repo:tag}} 
> arrives, it's using an image that's different than the one the user intended.
> Docker CLI already supports referring to an image by {{repo@id}}, where the 
> ID can have two forms:
> * v1 Image ID
> * digest
> Native Mesos provisioner should support the same for Docker images. IMO it's 
> fine if image discovery by ID is not supported (and thus still requiring 
> {{repo:tag}} to be specified) (looks like [v2 
> registry|http://docs.docker.com/registry/spec/api/] does support it) but the 
> user can optionally specify an image ID and match it against the cached / 
> newly pulled image. If the ID doesn't match the cached image, the store can 
> re-pull it; if the ID doesn't match the newly pulled image (manifest), the 
> provisioner can fail the request without having the user unknowingly running 
> its task on the wrong image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3505) Support specifying Docker image by Image ID.

2016-07-01 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358516#comment-15358516
 ] 

Jay Guo edited comment on MESOS-3505 at 7/1/16 7:19 AM:


Just wanna make sure I understand the problem correctly. So one should be able 
to do following:
1. Specify digest in image name as {{debian@sha256:abcdef}} in 
{{ContainerInfo.MesosInfo.Image.Docker.name}}, 
2. Provisioner parses this information to ImageReference, which should 
recognise {{digest}} (which is available in protobuf message but not used at 
this moment)
3. If {{cached}} is set to _true_, *metadata_manager* should check if the image 
with specified digest exists in local store. If not, it pulls from docker 
registry v2 using the digest, e.g. {{debian@sha256:abcdef}}
4. If pull fails due to any reason, the process should fail and error should be 
returned to user.

One question is that do we assume digest is supported if users pull from a 
private registry?

cc [~jieyu]


was (Author: guoger):
Just wanna make sure I understand the problem correctly. So one should be able 
to do following:
1. Specify digest in image name as {{debian@sha256:abcdef}} in 
{{ContainerInfo.MesosInfo.Image.Docker.name}}, 
2. Provisioner parses this information to ImageReference, which should 
recognise {{digest}} (which is available in protobuf message but not used at 
this moment)
3. If {{cached}} is set to _true_, *metadata_manager* should check if the image 
with specified digest exists in store. If not, it pulls from docker registry v2 
using the digest, e.g. {{debian@sha256:abcdef}}
4. If pull fails due to any reason, the process should fail and error should be 
returned to user.

One question is that do we assume digest is supported if users pull from a 
private registry?

cc [~jieyu]

> Support specifying Docker image by Image ID.
> 
>
> Key: MESOS-3505
> URL: https://issues.apache.org/jira/browse/MESOS-3505
> Project: Mesos
>  Issue Type: Story
>Reporter: Yan Xu
>Assignee: Jay Guo
>  Labels: mesosphere
>
> A common way to specify a Docker image with the docker engine is through 
> {{repo:tag}}, which is convenient and sufficient for most people in most 
> scenarios. However this combination is neither precise nor immutable.
> For this reason, it's possible when an image with a {{repo:tag}} already 
> cached locally on an agent host and a task requiring this {{repo:tag}} 
> arrives, it's using an image that's different than the one the user intended.
> Docker CLI already supports referring to an image by {{repo@id}}, where the 
> ID can have two forms:
> * v1 Image ID
> * digest
> Native Mesos provisioner should support the same for Docker images. IMO it's 
> fine if image discovery by ID is not supported (and thus still requiring 
> {{repo:tag}} to be specified) (looks like [v2 
> registry|http://docs.docker.com/registry/spec/api/] does support it) but the 
> user can optionally specify an image ID and match it against the cached / 
> newly pulled image. If the ID doesn't match the cached image, the store can 
> re-pull it; if the ID doesn't match the newly pulled image (manifest), the 
> provisioner can fail the request without having the user unknowingly running 
> its task on the wrong image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3505) Support specifying Docker image by Image ID.

2016-07-01 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358516#comment-15358516
 ] 

Jay Guo commented on MESOS-3505:


Just wanna make sure I understand the problem correctly. So one should be able 
to do following:
1. Specify digest in image name as {{debian@sha256:abcdef}} in 
{{ContainerInfo.MesosInfo.Image.Docker.name}}, 
2. Provisioner parses this information to ImageReference, which should 
recognise {{digest}} (which is available in protobuf message but not used at 
this moment)
3. If {{cached}} is set to _true_, *metadata_manager* should check if the image 
with specified digest exists in store. If not, it pulls from docker registry v2 
using the digest, e.g. {{debian@sha256:abcdef}}
4. If pull fails due to any reason, the process should fail and error should be 
returned to user.

One question is that do we assume digest is supported if users pull from a 
private registry?

> Support specifying Docker image by Image ID.
> 
>
> Key: MESOS-3505
> URL: https://issues.apache.org/jira/browse/MESOS-3505
> Project: Mesos
>  Issue Type: Story
>Reporter: Yan Xu
>Assignee: Jay Guo
>  Labels: mesosphere
>
> A common way to specify a Docker image with the docker engine is through 
> {{repo:tag}}, which is convenient and sufficient for most people in most 
> scenarios. However this combination is neither precise nor immutable.
> For this reason, it's possible when an image with a {{repo:tag}} already 
> cached locally on an agent host and a task requiring this {{repo:tag}} 
> arrives, it's using an image that's different than the one the user intended.
> Docker CLI already supports referring to an image by {{repo@id}}, where the 
> ID can have two forms:
> * v1 Image ID
> * digest
> Native Mesos provisioner should support the same for Docker images. IMO it's 
> fine if image discovery by ID is not supported (and thus still requiring 
> {{repo:tag}} to be specified) (looks like [v2 
> registry|http://docs.docker.com/registry/spec/api/] does support it) but the 
> user can optionally specify an image ID and match it against the cached / 
> newly pulled image. If the ID doesn't match the cached image, the store can 
> re-pull it; if the ID doesn't match the newly pulled image (manifest), the 
> provisioner can fail the request without having the user unknowingly running 
> its task on the wrong image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3505) Support specifying Docker image by Image ID.

2016-07-01 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358516#comment-15358516
 ] 

Jay Guo edited comment on MESOS-3505 at 7/1/16 6:59 AM:


Just wanna make sure I understand the problem correctly. So one should be able 
to do following:
1. Specify digest in image name as {{debian@sha256:abcdef}} in 
{{ContainerInfo.MesosInfo.Image.Docker.name}}, 
2. Provisioner parses this information to ImageReference, which should 
recognise {{digest}} (which is available in protobuf message but not used at 
this moment)
3. If {{cached}} is set to _true_, *metadata_manager* should check if the image 
with specified digest exists in store. If not, it pulls from docker registry v2 
using the digest, e.g. {{debian@sha256:abcdef}}
4. If pull fails due to any reason, the process should fail and error should be 
returned to user.

One question is that do we assume digest is supported if users pull from a 
private registry?

cc [~jieyu]


was (Author: guoger):
Just wanna make sure I understand the problem correctly. So one should be able 
to do following:
1. Specify digest in image name as {{debian@sha256:abcdef}} in 
{{ContainerInfo.MesosInfo.Image.Docker.name}}, 
2. Provisioner parses this information to ImageReference, which should 
recognise {{digest}} (which is available in protobuf message but not used at 
this moment)
3. If {{cached}} is set to _true_, *metadata_manager* should check if the image 
with specified digest exists in store. If not, it pulls from docker registry v2 
using the digest, e.g. {{debian@sha256:abcdef}}
4. If pull fails due to any reason, the process should fail and error should be 
returned to user.

One question is that do we assume digest is supported if users pull from a 
private registry?

> Support specifying Docker image by Image ID.
> 
>
> Key: MESOS-3505
> URL: https://issues.apache.org/jira/browse/MESOS-3505
> Project: Mesos
>  Issue Type: Story
>Reporter: Yan Xu
>Assignee: Jay Guo
>  Labels: mesosphere
>
> A common way to specify a Docker image with the docker engine is through 
> {{repo:tag}}, which is convenient and sufficient for most people in most 
> scenarios. However this combination is neither precise nor immutable.
> For this reason, it's possible when an image with a {{repo:tag}} already 
> cached locally on an agent host and a task requiring this {{repo:tag}} 
> arrives, it's using an image that's different than the one the user intended.
> Docker CLI already supports referring to an image by {{repo@id}}, where the 
> ID can have two forms:
> * v1 Image ID
> * digest
> Native Mesos provisioner should support the same for Docker images. IMO it's 
> fine if image discovery by ID is not supported (and thus still requiring 
> {{repo:tag}} to be specified) (looks like [v2 
> registry|http://docs.docker.com/registry/spec/api/] does support it) but the 
> user can optionally specify an image ID and match it against the cached / 
> newly pulled image. If the ID doesn't match the cached image, the store can 
> re-pull it; if the ID doesn't match the newly pulled image (manifest), the 
> provisioner can fail the request without having the user unknowingly running 
> its task on the wrong image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5751) Inconsistent display in webui

2016-06-30 Thread Jay Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Guo updated MESOS-5751:
---
Attachment: homepage.png

> Inconsistent display in webui
> -
>
> Key: MESOS-5751
> URL: https://issues.apache.org/jira/browse/MESOS-5751
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Reporter: Jay Guo
> Attachments: homepage.png
>
>
> To reproduce:
> 1. Launch master
> 2. Launch agent
> 3. Launch test-framework
> 4. go to webui
> We observe correct statistics on the left panel but no completed tasks on 
> right side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5751) Inconsistent display in webui

2016-06-30 Thread Jay Guo (JIRA)
Jay Guo created MESOS-5751:
--

 Summary: Inconsistent display in webui
 Key: MESOS-5751
 URL: https://issues.apache.org/jira/browse/MESOS-5751
 Project: Mesos
  Issue Type: Bug
  Components: webui
Reporter: Jay Guo


To reproduce:
1. Launch master
2. Launch agent
3. Launch test-framework
4. go to webui

We observe correct statistics on the left panel but no completed tasks on right 
side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5227) Implement HTTP Docker Executor that uses the Executor Library

2016-06-26 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15350409#comment-15350409
 ] 

Jay Guo commented on MESOS-5227:


It would be great if you could put a few words on each review link to summarize.

> Implement HTTP Docker Executor that uses the Executor Library
> -
>
> Key: MESOS-5227
> URL: https://issues.apache.org/jira/browse/MESOS-5227
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Yong Tang
>
> Similar to what we did with the HTTP command executor in MESOS-3558 we should 
> have a HTTP docker executor that can speak the v1 Executor API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >