[jira] [Updated] (MESOS-6988) CLONE - WebUI redirect doesn't work with stats from /metric/snapshot

2017-01-24 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-6988:
--
Affects Version/s: (was: 1.0.0)
   1.1.0

> CLONE - WebUI redirect doesn't work with stats from /metric/snapshot
> 
>
> Key: MESOS-6988
> URL: https://issues.apache.org/jira/browse/MESOS-6988
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.1.0
>Reporter: Yan Xu
>
> The issue described in MESOS-6446 is still not fixed in 1.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6988) CLONE - WebUI redirect doesn't work with stats from /metric/snapshot

2017-01-24 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-6988:
--
Priority: Major  (was: Blocker)

> CLONE - WebUI redirect doesn't work with stats from /metric/snapshot
> 
>
> Key: MESOS-6988
> URL: https://issues.apache.org/jira/browse/MESOS-6988
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.1.0
>Reporter: Yan Xu
>
> The issue described in MESOS-6446 is still not fixed in 1.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6988) CLONE - WebUI redirect doesn't work with stats from /metric/snapshot

2017-01-24 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-6988:
--
Shepherd:   (was: Vinod Kone)

> CLONE - WebUI redirect doesn't work with stats from /metric/snapshot
> 
>
> Key: MESOS-6988
> URL: https://issues.apache.org/jira/browse/MESOS-6988
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.0.0
>Reporter: Yan Xu
>Priority: Blocker
>
> The issue described in MESOS-6446 is still not fixed in 1.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6988) CLONE - WebUI redirect doesn't work with stats from /metric/snapshot

2017-01-24 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-6988:
--
Target Version/s:   (was: 1.0.2, 1.1.0)

> CLONE - WebUI redirect doesn't work with stats from /metric/snapshot
> 
>
> Key: MESOS-6988
> URL: https://issues.apache.org/jira/browse/MESOS-6988
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.0.0
>Reporter: Yan Xu
>Priority: Blocker
>
> The issue described in MESOS-6446 is still not fixed in 1.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6988) CLONE - WebUI redirect doesn't work with stats from /metric/snapshot

2017-01-24 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-6988:
--
Assignee: (was: haosdent)

> CLONE - WebUI redirect doesn't work with stats from /metric/snapshot
> 
>
> Key: MESOS-6988
> URL: https://issues.apache.org/jira/browse/MESOS-6988
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.0.0
>Reporter: Yan Xu
>Priority: Blocker
>
> The issue described in MESOS-6446 is still not fixed in 1.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6988) CLONE - WebUI redirect doesn't work with stats from /metric/snapshot

2017-01-24 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-6988:
--
Fix Version/s: (was: 1.2.0)
   (was: 1.0.2)
   (was: 1.1.0)

> CLONE - WebUI redirect doesn't work with stats from /metric/snapshot
> 
>
> Key: MESOS-6988
> URL: https://issues.apache.org/jira/browse/MESOS-6988
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.0.0
>Reporter: Yan Xu
>Priority: Blocker
>
> The issue described in MESOS-6446 is still not fixed in 1.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6988) CLONE - WebUI redirect doesn't work with stats from /metric/snapshot

2017-01-24 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-6988:
--
Description: The issue described in MESOS-6446 is still not fixed in 1.1.0. 
 (was: The issue )

> CLONE - WebUI redirect doesn't work with stats from /metric/snapshot
> 
>
> Key: MESOS-6988
> URL: https://issues.apache.org/jira/browse/MESOS-6988
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.0.0
>Reporter: Yan Xu
>Assignee: haosdent
>Priority: Blocker
> Fix For: 1.0.2, 1.1.0, 1.2.0
>
>
> The issue described in MESOS-6446 is still not fixed in 1.1.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6988) CLONE - WebUI redirect doesn't work with stats from /metric/snapshot

2017-01-24 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-6988:
--
Description: The issue   (was: After Mesos 1.0, the webUI redirect is 
hidden from the users so you can go to any of the master and the webUI is 
populated with state.json from the leading master. 

This doesn't include stats from /metric/snapshot though as it is not 
redirected. The user ends up seeing some fields with empty values.)

> CLONE - WebUI redirect doesn't work with stats from /metric/snapshot
> 
>
> Key: MESOS-6988
> URL: https://issues.apache.org/jira/browse/MESOS-6988
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.0.0
>Reporter: Yan Xu
>Assignee: haosdent
>Priority: Blocker
> Fix For: 1.0.2, 1.1.0, 1.2.0
>
>
> The issue 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6988) CLONE - WebUI redirect doesn't work with stats from /metric/snapshot

2017-01-24 Thread Yan Xu (JIRA)
Yan Xu created MESOS-6988:
-

 Summary: CLONE - WebUI redirect doesn't work with stats from 
/metric/snapshot
 Key: MESOS-6988
 URL: https://issues.apache.org/jira/browse/MESOS-6988
 Project: Mesos
  Issue Type: Bug
  Components: webui
Affects Versions: 1.0.0
Reporter: Yan Xu
Assignee: haosdent
Priority: Blocker
 Fix For: 1.0.2, 1.1.0, 1.2.0


After Mesos 1.0, the webUI redirect is hidden from the users so you can go to 
any of the master and the webUI is populated with state.json from the leading 
master. 

This doesn't include stats from /metric/snapshot though as it is not 
redirected. The user ends up seeing some fields with empty values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6953) A compromised mesos-master node can execute code as root on agents.

2017-01-24 Thread Anindya Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837018#comment-15837018
 ] 

Anindya Sinha commented on MESOS-6953:
--

In a normal case (when master is not compromised), we should always have the 
same acls for {{run_tasks}} on each agent of the cluster, so the framework 
should be sure that the tasks would launch on any agent if it passes 
authorization on the master. In the case of a compromised master, we do not 
want the agent to launch tasks as a privileged user. The check against the 
{{run_tasks}} acl on the agent is just for that purpose.

Regarding the live upgrade case: If this functionality is desired (i.e. to 
protect against running tasks on the agent as privileged users through a 
compromised master), we need to add the {{run_tasks}} acl (not all acls) on 
each agent that matches with the {{run_tasks}} acl on the master.

Another option instead of using framework principal as the "subject" could be 
to add another flag for mesos-slave that enlists the {{whitelisted_users}} 
(instead of using {{acls}}) which the agent checks to ensure that the task user 
for the task that is going to be launched is included in that list of 
whitelisted users. The reason of using {{acls}} on the agent is mainly to reuse 
existing authorization module.

> A compromised mesos-master node can execute code as root on agents.
> ---
>
> Key: MESOS-6953
> URL: https://issues.apache.org/jira/browse/MESOS-6953
> Project: Mesos
>  Issue Type: Bug
>  Components: security
>Reporter: Anindya Sinha
>Assignee: Anindya Sinha
>  Labels: security, slave
>
> mesos-master has a `--[no-]root_submissions` flag that controls whether 
> frameworks with `root` user are admitted to the cluster.
> However, if a mesos-master node is compromised, it can attempt to schedule 
> tasks on agent as the `root` user. Since mesos-agent has no check against 
> tasks running on the agent for specific users, tasks can get run with `root` 
> privileges can get run within the container on the agent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6446) WebUI redirect doesn't work with stats from /metric/snapshot

2017-01-24 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836998#comment-15836998
 ] 

Yan Xu commented on MESOS-6446:
---

You should be able to directly read the metric endpoint, this ticket is about 
the webUI should read from the leading master's metrics.

Alright I'll open a new one.

> WebUI redirect doesn't work with stats from /metric/snapshot
> 
>
> Key: MESOS-6446
> URL: https://issues.apache.org/jira/browse/MESOS-6446
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.0.0
>Reporter: Yan Xu
>Assignee: haosdent
>Priority: Blocker
> Fix For: 1.0.2, 1.1.0, 1.2.0
>
> Attachments: Screen Shot 2016-10-21 at 12.04.23 PM.png, 
> webui_metrics.gif
>
>
> After Mesos 1.0, the webUI redirect is hidden from the users so you can go to 
> any of the master and the webUI is populated with state.json from the leading 
> master. 
> This doesn't include stats from /metric/snapshot though as it is not 
> redirected. The user ends up seeing some fields with empty values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6446) WebUI redirect doesn't work with stats from /metric/snapshot

2017-01-24 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836985#comment-15836985
 ] 

Adam B commented on MESOS-6446:
---

And please open a new (cloned even) ticket for the non-leading masters, since 
we've already committed some fixes to 3 different releases, and set the 
fixVersions accordingly. It'll be easier to track the fixVersions for the new 
issue/fix/backports.

> WebUI redirect doesn't work with stats from /metric/snapshot
> 
>
> Key: MESOS-6446
> URL: https://issues.apache.org/jira/browse/MESOS-6446
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.0.0
>Reporter: Yan Xu
>Assignee: haosdent
>Priority: Blocker
> Fix For: 1.0.2, 1.1.0, 1.2.0
>
> Attachments: Screen Shot 2016-10-21 at 12.04.23 PM.png, 
> webui_metrics.gif
>
>
> After Mesos 1.0, the webUI redirect is hidden from the users so you can go to 
> any of the master and the webUI is populated with state.json from the leading 
> master. 
> This doesn't include stats from /metric/snapshot though as it is not 
> redirected. The user ends up seeing some fields with empty values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6446) WebUI redirect doesn't work with stats from /metric/snapshot

2017-01-24 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836981#comment-15836981
 ] 

Adam B commented on MESOS-6446:
---

But might you need to read the metrics for the non-leading masters themselves, 
instead of always getting the metrics from the leading master? I'm not sure we 
always want to redirect for metrics..

> WebUI redirect doesn't work with stats from /metric/snapshot
> 
>
> Key: MESOS-6446
> URL: https://issues.apache.org/jira/browse/MESOS-6446
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Affects Versions: 1.0.0
>Reporter: Yan Xu
>Assignee: haosdent
>Priority: Blocker
> Fix For: 1.0.2, 1.1.0, 1.2.0
>
> Attachments: Screen Shot 2016-10-21 at 12.04.23 PM.png, 
> webui_metrics.gif
>
>
> After Mesos 1.0, the webUI redirect is hidden from the users so you can go to 
> any of the master and the webUI is populated with state.json from the leading 
> master. 
> This doesn't include stats from /metric/snapshot though as it is not 
> redirected. The user ends up seeing some fields with empty values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-5116) Investigate supporting accounting only mode in XFS isolator

2017-01-24 Thread James Peach (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836891#comment-15836891
 ] 

James Peach edited comment on MESOS-5116 at 1/25/17 1:10 AM:
-

| Stop storing agent flags in the XFS disk isolator. | 
https://reviews.apache.org/r/55896/ |
| Add support for not enforcing XFS quotas. | 
https://reviews.apache.org/r/55897/ |
| Update XFS disk isolator documentation. | https://reviews.apache.org/r/55903/ 
|


was (Author: jamespeach):
| Stop storing agent flags in the XFS disk isolator. | 
https://reviews.apache.org/r/55896/ |
| Add support for not enforcing XFS quotas. 
|https://reviews.apache.org/r/55897/ |

> Investigate supporting accounting only mode in XFS isolator
> ---
>
> Key: MESOS-5116
> URL: https://issues.apache.org/jira/browse/MESOS-5116
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Yan Xu
>Assignee: James Peach
>
> The initial implementation of XFS isolator always enforces the disk quota 
> limit. In contrast, Posix disk isolator supports optionally monitoring the 
> disk usage without enforcement. This eases the transition into disk quota 
> enforcement mode.
> Mesos agent provides a {{flags.enforce_container_disk_quota}} flag to turn on 
> enforcement when the Posix isolator is added. With XFS either we support it 
> as well or we need to change the flag so it's Posix disk isolator specific.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6953) A compromised mesos-master node can execute code as root on agents.

2017-01-24 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836937#comment-15836937
 ] 

Adam B commented on MESOS-6953:
---

cc: [~arojas]
Interesting.. So you use the framework principal as the "subject", although 
it's the master that's actually making the request?
So, now, if a framework wants to run a task, it must have permission not just 
on the masters, but also on every agent (where it might want to run)? What if 
it has the ACL on some agents, but not others? How would it discover that, by 
trial and error?
What's the live upgrade story here? Operators must copy the run_tasks ACL from 
the masters to all agents (and restart the agents)?

> A compromised mesos-master node can execute code as root on agents.
> ---
>
> Key: MESOS-6953
> URL: https://issues.apache.org/jira/browse/MESOS-6953
> Project: Mesos
>  Issue Type: Bug
>  Components: security
>Reporter: Anindya Sinha
>Assignee: Anindya Sinha
>  Labels: security, slave
>
> mesos-master has a `--[no-]root_submissions` flag that controls whether 
> frameworks with `root` user are admitted to the cluster.
> However, if a mesos-master node is compromised, it can attempt to schedule 
> tasks on agent as the `root` user. Since mesos-agent has no check against 
> tasks running on the agent for specific users, tasks can get run with `root` 
> privileges can get run within the container on the agent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6375) Support hierarchical resource allocation roles.

2017-01-24 Thread Neil Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836740#comment-15836740
 ] 

Neil Conway commented on MESOS-6375:


Design doc: 
https://docs.google.com/document/d/1Ie2-6O400ayNXtRqipHq6_CCQ4wOoLWzoqql3b0Y6HU/edit#

> Support hierarchical resource allocation roles.
> ---
>
> Key: MESOS-6375
> URL: https://issues.apache.org/jira/browse/MESOS-6375
> Project: Mesos
>  Issue Type: Epic
>  Components: allocation
>Reporter: Benjamin Mahler
>
> Currently mesos provides a non-hierarchical resource allocation model, in 
> which all roles are siblings of one another.
> Organizations often have a need for hierarchical resource allocation 
> constraints, whether for fair sharing of resources or for specifying quota 
> constraints.
> Consider the following fair sharing hierarchy based on "shares":
> {noformat}
>   ^   ^
> /   \   /   \
>   /   \   /   \
>eng (3)   sales (1)  =>   eng (75%)  sales (25%)
>  ^  ^
>/   \  /   \
>  /   \  /   \
>   ads (2)build (1)  ads (66%)  build (33%)
> {noformat}
> The hierarchy specifies that the engineering organization should get 3x as 
> many resources as sales, and within these resources the ads team should get 
> 2x as many resources as the build team. The implication of this is that, if 
> the ads team is not using some of its resources, the build team and 
> engineering organization will be able to use these resources before the sales 
> organization can. Without a hierarchy, the resources unused by the ads team 
> would be re-distributed among all other roles (rather than only its siblings).
> Quota can also apply in a hierarchical manner:
> {noformat}
> ^
>   /   \
> /   \
>eng (90 cpus)   sales (10 cpus)
>  ^
>/   \
>  /   \
>  ads (50 cpus)   build (10 cpus)
> {noformat}
> See https://people.eecs.berkeley.edu/~alig/papers/h-drf.pdf for some 
> discussion w.r.t. sharing resources in a hierarchical model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6896) Support backend per container.

2017-01-24 Thread Gilbert Song (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gilbert Song reassigned MESOS-6896:
---

Assignee: Gilbert Song

> Support backend per container.
> --
>
> Key: MESOS-6896
> URL: https://issues.apache.org/jira/browse/MESOS-6896
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Gilbert Song
>Assignee: Gilbert Song
>  Labels: backend, containerizer
>
> Currently, the container backend is determined by the agent flag and all 
> containers are using the same backend. It is possible to achieve backend per 
> container by introducing a user facing API, which fulfills more robust use 
> cases (e.g., imagine that a group of container/nested container running an 
> application, while some containers only read from huge images and some others 
> only write to pluggable volumes).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-6904) Perform batching of allocations to reduce allocator queue backlogging.

2017-01-24 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15817026#comment-15817026
 ] 

Yan Xu edited comment on MESOS-6904 at 1/24/17 9:55 PM:


Reviews currently in progress: 
https://reviews.apache.org/r/51027/
https://reviews.apache.org/r/51028/
https://reviews.apache.org/r/52534/
https://reviews.apache.org/r/55852/
https://reviews.apache.org/r/55893/
https://reviews.apache.org/r/55874/


was (Author: jjanco):
Reviews currently in progress: 
https://reviews.apache.org/r/51027/
https://reviews.apache.org/r/51028/
https://reviews.apache.org/r/52534/
WIP from [~gyliu]
https://reviews.apache.org/r/51621/

> Perform batching of allocations to reduce allocator queue backlogging.
> --
>
> Key: MESOS-6904
> URL: https://issues.apache.org/jira/browse/MESOS-6904
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Jacob Janco
>Assignee: Jacob Janco
>Priority: Critical
>  Labels: allocator
>
> Per MESOS-3157:
> {quote}
> Our deployment environments have a lot of churn, with many short-live 
> frameworks that often revive offers. Running the allocator takes a long time 
> (from seconds up to minutes).
> In this situation, event-triggered allocation causes the event queue in the 
> allocator process to get very long, and the allocator effectively becomes 
> unresponsive (eg. a revive offers message takes too long to come to the head 
> of the queue).
> {quote}
> To remedy the above scenario, it is proposed to perform batching of the 
> enqueued allocation operations so that a single allocation operation can 
> satisfy N enqueued allocations. This should reduce the potential for 
> backlogging in the allocator. See the discussion 
> [here|https://issues.apache.org/jira/browse/MESOS-3157?focusedCommentId=14728377=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14728377]
>  in MESOS-3157.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6987) Incorrect metrics when framework on unreachable agent is torndown

2017-01-24 Thread Neil Conway (JIRA)
Neil Conway created MESOS-6987:
--

 Summary: Incorrect metrics when framework on unreachable agent is 
torndown
 Key: MESOS-6987
 URL: https://issues.apache.org/jira/browse/MESOS-6987
 Project: Mesos
  Issue Type: Bug
  Components: master
Reporter: Neil Conway
Assignee: Neil Conway
Priority: Minor
 Attachments: disconnect_framework_metrics_wrong-1.patch

See attached patch. Scenario:

* task T for framework F is launched on agent X
* agent X is marked unreachable
* framework F is torn-down
* agent X re-registers
* task T is shutdown

The task is listed as "killed" in the {{/tasks}} endpoint, but "unreachable" in 
the master's metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6987) Incorrect metrics when framework on unreachable agent is torndown

2017-01-24 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-6987:
---
Attachment: disconnect_framework_metrics_wrong-1.patch

> Incorrect metrics when framework on unreachable agent is torndown
> -
>
> Key: MESOS-6987
> URL: https://issues.apache.org/jira/browse/MESOS-6987
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Neil Conway
>Assignee: Neil Conway
>Priority: Minor
>  Labels: mesosphere, metrics
> Attachments: disconnect_framework_metrics_wrong-1.patch
>
>
> See attached patch. Scenario:
> * task T for framework F is launched on agent X
> * agent X is marked unreachable
> * framework F is torn-down
> * agent X re-registers
> * task T is shutdown
> The task is listed as "killed" in the {{/tasks}} endpoint, but "unreachable" 
> in the master's metrics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6986) abort in DRFSorter::add

2017-01-24 Thread Yvan Royon (JIRA)
Yvan Royon created MESOS-6986:
-

 Summary: abort in DRFSorter::add
 Key: MESOS-6986
 URL: https://issues.apache.org/jira/browse/MESOS-6986
 Project: Mesos
  Issue Type: Bug
  Components: allocation
Affects Versions: 1.0.1
 Environment: Mesosphere Enterprise DC/OS, CoreOS
Reporter: Yvan Royon


My mesos-master process terminated on SIGABRT.
The CHECK failed in function {{DRFSorter::add}}:
https://github.com/apache/mesos/blob/master/src/master/allocator/sorter/drf/sorter.cpp#L74

It seems there is a condition during framework registration where names are 
lost?

We are using the mesos-go library ({{next}} branch), which uses the new HTTP 
API. The framework is custom Go code. The crash is hard to reliably reproduce.

{code}
mesos-master[90061]: F0119 01:07:57.426159 90086 sorter.cpp:73] Check failed: 
!contains(name)
mesos-master[90061]: *** Check failure stack trace: ***
mesos-master[90061]: @ 0x7f960d9299fd  google::LogMessage::Fail()
mesos-master[90061]: @ 0x7f960d92b82d  google::LogMessage::SendToLog()
mesos-master[90061]: @ 0x7f960d9295ec  google::LogMessage::Flush()
mesos-master[90061]: @ 0x7f960d92c129  
google::LogMessageFatal::~LogMessageFatal()
mesos-master[90061]: @ 0x7f960d03460d  
mesos::internal::master::allocator::DRFSorter::add()
mesos-master[90061]: @ 0x7f960d021177  
mesos::internal::master::allocator::internal::HierarchicalAllocatorProcess::addFramework()
mesos-master[90061]: @ 0x7f960d8b9381  process::ProcessManager::resume()
mesos-master[90061]: @ 0x7f960d8b9687  
_ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUt_vEEE6_M_runEv
mesos-master[90061]: @ 0x7f960bf52d73  (unknown)
mesos-master[90061]: @ 0x7f960b74f52c  (unknown)
mesos-master[90061]: @ 0x7f960b49180d  (unknown)
systemd[1]: dcos-mesos-master.service: Main process exited, code=killed, 
status=6/ABRT
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6985) os::getenv() can segfault

2017-01-24 Thread Benjamin Bannier (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836657#comment-15836657
 ] 

Benjamin Bannier commented on MESOS-6985:
-

Are we sure this is caused by {{os::getenv}} itself? In test code we sometimes 
call e.g., {{os::setenv}} to read the values later. We avoid this in non-test 
code as {{::getenv}} does not need to be reentrant, and would ideally not 
perform environment mutations in test code either once multiple actors are 
running.

> os::getenv() can segfault
> -
>
> Key: MESOS-6985
> URL: https://issues.apache.org/jira/browse/MESOS-6985
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
> Environment: ASF CI, Ubuntu 14.04 and CentOS 7 both with and without 
> libevent/SSL
>Reporter: Greg Mann
>  Labels: stout
> Attachments: 
> MasterMaintenanceTest.InverseOffersFilters-truncated.txt, 
> MasterTest.MultipleExecutors.txt
>
>
> This was observed on ASF CI. The segfault first showed up on CI on 9/20/16 
> and has been produced by the tests {{MasterTest.MultipleExecutors}} and 
> {{MasterMaintenanceTest.InverseOffersFilters}}. In both cases, 
> {{os::getenv()}} segfaults with the same stack trace:
> {code}
> *** Aborted at 1485241617 (unix time) try "date -d @1485241617" if you are 
> using GNU date ***
> PC: @ 0x2ad59e3ae82d (unknown)
> I0124 07:06:57.422080 28619 exec.cpp:162] Version: 1.2.0
> *** SIGSEGV (@0xf0) received by PID 28591 (TID 0x2ad5a7b87700) from PID 240; 
> stack trace: ***
> I0124 07:06:57.422336 28615 exec.cpp:212] Executor started at: 
> executor(75)@172.17.0.2:45752 with pid 28591
> @ 0x2ad5ab953197 (unknown)
> @ 0x2ad5ab957479 (unknown)
> @ 0x2ad59e165330 (unknown)
> @ 0x2ad59e3ae82d (unknown)
> @ 0x2ad594631358 os::getenv()
> @ 0x2ad59aba6acf mesos::internal::slave::executorEnvironment()
> @ 0x2ad59ab845c0 mesos::internal::slave::Framework::launchExecutor()
> @ 0x2ad59ab818a2 mesos::internal::slave::Slave::_run()
> @ 0x2ad59ac1ec10 
> _ZZN7process8dispatchIN5mesos8internal5slave5SlaveERKNS_6FutureIbEERKNS1_13FrameworkInfoERKNS1_12ExecutorInfoERK6OptionINS1_8TaskInfoEERKSF_INS1_13TaskGroupInfoEES6_S9_SC_SH_SL_EEvRKNS_3PIDIT_EEMSP_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_ENKUlPNS_11ProcessBaseEE_clES16_
> @ 0x2ad59ac1e6bf 
> _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal5slave5SlaveERKNS0_6FutureIbEERKNS5_13FrameworkInfoERKNS5_12ExecutorInfoERK6OptionINS5_8TaskInfoEERKSJ_INS5_13TaskGroupInfoEESA_SD_SG_SL_SP_EEvRKNS0_3PIDIT_EEMST_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
> @ 0x2ad59bce2304 std::function<>::operator()()
> @ 0x2ad59bcc9824 process::ProcessBase::visit()
> @ 0x2ad59bd4028e process::DispatchEvent::visit()
> @ 0x2ad594616df1 process::ProcessBase::serve()
> @ 0x2ad59bcc72b7 process::ProcessManager::resume()
> @ 0x2ad59bcd567c 
> process::ProcessManager::init_threads()::$_2::operator()()
> @ 0x2ad59bcd5585 
> _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_2vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
> @ 0x2ad59bcd std::_Bind_simple<>::operator()()
> @ 0x2ad59bcd552c std::thread::_Impl<>::_M_run()
> @ 0x2ad59d9e6a60 (unknown)
> @ 0x2ad59e15d184 start_thread
> @ 0x2ad59e46d37d (unknown)
> make[4]: *** [check-local] Segmentation fault
> {code}
> Find attached the full log from a failed run of 
> {{MasterTest.MultipleExecutors}} and a truncated log from a failed run of 
> {{MasterMaintenanceTest.InverseOffersFilters}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6985) os::getenv() can segfault

2017-01-24 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-6985:
-
Attachment: MasterMaintenanceTest.InverseOffersFilters-truncated.txt

> os::getenv() can segfault
> -
>
> Key: MESOS-6985
> URL: https://issues.apache.org/jira/browse/MESOS-6985
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
> Environment: ASF CI, Ubuntu 14.04 and CentOS 7 both with and without 
> libevent/SSL
>Reporter: Greg Mann
>  Labels: stout
> Attachments: 
> MasterMaintenanceTest.InverseOffersFilters-truncated.txt, 
> MasterTest.MultipleExecutors.txt
>
>
> This was observed on ASF CI. The segfault first showed up on CI on 9/20/16 
> and has been produced by the tests {{MasterTest.MultipleExecutors}} and 
> {{MasterMaintenanceTest.InverseOffersFilters}}. In both cases, 
> {{os::getenv()}} segfaults with the same stack trace:
> {code}
> *** Aborted at 1485241617 (unix time) try "date -d @1485241617" if you are 
> using GNU date ***
> PC: @ 0x2ad59e3ae82d (unknown)
> I0124 07:06:57.422080 28619 exec.cpp:162] Version: 1.2.0
> *** SIGSEGV (@0xf0) received by PID 28591 (TID 0x2ad5a7b87700) from PID 240; 
> stack trace: ***
> I0124 07:06:57.422336 28615 exec.cpp:212] Executor started at: 
> executor(75)@172.17.0.2:45752 with pid 28591
> @ 0x2ad5ab953197 (unknown)
> @ 0x2ad5ab957479 (unknown)
> @ 0x2ad59e165330 (unknown)
> @ 0x2ad59e3ae82d (unknown)
> @ 0x2ad594631358 os::getenv()
> @ 0x2ad59aba6acf mesos::internal::slave::executorEnvironment()
> @ 0x2ad59ab845c0 mesos::internal::slave::Framework::launchExecutor()
> @ 0x2ad59ab818a2 mesos::internal::slave::Slave::_run()
> @ 0x2ad59ac1ec10 
> _ZZN7process8dispatchIN5mesos8internal5slave5SlaveERKNS_6FutureIbEERKNS1_13FrameworkInfoERKNS1_12ExecutorInfoERK6OptionINS1_8TaskInfoEERKSF_INS1_13TaskGroupInfoEES6_S9_SC_SH_SL_EEvRKNS_3PIDIT_EEMSP_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_ENKUlPNS_11ProcessBaseEE_clES16_
> @ 0x2ad59ac1e6bf 
> _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal5slave5SlaveERKNS0_6FutureIbEERKNS5_13FrameworkInfoERKNS5_12ExecutorInfoERK6OptionINS5_8TaskInfoEERKSJ_INS5_13TaskGroupInfoEESA_SD_SG_SL_SP_EEvRKNS0_3PIDIT_EEMST_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
> @ 0x2ad59bce2304 std::function<>::operator()()
> @ 0x2ad59bcc9824 process::ProcessBase::visit()
> @ 0x2ad59bd4028e process::DispatchEvent::visit()
> @ 0x2ad594616df1 process::ProcessBase::serve()
> @ 0x2ad59bcc72b7 process::ProcessManager::resume()
> @ 0x2ad59bcd567c 
> process::ProcessManager::init_threads()::$_2::operator()()
> @ 0x2ad59bcd5585 
> _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_2vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
> @ 0x2ad59bcd std::_Bind_simple<>::operator()()
> @ 0x2ad59bcd552c std::thread::_Impl<>::_M_run()
> @ 0x2ad59d9e6a60 (unknown)
> @ 0x2ad59e15d184 start_thread
> @ 0x2ad59e46d37d (unknown)
> make[4]: *** [check-local] Segmentation fault
> {code}
> Find attached the full log from a failed run of 
> {{MasterTest.MultipleExecutors}} and a truncated log from a failed run of 
> {{MasterMaintenanceTest.InverseOffersFilters}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6985) os::getenv() can segfault

2017-01-24 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-6985:
-
Attachment: MasterTest.MultipleExecutors.txt

> os::getenv() can segfault
> -
>
> Key: MESOS-6985
> URL: https://issues.apache.org/jira/browse/MESOS-6985
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
> Environment: ASF CI, Ubuntu 14.04 and CentOS 7 both with and without 
> libevent/SSL
>Reporter: Greg Mann
>  Labels: stout
> Attachments: MasterTest.MultipleExecutors.txt
>
>
> This was observed on ASF CI. The segfault first showed up on CI on 9/20/16 
> and has been produced by the tests {{MasterTest.MultipleExecutors}} and 
> {{MasterMaintenanceTest.InverseOffersFilters}}. In both cases, 
> {{os::getenv()}} segfaults with the same stack trace:
> {code}
> *** Aborted at 1485241617 (unix time) try "date -d @1485241617" if you are 
> using GNU date ***
> PC: @ 0x2ad59e3ae82d (unknown)
> I0124 07:06:57.422080 28619 exec.cpp:162] Version: 1.2.0
> *** SIGSEGV (@0xf0) received by PID 28591 (TID 0x2ad5a7b87700) from PID 240; 
> stack trace: ***
> I0124 07:06:57.422336 28615 exec.cpp:212] Executor started at: 
> executor(75)@172.17.0.2:45752 with pid 28591
> @ 0x2ad5ab953197 (unknown)
> @ 0x2ad5ab957479 (unknown)
> @ 0x2ad59e165330 (unknown)
> @ 0x2ad59e3ae82d (unknown)
> @ 0x2ad594631358 os::getenv()
> @ 0x2ad59aba6acf mesos::internal::slave::executorEnvironment()
> @ 0x2ad59ab845c0 mesos::internal::slave::Framework::launchExecutor()
> @ 0x2ad59ab818a2 mesos::internal::slave::Slave::_run()
> @ 0x2ad59ac1ec10 
> _ZZN7process8dispatchIN5mesos8internal5slave5SlaveERKNS_6FutureIbEERKNS1_13FrameworkInfoERKNS1_12ExecutorInfoERK6OptionINS1_8TaskInfoEERKSF_INS1_13TaskGroupInfoEES6_S9_SC_SH_SL_EEvRKNS_3PIDIT_EEMSP_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_ENKUlPNS_11ProcessBaseEE_clES16_
> @ 0x2ad59ac1e6bf 
> _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal5slave5SlaveERKNS0_6FutureIbEERKNS5_13FrameworkInfoERKNS5_12ExecutorInfoERK6OptionINS5_8TaskInfoEERKSJ_INS5_13TaskGroupInfoEESA_SD_SG_SL_SP_EEvRKNS0_3PIDIT_EEMST_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
> @ 0x2ad59bce2304 std::function<>::operator()()
> @ 0x2ad59bcc9824 process::ProcessBase::visit()
> @ 0x2ad59bd4028e process::DispatchEvent::visit()
> @ 0x2ad594616df1 process::ProcessBase::serve()
> @ 0x2ad59bcc72b7 process::ProcessManager::resume()
> @ 0x2ad59bcd567c 
> process::ProcessManager::init_threads()::$_2::operator()()
> @ 0x2ad59bcd5585 
> _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_2vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
> @ 0x2ad59bcd std::_Bind_simple<>::operator()()
> @ 0x2ad59bcd552c std::thread::_Impl<>::_M_run()
> @ 0x2ad59d9e6a60 (unknown)
> @ 0x2ad59e15d184 start_thread
> @ 0x2ad59e46d37d (unknown)
> make[4]: *** [check-local] Segmentation fault
> {code}
> Find attached the full log from a failed run of 
> {{MasterTest.MultipleExecutors}} and a truncated log from a failed run of 
> {{MasterMaintenanceTest.InverseOffersFilters}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6985) os::getenv() can segfault

2017-01-24 Thread Greg Mann (JIRA)
Greg Mann created MESOS-6985:


 Summary: os::getenv() can segfault
 Key: MESOS-6985
 URL: https://issues.apache.org/jira/browse/MESOS-6985
 Project: Mesos
  Issue Type: Bug
  Components: stout
 Environment: ASF CI, Ubuntu 14.04 and CentOS 7 both with and without 
libevent/SSL
Reporter: Greg Mann


This was observed on ASF CI. The segfault first showed up on CI on 9/20/16 and 
has been produced by the tests {{MasterTest.MultipleExecutors}} and 
{{MasterMaintenanceTest.InverseOffersFilters}}. In both cases, {{os::getenv()}} 
segfaults with the same stack trace:
{code}
*** Aborted at 1485241617 (unix time) try "date -d @1485241617" if you are 
using GNU date ***
PC: @ 0x2ad59e3ae82d (unknown)
I0124 07:06:57.422080 28619 exec.cpp:162] Version: 1.2.0
*** SIGSEGV (@0xf0) received by PID 28591 (TID 0x2ad5a7b87700) from PID 240; 
stack trace: ***
I0124 07:06:57.422336 28615 exec.cpp:212] Executor started at: 
executor(75)@172.17.0.2:45752 with pid 28591
@ 0x2ad5ab953197 (unknown)
@ 0x2ad5ab957479 (unknown)
@ 0x2ad59e165330 (unknown)
@ 0x2ad59e3ae82d (unknown)
@ 0x2ad594631358 os::getenv()
@ 0x2ad59aba6acf mesos::internal::slave::executorEnvironment()
@ 0x2ad59ab845c0 mesos::internal::slave::Framework::launchExecutor()
@ 0x2ad59ab818a2 mesos::internal::slave::Slave::_run()
@ 0x2ad59ac1ec10 
_ZZN7process8dispatchIN5mesos8internal5slave5SlaveERKNS_6FutureIbEERKNS1_13FrameworkInfoERKNS1_12ExecutorInfoERK6OptionINS1_8TaskInfoEERKSF_INS1_13TaskGroupInfoEES6_S9_SC_SH_SL_EEvRKNS_3PIDIT_EEMSP_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_ENKUlPNS_11ProcessBaseEE_clES16_
@ 0x2ad59ac1e6bf 
_ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal5slave5SlaveERKNS0_6FutureIbEERKNS5_13FrameworkInfoERKNS5_12ExecutorInfoERK6OptionINS5_8TaskInfoEERKSJ_INS5_13TaskGroupInfoEESA_SD_SG_SL_SP_EEvRKNS0_3PIDIT_EEMST_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
@ 0x2ad59bce2304 std::function<>::operator()()
@ 0x2ad59bcc9824 process::ProcessBase::visit()
@ 0x2ad59bd4028e process::DispatchEvent::visit()
@ 0x2ad594616df1 process::ProcessBase::serve()
@ 0x2ad59bcc72b7 process::ProcessManager::resume()
@ 0x2ad59bcd567c 
process::ProcessManager::init_threads()::$_2::operator()()
@ 0x2ad59bcd5585 
_ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_2vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
@ 0x2ad59bcd std::_Bind_simple<>::operator()()
@ 0x2ad59bcd552c std::thread::_Impl<>::_M_run()
@ 0x2ad59d9e6a60 (unknown)
@ 0x2ad59e15d184 start_thread
@ 0x2ad59e46d37d (unknown)
make[4]: *** [check-local] Segmentation fault
{code}
Find attached the full log from a failed run of 
{{MasterTest.MultipleExecutors}} and a truncated log from a failed run of 
{{MasterMaintenanceTest.InverseOffersFilters}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6984) Pull out the docker image build step out of `support/docker-build.sh`.

2017-01-24 Thread Michael Park (JIRA)
Michael Park created MESOS-6984:
---

 Summary: Pull out the docker image build step out of 
`support/docker-build.sh`.
 Key: MESOS-6984
 URL: https://issues.apache.org/jira/browse/MESOS-6984
 Project: Mesos
  Issue Type: Task
Reporter: Michael Park


The {{support/docker-build.sh}} script currently writes a {{Dockerfile}} and 
performs a docker build, runs the image then deletes the image.

The docker build step is quite expensive, and are often flaky. We should simply 
pull a docker image from Dockerhub so that we can make our CI more stable and 
efficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6320) Implement clang-tidy check to catch incorrect flags hierarchies

2017-01-24 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836569#comment-15836569
 ] 

Michael Park commented on MESOS-6320:
-

{noformat}
commit d76f8d298b9f302c92ce4d0ff7ebed9e116a95a6
Author: Benjamin Bannier 
Date:   Wed Dec 21 19:33:30 2016 +0100

[clang-tidy] Added Mesos check of custom Flags classes.

This change fixes MESOS-6320.
{noformat}

> Implement clang-tidy check to catch incorrect flags hierarchies
> ---
>
> Key: MESOS-6320
> URL: https://issues.apache.org/jira/browse/MESOS-6320
> Project: Mesos
>  Issue Type: Bug
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: clang-tidy, mesosphere
> Fix For: 1.2.0
>
>
> Classes need to always use {{virtual}} inheritance when being derived from 
> {{FlagsBase}}. Also, in order to compose such derived flags they should be 
> inherited virtually again.
> Some examples:
> {code}
> struct A : virtual FlagsBase {}; // OK
> struct B : FlagsBase {}; // ERROR
> struct C : A {}; // ERROR
> {code}
> We should implement a clang-tidy checker to catch such wrong inheritance 
> issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5393) XFS disk isolator should disallow sandbox writes when no 'disk' is used in executor/task

2017-01-24 Thread James Peach (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836565#comment-15836565
 ] 

James Peach commented on MESOS-5393:


Implemented as a 1-block quota. Note that this makes it impossible to run a 
task because the quota gets used by agent logs.

> XFS disk isolator should disallow sandbox writes when no 'disk' is used in 
> executor/task
> 
>
> Key: MESOS-5393
> URL: https://issues.apache.org/jira/browse/MESOS-5393
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.0.0
>Reporter: Yan Xu
>Assignee: James Peach
>
> This is similar to MESOS-5081 and was left as a TODO in the first patch for 
> the XFS isolator.
> {noformat:title=}
> // TODO(jpeach) If there's no disk resource attached, we should set the
> // minimum quota (1 block), since a zero quota would be unconstrained.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)