[jira] [Commented] (MESOS-4428) Get only running tasks from mesos api

2016-01-19 Thread Guangya Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106542#comment-15106542
 ] 

Guangya Liu commented on MESOS-4428:


I think that you can take a look at MESOS-3307

> Get only running tasks from mesos api
> -
>
> Key: MESOS-4428
> URL: https://issues.apache.org/jira/browse/MESOS-4428
> Project: Mesos
>  Issue Type: Wish
>  Components: json api
>Reporter: Tymofii
>Priority: Trivial
>
> We're using /state.json for service discovery in our environment. Our 
> mesas-consul bridge reads /state.json from current leader and then parses it 
> to register all running tasks in Consul.
> When using Spark framework it generates a lot of tasks, which all goes to the 
> /state.json file as finished. The file itself can grow very large in couple 
> days of work.
> Is there any way to get only running tasks from mesos leader right now?
> If there's not, can you add such possibility?
> Or maybe you'll suggest using different approach for service discovery?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3361) Update MesosContainerizer to dynamically pick/enable isolators

2016-01-19 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-3361:
--
Shepherd:   (was: Niklas Quarfot Nielsen)

> Update MesosContainerizer to dynamically pick/enable isolators
> --
>
> Key: MESOS-3361
> URL: https://issues.apache.org/jira/browse/MESOS-3361
> Project: Mesos
>  Issue Type: Task
>Reporter: Kapil Arya
>Assignee: Kapil Arya
>  Labels: mesosphere
>
> This would allow the frameworks to opt-in/opt-out of network isolation per 
> container. Thus, one can launch some containers with their own IPs while 
> other containers still share the host IP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3358) Add TaskStatus label decorator hooks for Master

2016-01-19 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-3358:
--
Shepherd:   (was: Niklas Quarfot Nielsen)

> Add TaskStatus label decorator hooks for Master
> ---
>
> Key: MESOS-3358
> URL: https://issues.apache.org/jira/browse/MESOS-3358
> Project: Mesos
>  Issue Type: Task
>Reporter: Kapil Arya
>Assignee: Kapil Arya
>  Labels: mesosphere
>
> The hook will be triggered when Master receives TaskStatus message from Agent 
> or when the Master itself generates a TASK_LOST status. The hook should also 
> provide a list of the previous TaskStatuses to the module.
> The use case is to allow a "cleanup" module to release IPs if an agent is 
> lost. The previous statuses will contain the IP address(es) to be released.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3362) Allow Isolators to advertise "capabilities" via SlaveInfo

2016-01-19 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-3362:
--
Shepherd:   (was: Niklas Quarfot Nielsen)

> Allow Isolators to advertise "capabilities" via SlaveInfo
> -
>
> Key: MESOS-3362
> URL: https://issues.apache.org/jira/browse/MESOS-3362
> Project: Mesos
>  Issue Type: Task
>Reporter: Kapil Arya
>Assignee: Kapil Arya
>  Labels: mesosphere
>
> A network-isolator module can thus advertise that it can assign per-container 
> IP and can provide network-isolation.
> The SlaveInfo protobuf will be extended to include "Capabilities" similar to 
> FrameworkInfo::Capabilities.
> The isolator interface needs to be extended to create `info()` that return a 
> `IsolatorInfo` message. The `IsolatorInfo` message can include "Capabilities" 
> to be sent to Frameworks as part of SlaveInfo.
> The Isolator::info() interface will be used by Slave during initialization to 
> compile SlaveInfo::Capabilities.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3740) LIBPROCESS_IP not passed to Docker containers

2016-01-19 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-3740:
--
Shepherd:   (was: Niklas Quarfot Nielsen)

> LIBPROCESS_IP not passed to Docker containers
> -
>
> Key: MESOS-3740
> URL: https://issues.apache.org/jira/browse/MESOS-3740
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker
>Affects Versions: 0.25.0
> Environment: Mesos 0.24.1
>Reporter: Cody Maloney
>  Labels: mesosphere
>
> Docker containers aren't currently passed all the same environment variables 
> that Mesos Containerizer tasks are. See: 
> https://github.com/apache/mesos/blob/master/src/slave/containerizer/containerizer.cpp#L254
>  for all the environment variables explicitly set for mesos containers.
> While some of them don't necessarily make sense for docker containers, when 
> the docker has inside of it a libprocess process (A mesos framework 
> scheduler) and is using {{--net=host}} the task needs to have LIBPROCESS_IP 
> set otherwise the same sort of problems that happen because of MESOS-3553 can 
> happen (libprocess will try to guess the machine's IP address with likely bad 
> results in a number of operating environment).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3585) Add a test module for ip-per-container support

2016-01-19 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-3585:
--
Shepherd:   (was: Niklas Quarfot Nielsen)

> Add a test module for ip-per-container support
> --
>
> Key: MESOS-3585
> URL: https://issues.apache.org/jira/browse/MESOS-3585
> Project: Mesos
>  Issue Type: Task
>Reporter: Kapil Arya
>Assignee: Kapil Arya
>  Labels: mesosphere
>
> With the addition of {{NetworkInfo}} to allow frameworks to request 
> IP-per-container for their tasks, we should add a simple module that mimics 
> the behavior of a real network-isolation module for testing purposes. We can 
> then add this module in {{src/examples}} and write some tests against it.
> This module can also serve as a template module for third-party network 
> isolation provides for building their own network isolator modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2646) Update Master to send revocable resources in separate offers

2016-01-19 Thread Niklas Quarfot Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106566#comment-15106566
 ] 

Niklas Quarfot Nielsen commented on MESOS-2646:
---

[~JamesYongQiaoWang] Sorry about the delay. Do you still have capacity for some 
oversubscription work?

> Update Master to send revocable resources in separate offers
> 
>
> Key: MESOS-2646
> URL: https://issues.apache.org/jira/browse/MESOS-2646
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Vinod Kone
>Assignee: Yongqiao Wang
>  Labels: twitter
> Attachments: code-diff.txt
>
>
> Master will send separate offers for revocable and non-revocable/regular 
> resources. This allows master to rescind revocable offers (e.g, when a new 
> oversubscribed resources estimate comes from the slave) without impacting 
> regular offers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2688) Slave should kill revocable tasks if oversubscription is disabled

2016-01-19 Thread Niklas Quarfot Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106561#comment-15106561
 ] 

Niklas Quarfot Nielsen commented on MESOS-2688:
---

[~bmahler] - so, is that an OK for doing it on the slave? :)

> Slave should kill revocable tasks if oversubscription is disabled
> -
>
> Key: MESOS-2688
> URL: https://issues.apache.org/jira/browse/MESOS-2688
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Jie Yu
>  Labels: twitter
>
> If oversubscription is disabled on a restarted slave (that had it previously 
> enabled), it should kill revocable tasks.
> Slave knows this information from the Resources of a container that it 
> checkpoints and recovers.
> Add a new reason OVERSUBSCRIPTION_DISABLED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2695) Add master flag to enable/disable oversubscription

2016-01-19 Thread Niklas Quarfot Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106565#comment-15106565
 ] 

Niklas Quarfot Nielsen commented on MESOS-2695:
---

[~vi...@twitter.com] should we mark as 'won't fix' for now?

> Add master flag to enable/disable oversubscription
> --
>
> Key: MESOS-2695
> URL: https://issues.apache.org/jira/browse/MESOS-2695
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>  Labels: twitter
>
> This flag lets an operator control cluster level oversubscription. 
> The master should send revocable offers to framework if this flag is enabled 
> and the framework opts in to receive them.
> Master should ignore revocable resources from slaves if the flag is disabled.
> Need tests for all these scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4427) Ensure ip_address in state.json (from NetworkInfo) is valid

2016-01-19 Thread Sargun Dhillon (JIRA)
Sargun Dhillon created MESOS-4427:
-

 Summary: Ensure ip_address in state.json (from NetworkInfo) is 
valid
 Key: MESOS-4427
 URL: https://issues.apache.org/jira/browse/MESOS-4427
 Project: Mesos
  Issue Type: Bug
Reporter: Sargun Dhillon
Priority: Critical


We have seen a master state.json where the state.json has a field that looks 
similar to:
```
---REDACTED---
{
"container": {
"docker": {
"force_pull_image": false,
"image": "REDACTED",
"network": "HOST",
"privileged": false
},
"type": "DOCKER"
},
"executor_id": "",
"framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-",
"id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25",
"name": "ping-as-a-service",
"resources": {
"cpus": 0.1,
"disk": 0,
"mem": 64,
"ports": "[7907-7907]"
},
"slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043",
"state": "TASK_RUNNING",
"statuses": [
{
"container_status": {
"network_infos": [
{
"ip_address": "",
"ip_addresses": [
{
"ip_address": ""
}
]
}
]
},
"labels": [
{
"key": "Docker.NetworkSettings.IPAddress",
"value": ""
}
],
"state": "TASK_RUNNING",
"timestamp": 1453149270.95511
}
]
}
---REDACTED---
```

This is invalid, and it mesos-core should filter it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4427) Ensure ip_address in state.json (from NetworkInfo) is valid

2016-01-19 Thread Sargun Dhillon (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sargun Dhillon updated MESOS-4427:
--
Description: 
We have seen a master state.json where the state.json has a field that looks 
similar to:


---REDACTED---
{code:json}
{
"container": {
"docker": {
"force_pull_image": false,
"image": "REDACTED",
"network": "HOST",
"privileged": false
},
"type": "DOCKER"
},
"executor_id": "",
"framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-",
"id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25",
"name": "ping-as-a-service",
"resources": {
"cpus": 0.1,
"disk": 0,
"mem": 64,
"ports": "[7907-7907]"
},
"slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043",
"state": "TASK_RUNNING",
"statuses": [
{
"container_status": {
"network_infos": [
{
"ip_address": "",
"ip_addresses": [
{
"ip_address": ""
}
]
}
]
},
"labels": [
{
"key": "Docker.NetworkSettings.IPAddress",
"value": ""
}
],
"state": "TASK_RUNNING",
"timestamp": 1453149270.95511
}
]
}
{code}
---REDACTED---


This is invalid, and it mesos-core should filter it. 

  was:
We have seen a master state.json where the state.json has a field that looks 
similar to:

```
---REDACTED---
{
"container": {
"docker": {
"force_pull_image": false,
"image": "REDACTED",
"network": "HOST",
"privileged": false
},
"type": "DOCKER"
},
"executor_id": "",
"framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-",
"id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25",
"name": "ping-as-a-service",
"resources": {
"cpus": 0.1,
"disk": 0,
"mem": 64,
"ports": "[7907-7907]"
},
"slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043",
"state": "TASK_RUNNING",
"statuses": [
{
"container_status": {
"network_infos": [
{
"ip_address": "",
"ip_addresses": [
{
"ip_address": ""
}
]
}
]
},
"labels": [
{
"key": "Docker.NetworkSettings.IPAddress",
"value": ""
}
],
"state": "TASK_RUNNING",
"timestamp": 1453149270.95511
}
]
}
---REDACTED---
```

This is invalid, and it mesos-core should filter it. 


> Ensure ip_address in state.json (from NetworkInfo) is valid
> ---
>
> Key: MESOS-4427
> URL: https://issues.apache.org/jira/browse/MESOS-4427
> Project: Mesos
>  Issue Type: Bug
>Reporter: Sargun Dhillon
>Priority: Critical
>  Labels: mesosphere
>
> We have seen a master state.json where the state.json has a field that looks 
> similar to:
> ---REDACTED---
> {code:json}
> {
> "container": {
> "docker": {
> "force_pull_image": false,
> "image": "REDACTED",
> "network": "HOST",
> "privileged": false
> },
> "type": "DOCKER"
> },
> "executor_id": "",
> "framework_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-",
> "id": "ping-as-a-service.c2d1c17a-be22-11e5-b053-002590e56e25",
> "name": "ping-as-a-service",
> "resources": {
> "cpus": 0.1,
> "disk": 0,
> "mem": 64,
> "ports": "[7907-7907]"
> },
> "slave_id": "9f0e50ea-54b0-44e3-a451-c69e0c1a58fb-S76043",
> "state": "TASK_RUNNING",
> "statuses": [
> {
> "container_status": {
> "network_infos": [
> {
> "ip_address": "",
> "ip_addresses": [
> {
> "ip_address": ""
> }
> ]
> }
> ]
> },
> "labels": [
> {
> "key": "Docker.NetworkSettings.IPAddress",
> "value": ""
> }
> ],
> "state": "TASK_RUNNING",
> 

[jira] [Created] (MESOS-4428) Get only running tasks from mesos api

2016-01-19 Thread Tymofii (JIRA)
Tymofii created MESOS-4428:
--

 Summary: Get only running tasks from mesos api
 Key: MESOS-4428
 URL: https://issues.apache.org/jira/browse/MESOS-4428
 Project: Mesos
  Issue Type: Wish
  Components: json api
Reporter: Tymofii
Priority: Trivial


We're using /state.json for service discovery in our environment. Our 
mesas-consul bridge reads /state.json from current leader and then parses it to 
register all running tasks in Consul.

When using Spark framework it generates a lot of tasks, which all goes to the 
/state.json file as finished. The file itself can grow very large in couple 
days of work.

Is there any way to get only running tasks from mesos leader right now?
If there's not, can you add such possibility?

Or maybe you'll suggest using different approach for service discovery?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4411) Traverse all roles for quota allocation

2016-01-19 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4411:
---
Sprint: Mesosphere Sprint 27

> Traverse all roles for quota allocation
> ---
>
> Key: MESOS-4411
> URL: https://issues.apache.org/jira/browse/MESOS-4411
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Alexander Rukletsov
>Assignee: Guangya Liu
>Priority: Critical
>  Labels: mesosphere
>
> There might be a bug in how resources are allocated to multiple quota'ed 
> roles if one role's quota is met. We need to investigate this behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2646) Update Master to send revocable resources in separate offers

2016-01-19 Thread Yongqiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106625#comment-15106625
 ] 

Yongqiao Wang commented on MESOS-2646:
--

[~nnielsen], yes, I think I have. and do you mean I can start to fix this 
ticker now?

> Update Master to send revocable resources in separate offers
> 
>
> Key: MESOS-2646
> URL: https://issues.apache.org/jira/browse/MESOS-2646
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Vinod Kone
>Assignee: Yongqiao Wang
>  Labels: twitter
> Attachments: code-diff.txt
>
>
> Master will send separate offers for revocable and non-revocable/regular 
> resources. This allows master to rescind revocable offers (e.g, when a new 
> oversubscribed resources estimate comes from the slave) without impacting 
> regular offers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2296) Implement the Events stream on slave for Call endpoint

2016-01-19 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-2296:
--
Issue Type: Epic  (was: Task)

> Implement the Events stream on slave for Call endpoint
> --
>
> Key: MESOS-2296
> URL: https://issues.apache.org/jira/browse/MESOS-2296
> Project: Mesos
>  Issue Type: Epic
>Reporter: Vinod Kone
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4255) Add mechanism for testing recovery of HTTP based executors

2016-01-19 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-4255:
--
Sprint:   (was: Mesosphere Sprint 26)

> Add mechanism for testing recovery of HTTP based executors
> --
>
> Key: MESOS-4255
> URL: https://issues.apache.org/jira/browse/MESOS-4255
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> Currently, the slave process generates a process ID every time it is 
> initialized via {{process::ID::generate}} function call. This is a problem 
> for testing HTTP executors as it can't retry if there is a disconnection 
> after an agent restart since the prefix is incremented. 
> {code}
> Agent PID before:
> slave(1)@127.0.0.1:43915
> Agent PID after restart:
> slave(2)@127.0.0.1:43915
> {code}
> There are a couple of ways to fix this:
> - Add a constructor to {{Slave}} exclusively for testing that passes on a 
> fixed {{ID}} instead of relying on {{ID::generate}}.
> - Currently we delegate to slave(1)@ i.e. (1) when nothing is specified as 
> the URL in libprocess i.e. {{127.0.0.1:43915/api/v1/executor}} would delegate 
> to {{slave(1)@127.0.0.1:43915/api/v1/executor}}. Instead of defaulting to 
> (1), we can default to the last known active ID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4434) Install 3rdparty package boost, glog, protobuf and picojson when installing Mesos

2016-01-19 Thread Kapil Arya (JIRA)
Kapil Arya created MESOS-4434:
-

 Summary: Install 3rdparty package boost, glog, protobuf and 
picojson when installing Mesos
 Key: MESOS-4434
 URL: https://issues.apache.org/jira/browse/MESOS-4434
 Project: Mesos
  Issue Type: Bug
  Components: build, modules
Reporter: Kapil Arya


Mesos modules depend on having these packages installed with the exact version 
as Mesos was compiled with.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4410) Introduce protobuf for quota set request.

2016-01-19 Thread Timothy Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen updated MESOS-4410:

Priority: Blocker  (was: Major)

> Introduce protobuf for quota set request.
> -
>
> Key: MESOS-4410
> URL: https://issues.apache.org/jira/browse/MESOS-4410
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>Priority: Blocker
>  Labels: mesosphere
>
> To document quota request JSON schema and simplify request processing, 
> introduce a {{QuotaRequest}} protobuf wrapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4185) Revisit the "System Requirements" for all systems in the "Getting Started" guide

2016-01-19 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-4185:
---
Assignee: (was: Kevin Klues)

> Revisit the "System Requirements" for all systems in the "Getting Started" 
> guide
> 
>
> Key: MESOS-4185
> URL: https://issues.apache.org/jira/browse/MESOS-4185
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Kevin Klues
>Priority: Minor
>
> The "System Requirements" section of our "Getting Started" guide needs an 
> overhaul.  Much of the information is outdated, and could likely be distilled 
> down to a simpler set of dependencies (especially for Centos 6.6 and Centos 
> 7.1).  We should take a good hard look at these and see if all of the 
> dependencies listed are necessary anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4429) Add oversubscription benchmark/stress/test framework

2016-01-19 Thread Bartek Plotka (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107556#comment-15107556
 ] 

Bartek Plotka commented on MESOS-4429:
--

Let's start a `doc` to define the scope and input/output in details: 
https://docs.google.com/document/d/1VyjbSXyvxyS95asFjzV5A19B_vcIAiUqgz3y_pMvtPs/edit?usp=sharing
 (:

Some notes on Serenity framework, [~nnielsen] mentioned:
As you can see it can be controlled via JSON file (quite similar to marathon's 
REST API input 
https://mesosphere.github.io/marathon/docs/rest-api.html#post-v2-apps). IMO it 
gives useful ability to store previous `tasks` and build certain reusable 
scenarios.

One of the interesting features in this framework is ability to stress slave 
with different kind of tasks using logic similar to `shares`. For instance you 
can specify that tasks of type A will be run 3 times more often then tasks of 
type B  (type A task shares = 3 & type B task shares = 1). As a result the 
framework will be spawning as many as possible tasks of both types in such 
defined "distribution". It also support targeting the tasks to particular the 
host.

It could be a good starting point for us.

> Add oversubscription benchmark/stress/test framework
> 
>
> Key: MESOS-4429
> URL: https://issues.apache.org/jira/browse/MESOS-4429
> Project: Mesos
>  Issue Type: Task
>Reporter: Niklas Quarfot Nielsen
>
> To evaluate the function and quality of oversubscription modules, we could 
> ship a test framework which can:
> 1) Launch on oversubscribed and non-oversubscribed resources in a controlled 
> manner. For example, register as two different frameworks and see that 
> resources from slack resources of one framework can be used by the other.
> 2) Measure time to react for different scenarios. For example, measure the 
> time it takes from slack appearing on a slave to the offer being issued with 
> revocable resources. The time to react for changing usage patterns e.g. time 
> to reclaim oversubscribed resources when regular tasks need them back.
> 3) Count the number of offer rescind, preemptions, etc. to deem the stability 
> of the policy.
> 4) Be able to measure % extra work being able to run.
> 5) Work across different resource dimensions as cpu time, memory, network, 
> caches.
> [~Bartek Plotka] has been working on something similar for Serenity in 
> https://github.com/mesosphere/serenity/tree/master/src/framework which we can 
> reuse as a base.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-920) Set GLOG_drop_log_memory=false in environment prior to logging initialization.

2016-01-19 Thread Kapil Arya (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kapil Arya reassigned MESOS-920:


Assignee: Kapil Arya

> Set GLOG_drop_log_memory=false in environment prior to logging initialization.
> --
>
> Key: MESOS-920
> URL: https://issues.apache.org/jira/browse/MESOS-920
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt
>Affects Versions: 0.15.0, 0.16.0
>Reporter: Benjamin Mahler
>Assignee: Kapil Arya
>
> We've observed issues where the masters are slow to respond. Two perf traces 
> collected while the masters were slow to respond:
> {noformat}
>  25.84%  [kernel][k] default_send_IPI_mask_sequence_phys
>  20.44%  [kernel][k] native_write_msr_safe
>   4.54%  [kernel][k] _raw_spin_lock
>   2.95%  libc-2.5.so [.] _int_malloc
>   1.82%  libc-2.5.so [.] malloc
>   1.55%  [kernel][k] apic_timer_interrupt
>   1.36%  libc-2.5.so [.] _int_free
> {noformat}
> {noformat}
>  29.03%  [kernel][k] default_send_IPI_mask_sequence_phys
>   9.64%  [kernel][k] _raw_spin_lock
>   7.38%  [kernel][k] native_write_msr_safe
>   2.43%  libc-2.5.so [.] _int_malloc
>   2.05%  libc-2.5.so [.] _int_free
>   1.67%  [kernel][k] apic_timer_interrupt
>   1.58%  libc-2.5.so [.] malloc
> {noformat}
> These have been found to be attributed to the posix_fadvise calls made by 
> glog. We can disable these via the environment:
> {noformat}
> GLOG_DEFINE_bool(drop_log_memory, true, "Drop in-memory buffers of log 
> contents. "
>  "Logs can grow very quickly and they are rarely read before 
> they "
>  "need to be evicted from memory. Instead, drop them from 
> memory "
>  "as soon as they are flushed to disk.");
> {noformat}
> {code}
> if (FLAGS_drop_log_memory) {
>   if (file_length_ >= logging::kPageSize) {
> // don't evict the most recent page
> uint32 len = file_length_ & ~(logging::kPageSize - 1);
> posix_fadvise(fileno(file_), 0, len, POSIX_FADV_DONTNEED);
>   }
> }
> {code}
> We should set GLOG_drop_log_memory=false prior to making our call to 
> google::InitGoogleLogging, to avoid others running into this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4429) Add oversubscription benchmark/stress/test framework

2016-01-19 Thread Bartek Plotka (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107556#comment-15107556
 ] 

Bartek Plotka edited comment on MESOS-4429 at 1/19/16 10:10 PM:


Let's start a `doc` to define the scope and input/output in details: 
https://docs.google.com/document/d/1VyjbSXyvxyS95asFjzV5A19B_vcIAiUqgz3y_pMvtPs/edit?usp=sharing
 (:

Some notes on the Serenity framework which was mentioned by [~nnielsen]:
As you can see it can be controlled via JSON file (quite similar to marathon's 
REST API input 
https://mesosphere.github.io/marathon/docs/rest-api.html#post-v2-apps). IMO it 
gives useful ability to store previous `tasks` and build certain reusable 
scenarios.

One of the interesting features in this framework is ability to stress slave 
with different kind of tasks using logic similar to `shares`. For instance you 
can specify that tasks of type A will be run 3 times more often then tasks of 
type B  (type A task shares = 3 & type B task shares = 1). As a result the 
framework will be spawning as many as possible tasks of both types in such 
defined "distribution". It also support targeting the tasks to particular the 
host.

It could be a good starting point for us.


was (Author: bartek plotka):
Let's start a `doc` to define the scope and input/output in details: 
https://docs.google.com/document/d/1VyjbSXyvxyS95asFjzV5A19B_vcIAiUqgz3y_pMvtPs/edit?usp=sharing
 (:

Some notes on Serenity framework, [~nnielsen] mentioned:
As you can see it can be controlled via JSON file (quite similar to marathon's 
REST API input 
https://mesosphere.github.io/marathon/docs/rest-api.html#post-v2-apps). IMO it 
gives useful ability to store previous `tasks` and build certain reusable 
scenarios.

One of the interesting features in this framework is ability to stress slave 
with different kind of tasks using logic similar to `shares`. For instance you 
can specify that tasks of type A will be run 3 times more often then tasks of 
type B  (type A task shares = 3 & type B task shares = 1). As a result the 
framework will be spawning as many as possible tasks of both types in such 
defined "distribution". It also support targeting the tasks to particular the 
host.

It could be a good starting point for us.

> Add oversubscription benchmark/stress/test framework
> 
>
> Key: MESOS-4429
> URL: https://issues.apache.org/jira/browse/MESOS-4429
> Project: Mesos
>  Issue Type: Task
>Reporter: Niklas Quarfot Nielsen
>
> To evaluate the function and quality of oversubscription modules, we could 
> ship a test framework which can:
> 1) Launch on oversubscribed and non-oversubscribed resources in a controlled 
> manner. For example, register as two different frameworks and see that 
> resources from slack resources of one framework can be used by the other.
> 2) Measure time to react for different scenarios. For example, measure the 
> time it takes from slack appearing on a slave to the offer being issued with 
> revocable resources. The time to react for changing usage patterns e.g. time 
> to reclaim oversubscribed resources when regular tasks need them back.
> 3) Count the number of offer rescind, preemptions, etc. to deem the stability 
> of the policy.
> 4) Be able to measure % extra work being able to run.
> 5) Work across different resource dimensions as cpu time, memory, network, 
> caches.
> [~Bartek Plotka] has been working on something similar for Serenity in 
> https://github.com/mesosphere/serenity/tree/master/src/framework which we can 
> reuse as a base.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4429) Add oversubscription benchmark/stress/test framework

2016-01-19 Thread Bartek Plotka (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107556#comment-15107556
 ] 

Bartek Plotka edited comment on MESOS-4429 at 1/19/16 10:10 PM:


Let's start a `doc` to define the scope and input/output in details: 
https://docs.google.com/document/d/1VyjbSXyvxyS95asFjzV5A19B_vcIAiUqgz3y_pMvtPs/edit?usp=sharing
 (:

Some notes on the mentioned Serenity framework:
As you can see it can be controlled via JSON file (quite similar to marathon's 
REST API input 
https://mesosphere.github.io/marathon/docs/rest-api.html#post-v2-apps). IMO it 
gives useful ability to store previous `tasks` and build certain reusable 
scenarios.

One of the interesting features in this framework is ability to stress slave 
with different kind of tasks using logic similar to `shares`. For instance you 
can specify that tasks of type A will be run 3 times more often then tasks of 
type B  (type A task shares = 3 & type B task shares = 1). As a result the 
framework will be spawning as many as possible tasks of both types in such 
defined "distribution". It also support targeting the tasks to particular the 
host.

It could be a good starting point for us.


was (Author: bartek plotka):
Let's start a `doc` to define the scope and input/output in details: 
https://docs.google.com/document/d/1VyjbSXyvxyS95asFjzV5A19B_vcIAiUqgz3y_pMvtPs/edit?usp=sharing
 (:

Some notes on the Serenity framework which was mentioned by [~nnielsen]:
As you can see it can be controlled via JSON file (quite similar to marathon's 
REST API input 
https://mesosphere.github.io/marathon/docs/rest-api.html#post-v2-apps). IMO it 
gives useful ability to store previous `tasks` and build certain reusable 
scenarios.

One of the interesting features in this framework is ability to stress slave 
with different kind of tasks using logic similar to `shares`. For instance you 
can specify that tasks of type A will be run 3 times more often then tasks of 
type B  (type A task shares = 3 & type B task shares = 1). As a result the 
framework will be spawning as many as possible tasks of both types in such 
defined "distribution". It also support targeting the tasks to particular the 
host.

It could be a good starting point for us.

> Add oversubscription benchmark/stress/test framework
> 
>
> Key: MESOS-4429
> URL: https://issues.apache.org/jira/browse/MESOS-4429
> Project: Mesos
>  Issue Type: Task
>Reporter: Niklas Quarfot Nielsen
>
> To evaluate the function and quality of oversubscription modules, we could 
> ship a test framework which can:
> 1) Launch on oversubscribed and non-oversubscribed resources in a controlled 
> manner. For example, register as two different frameworks and see that 
> resources from slack resources of one framework can be used by the other.
> 2) Measure time to react for different scenarios. For example, measure the 
> time it takes from slack appearing on a slave to the offer being issued with 
> revocable resources. The time to react for changing usage patterns e.g. time 
> to reclaim oversubscribed resources when regular tasks need them back.
> 3) Count the number of offer rescind, preemptions, etc. to deem the stability 
> of the policy.
> 4) Be able to measure % extra work being able to run.
> 5) Work across different resource dimensions as cpu time, memory, network, 
> caches.
> [~Bartek Plotka] has been working on something similar for Serenity in 
> https://github.com/mesosphere/serenity/tree/master/src/framework which we can 
> reuse as a base.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4425) Introduce filtering test abstractions for HTTP events to libprocess

2016-01-19 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-4425:
--
  Sprint: Mesosphere Sprint 27
Story Points: 3

> Introduce filtering test abstractions for HTTP events to libprocess
> ---
>
> Key: MESOS-4425
> URL: https://issues.apache.org/jira/browse/MESOS-4425
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> We need a test abstraction for {{HttpEvent}} similar to the already existing 
> one's for {{DispatchEvent}}, {{MessageEvent}} in libprocess.
> The abstraction can look similar in semantics to the already existing 
> {{FUTURE_DISPATCH}}/{{FUTURE_MESSAGE}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4429) Add oversubscription benchmark/stress/test framework

2016-01-19 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-4429:
--
Description: 
To evaluate the function and quality of oversubscription modules, we could ship 
a test framework which can:
1) Launch on oversubscribed and non-oversubscribed resources in a controlled 
manner. For example, register as two different frameworks and see that 
resources from slack resources of one framework can be used by the other.

2) Measure time to react for different scenarios. For example, measure the time 
it takes from slack appearing on a slave to the offer being issued with 
revocable resources. The time to react for changing usage patterns e.g. time to 
reclaim oversubscribed resources when regular tasks need them back.

3) Count the number of offer rescind, preemptions, etc. to deem the stability 
of the policy.

4) Be able to measure % extra work being able to run.

5) Work across different resource dimensions as cpu time, memory, network, 
caches.

[~Bartek Plotka] has been working on something similar for Serenity in 
https://github.com/mesosphere/serenity/tree/master/src/framework which we can 
reuse as a base.

  was:
To evaluate the function and quality of oversubscription modules, we could ship 
a test framework which can:
1) Launch on oversubscribed and non-oversubscribed resources in a controlled 
manner. For example, register as two different frameworks and see that 
resources from slack resources of one framework can be used by the other.
2) Measure time to react for different scenarios. For example, measure the time 
it takes from slack appearing on a slave to the offer being issued with 
revocable resources. The time to react for changing usage patterns e.g. time to 
reclaim oversubscribed resources when regular tasks need them back.
3) Count the number of offer rescind, preemptions, etc. to deem the stability 
of the policy.
4) Be able to measure % extra work being able to run.
5) Work across different resource dimensions as cpu time, memory, network, 
caches.

[~Bartek Plotka] has been working on something similar for Serenity in 
https://github.com/mesosphere/serenity/tree/master/src/framework which we can 
reuse as a base.


> Add oversubscription benchmark/stress/test framework
> 
>
> Key: MESOS-4429
> URL: https://issues.apache.org/jira/browse/MESOS-4429
> Project: Mesos
>  Issue Type: Task
>Reporter: Niklas Quarfot Nielsen
>
> To evaluate the function and quality of oversubscription modules, we could 
> ship a test framework which can:
> 1) Launch on oversubscribed and non-oversubscribed resources in a controlled 
> manner. For example, register as two different frameworks and see that 
> resources from slack resources of one framework can be used by the other.
> 2) Measure time to react for different scenarios. For example, measure the 
> time it takes from slack appearing on a slave to the offer being issued with 
> revocable resources. The time to react for changing usage patterns e.g. time 
> to reclaim oversubscribed resources when regular tasks need them back.
> 3) Count the number of offer rescind, preemptions, etc. to deem the stability 
> of the policy.
> 4) Be able to measure % extra work being able to run.
> 5) Work across different resource dimensions as cpu time, memory, network, 
> caches.
> [~Bartek Plotka] has been working on something similar for Serenity in 
> https://github.com/mesosphere/serenity/tree/master/src/framework which we can 
> reuse as a base.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4173) HealthCheckTest.CheckCommandTimeout is slow

2016-01-19 Thread Isabel Jimenez (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107354#comment-15107354
 ] 

Isabel Jimenez commented on MESOS-4173:
---

commit f139333db8264f25173c88e5a3f0db76680f3c52
Author: Timothy Chen 
Date:   Wed Jan 13 13:23:50 2016 -0800

Reduced HealthCheckTest.CheckCommandTimeout test duration.

Review: https://reviews.apache.org/r/40956/

> HealthCheckTest.CheckCommandTimeout is slow
> ---
>
> Key: MESOS-4173
> URL: https://issues.apache.org/jira/browse/MESOS-4173
> Project: Mesos
>  Issue Type: Improvement
>  Components: technical debt, test
>Reporter: Alexander Rukletsov
>Priority: Minor
>  Labels: mesosphere, newbie++, tech-debt
>
> The {{HealthCheckTest.CheckCommandTimeout}} test takes more than {{15s}}! to 
> finish on my Mac OS 10.10.4:
> {code}
> HealthCheckTest.CheckCommandTimeout (15483 ms)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2296) Implement the Events stream on slave for Call endpoint

2016-01-19 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-2296:
--
Issue Type: Task  (was: Epic)

> Implement the Events stream on slave for Call endpoint
> --
>
> Key: MESOS-2296
> URL: https://issues.apache.org/jira/browse/MESOS-2296
> Project: Mesos
>  Issue Type: Task
>Reporter: Vinod Kone
>Assignee: Anand Mazumdar
>  Labels: mesosphere
> Fix For: 0.27.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4005) Support workdir runtime configuration from image

2016-01-19 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107064#comment-15107064
 ] 

Gilbert Song commented on MESOS-4005:
-

Used to be blocked by docker v1 parse using protobuf parse. Fixed.

> Support workdir runtime configuration from image 
> -
>
> Key: MESOS-4005
> URL: https://issues.apache.org/jira/browse/MESOS-4005
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Timothy Chen
>Assignee: Gilbert Song
>  Labels: mesosphere, unified-containerizer-mvp
>
> We need to support workdir runtime configuration returned from image such as 
> Dockerfile.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4369) Enhance DockerExecuter to support Docker's user-defined networks

2016-01-19 Thread Ezra Silvera (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ezra Silvera updated MESOS-4369:

Summary: Enhance DockerExecuter to support Docker's user-defined networks  
(was: Enhance DockContainerizer to support Docker network created with Docker 
CLI)

> Enhance DockerExecuter to support Docker's user-defined networks
> 
>
> Key: MESOS-4369
> URL: https://issues.apache.org/jira/browse/MESOS-4369
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker
>Reporter: Qian Zhang
>Assignee: Ezra Silvera
>
> Currently DockerContainerizer supports the following network options which 
> are Docker built-in networks:
> {code}
> message DockerInfo {
> ...
> // Network options.
> enum Network {
>   HOST = 1;
>   BRIDGE = 2;
>   NONE = 3;
> }
> ...
> {code}
> However, since docker 1.9, Docker now supports user-defined networks (both 
> local and overlays) - e.g., {{docker network create --driver bridge 
> my-network}},. The user can then create containers that need to be attached 
> to these networks  e.g., {{docker run --net=my-network}},
> We need to enhance DockerExecuter to support such network option so that the 
> Docker container that can connect into such network.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4431) Sharing of persistent volumes via reference counting

2016-01-19 Thread Anindya Sinha (JIRA)
Anindya Sinha created MESOS-4431:


 Summary: Sharing of persistent volumes via reference counting
 Key: MESOS-4431
 URL: https://issues.apache.org/jira/browse/MESOS-4431
 Project: Mesos
  Issue Type: Improvement
  Components: general
Affects Versions: 0.25.0
Reporter: Anindya Sinha
Assignee: Anindya Sinha


Add capability for specific resources to be shared amongst tasks within or 
across frameworks/roles. Enable this functionality for persistent volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4432) Condense (redundant) log messages related to task launch/status/finish

2016-01-19 Thread Kapil Arya (JIRA)
Kapil Arya created MESOS-4432:
-

 Summary: Condense (redundant) log messages related to task 
launch/status/finish
 Key: MESOS-4432
 URL: https://issues.apache.org/jira/browse/MESOS-4432
 Project: Mesos
  Issue Type: Bug
Reporter: Kapil Arya


As can be seen from the following snippet, there were about "25" different log 
entries for a task from launch to finish. This seems a bit too much.

{code}
$ grep " 7062 " /run/log/mesos/mesos-master.INFO 

I0113 23:42:39.464856 15109 master.hpp:176] Adding task 7062 with resources 
cpus(*):0.008 on slave 87f9cced-992e-4d35-9b9b-5a89b9563bba-S1 (10.0.1.112)

I0113 23:42:39.465308 15109 master.cpp:3245] Launching task 7062 of 
framework afb66c28-eddf-4e4e-8b7a-fe822a04eef8- (No Executor Framework) at 
scheduler-009f25ee-1afc-4c20-88c7-d85c46d4da41@10.0.4.15:35526 with resources 
cpus(*):0.008 on slave 87f9cced-992e-4d35-9b9b-5a89b9563bba-S1 at 
slave(1)@10.0.1.112:5051 (10.0.1.112)

I0113 23:43:04.300138 15110 master.cpp:4414] Status update TASK_RUNNING 
(UUID: 174415c6-cf82-400a-90fc-31d7dfbf4fdd) for task 7062 of framework 
afb66c28-eddf-4e4e-8b7a-fe822a04eef8- from slave 
87f9cced-992e-4d35-9b9b-5a89b9563bba-S1 at slave(1)@10.0.1.112:5051 (10.0.1.112)

I0113 23:43:04.300900 15110 master.cpp:4462] Forwarding status update 
TASK_RUNNING (UUID: 174415c6-cf82-400a-90fc-31d7dfbf4fdd) for task 7062 of 
framework afb66c28-eddf-4e4e-8b7a-fe822a04eef8-

I0113 23:43:04.301697 15110 master.cpp:6066] Updating the state of task 
7062 of framework afb66c28-eddf-4e4e-8b7a-fe822a04eef8- (latest state: 
TASK_RUNNING, status update state: TASK_RUNNING)

I0113 23:43:17.932242 15110 master.cpp:3571] Processing ACKNOWLEDGE call 
174415c6-cf82-400a-90fc-31d7dfbf4fdd for task 7062 of framework 
afb66c28-eddf-4e4e-8b7a-fe822a04eef8- (No Executor Framework) at 
scheduler-009f25ee-1afc-4c20-88c7-d85c46d4da41@10.0.4.15:35526 on slave 
87f9cced-992e-4d35-9b9b-5a89b9563bba-S1

I0113 23:43:29.625159 15110 master.cpp:4414] Status update TASK_RUNNING 
(UUID: 174415c6-cf82-400a-90fc-31d7dfbf4fdd) for task 7062 of framework 
afb66c28-eddf-4e4e-8b7a-fe822a04eef8- from slave 
87f9cced-992e-4d35-9b9b-5a89b9563bba-S1 at slave(1)@10.0.1.112:5051 (10.0.1.112)

I0113 23:43:29.626286 15110 master.cpp:4462] Forwarding status update 
TASK_RUNNING (UUID: 174415c6-cf82-400a-90fc-31d7dfbf4fdd) for task 7062 of 
framework afb66c28-eddf-4e4e-8b7a-fe822a04eef8-

I0113 23:43:29.627462 15110 master.cpp:6066] Updating the state of task 
7062 of framework afb66c28-eddf-4e4e-8b7a-fe822a04eef8- (latest state: 
TASK_RUNNING, status update state: TASK_RUNNING)

I0113 23:44:00.408326 15110 master.cpp:3571] Processing ACKNOWLEDGE call 
174415c6-cf82-400a-90fc-31d7dfbf4fdd for task 7062 of framework 
afb66c28-eddf-4e4e-8b7a-fe822a04eef8- (No Executor Framework) at 
scheduler-009f25ee-1afc-4c20-88c7-d85c46d4da41@10.0.4.15:35526 on slave 
87f9cced-992e-4d35-9b9b-5a89b9563bba-S1

I0113 23:46:51.557840 15109 master.cpp:4414] Status update TASK_FINISHED 
(UUID: 33941ab4-117f-4f7c-92eb-19717298bd20) for task 7062 of framework 
afb66c28-eddf-4e4e-8b7a-fe822a04eef8- from slave 
87f9cced-992e-4d35-9b9b-5a89b9563bba-S1 at slave(1)@10.0.1.112:5051 (10.0.1.112)

I0113 23:46:51.562408 15109 master.cpp:4462] Forwarding status update 
TASK_FINISHED (UUID: 33941ab4-117f-4f7c-92eb-19717298bd20) for task 7062 of 
framework afb66c28-eddf-4e4e-8b7a-fe822a04eef8-

I0113 23:46:51.564661 15109 master.cpp:6066] Updating the state of task 
7062 of framework afb66c28-eddf-4e4e-8b7a-fe822a04eef8- (latest state: 
TASK_FINISHED, status update state: TASK_FINISHED)

I0113 23:48:29.797819 15109 master.cpp:4414] Status update TASK_FINISHED 
(UUID: 33941ab4-117f-4f7c-92eb-19717298bd20) for task 7062 of framework 
afb66c28-eddf-4e4e-8b7a-fe822a04eef8- from slave 
87f9cced-992e-4d35-9b9b-5a89b9563bba-S1 at slave(1)@10.0.1.112:5051 (10.0.1.112)

I0113 23:48:29.803653 15109 master.cpp:4462] Forwarding status update 
TASK_FINISHED (UUID: 33941ab4-117f-4f7c-92eb-19717298bd20) for task 7062 of 
framework afb66c28-eddf-4e4e-8b7a-fe822a04eef8-

I0113 23:48:29.806558 15109 master.cpp:6066] Updating the state of task 
7062 of framework afb66c28-eddf-4e4e-8b7a-fe822a04eef8- (latest state: 
TASK_FINISHED, status update state: TASK_FINISHED)

I0113 23:50:39.551422 15109 master.cpp:4414] Status update TASK_FINISHED 
(UUID: 33941ab4-117f-4f7c-92eb-19717298bd20) for task 7062 of framework 
afb66c28-eddf-4e4e-8b7a-fe822a04eef8- from slave 
87f9cced-992e-4d35-9b9b-5a89b9563bba-S1 at slave(1)@10.0.1.112:5051 (10.0.1.112)

I0113 23:50:39.558599 15109 master.cpp:4462] Forwarding status update 
TASK_FINISHED (UUID: 33941ab4-117f-4f7c-92eb-19717298bd20) for task 7062 of 
framework afb66c28-eddf-4e4e-8b7a-fe822a04eef8-

I0113 

[jira] [Created] (MESOS-4433) Implement a callback testing interface for the Executor Library

2016-01-19 Thread Anand Mazumdar (JIRA)
Anand Mazumdar created MESOS-4433:
-

 Summary: Implement a callback testing interface for the Executor 
Library
 Key: MESOS-4433
 URL: https://issues.apache.org/jira/browse/MESOS-4433
 Project: Mesos
  Issue Type: Task
Reporter: Anand Mazumdar
Assignee: Anand Mazumdar


Currently, we do not have a mocking based callback interface for the executor 
library. This should look similar to the ongoing work for MESOS-3339 i.e. the 
corresponding issue for the scheduler library.

The interface should allow us to set expectations like we do for the driver. An 
example:

{code}
EXPECT_CALL(executor, connected())
  .Times(1)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4369) Enhance DockerExecuter to support Docker's user-defined networks

2016-01-19 Thread Ezra Silvera (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107184#comment-15107184
 ] 

Ezra Silvera commented on MESOS-4369:
-

See code in https://reviews.apache.org/r/42516/

> Enhance DockerExecuter to support Docker's user-defined networks
> 
>
> Key: MESOS-4369
> URL: https://issues.apache.org/jira/browse/MESOS-4369
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker
>Reporter: Qian Zhang
>Assignee: Ezra Silvera
>
> Currently DockerContainerizer supports the following network options which 
> are Docker built-in networks:
> {code}
> message DockerInfo {
> ...
> // Network options.
> enum Network {
>   HOST = 1;
>   BRIDGE = 2;
>   NONE = 3;
> }
> ...
> {code}
> However, since docker 1.9, Docker now supports user-defined networks (both 
> local and overlays) - e.g., {{docker network create --driver bridge 
> my-network}},. The user can then create containers that need to be attached 
> to these networks  e.g., {{docker run --net=my-network}},
> We need to enhance DockerExecuter to support such network option so that the 
> Docker container that can connect into such network.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4369) Enhance DockerExecuter to support Docker's user-defined networks

2016-01-19 Thread Ezra Silvera (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107184#comment-15107184
 ] 

Ezra Silvera edited comment on MESOS-4369 at 1/19/16 6:51 PM:
--

See details in https://reviews.apache.org/r/42516/


was (Author: ezrasilvera):
See code in https://reviews.apache.org/r/42516/

> Enhance DockerExecuter to support Docker's user-defined networks
> 
>
> Key: MESOS-4369
> URL: https://issues.apache.org/jira/browse/MESOS-4369
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker
>Reporter: Qian Zhang
>Assignee: Ezra Silvera
>
> Currently DockerContainerizer supports the following network options which 
> are Docker built-in networks:
> {code}
> message DockerInfo {
> ...
> // Network options.
> enum Network {
>   HOST = 1;
>   BRIDGE = 2;
>   NONE = 3;
> }
> ...
> {code}
> However, since docker 1.9, Docker now supports user-defined networks (both 
> local and overlays) - e.g., {{docker network create --driver bridge 
> my-network}},. The user can then create containers that need to be attached 
> to these networks  e.g., {{docker run --net=my-network}},
> We need to enhance DockerExecuter to support such network option so that the 
> Docker container that can connect into such network.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3889) Modify Oversubscription documentation to explicitly forbid the QoS Controller from killing executors running on optimistically offered resources.

2016-01-19 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-3889:
-
Description: 
The oversubcription documentation currently assumes that oversubscribed 
resources ({{USAGE_SLACK}}) are the only type of revocable resources.  
Optimistic offers will add a second type of revocable resource 
({{ALLOCATION_SLACK}}) that should not be acted upon by oversubscription 
components.

For example, the [oversubscription 
doc|http://mesos.apache.org/documentation/latest/oversubscription/] says the 
following:
{quote}
NOTE: If any resource used by a task or executor is revocable, the whole 
container is treated as a revocable container and can therefore be killed or 
throttled by the QoS Controller.
{quote}
which we may amend to something like:
{quote}
NOTE: If any resource used by a task or executor is revocable usage slack, the 
whole container is treated as an oversubscribed container and can therefore be 
killed or throttled by the QoS Controller.
{quote}

> Modify Oversubscription documentation to explicitly forbid the QoS Controller 
> from killing executors running on optimistically offered resources.
> -
>
> Key: MESOS-3889
> URL: https://issues.apache.org/jira/browse/MESOS-3889
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Klaus Ma
>  Labels: mesosphere
>
> The oversubcription documentation currently assumes that oversubscribed 
> resources ({{USAGE_SLACK}}) are the only type of revocable resources.  
> Optimistic offers will add a second type of revocable resource 
> ({{ALLOCATION_SLACK}}) that should not be acted upon by oversubscription 
> components.
> For example, the [oversubscription 
> doc|http://mesos.apache.org/documentation/latest/oversubscription/] says the 
> following:
> {quote}
> NOTE: If any resource used by a task or executor is revocable, the whole 
> container is treated as a revocable container and can therefore be killed or 
> throttled by the QoS Controller.
> {quote}
> which we may amend to something like:
> {quote}
> NOTE: If any resource used by a task or executor is revocable usage slack, 
> the whole container is treated as an oversubscribed container and can 
> therefore be killed or throttled by the QoS Controller.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4111) Provide a means for libprocess users to exit while ensuring messages are flushed.

2016-01-19 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107194#comment-15107194
 ] 

Joseph Wu commented on MESOS-4111:
--

{{process::finalize}} only waits for the event queue on all processes to 
finish.  (It does this by putting a {{TerminateEvent}} at the back of the 
queue.)

Writes to a socket (or any FD), do not have events.  So you'd need to augment 
{{process::finalize}} to clean up and flush sockets too.  This 
[patch|https://reviews.apache.org/r/40266] is part of a chain to do something 
similar.

> Provide a means for libprocess users to exit while ensuring messages are 
> flushed.
> -
>
> Key: MESOS-4111
> URL: https://issues.apache.org/jira/browse/MESOS-4111
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Reporter: Benjamin Mahler
>Priority: Minor
>
> Currently after a {{send}} there is no way to ensure that the message is 
> flushed on the socket before terminating. We work around this by inserting 
> {{os::sleep}} calls (see MESOS-243, MESOS-4106).
> There are a number of approaches to this:
> (1) Return a Future from send that notifies when the message is flushed from 
> the system.
> (2) Call process::finalize before exiting. This would require that 
> process::finalize flushes all of the outstanding data on any active sockets, 
> which may block.
> Regardless of the approach, there needs to be a timer if we want to guarantee 
> termination.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-4390) Shared Volumes Design Doc

2016-01-19 Thread Anindya Sinha (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anindya Sinha updated MESOS-4390:
-
Comment: was deleted

(was: Design document updated for review...
https://docs.google.com/document/d/18O4SH3H4BQriW6CTrg3TlQTiVC-rBRsMePhz99Y_bss/edit)

> Shared Volumes Design Doc
> -
>
> Key: MESOS-4390
> URL: https://issues.apache.org/jira/browse/MESOS-4390
> Project: Mesos
>  Issue Type: Task
>Reporter: Adam B
>Assignee: Anindya Sinha
>  Labels: mesosphere
>
> Review & Approve design doc



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4436) Propose design doc for fixed-point scalar resources

2016-01-19 Thread Neil Conway (JIRA)
Neil Conway created MESOS-4436:
--

 Summary: Propose design doc for fixed-point scalar resources
 Key: MESOS-4436
 URL: https://issues.apache.org/jira/browse/MESOS-4436
 Project: Mesos
  Issue Type: Task
  Components: general
Reporter: Neil Conway






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4438) Add 'dependency' message to 'AppcImageManifest' protobuf.

2016-01-19 Thread Jojy Varghese (JIRA)
Jojy Varghese created MESOS-4438:


 Summary: Add 'dependency' message to 'AppcImageManifest' protobuf.
 Key: MESOS-4438
 URL: https://issues.apache.org/jira/browse/MESOS-4438
 Project: Mesos
  Issue Type: Task
  Components: containerization
Reporter: Jojy Varghese
Assignee: Jojy Varghese


AppcImageManifest protobuf currently lacks 'dependencies' which is necessary 
for image discovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3570) Make Scheduler Library use HTTP Pipelining Abstraction in Libprocess

2016-01-19 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3570:
--
  Sprint: Mesosphere Sprint 27
Story Points: 3

> Make Scheduler Library use HTTP Pipelining Abstraction in Libprocess
> 
>
> Key: MESOS-3570
> URL: https://issues.apache.org/jira/browse/MESOS-3570
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Vinod Kone
>  Labels: mesosphere, newbie
>
> Currently, the scheduler library sends calls in order by chaining them and 
> sending them only when it has received a response for the earlier call. This 
> was done because there was no HTTP Pipelining abstraction in Libprocess 
> {{process::post}}.
> However once {{MESOS-3332}} is resolved, we should be now able to use the new 
> abstraction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3570) Make Scheduler Library use HTTP Pipelining Abstraction in Libprocess

2016-01-19 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone reassigned MESOS-3570:
-

Assignee: Vinod Kone

> Make Scheduler Library use HTTP Pipelining Abstraction in Libprocess
> 
>
> Key: MESOS-3570
> URL: https://issues.apache.org/jira/browse/MESOS-3570
> Project: Mesos
>  Issue Type: Bug
>Reporter: Anand Mazumdar
>Assignee: Vinod Kone
>  Labels: mesosphere, newbie
>
> Currently, the scheduler library sends calls in order by chaining them and 
> sending them only when it has received a response for the earlier call. This 
> was done because there was no HTTP Pipelining abstraction in Libprocess 
> {{process::post}}.
> However once {{MESOS-3332}} is resolved, we should be now able to use the new 
> abstraction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4228) Use std::is_bind_expression to reroute the result of std::bind.

2016-01-19 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107697#comment-15107697
 ] 

Michael Park commented on MESOS-4228:
-

{noformat}
commit a5a42d5e861ee848c79370ed2408f8382ab1010a
Author: Michael Park 
Date:   Tue Jan 5 18:07:26 2016 -0800

Used `std::is_bind_expression` to SFINAE correctly.

Review: https://reviews.apache.org/r/41460
{noformat}

> Use std::is_bind_expression to reroute the result of std::bind.
> ---
>
> Key: MESOS-4228
> URL: https://issues.apache.org/jira/browse/MESOS-4228
> Project: Mesos
>  Issue Type: Task
>  Components: libprocess
>Reporter: Michael Park
>Assignee: Michael Park
>  Labels: mesosphere
>
> The Standard (C++11 through 17) does not require {{std::bind}}'s function 
> call operator to SFINAE, and VS 2015's doesn't. {{std::is_bind_expression}} 
> can be used to manually reroute bind expressions to the 1-arg overload, where 
> (conveniently) the argument will be ignored if necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4220) Introduce result_of with C++14 semantics to stout.

2016-01-19 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107700#comment-15107700
 ] 

Michael Park commented on MESOS-4220:
-

{noformat}
commit 1565096f2fba4e6ac83e6ee44a81e0290b8f7f58
Author: Michael Park 
Date:   Sat Dec 12 11:29:36 2015 -0500

Used SFINAE-friendly `result_of` in libprocess.

Review: https://reviews.apache.org/r/41462
{noformat}
{noformat}
commit 576fa0ee11f81006950094d4e35d231e7cb11472
Author: Michael Park 
Date:   Sat Dec 12 11:29:12 2015 -0500

Added SFINAE-friendly `result_of` in stout.

Review: https://reviews.apache.org/r/41461
{noformat}

> Introduce result_of with C++14 semantics to stout.
> --
>
> Key: MESOS-4220
> URL: https://issues.apache.org/jira/browse/MESOS-4220
> Project: Mesos
>  Issue Type: Task
>  Components: stout
>Reporter: Michael Park
>Assignee: Michael Park
>  Labels: mesosphere
>
> The {{std::result_of}} in VS 2015 Update 1 implements C++11 semantics which 
> does not allow it to be used in SFINAE contexts.
> Introduce a C++14 {{std::result_of}} into stout until we get to VS 2014 
> Update 2, at which point we can switch back to simply using 
> {{std::result_of}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4221) Invoke _Deferred's implicit conversion operator explicitly.

2016-01-19 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107699#comment-15107699
 ] 

Michael Park commented on MESOS-4221:
-

{noformat}
commit b15161eea964c196276b51ef24763fee2f409d57
Author: Michael Park 
Date:   Tue Dec 15 02:54:56 2015 +

Invoked `_Deferred`'s `operator F()` explicitly.

Review: https://reviews.apache.org/r/41459
{noformat}

> Invoke _Deferred's implicit conversion operator explicitly.
> ---
>
> Key: MESOS-4221
> URL: https://issues.apache.org/jira/browse/MESOS-4221
> Project: Mesos
>  Issue Type: Task
>  Components: libprocess
>Reporter: Michael Park
>Assignee: Michael Park
>  Labels: mesosphere
>
> As of VS 2015 Update 1, MSVC implements C++11 semantics for 
> {{std::function}}'s {{Callable}} constructor which does not SFINAE. In the 
> short term, we call the implicit conversion operator from {{_Deferred}} to 
> {{std::function}} explicitly.
> Going forward, I propose to make {{_Deferred}} callable which will bring us 
> to a state where {{process::defer}} is similar to {{std::bind}} in that the 
> objects returned from them are "implementation-defined" (i.e., {{_Deferred}} 
> and something like {{_Bind}}), and that they were both callable. {{Deferred}} 
> and {{std::function}} are similar in that they perform type-erasure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4191) Investigate switching to fixed point scalar resources

2016-01-19 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-4191:
---
Summary: Investigate switching to fixed point scalar resources  (was: 
Design doc for fixed point resources)

> Investigate switching to fixed point scalar resources
> -
>
> Key: MESOS-4191
> URL: https://issues.apache.org/jira/browse/MESOS-4191
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere, resources
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4371) Enhance DockContainerizer to support Docker volume created with Docker CLI

2016-01-19 Thread Qian Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qian Zhang reassigned MESOS-4371:
-

Assignee: Qian Zhang

> Enhance DockContainerizer to support Docker volume created with Docker CLI
> --
>
> Key: MESOS-4371
> URL: https://issues.apache.org/jira/browse/MESOS-4371
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker, volumes
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>
> In Docker, user can create a volume with Docker CLI, e.g., {{docker volume 
> create --name my-volume}}, we need to enhance DockerContainerizer to make the 
> Docker container launched by it can use such volume.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4421) Document that /reserve, /create-volumes endpoints can return misleading "success"

2016-01-19 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway reassigned MESOS-4421:
--

Assignee: Neil Conway

> Document that /reserve, /create-volumes endpoints can return misleading 
> "success"
> -
>
> Key: MESOS-4421
> URL: https://issues.apache.org/jira/browse/MESOS-4421
> Project: Mesos
>  Issue Type: Task
>  Components: documentation, master
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: documentation, endpoint, mesosphere, persistent-volumes, 
> reservations
>
> The docs for the {{/reserve}} endpoint say:
> {noformat}
> 200 OK: Success (the requested resources have been reserved).
> {noformat}
> This is not true: the master returns {{200}} when the request has been 
> validated and a {{CheckpointResourcesMessage}} has been sent to the agent, 
> but the master does not attempt to verify that the message has been received 
> or that the agent successfully checkpointed. Same behavior applies to 
> {{/unreserve}}, {{/create-volumes}}, and {{/destroy-volumes}}.
> We should _either_:
> 1. Accurately document what {{200}} return code means.
> 2. Change the implementation to wait for the agent's next checkpoint to 
> succeed (and to include the effect of the operation) before returning success 
> to the HTTP client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3273) EventCall Test Framework is flaky

2016-01-19 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107747#comment-15107747
 ] 

Vinod Kone commented on MESOS-3273:
---

commit 147895b4bd6c421ac15db043b8d243c07e44fd7c
Author: Vinod Kone 
Date:   Tue Jan 19 16:05:59 2016 -0800

Temporarily disabled EventCallFramework test due to MESOS-3273.


> EventCall Test Framework is flaky
> -
>
> Key: MESOS-3273
> URL: https://issues.apache.org/jira/browse/MESOS-3273
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API
>Affects Versions: 0.24.0
> Environment: 
> https://builds.apache.org/job/Mesos/705/COMPILER=clang,CONFIGURATION=--verbose,OS=ubuntu:14.04,label_exp=docker%7C%7CHadoop/consoleFull
>Reporter: Vinod Kone
>Assignee: Vinod Kone
>  Labels: flaky-test, mesosphere, tech-debt
> Attachments: asan.log
>
>
> Observed this on ASF CI. h/t [~haosd...@gmail.com]
> Looks like the HTTP scheduler never sent a SUBSCRIBE request to the master.
> {code}
> [ RUN  ] ExamplesTest.EventCallFramework
> Using temporary directory '/tmp/ExamplesTest_EventCallFramework_k4vXkx'
> I0813 19:55:15.643579 26085 exec.cpp:443] Ignoring exited event because the 
> driver is aborted!
> Shutting down
> Sending SIGTERM to process tree at pid 26061
> Killing the following process trees:
> [ 
> ]
> Shutting down
> Sending SIGTERM to process tree at pid 26062
> Shutting down
> Killing the following process trees:
> [ 
> ]
> Sending SIGTERM to process tree at pid 26063
> Killing the following process trees:
> [ 
> ]
> Shutting down
> Sending SIGTERM to process tree at pid 26098
> Killing the following process trees:
> [ 
> ]
> Shutting down
> Sending SIGTERM to process tree at pid 26099
> Killing the following process trees:
> [ 
> ]
> WARNING: Logging before InitGoogleLogging() is written to STDERR
> I0813 19:55:17.161726 26100 process.cpp:1012] libprocess is initialized on 
> 172.17.2.10:60249 for 16 cpus
> I0813 19:55:17.161888 26100 logging.cpp:177] Logging to STDERR
> I0813 19:55:17.163625 26100 scheduler.cpp:157] Version: 0.24.0
> I0813 19:55:17.175302 26100 leveldb.cpp:176] Opened db in 3.167446ms
> I0813 19:55:17.176393 26100 leveldb.cpp:183] Compacted db in 1.047996ms
> I0813 19:55:17.176496 26100 leveldb.cpp:198] Created db iterator in 77155ns
> I0813 19:55:17.176518 26100 leveldb.cpp:204] Seeked to beginning of db in 
> 8429ns
> I0813 19:55:17.176527 26100 leveldb.cpp:273] Iterated through 0 keys in the 
> db in 4219ns
> I0813 19:55:17.176708 26100 replica.cpp:744] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0813 19:55:17.178951 26136 recover.cpp:449] Starting replica recovery
> I0813 19:55:17.179934 26136 recover.cpp:475] Replica is in EMPTY status
> I0813 19:55:17.181970 26126 master.cpp:378] Master 
> 20150813-195517-167907756-60249-26100 (297daca2d01a) started on 
> 172.17.2.10:60249
> I0813 19:55:17.182317 26126 master.cpp:380] Flags at startup: 
> --acls="permissive: false
> register_frameworks {
>   principals {
> type: SOME
> values: "test-principal"
>   }
>   roles {
> type: SOME
> values: "*"
>   }
> }
> run_tasks {
>   principals {
> type: SOME
> values: "test-principal"
>   }
>   users {
> type: SOME
> values: "mesos"
>   }
> }
> " --allocation_interval="1secs" --allocator="HierarchicalDRF" 
> --authenticate="false" --authenticate_slaves="false" 
> --authenticators="crammd5" 
> --credentials="/tmp/ExamplesTest_EventCallFramework_k4vXkx/credentials" 
> --framework_sorter="drf" --help="false" --initialize_driver_logging="true" 
> --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
> --max_slave_ping_timeouts="5" --quiet="false" 
> --recovery_slave_removal_limit="100%" --registry="replicated_log" 
> --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" 
> --registry_strict="false" --root_submissions="true" 
> --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" 
> --user_sorter="drf" --version="false" 
> --webui_dir="/mesos/mesos-0.24.0/src/webui" --work_dir="/tmp/mesos-II8Gua" 
> --zk_session_timeout="10secs"
> I0813 19:55:17.183475 26126 master.cpp:427] Master allowing unauthenticated 
> frameworks to register
> I0813 19:55:17.183536 26126 master.cpp:432] Master allowing unauthenticated 
> slaves to register
> I0813 19:55:17.183615 26126 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/ExamplesTest_EventCallFramework_k4vXkx/credentials'
> W0813 19:55:17.183859 26126 credentials.hpp:52] Permissions on credentials 
> file '/tmp/ExamplesTest_EventCallFramework_k4vXkx/credentials' are too open. 
> It is recommended that your credentials file is NOT accessible by others.
> I0813 19:55:17.183969 26123 replica.cpp:641] Replica in EMPTY status received 
> a 

[jira] [Updated] (MESOS-4437) Disable the test RegistryClientTest.BadTokenServerAddress.

2016-01-19 Thread Jojy Varghese (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jojy Varghese updated MESOS-4437:
-
Labels: mesosphere  (was: )

> Disable the test RegistryClientTest.BadTokenServerAddress.
> --
>
> Key: MESOS-4437
> URL: https://issues.apache.org/jira/browse/MESOS-4437
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> As we are retiring registry client, disable this test which looks flaky.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4437) Disable the test RegistryClientTest.BadTokenServerAddress.

2016-01-19 Thread Jojy Varghese (JIRA)
Jojy Varghese created MESOS-4437:


 Summary: Disable the test RegistryClientTest.BadTokenServerAddress.
 Key: MESOS-4437
 URL: https://issues.apache.org/jira/browse/MESOS-4437
 Project: Mesos
  Issue Type: Task
  Components: containerization
Reporter: Jojy Varghese
Assignee: Jojy Varghese


As we are retiring registry client, disable this test which looks flaky.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4439) Fix appc CachedImage image validation

2016-01-19 Thread Jojy Varghese (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jojy Varghese updated MESOS-4439:
-
Description: 
Currently image validation is done assuming that the image's filename will have 
 digest (SHA-512) information. This is not part of the spec
(https://github.com/appc/spec/blob/master/spec/discovery.md).

The spec specifies the tuple  as unique identifier for  
discovering an image.


  was:
Currently image validation is done assuming that the image's filename will have 
 digest (SHA-512) information. This is not part of the spec
(https://github.com/appc/spec/blob/master/spec/discovery.md).

The spec specifies the tuple  as unique identifier for
discovering an image.



> Fix appc CachedImage image validation
> -
>
> Key: MESOS-4439
> URL: https://issues.apache.org/jira/browse/MESOS-4439
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere, unified-containerizer-mvp
>
> Currently image validation is done assuming that the image's filename will 
> have  digest (SHA-512) information. This is not part of the spec
> (https://github.com/appc/spec/blob/master/spec/discovery.md).
> 
> The spec specifies the tuple  as unique identifier 
> for  discovering an image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4439) Fix appc CachedImage image validation

2016-01-19 Thread Jojy Varghese (JIRA)
Jojy Varghese created MESOS-4439:


 Summary: Fix appc CachedImage image validation
 Key: MESOS-4439
 URL: https://issues.apache.org/jira/browse/MESOS-4439
 Project: Mesos
  Issue Type: Task
  Components: containerization
Reporter: Jojy Varghese
Assignee: Jojy Varghese


Currently image validation is done assuming that the image's filename will have 
 digest (SHA-512) information. This is not part of the spec
(https://github.com/appc/spec/blob/master/spec/discovery.md).

The spec specifies the tuple  as unique identifier for
discovering an image.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3763) Need for http::put request method

2016-01-19 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107899#comment-15107899
 ] 

Adam B commented on MESOS-3763:
---

[~jvanremoortere] Are you going to have time to shepherd this patch, or should 
I take it over and commit it?

> Need for http::put request method
> -
>
> Key: MESOS-3763
> URL: https://issues.apache.org/jira/browse/MESOS-3763
> Project: Mesos
>  Issue Type: Task
>Reporter: Joerg Schad
>Assignee: Yongqiao Wang
>Priority: Minor
>  Labels: mesosphere
>
> As we decided to create a more restful api for managing Quota request.
> Therefore we also want to use the HTTP put request and hence need to enable 
> the libprocess/http to send put request besides get and post requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4435) Update `Master::Http::stateSummary` to use `jsonify`.

2016-01-19 Thread Michael Park (JIRA)
Michael Park created MESOS-4435:
---

 Summary: Update `Master::Http::stateSummary` to use `jsonify`.
 Key: MESOS-4435
 URL: https://issues.apache.org/jira/browse/MESOS-4435
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Michael Park
Assignee: Michael Park


Update {{state-summary}} to use {{jsonify}} to stay consistent with {{state}} 
HTTP endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4421) Document that /reserve, /create-volumes endpoints can return misleading "success"

2016-01-19 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-4421:
---
  Sprint: Mesosphere Sprint 27
Story Points: 3

> Document that /reserve, /create-volumes endpoints can return misleading 
> "success"
> -
>
> Key: MESOS-4421
> URL: https://issues.apache.org/jira/browse/MESOS-4421
> Project: Mesos
>  Issue Type: Task
>  Components: documentation, master
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: documentation, endpoint, mesosphere, persistent-volumes, 
> reservations
>
> The docs for the {{/reserve}} endpoint say:
> {noformat}
> 200 OK: Success (the requested resources have been reserved).
> {noformat}
> This is not true: the master returns {{200}} when the request has been 
> validated and a {{CheckpointResourcesMessage}} has been sent to the agent, 
> but the master does not attempt to verify that the message has been received 
> or that the agent successfully checkpointed. Same behavior applies to 
> {{/unreserve}}, {{/create-volumes}}, and {{/destroy-volumes}}.
> We should _either_:
> 1. Accurately document what {{200}} return code means.
> 2. Change the implementation to wait for the agent's next checkpoint to 
> succeed (and to include the effect of the operation) before returning success 
> to the HTTP client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4421) Document that /reserve, /create-volumes endpoints can return misleading "success"

2016-01-19 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-4421:
---
Shepherd: Jie Yu

> Document that /reserve, /create-volumes endpoints can return misleading 
> "success"
> -
>
> Key: MESOS-4421
> URL: https://issues.apache.org/jira/browse/MESOS-4421
> Project: Mesos
>  Issue Type: Task
>  Components: documentation, master
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: documentation, endpoint, mesosphere, persistent-volumes, 
> reservations
>
> The docs for the {{/reserve}} endpoint say:
> {noformat}
> 200 OK: Success (the requested resources have been reserved).
> {noformat}
> This is not true: the master returns {{200}} when the request has been 
> validated and a {{CheckpointResourcesMessage}} has been sent to the agent, 
> but the master does not attempt to verify that the message has been received 
> or that the agent successfully checkpointed. Same behavior applies to 
> {{/unreserve}}, {{/create-volumes}}, and {{/destroy-volumes}}.
> We should _either_:
> 1. Accurately document what {{200}} return code means.
> 2. Change the implementation to wait for the agent's next checkpoint to 
> succeed (and to include the effect of the operation) before returning success 
> to the HTTP client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4111) Provide a means for libprocess users to exit while ensuring messages are flushed.

2016-01-19 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107856#comment-15107856
 ] 

haosdent commented on MESOS-4111:
-

oh, thank you very much.

> Provide a means for libprocess users to exit while ensuring messages are 
> flushed.
> -
>
> Key: MESOS-4111
> URL: https://issues.apache.org/jira/browse/MESOS-4111
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess
>Reporter: Benjamin Mahler
>Priority: Minor
>
> Currently after a {{send}} there is no way to ensure that the message is 
> flushed on the socket before terminating. We work around this by inserting 
> {{os::sleep}} calls (see MESOS-243, MESOS-4106).
> There are a number of approaches to this:
> (1) Return a Future from send that notifies when the message is flushed from 
> the system.
> (2) Call process::finalize before exiting. This would require that 
> process::finalize flushes all of the outstanding data on any active sockets, 
> which may block.
> Regardless of the approach, there needs to be a timer if we want to guarantee 
> termination.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4349) GMock warning in SlaveTest.ContainerUpdatedBeforeTaskReachesExecutor

2016-01-19 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108131#comment-15108131
 ] 

Timothy Chen commented on MESOS-4349:
-

commit d31f9152a7250583c51f2e0568aa0b5a09cc88e9
Author: Neil Conway 
Date:   Tue Jan 19 22:27:14 2016 -0800

Fixed more tests that didn't set a shutdown expect for MockExecutor.

Specifically, the following tests:

MasterTest.OfferNotRescindedOnceUsed
OversubscriptionTest.FetchResourceUsageFromMonitor
OversubscriptionTest.QoSFetchResourceUsageFromMonitor
SlaveTest.ContainerUpdatedBeforeTaskReachesExecutor

Review: https://reviews.apache.org/r/42265/

> GMock warning in SlaveTest.ContainerUpdatedBeforeTaskReachesExecutor
> 
>
> Key: MESOS-4349
> URL: https://issues.apache.org/jira/browse/MESOS-4349
> Project: Mesos
>  Issue Type: Bug
>  Components: tests
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere, tests
> Fix For: 0.27.0
>
>
> {noformat}
> [ RUN  ] SlaveTest.ContainerUpdatedBeforeTaskReachesExecutor
> GMOCK WARNING:
> Uninteresting mock function call - returning directly.
> Function call: shutdown(0x7fe189cae850)
> Stack trace:
> [   OK ] SlaveTest.ContainerUpdatedBeforeTaskReachesExecutor (51 ms)
> {noformat}
> Occurs non-deterministically for me on OSX 10.10, perhaps one run in ten.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4389) Master "roles" endpoint only shows active role

2016-01-19 Thread Fan Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108147#comment-15108147
 ] 

Fan Du commented on MESOS-4389:
---

Based on the code review, it's by design, it doesn't matter much though to use 
it.
Just a random puzzle :)

> Master "roles" endpoint only shows active role
> --
>
> Key: MESOS-4389
> URL: https://issues.apache.org/jira/browse/MESOS-4389
> Project: Mesos
>  Issue Type: Improvement
>  Components: HTTP API, master
>Reporter: Fan Du
>
> Register two slaves to master with role "busybox" and "ubuntu" respectively, 
> then running marthon with role "busybox", after this check master "roles" 
> endpoints, it can only get default and active role, could this be improved to 
> show all available roles for easily checking?
> {code}
> {
> "roles": [
> {
> "frameworks": [],
> "name": "*",
> "resources": {
> "cpus": 0,
> "disk": 0,
> "mem": 0
> },
> "weight": 1.0
> },
> {
> "frameworks": [
> "2caebb14-161f-4941-b8ab-8990cef01ac0-"
> ],
> "name": "busybox",
> "resources": {
> "cpus": 0,
> "disk": 0,
> "mem": 0
> },
> "weight": 1.0
> }
> ]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4435) Update `Master::Http::stateSummary` to use `jsonify`.

2016-01-19 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108003#comment-15108003
 ] 

Michael Park commented on MESOS-4435:
-

https://reviews.apache.org/r/42543/
https://reviews.apache.org/r/42546/

> Update `Master::Http::stateSummary` to use `jsonify`.
> -
>
> Key: MESOS-4435
> URL: https://issues.apache.org/jira/browse/MESOS-4435
> Project: Mesos
>  Issue Type: Task
>  Components: master
>Reporter: Michael Park
>Assignee: Michael Park
>
> Update {{state-summary}} to use {{jsonify}} to stay consistent with {{state}} 
> HTTP endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4339) Add weight support for framework sorter

2016-01-19 Thread Fan Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108142#comment-15108142
 ] 

Fan Du commented on MESOS-4339:
---

[~adam-mesos] and [~bbannier]
Based on the proposal documentation from MESOS-4284, it's well justified to 
enable weighted DRF framework sorter in a multi-role scenario, to keep the 
allocation decision fair across roles and frameworks. Although the work to 
support weighted DRF framework sorter is independent with that of multi-role 
frameworks in its design logic(which is what I thought before incompletely) 
but, the former needed to be done *AFTER* multi-role frameworks apparently in 
implementation.

So I'm wondering if you don't mind, I would still like to contribute this 
ticket to multi-role frameworks.

> Add weight support for framework sorter
> ---
>
> Key: MESOS-4339
> URL: https://issues.apache.org/jira/browse/MESOS-4339
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Fan Du
>Assignee: Fan Du
>
> Current framework sorter doesn't take into account of weights when sorting 
> framework belonging to a particular role, i.e., all frameworks has equal 
> weights as 1. Considering the role weight is controlled by the operator, 
> enable the framework weight does not impact the role level allocation 
> decision from any greedy frameworks, but it will be beneficial to some 
> framework who could get more resources within a specific role.
> The framework weight will come from message FrameworkInfo when it got 
> registered, and FrameworkSorters will "add" framework with weight,
> this will eventually result a weighted framework sorting flow when master 
> make the finally allocation decision.
> Please review this ticket which I will work on if it's considered acceptable.
> Thanks a lot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4146) Distinguish usage slack and allocation slack revocable resources

2016-01-19 Thread Guangya Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055424#comment-15055424
 ] 

Guangya Liu edited comment on MESOS-4146 at 1/20/16 7:43 AM:
-

https://reviews.apache.org/r/41333/
https://reviews.apache.org/r/41334/
https://reviews.apache.org/r/42547/


was (Author: gyliu):
https://reviews.apache.org/r/41333/
https://reviews.apache.org/r/41334/

> Distinguish usage slack and allocation slack revocable resources
> 
>
> Key: MESOS-4146
> URL: https://issues.apache.org/jira/browse/MESOS-4146
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>
> The API revocable() can now return resources which are revocable including 
> both allocation slack and usage slack, it is better add two new APIs to 
> return revocable resources for both allocation slack and usage slack.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4440) Clean get/post/deleteRequest func and let the caller to use the general funcion.

2016-01-19 Thread Yongqiao Wang (JIRA)
Yongqiao Wang created MESOS-4440:


 Summary: Clean get/post/deleteRequest func and let the caller to 
use the general funcion.
 Key: MESOS-4440
 URL: https://issues.apache.org/jira/browse/MESOS-4440
 Project: Mesos
  Issue Type: Bug
Reporter: Yongqiao Wang
Assignee: Yongqiao Wang
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4440) Clean get/post/deleteRequest func and let the caller to use the general funcion.

2016-01-19 Thread Yongqiao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108162#comment-15108162
 ] 

Yongqiao Wang commented on MESOS-4440:
--

In MESOS-3763 ticket, we have exposed the internal::http::request function in 
the header, so it needs to clean the other instances of post/get to use the 
http::request method in this ticket.

> Clean get/post/deleteRequest func and let the caller to use the general 
> funcion.
> 
>
> Key: MESOS-4440
> URL: https://issues.apache.org/jira/browse/MESOS-4440
> Project: Mesos
>  Issue Type: Bug
>Reporter: Yongqiao Wang
>Assignee: Yongqiao Wang
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2930) Allow the Resource Estimator to express over-allocation of revocable resources.

2016-01-19 Thread Niklas Quarfot Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106688#comment-15106688
 ] 

Niklas Quarfot Nielsen commented on MESOS-2930:
---

Hi [~bmahler] - sorry for the super tardy reply.

For Serenity, the Estimator and QoS controllers acts as edges on a shared 
pipeline of filters (lives in it's own actor). In short, the estimator pushes 
usage statistics in and awaits estimates, whereas the QoS controller awaits 
corrections from the pipeline.

> Allow the Resource Estimator to express over-allocation of revocable 
> resources.
> ---
>
> Key: MESOS-2930
> URL: https://issues.apache.org/jira/browse/MESOS-2930
> Project: Mesos
>  Issue Type: Improvement
>  Components: slave
>Reporter: Benjamin Mahler
>Assignee: Klaus Ma
>
> Currently the resource estimator returns the amount of oversubscription 
> resources that are available, since resources cannot be negative, this allows 
> the resource estimator to express the following:
> (1) Return empty resources: We are fully allocated for oversubscription 
> resources.
> (2) Return non-empty resources: We are under-allocated for oversubscription 
> resources. In other words, some are available.
> However, there is an additional situation that we cannot express:
> (3) Analogous to returning non-empty "negative" resources: We are 
> over-allocated for oversubscription resources. Do not re-offer any of the 
> over-allocated oversubscription resources that are recovered.
> Without (3), the slave can only shrink the total pool of oversubscription 
> resources by returning (1) as resources are recovered, until the pool is 
> shrunk to the desired size. However, this approach is only best-effort, it's 
> possible for a framework to launch more tasks in the window of time (15 
> seconds by default) that the slave polls the estimator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3889) Modify Oversubscription documentation to explicitly forbid the QoS Controller from killing executors running on optimistically offered resources.

2016-01-19 Thread Niklas Quarfot Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106748#comment-15106748
 ] 

Niklas Quarfot Nielsen commented on MESOS-3889:
---

[~hartem] [~klaus1982] Can you add a bit of context on this ticket? :)

> Modify Oversubscription documentation to explicitly forbid the QoS Controller 
> from killing executors running on optimistically offered resources.
> -
>
> Key: MESOS-3889
> URL: https://issues.apache.org/jira/browse/MESOS-3889
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>Assignee: Klaus Ma
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2930) Allow the Resource Estimator to express over-allocation of revocable resources.

2016-01-19 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106863#comment-15106863
 ] 

Klaus Ma edited comment on MESOS-2930 at 1/19/16 3:21 PM:
--

[~nnielsen], there's a case we need to handle regrading this JIRA:

* T1: in cluster, {{cpus=2}}: on is revocable and the other one is nonRevocable
* T2: {{framework1}} get offer {{cpus=2}}, but did NOT launch tasks
* T3: Estimator update empty resources; {{slave.total}} is updated to 
{{cpus=1}} in {{HierarchicalAllocatorProcess::updateSlave}}
* T4: in {{allocate()}}, slave.total (cpus=1) < slave.allocated (cpus=2), the 
resources {{cpus=1}} will re-offer to framework because {{operator-}} will 
return first item if {{subtractable}} is false.

Any comments?


was (Author: klaus1982):
[~nnielsen], there's a case we need to handle regrading this JIRA:

* T1: in cluster, {{cpus=2}}: on is revocable and the other one is nonRevocable
* T2: {{framework1}} get offer {{cpus=2}}, but did NOT launch tasks
* T3: Estimator update empty resources; {{slave.total}} is updated to 
{{cpus=1}} in {{HierarchicalAllocatorProcess::updateSlave}}
* T4: in {{allocate()}}, slave.total (cpus=1) < slave.allocated (cpus=2), the 
resources {{cpus=1}} will re-offer to framework because {{operator-}} will 
return first item is {{subtractable}} is false.

Any comments?

> Allow the Resource Estimator to express over-allocation of revocable 
> resources.
> ---
>
> Key: MESOS-2930
> URL: https://issues.apache.org/jira/browse/MESOS-2930
> Project: Mesos
>  Issue Type: Improvement
>  Components: slave
>Reporter: Benjamin Mahler
>Assignee: Klaus Ma
>
> Currently the resource estimator returns the amount of oversubscription 
> resources that are available, since resources cannot be negative, this allows 
> the resource estimator to express the following:
> (1) Return empty resources: We are fully allocated for oversubscription 
> resources.
> (2) Return non-empty resources: We are under-allocated for oversubscription 
> resources. In other words, some are available.
> However, there is an additional situation that we cannot express:
> (3) Analogous to returning non-empty "negative" resources: We are 
> over-allocated for oversubscription resources. Do not re-offer any of the 
> over-allocated oversubscription resources that are recovered.
> Without (3), the slave can only shrink the total pool of oversubscription 
> resources by returning (1) as resources are recovered, until the pool is 
> shrunk to the desired size. However, this approach is only best-effort, it's 
> possible for a framework to launch more tasks in the window of time (15 
> seconds by default) that the slave polls the estimator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2930) Allow the Resource Estimator to express over-allocation of revocable resources.

2016-01-19 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106863#comment-15106863
 ] 

Klaus Ma commented on MESOS-2930:
-

[~nnielsen], there's a case we need to handle regrading this JIRA:

* T1: in cluster, {{cpus=2}}: on is revocable and the other one is nonRevocable
* T2: {{framework1}} get offer {{cpus=2}}, but did NOT launch tasks
* T3: Estimator update empty resources; {{slave.total}} is updated to 
{{cpus=1}} in {{HierarchicalAllocatorProcess::updateSlave}}
* T4: in {{allocate()}}, slave.total (cpus=1) < slave.allocated (cpus=2), the 
resources {{cpus=1}} will re-offer to framework because {{operator-}} will 
return first item is {{subtractable}} is false.

Any comments?

> Allow the Resource Estimator to express over-allocation of revocable 
> resources.
> ---
>
> Key: MESOS-2930
> URL: https://issues.apache.org/jira/browse/MESOS-2930
> Project: Mesos
>  Issue Type: Improvement
>  Components: slave
>Reporter: Benjamin Mahler
>Assignee: Klaus Ma
>
> Currently the resource estimator returns the amount of oversubscription 
> resources that are available, since resources cannot be negative, this allows 
> the resource estimator to express the following:
> (1) Return empty resources: We are fully allocated for oversubscription 
> resources.
> (2) Return non-empty resources: We are under-allocated for oversubscription 
> resources. In other words, some are available.
> However, there is an additional situation that we cannot express:
> (3) Analogous to returning non-empty "negative" resources: We are 
> over-allocated for oversubscription resources. Do not re-offer any of the 
> over-allocated oversubscription resources that are recovered.
> Without (3), the slave can only shrink the total pool of oversubscription 
> resources by returning (1) as resources are recovered, until the pool is 
> shrunk to the desired size. However, this approach is only best-effort, it's 
> possible for a framework to launch more tasks in the window of time (15 
> seconds by default) that the slave polls the estimator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4429) Add oversubscription benchmark/stress/test framework

2016-01-19 Thread Niklas Quarfot Nielsen (JIRA)
Niklas Quarfot Nielsen created MESOS-4429:
-

 Summary: Add oversubscription benchmark/stress/test framework
 Key: MESOS-4429
 URL: https://issues.apache.org/jira/browse/MESOS-4429
 Project: Mesos
  Issue Type: Task
Reporter: Niklas Quarfot Nielsen


To evaluate the function and quality of oversubscription modules, we could ship 
a test framework which can:
1) Launch on oversubscribed and non-oversubscribed resources in a controlled 
manner. For example, register as two different frameworks and see that 
resources from slack resources of one framework can be used by the other.
2) Measure time to react for different scenarios. For example, measure the time 
it takes from slack appearing on a slave to the offer being issued with 
revocable resources. The time to react for changing usage patterns e.g. time to 
reclaim oversubscribed resources when regular tasks need them back.
3) Count the number of offer rescind, preemptions, etc. to deem the stability 
of the policy.
4) Be able to measure % extra work being able to run.
5) Work across different resource dimensions as cpu time, memory, network, 
caches.

[~Bartek Plotka] has been working on something similar for Serenity in 
https://github.com/mesosphere/serenity/tree/master/src/framework which we can 
reuse as a base.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4102) Quota doesn't allocate resources on slave joining.

2016-01-19 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4102:
---
Assignee: Klaus Ma  (was: Alexander Rukletsov)

> Quota doesn't allocate resources on slave joining.
> --
>
> Key: MESOS-4102
> URL: https://issues.apache.org/jira/browse/MESOS-4102
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Neil Conway
>Assignee: Klaus Ma
>Priority: Blocker
>  Labels: mesosphere, quota
> Attachments: quota_absent_framework_test-1.patch
>
>
> See attached patch. {{framework1}} is not allocated any resources, despite 
> the fact that the resources on {{agent2}} can safely be allocated to it 
> without risk of violating {{quota1}}. If I understand the intended quota 
> behavior correctly, this doesn't seem intended.
> Note that if the framework is added _after_ the slaves are added, the 
> resources on {{agent2}} are allocated to {{framework1}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4369) Enhance DockContainerizer to support Docker network created with Docker CLI

2016-01-19 Thread Ezra Silvera (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ezra Silvera updated MESOS-4369:

Description: 
Currently DockerContainerizer supports the following network options which are 
Docker built-in networks:
{code}
message DockerInfo {
...
// Network options.
enum Network {
  HOST = 1;
  BRIDGE = 2;
  NONE = 3;
}
...
{code}
However, since docker 1.9, Docker now supports user-defined networks (both 
local and overlays) - e.g., {{docker network create --driver bridge 
my-network}},. The user can then create containers that need to be attached to 
these networks  e.g., {{docker run --net=my-network}},
We need to enhance DockerExecuter to support such network option so that the 
Docker container that can connect into such network.

  was:
Currently DockerContainerizer supports the following network options which are 
Docker built-in networks:
{code}
message DockerInfo {
...
// Network options.
enum Network {
  HOST = 1;
  BRIDGE = 2;
  NONE = 3;
}
...
{code}
However, with Docker CLI, user can create a customized network, e.g., {{docker 
network create my-network}}, we need to enhance DockerContainerizer to support 
such network so that the Docker container that user creates in Mesos with 
DockerContainerizer can connect into such network.


> Enhance DockContainerizer to support Docker network created with Docker CLI
> ---
>
> Key: MESOS-4369
> URL: https://issues.apache.org/jira/browse/MESOS-4369
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker
>Reporter: Qian Zhang
>Assignee: Ezra Silvera
>
> Currently DockerContainerizer supports the following network options which 
> are Docker built-in networks:
> {code}
> message DockerInfo {
> ...
> // Network options.
> enum Network {
>   HOST = 1;
>   BRIDGE = 2;
>   NONE = 3;
> }
> ...
> {code}
> However, since docker 1.9, Docker now supports user-defined networks (both 
> local and overlays) - e.g., {{docker network create --driver bridge 
> my-network}},. The user can then create containers that need to be attached 
> to these networks  e.g., {{docker run --net=my-network}},
> We need to enhance DockerExecuter to support such network option so that the 
> Docker container that can connect into such network.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4430) Identify and change logging level for message that don't contain specific task/framework/slave info

2016-01-19 Thread Kapil Arya (JIRA)
Kapil Arya created MESOS-4430:
-

 Summary: Identify and change logging level for message that don't 
contain specific task/framework/slave info
 Key: MESOS-4430
 URL: https://issues.apache.org/jira/browse/MESOS-4430
 Project: Mesos
  Issue Type: Bug
Reporter: Kapil Arya


The idea is to identify message such as:

{code}
mesos-slave[37891]: I0117 15:20:15.357344 37941 slave.cpp:4200] Received 
oversubscribable resources  from the resource estimator
mesos-slave[37891]: I0117 15:20:30.357959 37957 slave.cpp:4186] Querying 
resource estimator for oversubscribable resources
{code}

and remove them from default logging level. These messages don't provide any 
value to the sysadmin, etc., and fill up logs. In one incident, we observed 
over 12K lines of such message in the log over a 33hr run of a cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4422) Use adaptor::reverse for reverse iteration in the code base.

2016-01-19 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106926#comment-15106926
 ] 

haosdent commented on MESOS-4422:
-

A naive question, I saw [~tnachen] add reverse_foreach in 
[r42379|https://reviews.apache.org/r/42379/] before. So why we give up that 
proposal?

> Use adaptor::reverse for reverse iteration in the code base.
> 
>
> Key: MESOS-4422
> URL: https://issues.apache.org/jira/browse/MESOS-4422
> Project: Mesos
>  Issue Type: Task
>Reporter: Jie Yu
>Assignee: haosdent
>
> It would be good to be consistent on our looping structure.
> Currently, we use foreach for forward iteration and use rbegin/rend for 
> reverse iteration. We recently added adaptor::reverse 
> (https://reviews.apache.org/r/42450) in stout, which allows us to do:
> {noformat}
> vector input = {};
> foreach (int i, adaptor::reverse(input)) {
>   ...
> }
> {noformat}
> We should cleanup our code to consistently use this structure on reverse 
> iteration.
> {noformat}
> jie$ grep -R rbegin src
> src/common/protobuf_utils.cpp:  for (auto status = task.statuses().rbegin();
> src/slave/containerizer/mesos/containerizer.cpp:  for (auto it = 
> isolators.crbegin(); it != isolators.crend(); ++it) {
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4422) Use adaptor::reverse for reverse iteration in the code base.

2016-01-19 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-4422:
---

Assignee: haosdent

> Use adaptor::reverse for reverse iteration in the code base.
> 
>
> Key: MESOS-4422
> URL: https://issues.apache.org/jira/browse/MESOS-4422
> Project: Mesos
>  Issue Type: Task
>Reporter: Jie Yu
>Assignee: haosdent
>
> It would be good to be consistent on our looping structure.
> Currently, we use foreach for forward iteration and use rbegin/rend for 
> reverse iteration. We recently added adaptor::reverse 
> (https://reviews.apache.org/r/42450) in stout, which allows us to do:
> {noformat}
> vector input = {};
> foreach (int i, adaptor::reverse(input)) {
>   ...
> }
> {noformat}
> We should cleanup our code to consistently use this structure on reverse 
> iteration.
> {noformat}
> jie$ grep -R rbegin src
> src/common/protobuf_utils.cpp:  for (auto status = task.statuses().rbegin();
> src/slave/containerizer/mesos/containerizer.cpp:  for (auto it = 
> isolators.crbegin(); it != isolators.crend(); ++it) {
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1718) Command executor can overcommit the slave.

2016-01-19 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107782#comment-15107782
 ] 

Vinod Kone commented on MESOS-1718:
---

I don't have cycles to shepherd this at the moment :(

> Command executor can overcommit the slave.
> --
>
> Key: MESOS-1718
> URL: https://issues.apache.org/jira/browse/MESOS-1718
> Project: Mesos
>  Issue Type: Bug
>  Components: slave
>Reporter: Benjamin Mahler
>Assignee: Klaus Ma
>
> Currently we give a small amount of resources to the command executor, in 
> addition to resources used by the command task:
> https://github.com/apache/mesos/blob/0.20.0-rc1/src/slave/slave.cpp#L2448
> {code: title=}
> ExecutorInfo Slave::getExecutorInfo(
> const FrameworkID& frameworkId,
> const TaskInfo& task)
> {
>   ...
> // Add an allowance for the command executor. This does lead to a
> // small overcommit of resources.
> executor.mutable_resources()->MergeFrom(
> Resources::parse(
>   "cpus:" + stringify(DEFAULT_EXECUTOR_CPUS) + ";" +
>   "mem:" + stringify(DEFAULT_EXECUTOR_MEM.megabytes())).get());
>   ...
> }
> {code}
> This leads to an overcommit of the slave. Ideally, for command tasks we can 
> "transfer" all of the task resources to the executor at the slave / isolation 
> level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1718) Command executor can overcommit the slave.

2016-01-19 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107811#comment-15107811
 ] 

Klaus Ma commented on MESOS-1718:
-

NP :). Your comments are always helpful. I'll find a volunteer for this JIRA.

> Command executor can overcommit the slave.
> --
>
> Key: MESOS-1718
> URL: https://issues.apache.org/jira/browse/MESOS-1718
> Project: Mesos
>  Issue Type: Bug
>  Components: slave
>Reporter: Benjamin Mahler
>Assignee: Klaus Ma
>
> Currently we give a small amount of resources to the command executor, in 
> addition to resources used by the command task:
> https://github.com/apache/mesos/blob/0.20.0-rc1/src/slave/slave.cpp#L2448
> {code: title=}
> ExecutorInfo Slave::getExecutorInfo(
> const FrameworkID& frameworkId,
> const TaskInfo& task)
> {
>   ...
> // Add an allowance for the command executor. This does lead to a
> // small overcommit of resources.
> executor.mutable_resources()->MergeFrom(
> Resources::parse(
>   "cpus:" + stringify(DEFAULT_EXECUTOR_CPUS) + ";" +
>   "mem:" + stringify(DEFAULT_EXECUTOR_MEM.megabytes())).get());
>   ...
> }
> {code}
> This leads to an overcommit of the slave. Ideally, for command tasks we can 
> "transfer" all of the task resources to the executor at the slave / isolation 
> level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4417) Prevent allocator from crashing on successful recovery.

2016-01-19 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4417:
---
Summary: Prevent allocator from crashing on successful recovery.  (was: 
Refactor allocator recovery.)

> Prevent allocator from crashing on successful recovery.
> ---
>
> Key: MESOS-4417
> URL: https://issues.apache.org/jira/browse/MESOS-4417
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>Priority: Blocker
>  Labels: mesosphere
>
> There might be a bug that may crash the master as pointed out by [~bmahler] 
> in https://reviews.apache.org/r/4/.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4382) Change the `principal` in `ReservationInfo` to optional

2016-01-19 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-4382:
-
Sprint: Mesosphere Sprint 26  (was: Mesosphere Sprint 27)

> Change the `principal` in `ReservationInfo` to optional
> ---
>
> Key: MESOS-4382
> URL: https://issues.apache.org/jira/browse/MESOS-4382
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere, reservations
>
> With the addition of HTTP endpoints for {{/reserve}} and {{/unreserve}}, it 
> is now desirable to allow dynamic reservations without a principal, in the 
> case where HTTP authentication is disabled. To allow for this, we will change 
> the {{principal}} field in {{ReservationInfo}} from required to optional. For 
> backwards-compatibility, however, the master should currently invalidate any 
> {{ReservationInfo}} messages that do not have this field set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4102) Quota doesn't allocate resources on slave joining.

2016-01-19 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107026#comment-15107026
 ] 

Alexander Rukletsov commented on MESOS-4102:


https://reviews.apache.org/r/42510/

> Quota doesn't allocate resources on slave joining.
> --
>
> Key: MESOS-4102
> URL: https://issues.apache.org/jira/browse/MESOS-4102
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Neil Conway
>Assignee: Klaus Ma
>Priority: Blocker
>  Labels: mesosphere, quota
> Attachments: quota_absent_framework_test-1.patch
>
>
> See attached patch. {{framework1}} is not allocated any resources, despite 
> the fact that the resources on {{agent2}} can safely be allocated to it 
> without risk of violating {{quota1}}. If I understand the intended quota 
> behavior correctly, this doesn't seem intended.
> Note that if the framework is added _after_ the slaves are added, the 
> resources on {{agent2}} are allocated to {{framework1}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-4102) Quota doesn't allocate resources on slave joining.

2016-01-19 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4102:
---
Comment: was deleted

(was: https://reviews.apache.org/r/42510/)

> Quota doesn't allocate resources on slave joining.
> --
>
> Key: MESOS-4102
> URL: https://issues.apache.org/jira/browse/MESOS-4102
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Neil Conway
>Assignee: Klaus Ma
>Priority: Blocker
>  Labels: mesosphere, quota
> Attachments: quota_absent_framework_test-1.patch
>
>
> See attached patch. {{framework1}} is not allocated any resources, despite 
> the fact that the resources on {{agent2}} can safely be allocated to it 
> without risk of violating {{quota1}}. If I understand the intended quota 
> behavior correctly, this doesn't seem intended.
> Note that if the framework is added _after_ the slaves are added, the 
> resources on {{agent2}} are allocated to {{framework1}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4102) Quota doesn't allocate resources on slave joining.

2016-01-19 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106854#comment-15106854
 ] 

Alexander Rukletsov edited comment on MESOS-4102 at 1/19/16 5:17 PM:
-

https://reviews.apache.org/r/42289
https://reviews.apache.org/r/42510/


was (Author: alexr):
https://reviews.apache.org/r/42289

> Quota doesn't allocate resources on slave joining.
> --
>
> Key: MESOS-4102
> URL: https://issues.apache.org/jira/browse/MESOS-4102
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Neil Conway
>Assignee: Klaus Ma
>Priority: Blocker
>  Labels: mesosphere, quota
> Attachments: quota_absent_framework_test-1.patch
>
>
> See attached patch. {{framework1}} is not allocated any resources, despite 
> the fact that the resources on {{agent2}} can safely be allocated to it 
> without risk of violating {{quota1}}. If I understand the intended quota 
> behavior correctly, this doesn't seem intended.
> Note that if the framework is added _after_ the slaves are added, the 
> resources on {{agent2}} are allocated to {{framework1}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4411) Traverse all roles for quota allocation

2016-01-19 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107025#comment-15107025
 ] 

Alexander Rukletsov commented on MESOS-4411:


https://reviews.apache.org/r/42511/

> Traverse all roles for quota allocation
> ---
>
> Key: MESOS-4411
> URL: https://issues.apache.org/jira/browse/MESOS-4411
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Alexander Rukletsov
>Assignee: Guangya Liu
>Priority: Critical
>  Labels: mesosphere
>
> There might be a bug in how resources are allocated to multiple quota'ed 
> roles if one role's quota is met. We need to investigate this behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4411) Traverse all roles for quota allocation.

2016-01-19 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4411:
---
Priority: Blocker  (was: Critical)
 Summary: Traverse all roles for quota allocation.  (was: Traverse all 
roles for quota allocation)

> Traverse all roles for quota allocation.
> 
>
> Key: MESOS-4411
> URL: https://issues.apache.org/jira/browse/MESOS-4411
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Alexander Rukletsov
>Assignee: Guangya Liu
>Priority: Blocker
>  Labels: mesosphere
>
> There might be a bug in how resources are allocated to multiple quota'ed 
> roles if one role's quota is met. We need to investigate this behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)