[jira] [Created] (MESOS-6397) Simplify the comparison logic for `ExecutorInfo`.

2016-10-14 Thread haosdent (JIRA)
haosdent created MESOS-6397:
---

 Summary: Simplify the comparison logic for `ExecutorInfo`.
 Key: MESOS-6397
 URL: https://issues.apache.org/jira/browse/MESOS-6397
 Project: Mesos
  Issue Type: Improvement
Reporter: haosdent
Assignee: haosdent


Refer to the comment in https://reviews.apache.org/r/52817/#comment221849, 
we should simplify the comparison logic for {{ExecutorInfo}} in 
{{type_utils.cpp}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6391) Command task's sandbox should not be owned by root if it uses container image.

2016-10-14 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576990#comment-15576990
 ] 

Jie Yu commented on MESOS-6391:
---

Looks like 0.28.3 backport is a bit very challenging.

> Command task's sandbox should not be owned by root if it uses container image.
> --
>
> Key: MESOS-6391
> URL: https://issues.apache.org/jira/browse/MESOS-6391
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.28.2, 1.0.1
>Reporter: Jie Yu
>Assignee: Jie Yu
>Priority: Blocker
> Fix For: 1.0.2, 1.1.0
>
>
> Currently, if the task defines a container image, the command executor will 
> be run under root because it needs to perform pivot_root.
> That means if the task wants to run under an unprivileged user, the sandbox 
> of that task will not be writable because it's owned by root.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6312) Update CHANGELOG to mention addtion of agent '--runtime_dir' flag.

2016-10-14 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-6312:
---
Description: 
We recently introduced a new agent flag for {{\-\-runtime_dir}}. Unlike the 
{{\-\-work_dir}}, this directory is designed to hold the state of a running 
agent between subsequent agent-restarts (but not across host reboots).

By default, this flag is set to {{/var/run/mesos}} since this is a {{tempfs}} 
on linux that gets automatically cleaned up on reboot. When running as non-root 
we set the default to {{os::temp()/mesos/runtime}}.

We should call this out in the CHAGNELOG

  was:
We recently introduced a new agent flag for {{\-\-runtime_dir}}. Unlike the 
{{\-\-work_dir}}, this directory is designed to hold the state of a running 
agent between subsequent agent-restarts (but not across host reboots).

By default, this flag is set to {{/var/run/mesos}} since this is a {{tempfs}} 
on linux that gets automatically cleaned up on reboot. However, on most systems 
{{/var/run/mesos}} is only writable by root, causing problems when launching an 
agent as non-root and not pointing {{--runtime_dir}} to a different location.

We need to call this out in the upgrade.md and getting-started.md docs so that 
people know they may need to set this going forward.


> Update CHANGELOG to mention addtion of agent '--runtime_dir' flag.
> --
>
> Key: MESOS-6312
> URL: https://issues.apache.org/jira/browse/MESOS-6312
> Project: Mesos
>  Issue Type: Task
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>Priority: Blocker
>
> We recently introduced a new agent flag for {{\-\-runtime_dir}}. Unlike the 
> {{\-\-work_dir}}, this directory is designed to hold the state of a running 
> agent between subsequent agent-restarts (but not across host reboots).
> By default, this flag is set to {{/var/run/mesos}} since this is a {{tempfs}} 
> on linux that gets automatically cleaned up on reboot. When running as 
> non-root we set the default to {{os::temp()/mesos/runtime}}.
> We should call this out in the CHAGNELOG



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6312) Update CHANGELOG to mention addtion of agent '--runtime_dir' flag.

2016-10-14 Thread Kevin Klues (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Klues updated MESOS-6312:
---
Summary: Update CHANGELOG to mention addtion of agent '--runtime_dir' flag. 
 (was: Add requirement in upgrade.md and getting-started.md for agent 
'--runtime_dir' in when running as non-root)

> Update CHANGELOG to mention addtion of agent '--runtime_dir' flag.
> --
>
> Key: MESOS-6312
> URL: https://issues.apache.org/jira/browse/MESOS-6312
> Project: Mesos
>  Issue Type: Task
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>Priority: Blocker
>
> We recently introduced a new agent flag for {{\-\-runtime_dir}}. Unlike the 
> {{\-\-work_dir}}, this directory is designed to hold the state of a running 
> agent between subsequent agent-restarts (but not across host reboots).
> By default, this flag is set to {{/var/run/mesos}} since this is a {{tempfs}} 
> on linux that gets automatically cleaned up on reboot. However, on most 
> systems {{/var/run/mesos}} is only writable by root, causing problems when 
> launching an agent as non-root and not pointing {{--runtime_dir}} to a 
> different location.
> We need to call this out in the upgrade.md and getting-started.md docs so 
> that people know they may need to set this going forward.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6335) Add user doc for task group tasks

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6335:
--
Assignee: Gilbert Song  (was: Vinod Kone)
Target Version/s: 1.2.0  (was: 1.1.0)
 Description: Committed some basic documentation. So moving this to 
pods-improvements epic and targeting this for 1.2.0. I would like this to track 
the more comprehensive documentation.

> Add user doc for task group tasks
> -
>
> Key: MESOS-6335
> URL: https://issues.apache.org/jira/browse/MESOS-6335
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Vinod Kone
>Assignee: Gilbert Song
>
> Committed some basic documentation. So moving this to pods-improvements epic 
> and targeting this for 1.2.0. I would like this to track the more 
> comprehensive documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6279) Add test cases for the TCP health check

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6279:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> Add test cases for the TCP health check
> ---
>
> Key: MESOS-6279
> URL: https://issues.apache.org/jira/browse/MESOS-6279
> Project: Mesos
>  Issue Type: Task
>  Components: tests
>Reporter: haosdent
>Assignee: haosdent
>  Labels: health-check, mesosphere, test
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6335) Add user doc for task group tasks

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6335:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> Add user doc for task group tasks
> -
>
> Key: MESOS-6335
> URL: https://issues.apache.org/jira/browse/MESOS-6335
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Vinod Kone
>Assignee: Vinod Kone
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6288) The default executor should maintain launcher_dir.

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6288:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> The default executor should maintain launcher_dir.
> --
>
> Key: MESOS-6288
> URL: https://issues.apache.org/jira/browse/MESOS-6288
> Project: Mesos
>  Issue Type: Bug
>Reporter: Alexander Rukletsov
>Assignee: Gastón Kleiman
>  Labels: health-check, mesosphere
>
> Both command and docker executors require {{launcher_dir}} is provided in a 
> flag. This directory contains mesos binaries, e.g. a tcp checker necessary 
> for TCP health check. The default executor should obtain somehow (a flag, env 
> var) and maintain this directory for health checker to use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6348) Allow `network/cni` isolator unit-tests to run with CNI plugins

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6348:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> Allow `network/cni` isolator unit-tests to run with CNI plugins 
> 
>
> Key: MESOS-6348
> URL: https://issues.apache.org/jira/browse/MESOS-6348
> Project: Mesos
>  Issue Type: Task
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>  Labels: mesosphere
>
> Currently, we don't have any infrastructure to allow for CNI plugins to be 
> used in `network/cni` isolator unit-tests. This forces us to mock CNI plugins 
> that don't use new network namespaces leading to very restricting form of 
> unit-tests. 
> Especially for port-mapper plugin, in order to test its DNAT functionality it 
> will be very useful if we run the containers in separate network namespace 
> requiring an actual CNI plugin.
> The proposal is there to introduce a test filter called CNIPLUGIN, that gets 
> set when CNI_PATH env var is set. Tests using the CNIPLUGIN filter can then 
> use actual CNI plugins in their tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6292) Add unit tests for nested container case for docker/runtime isolator.

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6292:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> Add unit tests for nested container case for docker/runtime isolator.
> -
>
> Key: MESOS-6292
> URL: https://issues.apache.org/jira/browse/MESOS-6292
> Project: Mesos
>  Issue Type: Improvement
>  Components: isolation
>Reporter: Gilbert Song
>Assignee: Gilbert Song
>  Labels: isolator, mesosphere
>
> Launch nested containers with different container images specified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6119) TCP health checks are not portable.

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6119:
--
Sprint: Mesosphere Sprint 42, Mesosphere Sprint 43, Mesosphere Sprint 44, 
Mesosphere Sprint 45  (was: Mesosphere Sprint 42, Mesosphere Sprint 43, 
Mesosphere Sprint 44)

> TCP health checks are not portable.
> ---
>
> Key: MESOS-6119
> URL: https://issues.apache.org/jira/browse/MESOS-6119
> Project: Mesos
>  Issue Type: Bug
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: health-check, mesosphere
>
> MESOS-3567 introduced a dependency on "bash" for TCP health checks, which is 
> undesirable. We should implement a portable solution for TCP health checks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3335) FlagsBase copy-ctor leads to dangling pointer.

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3335:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> FlagsBase copy-ctor leads to dangling pointer.
> --
>
> Key: MESOS-3335
> URL: https://issues.apache.org/jira/browse/MESOS-3335
> Project: Mesos
>  Issue Type: Bug
>Reporter: Neil Conway
>Assignee: Benjamin Bannier
>  Labels: mesosphere
> Attachments: lambda_capture_bug.cpp
>
>
> Per [#3328], ubsan detects the following problem:
> [ RUN ] FaultToleranceTest.ReregisterCompletedFrameworks
> /mesos/3rdparty/libprocess/3rdparty/stout/include/stout/flags/flags.hpp:303:25:
>  runtime error: load of value 33, which is not a valid value for type 'bool'
> I believe what is going on here is the following:
> * The test calls StartMaster(), which does MesosTest::CreateMasterFlags()
> * MesosTest::CreateMasterFlags() allocates a new master::Flags on the stack, 
> which is subsequently copy-constructed back to StartMaster()
> * The FlagsBase constructor is:
> bq. {{FlagsBase() { add(, "help", "...", false); }}}
> where "help" is a member variable -- i.e., it is allocated on the stack in 
> this case.
> * {{FlagsBase()::add}} captures {{}}, e.g.:
> {noformat}
> flag.stringify = [t1](const FlagsBase&) -> Option {
> return stringify(*t1);
>   };}}
> {noformat}
> * The implicit copy constructor for FlagsBase is just going to copy the 
> lambda above, i.e., the result of the copy constructor will have a lambda 
> that points into MesosTest::CreateMasterFlags()'s stack frame, which is bad 
> news.
> Not sure the right fix -- comments welcome. You could define a copy-ctor for 
> FlagsBase that does something gross (basically remove the old help flag and 
> define a new one that points into the target of the copy), but that seems, 
> well, gross.
> Probably not a pressing-problem to fix -- AFAICS worst symptom is that we end 
> up reading one byte from some random stack location when serving 
> {{state.json}}, for example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3753) Test the HTTP Scheduler library with SSL enabled

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-3753:
--
Sprint: Mesosphere Sprint 39, Mesosphere Sprint 40, Mesosphere Sprint 41, 
Mesosphere Sprint 42, Mesosphere Sprint 44, Mesosphere Sprint 45  (was: 
Mesosphere Sprint 39, Mesosphere Sprint 40, Mesosphere Sprint 41, Mesosphere 
Sprint 42, Mesosphere Sprint 44)

> Test the HTTP Scheduler library with SSL enabled
> 
>
> Key: MESOS-3753
> URL: https://issues.apache.org/jira/browse/MESOS-3753
> Project: Mesos
>  Issue Type: Story
>  Components: framework, HTTP API, test
>Reporter: Joseph Wu
>Assignee: Greg Mann
>  Labels: mesosphere, security
>
> Currently, the HTTP Scheduler library does not support SSL-enabled Mesos.  
> (You can manually test this by spinning up an SSL-enabled master and attempt 
> to run the event-call framework example against it.)
> We need to add tests that check the HTTP Scheduler library against 
> SSL-enabled Mesos:
> * with downgrade support,
> * with required framework/client-side certifications,
> * with/without verification of certificates (master-side),
> * with/without verification of certificates (framework-side),
> * with a custom certificate authority (CA)
> These options should be controlled by the same environment variables found on 
> the [SSL user doc|http://mesos.apache.org/documentation/latest/ssl/].
> Note: This issue will be broken down into smaller sub-issues as bugs/problems 
> are discovered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6376) Add documentation for capabilities support of the mesos containerizer

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6376:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> Add documentation for capabilities support of the mesos containerizer
> -
>
> Key: MESOS-6376
> URL: https://issues.apache.org/jira/browse/MESOS-6376
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Blocker
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6142) Frameworks may RESERVE for an arbitrary role.

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6142:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> Frameworks may RESERVE for an arbitrary role.
> -
>
> Key: MESOS-6142
> URL: https://issues.apache.org/jira/browse/MESOS-6142
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation, master
>Affects Versions: 1.0.0
>Reporter: Alexander Rukletsov
>Assignee: Gastón Kleiman
>  Labels: mesosphere, reservations
>
> The master does not validate that resources from a reservation request have 
> the same role the framework is registered with. As a result, frameworks may 
> reserve resources for arbitrary roles.
> I've modified the role in [the {{ReserveThenUnreserve}} 
> test|https://github.com/apache/mesos/blob/bca600cf5602ed8227d91af9f73d689da14ad786/src/tests/reservation_tests.cpp#L117]
>  to "yoyo" and observed the following in the test's log:
> {noformat}
> I0908 18:35:43.379122 2138112 master.cpp:3362] Processing ACCEPT call for 
> offers: [ dfaf67e6-7c1c-4988-b427-c49842cb7bb7-O0 ] on agent 
> dfaf67e6-7c1c-4988-b427-c49842cb7bb7-S0 at slave(1)@10.200.181.237:60116 
> (alexr.railnet.train) for framework dfaf67e6-7c1c-4988-b427-c49842cb7bb7- 
> (default) at 
> scheduler-ca12a660-9f08-49de-be4e-d452aa3aa6da@10.200.181.237:60116
> I0908 18:35:43.379170 2138112 master.cpp:3022] Authorizing principal 
> 'test-principal' to reserve resources 'cpus(yoyo, test-principal):1; 
> mem(yoyo, test-principal):512'
> I0908 18:35:43.379678 2138112 master.cpp:3642] Applying RESERVE operation for 
> resources cpus(yoyo, test-principal):1; mem(yoyo, test-principal):512 from 
> framework dfaf67e6-7c1c-4988-b427-c49842cb7bb7- (default) at 
> scheduler-ca12a660-9f08-49de-be4e-d452aa3aa6da@10.200.181.237:60116 to agent 
> dfaf67e6-7c1c-4988-b427-c49842cb7bb7-S0 at slave(1)@10.200.181.237:60116 
> (alexr.railnet.train)
> I0908 18:35:43.379767 2138112 master.cpp:7341] Sending checkpointed resources 
> cpus(yoyo, test-principal):1; mem(yoyo, test-principal):512 to agent 
> dfaf67e6-7c1c-4988-b427-c49842cb7bb7-S0 at slave(1)@10.200.181.237:60116 
> (alexr.railnet.train)
> I0908 18:35:43.380273 3211264 slave.cpp:2497] Updated checkpointed resources 
> from  to cpus(yoyo, test-principal):1; mem(yoyo, test-principal):512
> I0908 18:35:43.380574 2674688 hierarchical.cpp:760] Updated allocation of 
> framework dfaf67e6-7c1c-4988-b427-c49842cb7bb7- on agent 
> dfaf67e6-7c1c-4988-b427-c49842cb7bb7-S0 from cpus(*):1; mem(*):512; 
> disk(*):470841; ports(*):[31000-32000] to ports(*):[31000-32000]; cpus(yoyo, 
> test-principal):1; disk(*):470841; mem(yoyo, test-principal):512 with RESERVE 
> operation
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6184) Health checks should use a general mechanism to enter namespaces of the task.

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6184:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> Health checks should use a general mechanism to enter namespaces of the task.
> -
>
> Key: MESOS-6184
> URL: https://issues.apache.org/jira/browse/MESOS-6184
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>Assignee: haosdent
>Priority: Blocker
>  Labels: health-check, mesosphere
>
> To perform health checks for tasks, we need to enter the corresponding 
> namespaces of the container. For now health check use custom clone to 
> implement this
> {code}
>   return process::defaultClone([=]() -> int {
> if (taskPid.isSome()) {
>   foreach (const string& ns, namespaces) {
> Try setns = ns::setns(taskPid.get(), ns);
> if (setns.isError()) {
>   ...
> }
>   }
> }
> return func();
>   });
> {code}
> After the childHooks patches merged, we could change the health check to use 
> childHooks to call {{setns}} and make {{process::defaultClone}} private 
> again.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6023) Create a binary for the port-mapper plugin

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6023:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> Create a binary for the port-mapper plugin
> --
>
> Key: MESOS-6023
> URL: https://issues.apache.org/jira/browse/MESOS-6023
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: Linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Blocker
>
> The CNI port mapper plugin needs to be a separate binary that will be invoked 
> by the `network/cni` isolator as a CNI plugin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6366) Design doc for agent secrets

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6366:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> Design doc for agent secrets
> 
>
> Key: MESOS-6366
> URL: https://issues.apache.org/jira/browse/MESOS-6366
> Project: Mesos
>  Issue Type: Task
>  Components: slave
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> Produce a design for the passing of credentials to the agent, and their use 
> in the following three scenarios:
> * HTTP executor authentication
> * Container image fetching
> * Artifact fetching



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6171) Introduce "global" decision policy for unhealthy tasks.

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6171:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> Introduce "global" decision policy for unhealthy tasks.
> ---
>
> Key: MESOS-6171
> URL: https://issues.apache.org/jira/browse/MESOS-6171
> Project: Mesos
>  Issue Type: Improvement
>Affects Versions: 1.0.0
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>  Labels: health-check, mesosphere
>
> Currently, if the task is deemed unhealthy, i.e. it failed a health check a 
> certain number of times, it is killed by both default executors: 
> [command|https://github.com/apache/mesos/blob/b053572bc424478cafcd60d1bce078f5132c4590/src/launcher/executor.cpp#L299]
>  and 
> [docker|https://github.com/apache/mesos/blob/b053572bc424478cafcd60d1bce078f5132c4590/src/docker/executor.cpp#L315].
>  This is what can be called "local" kill policy.
> While local kill policy can save some network traffic and unload the 
> scheduler, there are cases, when a scheduler may want to decide what—and 
> when—to do. This is what can be called "global" policy, i.e. the health check 
> library reports whether a health check failed or succeeded, while the 
> executor forwards this update to the scheduler without taking any action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5303) Add capabilities support for mesos execute cli.

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-5303:
--
Sprint: Mesosphere Sprint 34, Mesosphere Sprint 35, Mesosphere Sprint 37, 
Mesosphere Sprint 38, Mesosphere Sprint 39, Mesosphere Sprint 40, Mesosphere 
Sprint 41, Mesosphere Sprint 42, Mesosphere Sprint 44, Mesosphere Sprint 45  
(was: Mesosphere Sprint 34, Mesosphere Sprint 35, Mesosphere Sprint 37, 
Mesosphere Sprint 38, Mesosphere Sprint 39, Mesosphere Sprint 40, Mesosphere 
Sprint 41, Mesosphere Sprint 42, Mesosphere Sprint 44)

> Add capabilities support for mesos execute cli.
> ---
>
> Key: MESOS-5303
> URL: https://issues.apache.org/jira/browse/MESOS-5303
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>
> Add support for `user` and `capabilities` to execute cli. This will help in 
> testing the `capabilities` feature for unified containerizer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5792) Add mesos tests to CMake (make check)

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-5792:
--
Sprint: Mesosphere Sprint 40, Mesosphere Sprint 41, Mesosphere Sprint 42, 
Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 40, 
Mesosphere Sprint 41, Mesosphere Sprint 42, Mesosphere Sprint 44)

> Add mesos tests to CMake (make check)
> -
>
> Key: MESOS-5792
> URL: https://issues.apache.org/jira/browse/MESOS-5792
> Project: Mesos
>  Issue Type: Improvement
>  Components: build
>Reporter: Srinivas
>Assignee: Srinivas
>  Labels: build, mesosphere
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Provide CMakeLists.txt and configuration files to build mesos tests using 
> CMake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5966) Add libprocess HTTP tests with SSL support

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-5966:
--
Sprint: Mesosphere Sprint 40, Mesosphere Sprint 41, Mesosphere Sprint 42, 
Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 40, 
Mesosphere Sprint 41, Mesosphere Sprint 42, Mesosphere Sprint 44)

> Add libprocess HTTP tests with SSL support
> --
>
> Key: MESOS-5966
> URL: https://issues.apache.org/jira/browse/MESOS-5966
> Project: Mesos
>  Issue Type: Task
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> Libprocess contains SSL unit tests which test our SSL support using simple 
> sockets. We should add tests which also make use of libprocess's various HTTP 
> classes and helpers in a variety of SSL configurations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6291) Add unit tests for nested container case for filesystem/linux isolator.

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6291:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> Add unit tests for nested container case for filesystem/linux isolator.
> ---
>
> Key: MESOS-6291
> URL: https://issues.apache.org/jira/browse/MESOS-6291
> Project: Mesos
>  Issue Type: Improvement
>  Components: isolation
>Reporter: Gilbert Song
>Assignee: Gilbert Song
>  Labels: isolator, mesosphere
>
> Parameterize the existing tests so that all works for both top level 
> container and nested container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6193) Make the docker/volume isolator nesting aware.

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6193:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> Make the docker/volume isolator nesting aware.
> --
>
> Key: MESOS-6193
> URL: https://issues.apache.org/jira/browse/MESOS-6193
> Project: Mesos
>  Issue Type: Task
>Reporter: Jie Yu
>Assignee: Gilbert Song
>  Labels: isolator, mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6278) Add test cases for the HTTP health checks

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6278:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> Add test cases for the HTTP health checks
> -
>
> Key: MESOS-6278
> URL: https://issues.apache.org/jira/browse/MESOS-6278
> Project: Mesos
>  Issue Type: Task
>  Components: tests
>Reporter: haosdent
>Assignee: haosdent
>  Labels: health-check, mesosphere, test
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6293) HealthCheckTest.HealthyTaskViaHTTPWithoutType fails on some distros.

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-6293:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> HealthCheckTest.HealthyTaskViaHTTPWithoutType fails on some distros.
> 
>
> Key: MESOS-6293
> URL: https://issues.apache.org/jira/browse/MESOS-6293
> Project: Mesos
>  Issue Type: Bug
>Reporter: Alexander Rukletsov
>Assignee: haosdent
>  Labels: health-check, mesosphere
>
> I see consistent failures of this test in the internal CI in *some* distros, 
> specifically CentOS 6, Ubuntu 14, 15, 16. The source of the health check 
> failure is always the same: {{curl}} cannot connect to the target:
> {noformat}
> Received task health update, healthy: false
> W0929 17:22:05.270992  2730 health_checker.cpp:204] Health check failed 1 
> times consecutively: HTTP health check failed: curl returned exited with 
> status 7: curl: (7) couldn't connect to host
> I0929 17:22:05.273634 26850 slave.cpp:3609] Handling status update 
> TASK_RUNNING (UUID: f5408ac9-f6ba-447f-b3d7-9dce44384ffe) for task 
> aa0792d3-8d85-4c32-bd04-56a9b552ebda in health state unhealthy of framework 
> 2e0e9ea1-0ae5-4f28-80bb-a9abc56c5a6f- from executor(1)@172.30.2.20:58660
> I0929 17:22:05.274178 26844 status_update_manager.cpp:323] Received status 
> update TASK_RUNNING (UUID: f5408ac9-f6ba-447f-b3d7-9dce44384ffe) for task 
> aa0792d3-8d85-4c32-bd04-56a9b552ebda in health state unhealthy of framework 
> 2e0e9ea1-0ae5-4f28-80bb-a9abc56c5a6f-
> I0929 17:22:05.274226 26844 status_update_manager.cpp:377] Forwarding update 
> TASK_RUNNING (UUID: f5408ac9-f6ba-447f-b3d7-9dce44384ffe) for task 
> aa0792d3-8d85-4c32-bd04-56a9b552ebda in health state unhealthy of framework 
> 2e0e9ea1-0ae5-4f28-80bb-a9abc56c5a6f- to the agent
> I0929 17:22:05.274314 26845 slave.cpp:4026] Forwarding the update 
> TASK_RUNNING (UUID: f5408ac9-f6ba-447f-b3d7-9dce44384ffe) for task 
> aa0792d3-8d85-4c32-bd04-56a9b552ebda in health state unhealthy of framework 
> 2e0e9ea1-0ae5-4f28-80bb-a9abc56c5a6f- to master@172.30.2.20:38955
> I0929 17:22:05.274415 26845 slave.cpp:3920] Status update manager 
> successfully handled status update TASK_RUNNING (UUID: 
> f5408ac9-f6ba-447f-b3d7-9dce44384ffe) for task 
> aa0792d3-8d85-4c32-bd04-56a9b552ebda in health state unhealthy of framework 
> 2e0e9ea1-0ae5-4f28-80bb-a9abc56c5a6f-
> I0929 17:22:05.274436 26845 slave.cpp:3936] Sending acknowledgement for 
> status update TASK_RUNNING (UUID: f5408ac9-f6ba-447f-b3d7-9dce44384ffe) for 
> task aa0792d3-8d85-4c32-bd04-56a9b552ebda in health state unhealthy of 
> framework 2e0e9ea1-0ae5-4f28-80bb-a9abc56c5a6f- to 
> executor(1)@172.30.2.20:58660
> I0929 17:22:05.274534 26849 master.cpp:5661] Status update TASK_RUNNING 
> (UUID: f5408ac9-f6ba-447f-b3d7-9dce44384ffe) for task 
> aa0792d3-8d85-4c32-bd04-56a9b552ebda in health state unhealthy of framework 
> 2e0e9ea1-0ae5-4f28-80bb-a9abc56c5a6f- from agent 
> 2e0e9ea1-0ae5-4f28-80bb-a9abc56c5a6f-S0 at slave(77)@172.30.2.20:38955 
> (ip-172-30-2-20.mesosphere.io)
> ../../src/tests/health_check_tests.cpp:1398: Failure
> I0929 17:22:05.274567 26849 master.cpp:5723] Forwarding status update 
> TASK_RUNNING (UUID: f5408ac9-f6ba-447f-b3d7-9dce44384ffe) for task 
> aa0792d3-8d85-4c32-bd04-56a9b552ebda in health state unhealthy of framework 
> 2e0e9ea1-0ae5-4f28-80bb-a9abc56c5a6f-
> Value of: statusHealth.get().healthy()
>   Actual: false
>   Expected: true
> I0929 17:22:05.274636 26849 master.cpp:7560] Updating the state of task 
> aa0792d3-8d85-4c32-bd04-56a9b552ebda of framework 
> 2e0e9ea1-0ae5-4f28-80bb-a9abc56c5a6f- (latest state: TASK_RUNNING, status 
> update state: TASK_RUNNING)
> I0929 17:22:05.274829 26844 sched.cpp:1025] Scheduler::statusUpdate took 
> 43297ns
> Received SHUTDOWN event
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5856) Logrotate ContainerLogger module does not rotate logs when run as root with --switch_user

2016-10-14 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-5856:
--
Sprint: Mesosphere Sprint 44, Mesosphere Sprint 45  (was: Mesosphere Sprint 
44)

> Logrotate ContainerLogger module does not rotate logs when run as root with 
> --switch_user
> -
>
> Key: MESOS-5856
> URL: https://issues.apache.org/jira/browse/MESOS-5856
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.0, 0.28.0, 1.0.0
>Reporter: Joseph Wu
>Assignee: Sivaram Kannan
>Priority: Critical
>  Labels: logger, mesosphere, newbie
>
> The logrotate ContainerLogger module runs as the agent's user.  In most 
> cases, this is {{root}}.
> When {{logrotate}} is run as root, there is an additional check the 
> configuration files must pass (because a root {{logrotate}} needs to be 
> secured against non-root modifications to the configuration):
> https://github.com/logrotate/logrotate/blob/fe80cb51a2571ca35b1a7c8ba0695db5a68feaba/config.c#L807-L815
> Log rotation will fail under the following scenario:
> 1) The agent is run with {{--switch_user}} (default: true)
> 2) A task is launched with a non-root user specified
> 3) The logrotate module spawns a few companion processes (as root) and this 
> creates the {{stdout}}, {{stderr}}, {{stdout.logrotate.conf}}, and 
> {{stderr.logrotate.conf}} files (as root).  This step races with the next 
> step.
> 4) The Mesos containerizer and Fetcher will {{chown}} the task's sandbox to 
> the non-root user.  Including the files just created.
> 5) When {{logrotate}} is run, it will skip any non-root configuration files.  
> This means the files are not rotated.
> 
> Fix: The logrotate module's companion processes should call {{setuid}} and 
> {{setgid}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6310) Remove or define non-POSIX function

2016-10-14 Thread Marc Villacorta (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576711#comment-15576711
 ] 

Marc Villacorta commented on MESOS-6310:


It builds successfully after I applied that last patch.

> Remove or define non-POSIX function
> ---
>
> Key: MESOS-6310
> URL: https://issues.apache.org/jira/browse/MESOS-6310
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Affects Versions: 1.0.2
>Reporter: Marc Villacorta
>Assignee: Kevin Klues
>Priority: Minor
> Fix For: 1.1.0
>
>
> I was trying to compile Mesos using _musl_ inside Alpine Linux 3.4.
> But this [commit| 
> https://github.com/apache/mesos/commit/498d14e934233e4501597b43da3924bfe8b2de20]
>  introduced the {{W_EXITCODE()}} macro which is not defined in _musl_ and 
> seems to be non-POSIX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4566) Avoid unnecessary temporary `std::string` constructions and copies in `jsonify`.

2016-10-14 Thread Michael Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-4566:

Description: A few of the critical code paths in {{jsonify}} involve 
unnecessary temporary string construction and copies (inherited from the 
{{JSON::*}}). For example, {{strings::trim}} is used to remove trailing 0s from 
printing {{double}}. We print {{double}} a lot, and therefore constructing a 
temporary {{std::string}} on printing of every double is extremely costly. This 
ticket captures the work involved in avoiding them.  (was: A few of the 
critical code paths in {{jsonify}} involve unnecessary temporary string 
construction and copies (inherited from the {{JSON::*}}). For example, 
{{strings::trim}} is used to remove trailing 0s from printing {{double}}s. We 
print {{double}}s a lot, and therefore constructing a temporary {{std::string}} 
on printing of every double is extremely costly. This ticket captures the work 
involved in avoiding them.)

> Avoid unnecessary temporary `std::string` constructions and copies in 
> `jsonify`.
> 
>
> Key: MESOS-4566
> URL: https://issues.apache.org/jira/browse/MESOS-4566
> Project: Mesos
>  Issue Type: Improvement
>  Components: stout
>Reporter: Michael Park
>Assignee: Michael Park
>  Labels: mesosphere
> Fix For: 0.24.2, 0.25.1, 0.26.1, 0.27.1, 0.28.0
>
>
> A few of the critical code paths in {{jsonify}} involve unnecessary temporary 
> string construction and copies (inherited from the {{JSON::*}}). For example, 
> {{strings::trim}} is used to remove trailing 0s from printing {{double}}. We 
> print {{double}} a lot, and therefore constructing a temporary 
> {{std::string}} on printing of every double is extremely costly. This ticket 
> captures the work involved in avoiding them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6391) Command task's sandbox should not be owned by root if it uses container image.

2016-10-14 Thread Gilbert Song (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gilbert Song updated MESOS-6391:

Target Version/s: 0.28.3, 1.0.2, 1.1.0  (was: 1.1.0)

> Command task's sandbox should not be owned by root if it uses container image.
> --
>
> Key: MESOS-6391
> URL: https://issues.apache.org/jira/browse/MESOS-6391
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.28.2, 1.0.1
>Reporter: Jie Yu
>Assignee: Jie Yu
>Priority: Blocker
>
> Currently, if the task defines a container image, the command executor will 
> be run under root because it needs to perform pivot_root.
> That means if the task wants to run under an unprivileged user, the sandbox 
> of that task will not be writable because it's owned by root.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6391) Command task's sandbox should not be owned by root if it uses container image.

2016-10-14 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576557#comment-15576557
 ] 

Gilbert Song commented on MESOS-6391:
-

Yes, let me update it.

> Command task's sandbox should not be owned by root if it uses container image.
> --
>
> Key: MESOS-6391
> URL: https://issues.apache.org/jira/browse/MESOS-6391
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.28.2, 1.0.1
>Reporter: Jie Yu
>Assignee: Jie Yu
>Priority: Blocker
>
> Currently, if the task defines a container image, the command executor will 
> be run under root because it needs to perform pivot_root.
> That means if the task wants to run under an unprivileged user, the sandbox 
> of that task will not be writable because it's owned by root.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6391) Command task's sandbox should not be owned by root if it uses container image.

2016-10-14 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576544#comment-15576544
 ] 

Vinod Kone commented on MESOS-6391:
---

should this be targeted for 0.28.3 and 1.0.2 as well?

> Command task's sandbox should not be owned by root if it uses container image.
> --
>
> Key: MESOS-6391
> URL: https://issues.apache.org/jira/browse/MESOS-6391
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.28.2, 1.0.1
>Reporter: Jie Yu
>Assignee: Jie Yu
>Priority: Blocker
>
> Currently, if the task defines a container image, the command executor will 
> be run under root because it needs to perform pivot_root.
> That means if the task wants to run under an unprivileged user, the sandbox 
> of that task will not be writable because it's owned by root.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6028) mesos-execute has a typo in volume help.

2016-10-14 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6028:
---
Summary: mesos-execute has a typo in volume help.  (was: typo in 
mesos-execute usage)

> mesos-execute has a typo in volume help.
> 
>
> Key: MESOS-6028
> URL: https://issues.apache.org/jira/browse/MESOS-6028
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Stéphane Cottin
>Assignee: Tomasz Janiszewski
>Priority: Minor
>
> s/docker_options/driver_options/g



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6050) Add an agent flag for 'runtime_dir'

2016-10-14 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576397#comment-15576397
 ] 

Yan Xu commented on MESOS-6050:
---

{{/var/run/}} is generally known as run state. Runtime is pretty ambiguous to 
me and a [separate effort|https://reviews.apache.org/r/52556/] had been made to 
name something else "runtime" in Mesos.

Would you consider {{--runstate_dir}} or {{MESOS_RUNSTATE_DIR}} as better 
alternatives? See 
https://www.gnu.org/prep/standards/html_node/Directory-Variables.html for other 
conventions.

> Add an agent flag for 'runtime_dir'
> ---
>
> Key: MESOS-6050
> URL: https://issues.apache.org/jira/browse/MESOS-6050
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Kevin Klues
>Assignee: Kevin Klues
> Fix For: 1.1.0
>
>
> Currently, a number of agent components hard code a path
> under '/var/run/mesos' in order to persist runtime information across
> agent crashes. This path should be configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3959) Executor page of mesos ui does not show slave hostname.

2016-10-14 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-3959:
---
Summary: Executor page of mesos ui does not show slave hostname.  (was: 
Executor page of mesos ui does not show slave hostname)

> Executor page of mesos ui does not show slave hostname.
> ---
>
> Key: MESOS-3959
> URL: https://issues.apache.org/jira/browse/MESOS-3959
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Reporter: Ian Babrou
>
> This is not really convenient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6360) The handling of whiteout files in provisioner is not correct

2016-10-14 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576335#comment-15576335
 ] 

Jie Yu commented on MESOS-6360:
---

I am fine we don't target this for 1.1.0. I mark the target version as 1.1.1 
because I think it's CRITICAL to get this fix out.

> The handling of whiteout files in provisioner is not correct
> 
>
> Key: MESOS-6360
> URL: https://issues.apache.org/jira/browse/MESOS-6360
> Project: Mesos
>  Issue Type: Bug
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>Priority: Blocker
>
> Currently when user launches a container from a Docker image via universal 
> containerizer, we always handle the whiteout files in 
> {{ProvisionerProcess::__provision()}} regardless of which backend is used.
> However this is actually not correct, because the way to handle whiteout 
> files is backend dependent, that means for different backends, we need to 
> handle whiteout files in different ways, e.g.:
> * AUFS backend: It seems the AUFS whiteout ({{.wh.}} and 
> {{.wh..wh..opq}}) is the whiteout standard in Docker (see [this comment | 
> https://github.com/docker/docker/blob/v1.12.1/pkg/archive/archive.go#L259:L262]
>  for details), so that means after the Docker image is pulled, its whiteout 
> files in the store are already in aufs format, then we do not need to do 
> anything about whiteout file handling because the aufs mount done in 
> {{AufsBackendProcess::provision()}} will handle it automatically.
> * Overlay backend: Overlayfs has its own whiteout files (see [this doc | 
> https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt] for 
> details), so we need to convert the aufs whiteout files to overlayfs whiteout 
> files before we do the overlay mount in {{OverlayBackendProcess::provision}} 
> which will automatically handle the overlayfs whiteout files.
> * Copy backend: We need to manually handle the aufs whiteout files when we 
> copy each layer in {{CopyBackendProcess::_provision()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6360) The handling of whiteout files in provisioner is not correct

2016-10-14 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-6360:
--
Target Version/s: 1.1.1  (was: 1.2.0)

> The handling of whiteout files in provisioner is not correct
> 
>
> Key: MESOS-6360
> URL: https://issues.apache.org/jira/browse/MESOS-6360
> Project: Mesos
>  Issue Type: Bug
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>Priority: Blocker
>
> Currently when user launches a container from a Docker image via universal 
> containerizer, we always handle the whiteout files in 
> {{ProvisionerProcess::__provision()}} regardless of which backend is used.
> However this is actually not correct, because the way to handle whiteout 
> files is backend dependent, that means for different backends, we need to 
> handle whiteout files in different ways, e.g.:
> * AUFS backend: It seems the AUFS whiteout ({{.wh.}} and 
> {{.wh..wh..opq}}) is the whiteout standard in Docker (see [this comment | 
> https://github.com/docker/docker/blob/v1.12.1/pkg/archive/archive.go#L259:L262]
>  for details), so that means after the Docker image is pulled, its whiteout 
> files in the store are already in aufs format, then we do not need to do 
> anything about whiteout file handling because the aufs mount done in 
> {{AufsBackendProcess::provision()}} will handle it automatically.
> * Overlay backend: Overlayfs has its own whiteout files (see [this doc | 
> https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt] for 
> details), so we need to convert the aufs whiteout files to overlayfs whiteout 
> files before we do the overlay mount in {{OverlayBackendProcess::provision}} 
> which will automatically handle the overlayfs whiteout files.
> * Copy backend: We need to manually handle the aufs whiteout files when we 
> copy each layer in {{CopyBackendProcess::_provision()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-6360) The handling of whiteout files in provisioner is not correct

2016-10-14 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-6360:
--
Comment: was deleted

(was: commit 3f6503861e92896a619e3b26a65f9b679a2dd3a9
Author: Qian Zhang 
Date:   Fri Oct 14 11:41:02 2016 -0700

Added backend suffix to image layer rootfs path.

Previously image layer rootfs path is in the format below regardless
of which backend is used.
  /layers//rootfs
This introduced an issue: when agent is restarted with a different
backend, we will wrongly handle the whiteout files since different
backends(e.g., aufs and overlay) have different whiteout standard.

In this commit, we added backend suffix to image layer rootfs path
for overlay backend like below.
  /layers//rootfs.overlay
For non-overlay backends, it is still in the previous format, this
is because they share the same whiteout standard (aufs standard),
and also used to handle backward compatibility.

So the expected result of this commit is:
1. If user switches backend from overlay to a non-overlay (or vice
versa) when restarting agent, all the image layers of the previous
backend in the store will just be ignored.
2. In the upgrade case, if user starts a new version of Mesos agent
with overlay backend, then all the image layers in the store pulled
by the old agent will just be ignored, but if user starts the new
agent with a non-overlay backend, then all the image layers in the
store can still be used.

Review: https://reviews.apache.org/r/52827/)

> The handling of whiteout files in provisioner is not correct
> 
>
> Key: MESOS-6360
> URL: https://issues.apache.org/jira/browse/MESOS-6360
> Project: Mesos
>  Issue Type: Bug
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>Priority: Blocker
>
> Currently when user launches a container from a Docker image via universal 
> containerizer, we always handle the whiteout files in 
> {{ProvisionerProcess::__provision()}} regardless of which backend is used.
> However this is actually not correct, because the way to handle whiteout 
> files is backend dependent, that means for different backends, we need to 
> handle whiteout files in different ways, e.g.:
> * AUFS backend: It seems the AUFS whiteout ({{.wh.}} and 
> {{.wh..wh..opq}}) is the whiteout standard in Docker (see [this comment | 
> https://github.com/docker/docker/blob/v1.12.1/pkg/archive/archive.go#L259:L262]
>  for details), so that means after the Docker image is pulled, its whiteout 
> files in the store are already in aufs format, then we do not need to do 
> anything about whiteout file handling because the aufs mount done in 
> {{AufsBackendProcess::provision()}} will handle it automatically.
> * Overlay backend: Overlayfs has its own whiteout files (see [this doc | 
> https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt] for 
> details), so we need to convert the aufs whiteout files to overlayfs whiteout 
> files before we do the overlay mount in {{OverlayBackendProcess::provision}} 
> which will automatically handle the overlayfs whiteout files.
> * Copy backend: We need to manually handle the aufs whiteout files when we 
> copy each layer in {{CopyBackendProcess::_provision()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6360) The handling of whiteout files in provisioner is not correct

2016-10-14 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576329#comment-15576329
 ] 

Jie Yu commented on MESOS-6360:
---

commit 3f6503861e92896a619e3b26a65f9b679a2dd3a9
Author: Qian Zhang 
Date:   Fri Oct 14 11:41:02 2016 -0700

Added backend suffix to image layer rootfs path.

Previously image layer rootfs path is in the format below regardless
of which backend is used.
  /layers//rootfs
This introduced an issue: when agent is restarted with a different
backend, we will wrongly handle the whiteout files since different
backends(e.g., aufs and overlay) have different whiteout standard.

In this commit, we added backend suffix to image layer rootfs path
for overlay backend like below.
  /layers//rootfs.overlay
For non-overlay backends, it is still in the previous format, this
is because they share the same whiteout standard (aufs standard),
and also used to handle backward compatibility.

So the expected result of this commit is:
1. If user switches backend from overlay to a non-overlay (or vice
versa) when restarting agent, all the image layers of the previous
backend in the store will just be ignored.
2. In the upgrade case, if user starts a new version of Mesos agent
with overlay backend, then all the image layers in the store pulled
by the old agent will just be ignored, but if user starts the new
agent with a non-overlay backend, then all the image layers in the
store can still be used.

Review: https://reviews.apache.org/r/52827/

> The handling of whiteout files in provisioner is not correct
> 
>
> Key: MESOS-6360
> URL: https://issues.apache.org/jira/browse/MESOS-6360
> Project: Mesos
>  Issue Type: Bug
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>Priority: Blocker
>
> Currently when user launches a container from a Docker image via universal 
> containerizer, we always handle the whiteout files in 
> {{ProvisionerProcess::__provision()}} regardless of which backend is used.
> However this is actually not correct, because the way to handle whiteout 
> files is backend dependent, that means for different backends, we need to 
> handle whiteout files in different ways, e.g.:
> * AUFS backend: It seems the AUFS whiteout ({{.wh.}} and 
> {{.wh..wh..opq}}) is the whiteout standard in Docker (see [this comment | 
> https://github.com/docker/docker/blob/v1.12.1/pkg/archive/archive.go#L259:L262]
>  for details), so that means after the Docker image is pulled, its whiteout 
> files in the store are already in aufs format, then we do not need to do 
> anything about whiteout file handling because the aufs mount done in 
> {{AufsBackendProcess::provision()}} will handle it automatically.
> * Overlay backend: Overlayfs has its own whiteout files (see [this doc | 
> https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt] for 
> details), so we need to convert the aufs whiteout files to overlayfs whiteout 
> files before we do the overlay mount in {{OverlayBackendProcess::provision}} 
> which will automatically handle the overlayfs whiteout files.
> * Copy backend: We need to manually handle the aufs whiteout files when we 
> copy each layer in {{CopyBackendProcess::_provision()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6360) The handling of whiteout files in provisioner is not correct

2016-10-14 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576328#comment-15576328
 ] 

Jie Yu commented on MESOS-6360:
---

commit 3f6503861e92896a619e3b26a65f9b679a2dd3a9
Author: Qian Zhang 
Date:   Fri Oct 14 11:41:02 2016 -0700

Added backend suffix to image layer rootfs path.

Previously image layer rootfs path is in the format below regardless
of which backend is used.
  /layers//rootfs
This introduced an issue: when agent is restarted with a different
backend, we will wrongly handle the whiteout files since different
backends(e.g., aufs and overlay) have different whiteout standard.

In this commit, we added backend suffix to image layer rootfs path
for overlay backend like below.
  /layers//rootfs.overlay
For non-overlay backends, it is still in the previous format, this
is because they share the same whiteout standard (aufs standard),
and also used to handle backward compatibility.

So the expected result of this commit is:
1. If user switches backend from overlay to a non-overlay (or vice
versa) when restarting agent, all the image layers of the previous
backend in the store will just be ignored.
2. In the upgrade case, if user starts a new version of Mesos agent
with overlay backend, then all the image layers in the store pulled
by the old agent will just be ignored, but if user starts the new
agent with a non-overlay backend, then all the image layers in the
store can still be used.

Review: https://reviews.apache.org/r/52827/

> The handling of whiteout files in provisioner is not correct
> 
>
> Key: MESOS-6360
> URL: https://issues.apache.org/jira/browse/MESOS-6360
> Project: Mesos
>  Issue Type: Bug
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>Priority: Blocker
>
> Currently when user launches a container from a Docker image via universal 
> containerizer, we always handle the whiteout files in 
> {{ProvisionerProcess::__provision()}} regardless of which backend is used.
> However this is actually not correct, because the way to handle whiteout 
> files is backend dependent, that means for different backends, we need to 
> handle whiteout files in different ways, e.g.:
> * AUFS backend: It seems the AUFS whiteout ({{.wh.}} and 
> {{.wh..wh..opq}}) is the whiteout standard in Docker (see [this comment | 
> https://github.com/docker/docker/blob/v1.12.1/pkg/archive/archive.go#L259:L262]
>  for details), so that means after the Docker image is pulled, its whiteout 
> files in the store are already in aufs format, then we do not need to do 
> anything about whiteout file handling because the aufs mount done in 
> {{AufsBackendProcess::provision()}} will handle it automatically.
> * Overlay backend: Overlayfs has its own whiteout files (see [this doc | 
> https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt] for 
> details), so we need to convert the aufs whiteout files to overlayfs whiteout 
> files before we do the overlay mount in {{OverlayBackendProcess::provision}} 
> which will automatically handle the overlayfs whiteout files.
> * Copy backend: We need to manually handle the aufs whiteout files when we 
> copy each layer in {{CopyBackendProcess::_provision()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6310) Remove or define non-POSIX function

2016-10-14 Thread Marc Villacorta (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576290#comment-15576290
 ] 

Marc Villacorta commented on MESOS-6310:


I think {{stout/os/wait.hpp}} should be included in 
{{src/slave/containerizer/mesos/launch.cpp}} too.

{code:none}
  CXX  slave/containerizer/mesos/libmesos_no_3rdparty_la-launch.lo
../../src/slave/containerizer/mesos/launch.cpp: In function 'void 
mesos::internal::slave::exitWithSignal(int)':
../../src/slave/containerizer/mesos/launch.cpp:224:44: error: 'W_EXITCODE' was 
not declared in this scope
 signalSafeWriteStatus(W_EXITCODE(0, sig));
^
../../src/slave/containerizer/mesos/launch.cpp: In function 'void 
mesos::internal::slave::exitWithStatus(int)':
../../src/slave/containerizer/mesos/launch.cpp:236:47: error: 'W_EXITCODE' was 
not declared in this scope
 signalSafeWriteStatus(W_EXITCODE(status, 0));
   ^
{code}

> Remove or define non-POSIX function
> ---
>
> Key: MESOS-6310
> URL: https://issues.apache.org/jira/browse/MESOS-6310
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Affects Versions: 1.0.2
>Reporter: Marc Villacorta
>Assignee: Kevin Klues
>Priority: Minor
> Fix For: 1.1.0
>
>
> I was trying to compile Mesos using _musl_ inside Alpine Linux 3.4.
> But this [commit| 
> https://github.com/apache/mesos/commit/498d14e934233e4501597b43da3924bfe8b2de20]
>  introduced the {{W_EXITCODE()}} macro which is not defined in _musl_ and 
> seems to be non-POSIX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-6396) Hooks should allow sandbox dependent environment variables.

2016-10-14 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576156#comment-15576156
 ] 

Till Toenshoff edited comment on MESOS-6396 at 10/14/16 7:23 PM:
-

I wish - unfortunately that one only mutates the Task environment.

see 
https://github.com/apache/mesos/blob/master/src/slave/containerizer/docker.cpp#L229,
 
https://github.com/apache/mesos/blob/master/src/slave/containerizer/docker.cpp#L1060


was (Author: tillt):
I wish - unfortunately that one only mutates the Task environment.

see 
https://github.com/apache/mesos/blob/master/src/slave/containerizer/docker.cpp#L229

> Hooks should allow sandbox dependent environment variables.
> ---
>
> Key: MESOS-6396
> URL: https://issues.apache.org/jira/browse/MESOS-6396
> Project: Mesos
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Till Toenshoff
>  Labels: containerizer, docker, hooks, module
>
> The {{slaveExecutorEnvironmentDecorator}} hook is the only one that allows 
> mutating the executor environment of a Docker container. That callback has no 
> means of getting the location of the sandbox. That in turn means that it is 
> not possible for a hook to create files and respective environment variables 
> listing  paths within the sandbox for the executor to access.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6396) Hooks should allow sandbox dependent environment variables.

2016-10-14 Thread Till Toenshoff (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-6396:
--
Labels: containerizer docker hooks  (was: )

> Hooks should allow sandbox dependent environment variables.
> ---
>
> Key: MESOS-6396
> URL: https://issues.apache.org/jira/browse/MESOS-6396
> Project: Mesos
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Till Toenshoff
>  Labels: containerizer, docker, hooks, module
>
> The {{slaveExecutorEnvironmentDecorator}} hook is the only one that allows 
> mutating the executor environment of a Docker container. That callback has no 
> means of getting the location of the sandbox. That in turn means that it is 
> not possible for a hook to create files and respective environment variables 
> listing  paths within the sandbox for the executor to access.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6396) Hooks should allow sandbox dependent environment variables.

2016-10-14 Thread Till Toenshoff (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-6396:
--
Labels: containerizer docker hooks module  (was: containerizer docker hooks)

> Hooks should allow sandbox dependent environment variables.
> ---
>
> Key: MESOS-6396
> URL: https://issues.apache.org/jira/browse/MESOS-6396
> Project: Mesos
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Till Toenshoff
>  Labels: containerizer, docker, hooks, module
>
> The {{slaveExecutorEnvironmentDecorator}} hook is the only one that allows 
> mutating the executor environment of a Docker container. That callback has no 
> means of getting the location of the sandbox. That in turn means that it is 
> not possible for a hook to create files and respective environment variables 
> listing  paths within the sandbox for the executor to access.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6396) Hooks should allow sandbox dependent environment variables.

2016-10-14 Thread Till Toenshoff (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-6396:
--
Affects Version/s: 1.1.0

> Hooks should allow sandbox dependent environment variables.
> ---
>
> Key: MESOS-6396
> URL: https://issues.apache.org/jira/browse/MESOS-6396
> Project: Mesos
>  Issue Type: Improvement
>Affects Versions: 1.1.0
>Reporter: Till Toenshoff
>  Labels: containerizer, docker, hooks, module
>
> The {{slaveExecutorEnvironmentDecorator}} hook is the only one that allows 
> mutating the executor environment of a Docker container. That callback has no 
> means of getting the location of the sandbox. That in turn means that it is 
> not possible for a hook to create files and respective environment variables 
> listing  paths within the sandbox for the executor to access.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6396) Hooks should allow sandbox dependent environment variables.

2016-10-14 Thread Till Toenshoff (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-6396:
--
Issue Type: Improvement  (was: Bug)

> Hooks should allow sandbox dependent environment variables.
> ---
>
> Key: MESOS-6396
> URL: https://issues.apache.org/jira/browse/MESOS-6396
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Till Toenshoff
>
> The {{slaveExecutorEnvironmentDecorator}} hook is the only one that allows 
> mutating the executor environment of a Docker container. That callback has no 
> means of getting the location of the sandbox. That in turn means that it is 
> not possible for a hook to create files and respective environment variables 
> listing  paths within the sandbox for the executor to access.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6396) Hooks should allow sandbox dependent environment variables.

2016-10-14 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576156#comment-15576156
 ] 

Till Toenshoff commented on MESOS-6396:
---

I wish - unfortunately that one only mutates the Task environment.

see 
https://github.com/apache/mesos/blob/master/src/slave/containerizer/docker.cpp#L229

> Hooks should allow sandbox dependent environment variables.
> ---
>
> Key: MESOS-6396
> URL: https://issues.apache.org/jira/browse/MESOS-6396
> Project: Mesos
>  Issue Type: Bug
>Reporter: Till Toenshoff
>
> The {{slaveExecutorEnvironmentDecorator}} hook is the only one that allows 
> mutating the executor environment of a Docker container. That callback has no 
> means of getting the location of the sandbox. That in turn means that it is 
> not possible for a hook to create files and respective environment variables 
> listing  paths within the sandbox for the executor to access.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6396) Hooks should allow sandbox dependent environment variables.

2016-10-14 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576144#comment-15576144
 ] 

Joseph Wu commented on MESOS-6396:
--

The {{slavePreLaunchDockerEnvironmentDecorator}} Should fit your purposes.

> Hooks should allow sandbox dependent environment variables.
> ---
>
> Key: MESOS-6396
> URL: https://issues.apache.org/jira/browse/MESOS-6396
> Project: Mesos
>  Issue Type: Bug
>Reporter: Till Toenshoff
>
> The {{slaveExecutorEnvironmentDecorator}} hook is the only one that allows 
> mutating the executor environment of a Docker container. That callback has no 
> means of getting the location of the sandbox. That in turn means that it is 
> not possible for a hook to create files and respective environment variables 
> listing  paths within the sandbox for the executor to access.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6396) Hooks should allow sandbox dependent environment variables.

2016-10-14 Thread Till Toenshoff (JIRA)
Till Toenshoff created MESOS-6396:
-

 Summary: Hooks should allow sandbox dependent environment 
variables.
 Key: MESOS-6396
 URL: https://issues.apache.org/jira/browse/MESOS-6396
 Project: Mesos
  Issue Type: Bug
Reporter: Till Toenshoff


The {{slaveExecutorEnvironmentDecorator}} hook is the only one that allows 
mutating the executor environment of a Docker container. That callback has no 
means of getting the location of the sandbox. That in turn means that it is not 
possible for a hook to create files and respective environment variables 
listing  paths within the sandbox for the executor to access.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6312) Add requirement in upgrade.md and getting-started.md for agent '--runtime_dir' in when running as non-root

2016-10-14 Thread Kevin Klues (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576055#comment-15576055
 ] 

Kevin Klues commented on MESOS-6312:


Also, this one now: https://reviews.apache.org/r/52856/ 

> Add requirement in upgrade.md and getting-started.md for agent 
> '--runtime_dir' in when running as non-root
> --
>
> Key: MESOS-6312
> URL: https://issues.apache.org/jira/browse/MESOS-6312
> Project: Mesos
>  Issue Type: Task
>Reporter: Kevin Klues
>Assignee: Kevin Klues
>Priority: Blocker
>
> We recently introduced a new agent flag for {{\-\-runtime_dir}}. Unlike the 
> {{\-\-work_dir}}, this directory is designed to hold the state of a running 
> agent between subsequent agent-restarts (but not across host reboots).
> By default, this flag is set to {{/var/run/mesos}} since this is a {{tempfs}} 
> on linux that gets automatically cleaned up on reboot. However, on most 
> systems {{/var/run/mesos}} is only writable by root, causing problems when 
> launching an agent as non-root and not pointing {{--runtime_dir}} to a 
> different location.
> We need to call this out in the upgrade.md and getting-started.md docs so 
> that people know they may need to set this going forward.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6380) mesos-local failed to start without sudo

2016-10-14 Thread Kevin Klues (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576050#comment-15576050
 ] 

Kevin Klues commented on MESOS-6380:


Also requires https://reviews.apache.org/r/52856/  now

> mesos-local failed to start without sudo
> 
>
> Key: MESOS-6380
> URL: https://issues.apache.org/jira/browse/MESOS-6380
> Project: Mesos
>  Issue Type: Bug
>Reporter: haosdent
>
> Got this error when launch mesos-local without sudo
> {code}
>  message: 'Failed to launch container: Failed to make the containerizer 
> runtime directory 
> '/var/run/mesos/containers/f2d6947f-2916-4f1a-90dc-3d137b360b9c': Permission 
> denied; Abnormal executor termination: unknown container'
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6239) Fix warnings and errors produced by new hardened CXXFLAGS

2016-10-14 Thread Aaron Wood (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Wood updated MESOS-6239:
--
Description: 
Most of the new warnings/errors come from libprocess/stout as there were never 
any CXXFLAGS propagated to them.

https://reviews.apache.org/r/52647/
https://reviews.apache.org/r/52754/
https://reviews.apache.org/r/52886/

  was:
Most of the new warnings/errors come from libprocess/stout as there were never 
any CXXFLAGS propagated to them.

https://reviews.apache.org/r/52647/
https://reviews.apache.org/r/52754/


> Fix warnings and errors produced by new hardened CXXFLAGS
> -
>
> Key: MESOS-6239
> URL: https://issues.apache.org/jira/browse/MESOS-6239
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Aaron Wood
>Assignee: Aaron Wood
>Priority: Minor
>  Labels: c++, clang, gcc, libprocess, security, stout
>
> Most of the new warnings/errors come from libprocess/stout as there were 
> never any CXXFLAGS propagated to them.
> https://reviews.apache.org/r/52647/
> https://reviews.apache.org/r/52754/
> https://reviews.apache.org/r/52886/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6386) "Reached unreachable statement" in LinuxCapabilitiesIsolatorTest

2016-10-14 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier updated MESOS-6386:

Shepherd: Jie Yu

> "Reached unreachable statement" in LinuxCapabilitiesIsolatorTest
> 
>
> Key: MESOS-6386
> URL: https://issues.apache.org/jira/browse/MESOS-6386
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
> Environment: CentOS Linux release 7.2.1511 (Core), amd64
>Reporter: Neil Conway
>Assignee: Benjamin Bannier
>Priority: Minor
>  Labels: mesosphere
> Attachments: verbose-test-output.txt
>
>
> {noformat}
> [ RUN  ] TestParam/LinuxCapabilitiesIsolatorTest.ROOT_Ping/2
> Failed to execute command: Permission denied
> Reached unreachable statement at 
> ../../mesos/src/slave/containerizer/mesos/launch.cpp:710
> [   OK ] TestParam/LinuxCapabilitiesIsolatorTest.ROOT_Ping/2 (366 ms)
> {noformat}
> Observed running the tests as root on CentOS 7.2. Verbose test output 
> attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6386) "Reached unreachable statement" in LinuxCapabilitiesIsolatorTest

2016-10-14 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier reassigned MESOS-6386:
---

Assignee: Benjamin Bannier

> "Reached unreachable statement" in LinuxCapabilitiesIsolatorTest
> 
>
> Key: MESOS-6386
> URL: https://issues.apache.org/jira/browse/MESOS-6386
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
> Environment: CentOS Linux release 7.2.1511 (Core), amd64
>Reporter: Neil Conway
>Assignee: Benjamin Bannier
>Priority: Minor
>  Labels: mesosphere
> Attachments: verbose-test-output.txt
>
>
> {noformat}
> [ RUN  ] TestParam/LinuxCapabilitiesIsolatorTest.ROOT_Ping/2
> Failed to execute command: Permission denied
> Reached unreachable statement at 
> ../../mesos/src/slave/containerizer/mesos/launch.cpp:710
> [   OK ] TestParam/LinuxCapabilitiesIsolatorTest.ROOT_Ping/2 (366 ms)
> {noformat}
> Observed running the tests as root on CentOS 7.2. Verbose test output 
> attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5963) HealthChecker should not decide when to kill tasks and when to stop performing health checks.

2016-10-14 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-5963:
---
Shepherd: Benjamin Mahler
Assignee: Alexander Rukletsov  (was: haosdent)
  Sprint: Mesosphere Sprint 45
Target Version/s: 1.2.0

> HealthChecker should not decide when to kill tasks and when to stop 
> performing health checks.
> -
>
> Key: MESOS-5963
> URL: https://issues.apache.org/jira/browse/MESOS-5963
> Project: Mesos
>  Issue Type: Bug
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: health-check, mesosphere
>
> Currently, {{HealthChecker}} library decides when a task should be killed 
> based on its health status. Moreover, it stops checking it health after that. 
> This seems unfortunate, because it's up to the executor and / or framework to 
> decide both when to kill tasks and when to health check them. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6395) HealthChecker sends updates to executor via libprocess messaging.

2016-10-14 Thread Alexander Rukletsov (JIRA)
Alexander Rukletsov created MESOS-6395:
--

 Summary: HealthChecker sends updates to executor via libprocess 
messaging.
 Key: MESOS-6395
 URL: https://issues.apache.org/jira/browse/MESOS-6395
 Project: Mesos
  Issue Type: Improvement
Reporter: Alexander Rukletsov
Assignee: Alexander Rukletsov


Currently {{HealthChecker}} sends status updates via libprocess messaging to 
the executor's UPID. This seems unnecessary after refactoring health checker 
into the library: a simple callback will do. Moreover, not requiring executor's 
{{UPID}} will simplify creating a mocked {{HealthChecker}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-6035) Add non-recursive version of cgroups::get

2016-10-14 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575167#comment-15575167
 ] 

Alexander Rukletsov edited comment on MESOS-6035 at 10/14/16 12:16 PM:
---

{noformat}
Commit: fcd5106b5dfa14bc83eae68415bd4782c16f79a4 [fcd5106]
Author: Alexander Rukletsov 
Date: 14 October 2016 at 14:10:23 GMT+2
Commit Date: 14 October 2016 at 14:15:02 GMT+2

Revert "Removed the expired TODO about non-recursive version...
`cgroups::get`."

This reverts commit e042aa071a77ef1922d9b1a93f6e8adf221979b3.

RR https://reviews.apache.org/r/51185/ should have been committed
together with https://reviews.apache.org/r/51031/. However, the
latter is not going to make it into the 1.1.0 release, hence the
former is reverted now to avoid confusion.
{noformat}


was (Author: alexr):
{noformat}
Commit: fcd5106b5dfa14bc83eae68415bd4782c16f79a4 [fcd5106]
Parents: 9fc2901d23
Author: Alexander Rukletsov 
Date: 14 October 2016 at 14:10:23 GMT+2
Commit Date: 14 October 2016 at 14:15:02 GMT+2
Labels: HEAD -> master

Revert "Removed the expired TODO about non-recursive version...
`cgroups::get`."

This reverts commit e042aa071a77ef1922d9b1a93f6e8adf221979b3.

RR https://reviews.apache.org/r/51185/ should have been committed
together with https://reviews.apache.org/r/51031/. However, the
latter is not going to make it into the 1.1.0 release, hence the
former is reverted now to avoid confusion.
{noformat}

> Add non-recursive version of cgroups::get
> -
>
> Key: MESOS-6035
> URL: https://issues.apache.org/jira/browse/MESOS-6035
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>
> In some cases, we only need to get the top level cgroups instead of to get 
> all cgroups recursively. Add a non-recursive version could help to avoid 
> unnecessary paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6035) Add non-recursive version of cgroups::get

2016-10-14 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575167#comment-15575167
 ] 

Alexander Rukletsov commented on MESOS-6035:


{noformat}
Commit: fcd5106b5dfa14bc83eae68415bd4782c16f79a4 [fcd5106]
Parents: 9fc2901d23
Author: Alexander Rukletsov 
Date: 14 October 2016 at 14:10:23 GMT+2
Commit Date: 14 October 2016 at 14:15:02 GMT+2
Labels: HEAD -> master

Revert "Removed the expired TODO about non-recursive version...
`cgroups::get`."

This reverts commit e042aa071a77ef1922d9b1a93f6e8adf221979b3.

RR https://reviews.apache.org/r/51185/ should have been committed
together with https://reviews.apache.org/r/51031/. However, the
latter is not going to make it into the 1.1.0 release, hence the
former is reverted now to avoid confusion.
{noformat}

> Add non-recursive version of cgroups::get
> -
>
> Key: MESOS-6035
> URL: https://issues.apache.org/jira/browse/MESOS-6035
> Project: Mesos
>  Issue Type: Improvement
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
>
> In some cases, we only need to get the top level cgroups instead of to get 
> all cgroups recursively. Add a non-recursive version could help to avoid 
> unnecessary paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6394) Improvements to partition-aware Mesos frameworks.

2016-10-14 Thread Alexander Rukletsov (JIRA)
Alexander Rukletsov created MESOS-6394:
--

 Summary: Improvements to partition-aware Mesos frameworks.
 Key: MESOS-6394
 URL: https://issues.apache.org/jira/browse/MESOS-6394
 Project: Mesos
  Issue Type: Epic
  Components: master
Reporter: Alexander Rukletsov
Assignee: Neil Conway


This is a follow up epic to MESOS-5344 to capture further improvements and 
changes that need to be made to the MVP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5344) Partition-aware Mesos frameworks.

2016-10-14 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-5344:
---
Summary: Partition-aware Mesos frameworks.  (was: Partition-aware Mesos 
frameworks)

> Partition-aware Mesos frameworks.
> -
>
> Key: MESOS-5344
> URL: https://issues.apache.org/jira/browse/MESOS-5344
> Project: Mesos
>  Issue Type: Epic
>  Components: master
>Reporter: Neil Conway
>Assignee: Neil Conway
>Priority: Blocker
>  Labels: mesosphere
>
> This epic covers three related tasks:
> 1. Allowing partitioned agents to reregister with the master. This allows 
> frameworks to control how tasks running on partitioned agents should be dealt 
> with.
> 2. Replacing the TASK_LOST task state with a set of more granular states with 
> more precise semantics: UNREACHABLE, DROPPED, UNKNOWN, GONE, and 
> GONE_BY_OPERATOR.
> 3. Allow frameworks to be informed when a task that was running on a 
> partitioned agent has been terminated (GONE and GONE_BY_OPERATOR states).
> These new behaviors will be guarded by the {{PARTITION_AWARE}} framework 
> capability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6391) Command task's sandbox should not be owned by root if it uses container image.

2016-10-14 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6391:
---
Priority: Blocker  (was: Major)

> Command task's sandbox should not be owned by root if it uses container image.
> --
>
> Key: MESOS-6391
> URL: https://issues.apache.org/jira/browse/MESOS-6391
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.28.2, 1.0.1
>Reporter: Jie Yu
>Assignee: Jie Yu
>Priority: Blocker
>
> Currently, if the task defines a container image, the command executor will 
> be run under root because it needs to perform pivot_root.
> That means if the task wants to run under an unprivileged user, the sandbox 
> of that task will not be writable because it's owned by root.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6337) Nested containers getting killed before network isolation can be applied to them.

2016-10-14 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6337:
---
Target Version/s:   (was: 1.1.0)

> Nested containers getting killed before network isolation can be applied to 
> them.
> -
>
> Key: MESOS-6337
> URL: https://issues.apache.org/jira/browse/MESOS-6337
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
> Environment: Linux
>Reporter: Avinash Sridharan
>Assignee: Gilbert Song
>  Labels: mesosphere
>
> Seeing this odd behavior in one of our clusters:
> ```
> http.cpp:1948] Failed to launch nested container 
> cb92634b-42b3-40f3-94f7-609f89a362bc.46d884e4-d0eb-4572-be1d-24414df7cb2e: 
> Collect failed: Failed to seed container 
> cb92634b-42b3-40f3-94f7-609f89a362bc.46d884e4-d0eb-4572-be1d-24414df7cb2e: 
> Collect failed: Failed to setup hostname and network files: Failed to enter 
> the mount namespace of pid 21591: Pid 21591 does not exist
> Oct 07 02:05:55 ip-10-10-0-207 mesos-agent[31520]: I1007 02:05:55.894485 
> 31531 containerizer.cpp:1931] Destroying container 
> cb92634b-42b3-40f3-94f7-609f89a362bc.46d884e4-d0eb-4572-be1d-24414df7cb2e in 
> ISOLATING state
> Oct 07 02:05:55 ip-10-10-0-207 mesos-agent[31520]: I1007 02:05:55.894439 
> 31531 containerizer.cpp:2300] Container 
> cb92634b-42b3-40f3-94f7-609f89a362bc.46d884e4-d0eb-4572-be1d-24414df7cb2e has 
> exited
> Oct 07 02:05:55 ip-10-10-0-207 mesos-agent[31520]: I1007 02:05:55.854456 
> 31534 systemd.cpp:96] Assigned child process '21591' to 
> 'mesos_executors.slice'
> Oct 07 02:05:55 ip-10-10-0-207 mesos-agent[31520]: W1007 02:05:55.831861 
> 21580 process.cpp:882] Failed SSL connections will be downgraded to a non-SSL 
> socket
> Oct 07 02:05:55 ip-10-10-0-207 mesos-agent[31520]: NOTE: Set 
> LIBPROCESS_SSL_REQUIRE_CERT=1 to require peer certificate verification
> Oct 07 02:05:55 ip-10-10-0-207 mesos-agent[31520]: I1007 02:05:55.831526 
> 21580 openssl.cpp:432] Will only verify peer certificate if presented!
> Oct 07 02:05:55 ip-10-10-0-207 mesos-agent[31520]: NOTE: Set 
> LIBPROCESS_SSL_VERIFY_CERT=1 to enable peer certificate verification
> Oct 07 02:05:55 ip-10-10-0-207 mesos-agent[31520]: I1007 02:05:55.831521 
> 21580 openssl.cpp:426] Will not verify peer certificate!
> Oct 07 02:05:55 ip-10-10-0-207 mesos-agent[31520]: I1007 02:05:55.831511 
> 21580 openssl.cpp:421] CA directory path unspecified! NOTE: Set CA directory 
> path with LIBPROCESS_SSL_CA_DIR=
> Oct 07 02:05:55 ip-10-10-0-207 mesos-agent[31520]: W1007 02:05:55.831405 
> 21580 openssl.cpp:399] Failed SSL connections will be downgraded to a non-SSL 
> socket
> Oct 07 02:05:55 ip-10-10-0-207 mesos-agent[31520]: WARNING: Logging before 
> InitGoogleLogging() is written to STDERR
> Oct 07 02:05:55 ip-10-10-0-207 mesos-agent[31520]: W1007 02:05:55.828413 
> 21581 process.cpp:882] Failed SSL connections will be downgraded to a non-SSL 
> socket
> Oct 07 02:05:55 ip-10-10-0-207 mesos-agent[31520]: NOTE: Set 
> LIBPROCESS_SSL_REQUIRE_CERT=1 to require peer certificate verification
> ```
> The above log is "reverse" chronological order, so please read it bottom up.
> The relevant log is:
> ```
> http.cpp:1948] Failed to launch nested container 
> cb92634b-42b3-40f3-94f7-609f89a362bc.46d884e4-d0eb-4572-be1d-24414df7cb2e: 
> Collect failed: Failed to seed container 
> cb92634b-42b3-40f3-94f7-609f89a362bc.46d884e4-d0eb-4572-be1d-24414df7cb2e: 
> Collect failed: Failed to setup hostname and network files: Failed to enter 
> the mount namespace of pid 21591: Pid 21591 does not exist
> ```
> Looks like the nested container failed to launch because the `isolate` call 
> to the `network/cni` isolator failed. Seems like when the isolator received 
> the `isolate` call the PID for the nested container has already exited and it 
> couldn't enter its mount namespace to setup the network files. 
> The odd thing here is that the nested container would have been frozen, and 
> hence was not running, so not sure what killed the nested container. My 
> suspicion falls on systemd, since I also see this log message:
> ```
> Oct 07 18:02:31 ip-10-10-0-207 mesos-agent[31520]: I1007 18:02:31.473656 
> 31532 systemd.cpp:96] Assigned child process '1596' to 'mesos_executors.slice'
> ```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6376) Add documentation for capabilities support of the mesos containerizer

2016-10-14 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6376:
---
Priority: Blocker  (was: Major)

> Add documentation for capabilities support of the mesos containerizer
> -
>
> Key: MESOS-6376
> URL: https://issues.apache.org/jira/browse/MESOS-6376
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Blocker
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6283) Fix the Web UI allowing access to the task sandbox for nested containers.

2016-10-14 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575101#comment-15575101
 ] 

haosdent commented on MESOS-6283:
-

I think this is a blocker, let me ping [~vinodkone] today again.

> Fix the Web UI allowing access to the task sandbox for nested containers.
> -
>
> Key: MESOS-6283
> URL: https://issues.apache.org/jira/browse/MESOS-6283
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Reporter: Anand Mazumdar
>Assignee: haosdent
>Priority: Blocker
>  Labels: mesosphere
> Attachments: sandbox.gif
>
>
> Currently, the sandbox button for a child task is broken on the WebUI. It 
> does nothing and dies with an error that the executor for this task cannot be 
> found. We need to fix the WebUI to follow the symlink "tasks/taskId" and 
> display the task sandbox to the users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6134) Port CFS quota support to Docker Containerizer using command executor.

2016-10-14 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6134:
---
Target Version/s: 1.2.0  (was: 1.1.0)

> Port CFS quota support to Docker Containerizer using command executor.
> --
>
> Key: MESOS-6134
> URL: https://issues.apache.org/jira/browse/MESOS-6134
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker
>Reporter: Zhitao Li
>Assignee: Zhitao Li
>
> MESOS-2154 only partially fixed the CFS quota support in Docker 
> Containerizer: that fix only works for custom executor.
> This tracks the fix for command executor so we can declare this is complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6134) Port CFS quota support to Docker Containerizer using command executor.

2016-10-14 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575096#comment-15575096
 ] 

Alexander Rukletsov commented on MESOS-6134:


This issue is delaying the 1.1.0 release and shows no progress in the last 
days. It is retargeted for 1.2.0.

> Port CFS quota support to Docker Containerizer using command executor.
> --
>
> Key: MESOS-6134
> URL: https://issues.apache.org/jira/browse/MESOS-6134
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker
>Reporter: Zhitao Li
>Assignee: Zhitao Li
>
> MESOS-2154 only partially fixed the CFS quota support in Docker 
> Containerizer: that fix only works for custom executor.
> This tracks the fix for command executor so we can declare this is complete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6335) Add user doc for task group tasks

2016-10-14 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575094#comment-15575094
 ] 

Alexander Rukletsov commented on MESOS-6335:


Is it still to land in 1.1.0?

> Add user doc for task group tasks
> -
>
> Key: MESOS-6335
> URL: https://issues.apache.org/jira/browse/MESOS-6335
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Vinod Kone
>Assignee: Vinod Kone
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2449) Support group of tasks (Pod) constructs and API in Mesos.

2016-10-14 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-2449:
---
Priority: Blocker  (was: Major)

> Support group of tasks (Pod) constructs and API in Mesos.
> -
>
> Key: MESOS-2449
> URL: https://issues.apache.org/jira/browse/MESOS-2449
> Project: Mesos
>  Issue Type: Epic
>Reporter: Timothy Chen
>Priority: Blocker
>  Labels: mesosphere
>
> There is a common need among different frameworks, that wants to start a 
> group of tasks that are either depend or co-located with each other.
> Although a framework can schedule individual tasks within the same offer and 
> slave id, it doesn't have a way to describe dependencies, failure policies 
> (if one of the task failed), network setup, and group container information, 
> etc.
> Want to create a epic to start the discussion around the requirements folks 
> need, and see where we can lead this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6344) Allow `network/cni` isolator to take a search path for CNI plugins instead of single directory

2016-10-14 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575042#comment-15575042
 ] 

Till Toenshoff commented on MESOS-6344:
---

from [~avin...@mesosphere.io] via slack:

@till  @alexr we are still reviewing MESOS-6344 and MESOS-6023 . Jie has done 
one pass and we need to re-factor some code to get it right. My estimate is by 
EOD tomorrow (Friday). MESOS-6014 is just the epic, so will close it out once 
we are done with MESOS-6344 and MESOS-6023. MESOS-6040 is just enabling the 
port-mapper plugin in the CMake build. Should be accomplished by EOD tomorrow 
as well.

> Allow `network/cni` isolator to take a search path for CNI plugins instead of 
> single directory
> --
>
> Key: MESOS-6344
> URL: https://issues.apache.org/jira/browse/MESOS-6344
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Blocker
>  Labels: mesosphere
>
> Currently the `network/cni` isolator expects a single directory with the 
> `--network_cni_plugins_dir` . This is very limiting because this forces the 
> operator to put all the CNI plugins in the same directory. 
> With Mesos port-mapper CNI plugin this would also imply that the operator 
> would have to move this plugin from the Mesos installation directory to a 
> directory specified in the `--network_cni_plugins_dir`. 
> To simplify the operators experience it would make sense for the 
> `--network_cni_plugins_dir` flag to take in set of directories instead of 
> single directory. The `network/cni` isolator can then search this set of 
> directories to find the CNI plugin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6040) Add a CMake build for `mesos-port-mapper`

2016-10-14 Thread Till Toenshoff (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15575041#comment-15575041
 ] 

Till Toenshoff commented on MESOS-6040:
---

via slack from [~avin...@mesosphere.io];
@till  @alexr we are still reviewing MESOS-6344 and MESOS-6023 . Jie has done 
one pass and we need to re-factor some code to get it right. My estimate is by 
EOD tomorrow (Friday). MESOS-6014 is just the epic, so will close it out once 
we are done with MESOS-6344 and MESOS-6023. MESOS-6040 is just enabling the 
port-mapper plugin in the CMake build. Should be accomplished by EOD tomorrow 
as well.

> Add a CMake build for `mesos-port-mapper`
> -
>
> Key: MESOS-6040
> URL: https://issues.apache.org/jira/browse/MESOS-6040
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Blocker
>  Labels: mesosphere
>
> Once the port-mapper binary compiles with GNU make, we need to modify the 
> CMake to build the port-mapper binary as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6360) The handling of whiteout files in provisioner is not correct

2016-10-14 Thread Till Toenshoff (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-6360:
--
Target Version/s: 1.2.0  (was: 1.1.0)

This issue is delaying the 1.1.0 release and shows no progress in the last 
days. It is retargeted for 1.2.0.

> The handling of whiteout files in provisioner is not correct
> 
>
> Key: MESOS-6360
> URL: https://issues.apache.org/jira/browse/MESOS-6360
> Project: Mesos
>  Issue Type: Bug
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>Priority: Blocker
>
> Currently when user launches a container from a Docker image via universal 
> containerizer, we always handle the whiteout files in 
> {{ProvisionerProcess::__provision()}} regardless of which backend is used.
> However this is actually not correct, because the way to handle whiteout 
> files is backend dependent, that means for different backends, we need to 
> handle whiteout files in different ways, e.g.:
> * AUFS backend: It seems the AUFS whiteout ({{.wh.}} and 
> {{.wh..wh..opq}}) is the whiteout standard in Docker (see [this comment | 
> https://github.com/docker/docker/blob/v1.12.1/pkg/archive/archive.go#L259:L262]
>  for details), so that means after the Docker image is pulled, its whiteout 
> files in the store are already in aufs format, then we do not need to do 
> anything about whiteout file handling because the aufs mount done in 
> {{AufsBackendProcess::provision()}} will handle it automatically.
> * Overlay backend: Overlayfs has its own whiteout files (see [this doc | 
> https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt] for 
> details), so we need to convert the aufs whiteout files to overlayfs whiteout 
> files before we do the overlay mount in {{OverlayBackendProcess::provision}} 
> which will automatically handle the overlayfs whiteout files.
> * Copy backend: We need to manually handle the aufs whiteout files when we 
> copy each layer in {{CopyBackendProcess::_provision()}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6014) Create a CNI plugin that provides port mapping functionality for various CNI plugins.

2016-10-14 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6014:
---
Priority: Blocker  (was: Major)

> Create a CNI plugin that provides port mapping functionality for various CNI 
> plugins.
> -
>
> Key: MESOS-6014
> URL: https://issues.apache.org/jira/browse/MESOS-6014
> Project: Mesos
>  Issue Type: Epic
>  Components: containerization
> Environment: Linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Blocker
>  Labels: mesosphere
>
> Currently there is no CNI plugin that supports port mapping. Given that the 
> unified containerizer is starting to become the de-facto container run time, 
> having  a CNI plugin that provides port mapping is a must have. This is 
> primarily required for support BRIDGE networking mode, similar to docker 
> bridge networking that users expect to have when using docker containers. 
> While the most obvious use case is that of using the port-mapper plugin with 
> the bridge plugin, the port-mapping functionality itself is generic and 
> should be usable with any CNI plugin that needs it.
> Keeping port-mapping as a CNI plugin gives operators the ability to use the 
> default port-mapper (CNI plugin) that Mesos provides, or use their own plugin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6040) Add a CMake build for `mesos-port-mapper`

2016-10-14 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6040:
---
Priority: Blocker  (was: Major)

> Add a CMake build for `mesos-port-mapper`
> -
>
> Key: MESOS-6040
> URL: https://issues.apache.org/jira/browse/MESOS-6040
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Blocker
>  Labels: mesosphere
>
> Once the port-mapper binary compiles with GNU make, we need to modify the 
> CMake to build the port-mapper binary as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6023) Create a binary for the port-mapper plugin

2016-10-14 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6023:
---
Priority: Blocker  (was: Major)

> Create a binary for the port-mapper plugin
> --
>
> Key: MESOS-6023
> URL: https://issues.apache.org/jira/browse/MESOS-6023
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: Linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Blocker
>
> The CNI port mapper plugin needs to be a separate binary that will be invoked 
> by the `network/cni` isolator as a CNI plugin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-6344) Allow `network/cni` isolator to take a search path for CNI plugins instead of single directory

2016-10-14 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6344:
---
Priority: Blocker  (was: Major)

> Allow `network/cni` isolator to take a search path for CNI plugins instead of 
> single directory
> --
>
> Key: MESOS-6344
> URL: https://issues.apache.org/jira/browse/MESOS-6344
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Blocker
>  Labels: mesosphere
>
> Currently the `network/cni` isolator expects a single directory with the 
> `--network_cni_plugins_dir` . This is very limiting because this forces the 
> operator to put all the CNI plugins in the same directory. 
> With Mesos port-mapper CNI plugin this would also imply that the operator 
> would have to move this plugin from the Mesos installation directory to a 
> directory specified in the `--network_cni_plugins_dir`. 
> To simplify the operators experience it would make sense for the 
> `--network_cni_plugins_dir` flag to take in set of directories instead of 
> single directory. The `network/cni` isolator can then search this set of 
> directories to find the CNI plugin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3421) Support sharing of resources across task instances

2016-10-14 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574412#comment-15574412
 ] 

Yan Xu edited comment on MESOS-3421 at 10/14/16 6:28 AM:
-

Resolving it as all sub-issues are resolved. Further improvements are tracked 
by MESOS-6372.


was (Author: xujyan):
Resolving it as all sub-issues are solved. Further improvements are tracked by 
MESOS-6372.

> Support sharing of resources across task instances
> --
>
> Key: MESOS-3421
> URL: https://issues.apache.org/jira/browse/MESOS-3421
> Project: Mesos
>  Issue Type: Epic
>  Components: isolation, volumes
>Reporter: Anindya Sinha
>Assignee: Anindya Sinha
>  Labels: external-volumes, persistent-volumes
> Fix For: 1.1.0
>
>
> A service that needs persistent volume needs to have access to the same 
> persistent volume (RW) from multiple task(s) instances on the same agent 
> node. Currently, a persistent volume once offered to the framework(s) can be 
> scheduled to a task and until that tasks terminates, that persistent volume 
> cannot be used by another task.
> Explore providing the capability of sharing persistent volumes across task 
> instances scheduled on a single agent node.
> Based on discussion within the community, we would allow sharing of resources 
> in general, and add support to enable shareability for persistent volumes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)