[jira] [Commented] (MESOS-5377) Improve DRF behavior with scarce resources.

2016-05-13 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283415#comment-15283415
 ] 

Klaus Ma commented on MESOS-5377:
-

I think we can ignore 100% used resources; so the dominant resource will be 
changed to CPU or MEM.

> Improve DRF behavior with scarce resources.
> ---
>
> Key: MESOS-5377
> URL: https://issues.apache.org/jira/browse/MESOS-5377
> Project: Mesos
>  Issue Type: Epic
>  Components: allocation
>Reporter: Benjamin Mahler
>
> The allocator currently uses the notion of Weighted [Dominant Resource 
> Fairness|https://www.cs.berkeley.edu/~alig/papers/drf.pdf] (WDRF) to 
> establish a linear notion of fairness across allocation roles.
> DRF behaves well for resources that are present within each machine in a 
> cluster (e.g. CPUs, memory, disk). However, some resources (e.g. GPUs) are 
> only present on a subset of machines in the cluster.
> Consider the behavior when there are the following agents in a cluster:
> 1000 agents with (cpus:4,mem:1024,disk:1024)
> 1 agent with (gpus:1,cpus:4,mem:1024,disk:1024)
> If a role wishes to use both GPU and non-GPU resources for tasks, consuming 1 
> GPU will lead DRF to consider the role to have a 100% share of the cluster, 
> since it consumes 100% of the GPUs in the cluster. This framework will then 
> not receive any other offers.
> Among possible improvements, fairness can have understanding of resource 
> packages. In a sense there is 1 GPU package that is competed on and 1000 
> non-GPU packages competed on, and ideally a role's consumption of the single 
> GPU package does not have a large effect on the role's access to the other 
> 1000 non-GPU packages.
> In the interim, we should consider having a recommended way to deal with 
> scarce resources in the current model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5380) Killing a queued task can cause the corresponding command executor to never terminate.

2016-05-13 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-5380:
--
Description: 
We observed this in our testing environment. Sequence of events:

1) A command task is queued since the executor has not registered yet.
2) The framework issues a killTask.
3) Since executor is in REGISTERING state, agent calls 
`statusUpdate(TASK_KILLED, UPID())`
4) `statusUpdate` now will call `containerizer->status()` before calling 
`executor->terminateTask(status.task_id(), status);` which will remove the 
queued task. (Introduced in this patch: https://reviews.apache.org/r/43258).
5) Since the above is async, it's possible that the task is still in queued 
task when we trying to see if we need to kill unregistered executor in 
`killTask`:
{code}
  // TODO(jieyu): Here, we kill the executor if it no longer has
  // any task to run and has not yet registered. This is a
  // workaround for those single task executors that do not have a
  // proper self terminating logic when they haven't received the
  // task within a timeout.
  if (executor->queuedTasks.empty()) {
CHECK(executor->launchedTasks.empty())
<< " Unregistered executor '" << executor->id
<< "' has launched tasks";

LOG(WARNING) << "Killing the unregistered executor " << *executor
 << " because it has no tasks";

executor->state = Executor::TERMINATING;

containerizer->destroy(executor->containerId);
  }
{code}

6) Consequently, the executor will never be terminated by Mesos.

Attaching the relevant agent log:
{noformat}
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.640527  1342 slave.cpp:1361] Got assigned task 
mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 for framework 
a3ad8418-cb77-4705-b353-4b514ceca52c-
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.641034  1342 slave.cpp:1480] Launching task 
mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 for framework 
a3ad8418-cb77-4705-b353-4b514ceca52c-
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.641440  1342 paths.cpp:528] Trying to chown 
'/var/lib/mesos/slave/slaves/a3ad8418-cb77-4705-b353-4b514ceca52c-S0/frameworks/a3ad8418-cb77-4705-b353-4b514ceca52c-/executors/mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6/runs/24762d43-2134-475e-b724-caa72110497a'
 to user 'root'
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.644664  1342 slave.cpp:5389] Launching executor 
mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 of framework 
a3ad8418-cb77-4705-b353-4b514ceca52c- with resources cpus(*):0.1; mem(*):32 
in work directory 
'/var/lib/mesos/slave/slaves/a3ad8418-cb77-4705-b353-4b514ceca52c-S0/frameworks/a3ad8418-cb77-4705-b353-4b514ceca52c-/executors/mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6/runs/24762d43-2134-475e-b724-caa72110497a'
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.645195  1342 slave.cpp:1698] Queuing task 
'mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6' for executor 
'mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6' of framework 
a3ad8418-cb77-4705-b353-4b514ceca52c-
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.645491  1338 containerizer.cpp:671] Starting container 
'24762d43-2134-475e-b724-caa72110497a' for executor 
'mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6' of framework 
'a3ad8418-cb77-4705-b353-4b514ceca52c-'
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.647897  1345 cpushare.cpp:389] Updated 'cpu.shares' to 1126 
(cpus 1.1) for container 24762d43-2134-475e-b724-caa72110497a
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.648619  1345 cpushare.cpp:411] Updated 'cpu.cfs_period_us' to 
100ms and 'cpu.cfs_quota_us' to 110ms (cpus 1.1) for container 
24762d43-2134-475e-b724-caa72110497a
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.650180  1341 mem.cpp:602] Started listening for OOM events for 
container 24762d43-2134-475e-b724-caa72110497a
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.650718  1341 mem.cpp:722] Started listening on low memory 
pressure events for container 24762d43-2134-475e-b724-caa72110497a
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.651147  1341 mem.cpp:722] Started listening on medium memory 
pressure events for container 24762d43-2134-475e-b724-caa72110497a
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.651599  1341 mem.cpp:722] Started listening on critical memory 
pressure events for container 

[jira] [Updated] (MESOS-5380) Killing a queued task can cause the corresponding command executor to never terminate.

2016-05-13 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-5380:
--
Description: 
We observed this in our testing environment. Sequence of events:

1) A command task is queued since the executor has not registered yet.
2) The framework issues a killTask.
3) Since executor is in REGISTERING state, agent calls 
`statusUpdate(TASK_KILLED, UPID())`
4) `statusUpdate` now will call `containerizer->status()` before calling 
`executor->terminateTask(status.task_id(), status);` which will remove the 
queued task. (Introduced in this patch: https://reviews.apache.org/r/43258).
5) Since the above is async, it's possible that the task is still in queued 
task when we trying to see if we need to kill unregistered executor in 
`killTask`:
{code}
  // TODO(jieyu): Here, we kill the executor if it no longer has
  // any task to run and has not yet registered. This is a
  // workaround for those single task executors that do not have a
  // proper self terminating logic when they haven't received the
  // task within a timeout.
  if (executor->queuedTasks.empty()) {
CHECK(executor->launchedTasks.empty())
<< " Unregistered executor '" << executor->id
<< "' has launched tasks";

LOG(WARNING) << "Killing the unregistered executor " << *executor
 << " because it has no tasks";

executor->state = Executor::TERMINATING;

containerizer->destroy(executor->containerId);
  }
{code}

6) Consequently, the executor will never be terminated by Mesos.

See the attached agent log:
{noformat}
Attaching the relevant agent log:
{noformat}
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.640527  1342 slave.cpp:1361] Got assigned task 
mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 for framework 
a3ad8418-cb77-4705-b353-4b514ceca52c-
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.641034  1342 slave.cpp:1480] Launching task 
mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 for framework 
a3ad8418-cb77-4705-b353-4b514ceca52c-
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.641440  1342 paths.cpp:528] Trying to chown 
'/var/lib/mesos/slave/slaves/a3ad8418-cb77-4705-b353-4b514ceca52c-S0/frameworks/a3ad8418-cb77-4705-b353-4b514ceca52c-/executors/mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6/runs/24762d43-2134-475e-b724-caa72110497a'
 to user 'root'
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.644664  1342 slave.cpp:5389] Launching executor 
mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 of framework 
a3ad8418-cb77-4705-b353-4b514ceca52c- with resources cpus(*):0.1; mem(*):32 
in work directory 
'/var/lib/mesos/slave/slaves/a3ad8418-cb77-4705-b353-4b514ceca52c-S0/frameworks/a3ad8418-cb77-4705-b353-4b514ceca52c-/executors/mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6/runs/24762d43-2134-475e-b724-caa72110497a'
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.645195  1342 slave.cpp:1698] Queuing task 
'mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6' for executor 
'mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6' of framework 
a3ad8418-cb77-4705-b353-4b514ceca52c-
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.645491  1338 containerizer.cpp:671] Starting container 
'24762d43-2134-475e-b724-caa72110497a' for executor 
'mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6' of framework 
'a3ad8418-cb77-4705-b353-4b514ceca52c-'
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.647897  1345 cpushare.cpp:389] Updated 'cpu.shares' to 1126 
(cpus 1.1) for container 24762d43-2134-475e-b724-caa72110497a
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.648619  1345 cpushare.cpp:411] Updated 'cpu.cfs_period_us' to 
100ms and 'cpu.cfs_quota_us' to 110ms (cpus 1.1) for container 
24762d43-2134-475e-b724-caa72110497a
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.650180  1341 mem.cpp:602] Started listening for OOM events for 
container 24762d43-2134-475e-b724-caa72110497a
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.650718  1341 mem.cpp:722] Started listening on low memory 
pressure events for container 24762d43-2134-475e-b724-caa72110497a
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.651147  1341 mem.cpp:722] Started listening on medium memory 
pressure events for container 24762d43-2134-475e-b724-caa72110497a
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.651599  1341 mem.cpp:722] Started listening on critical memory 

[jira] [Commented] (MESOS-5302) Consider adding an Executor Shim/Adapter for the new/old API

2016-05-13 Thread Anand Mazumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283250#comment-15283250
 ] 

Anand Mazumdar commented on MESOS-5302:
---

Review Chain: https://reviews.apache.org/r/47363/

> Consider adding an Executor Shim/Adapter for the new/old API
> 
>
> Key: MESOS-5302
> URL: https://issues.apache.org/jira/browse/MESOS-5302
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> Currently, all the business logic for HTTP based command executor/driver 
> based command executor lives in 2 different files. As more features are 
> added/bugs are discovered in the executor itself, they need to be fixed in 
> two places. It would be nice to have some kind of a shim/adapter that 
> abstracts away the underlying library details from the executor. Hence, the 
> executor can toggle between whether it wants to use the driver or the new API 
> via an environment variable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5005) Enforce that DiskInfo principal is equal to framework/operator principal

2016-05-13 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-5005:
-
Sprint: Mesosphere Sprint 32, Mesosphere Sprint 35  (was: Mesosphere Sprint 
32)

> Enforce that DiskInfo principal is equal to framework/operator principal
> 
>
> Key: MESOS-5005
> URL: https://issues.apache.org/jira/browse/MESOS-5005
> Project: Mesos
>  Issue Type: Bug
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere, persistent-volumes, reservations
> Fix For: 0.29.0
>
>
> Currently, we require that {{ReservationInfo.principal}} be equal to the 
> principal provided for authentication, which means that when HTTP 
> authentication is disabled this field cannot be set. Based on comments in 
> 'mesos.proto', the original intention was to enforce this same constraint for 
> {{Persistence.principal}}, but it seems that we don't enforce it. This should 
> be changed to make the two fields equivalent, with one exception: when the 
> framework/operator principal is {{None}}, we should allow the principal in 
> {{DiskInfo}} to take any value, along the same lines as MESOS-5212.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5380) Killing a queued task can cause the corresponding command executor to never terminate.

2016-05-13 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-5380:
--
Shepherd: Jie Yu
  Sprint: Mesosphere Sprint 35
Story Points: 3

> Killing a queued task can cause the corresponding command executor to never 
> terminate.
> --
>
> Key: MESOS-5380
> URL: https://issues.apache.org/jira/browse/MESOS-5380
> Project: Mesos
>  Issue Type: Bug
>  Components: slave
>Affects Versions: 0.28.0, 0.28.1
>Reporter: Jie Yu
>Assignee: Vinod Kone
>Priority: Blocker
>  Labels: mesosphere
> Fix For: 0.29.0, 0.28.2
>
>
> We observed this in our testing environment. Sequence of events:
> 1) A command task is queued since the executor has not registered yet.
> 2) The framework issues a killTask.
> 3) Since executor is in REGISTERING state, agent calls 
> `statusUpdate(TASK_KILLED, UPID())`
> 4) `statusUpdate` now will call `containerizer->status()` before calling 
> `executor->terminateTask(status.task_id(), status);` which will remove the 
> queued task. (Introduced in this patch: https://reviews.apache.org/r/43258).
> 5) Since the above is async, it's possible that the task is still in queued 
> task when we trying to see if we need to kill unregistered executor in 
> `killTask`:
> {code}
>   // TODO(jieyu): Here, we kill the executor if it no longer has
>   // any task to run and has not yet registered. This is a
>   // workaround for those single task executors that do not have a
>   // proper self terminating logic when they haven't received the
>   // task within a timeout.
>   if (executor->queuedTasks.empty()) {
> CHECK(executor->launchedTasks.empty())
> << " Unregistered executor '" << executor->id
> << "' has launched tasks";
> LOG(WARNING) << "Killing the unregistered executor " << *executor
>  << " because it has no tasks";
> executor->state = Executor::TERMINATING;
> containerizer->destroy(executor->containerId);
>   }
> {code}
> 6) Consequently, the executor will never be terminated by Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5215) Update the documentation for '/reserve' and '/create-volumes'

2016-05-13 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283211#comment-15283211
 ] 

Greg Mann commented on MESOS-5215:
--

Review here: https://reviews.apache.org/r/47360/

> Update the documentation for '/reserve' and '/create-volumes'
> -
>
> Key: MESOS-5215
> URL: https://issues.apache.org/jira/browse/MESOS-5215
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Affects Versions: 0.28.1
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: documentation, mesosphere
> Fix For: 0.29.0
>
>
> There are a couple issues related to the {{principal}} field in {{DiskInfo}} 
> and {{ReservationInfo}} (see linked JIRAs) that should be better documented. 
> We need to help users understand the purpose of these fields and how they 
> interact with the principal provided in the HTTP authentication header. See 
> linked tickets for background.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5340) libevent builds may prevent new connections

2016-05-13 Thread Benjamin Mahler (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283203#comment-15283203
 ] 

Benjamin Mahler commented on MESOS-5340:


Added a unit test for this issue here:
https://reviews.apache.org/r/47362/

> libevent builds may prevent new connections
> ---
>
> Key: MESOS-5340
> URL: https://issues.apache.org/jira/browse/MESOS-5340
> Project: Mesos
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.29.0, 0.28.1
>Reporter: Till Toenshoff
>Assignee: Benjamin Mahler
>Priority: Blocker
>  Labels: mesosphere, security, ssl
> Fix For: 0.29.0
>
>
> When using an SSL-enabled build of Mesos in combination with SSL-downgrading 
> support, any connection that does not actually transmit data will hang the 
> runnable (e.g. master).
> For reproducing the issue (on any platform)...
> Spin up a master with enabled SSL-downgrading:
> {noformat}
> $ export SSL_ENABLED=true
> $ export SSL_SUPPORT_DOWNGRADE=true
> $ export SSL_KEY_FILE=/path/to/your/foo.key
> $ export SSL_CERT_FILE=/path/to/your/foo.crt
> $ export SSL_CA_FILE=/path/to/your/ca.crt
> $ ./bin/mesos-master.sh --work_dir=/tmp/foo
> {noformat}
> Create some artificial HTTP request load for quickly spotting the problem in 
> both, the master logs as well as the output of CURL itself:
> {noformat}
> $ while true; do sleep 0.1; echo $( date +">%H:%M:%S.%3N"; curl -s -k -A "SSL 
> Debug" http://localhost:5050/master/slaves; echo ;date +"<%H:%M:%S.%3N"; 
> echo); done
> {noformat}
> Now create a connection to the master that does not transmit any data:
> {noformat}
> $ telnet localhost 5050
> {noformat}
> You should now see the CURL requests hanging, the master stops responding to 
> new connections. This will persist until either some data is transmitted via 
> the above telnet connection or it is closed.
> This problem has initially been observed when running Mesos on an AWS cluster 
> with enabled load-balancer (which uses an idle, persistent connection) for 
> the master node. Such connection does naturally not transmit any data as long 
> as there are no external requests routed via the load-balancer. AWS allows 
> setting up a timeout for those connections and in our test environment, this 
> duration was set to 60 seconds and hence we were seeing our master getting 
> repetitively unresponsive for 60 seconds, then getting "unstuck" for a brief 
> period until it got stuck again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5380) Killing a queued task can cause the corresponding command executor to never terminate.

2016-05-13 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-5380:
--
Labels: mesosphere  (was: )

> Killing a queued task can cause the corresponding command executor to never 
> terminate.
> --
>
> Key: MESOS-5380
> URL: https://issues.apache.org/jira/browse/MESOS-5380
> Project: Mesos
>  Issue Type: Bug
>  Components: slave
>Affects Versions: 0.28.0, 0.28.1
>Reporter: Jie Yu
>Assignee: Vinod Kone
>Priority: Blocker
>  Labels: mesosphere
> Fix For: 0.29.0, 0.28.2
>
>
> We observed this in our testing environment. Sequence of events:
> 1) A command task is queued since the executor has not registered yet.
> 2) The framework issues a killTask.
> 3) Since executor is in REGISTERING state, agent calls 
> `statusUpdate(TASK_KILLED, UPID())`
> 4) `statusUpdate` now will call `containerizer->status()` before calling 
> `executor->terminateTask(status.task_id(), status);` which will remove the 
> queued task. (Introduced in this patch: https://reviews.apache.org/r/43258).
> 5) Since the above is async, it's possible that the task is still in queued 
> task when we trying to see if we need to kill unregistered executor in 
> `killTask`:
> {code}
>   // TODO(jieyu): Here, we kill the executor if it no longer has
>   // any task to run and has not yet registered. This is a
>   // workaround for those single task executors that do not have a
>   // proper self terminating logic when they haven't received the
>   // task within a timeout.
>   if (executor->queuedTasks.empty()) {
> CHECK(executor->launchedTasks.empty())
> << " Unregistered executor '" << executor->id
> << "' has launched tasks";
> LOG(WARNING) << "Killing the unregistered executor " << *executor
>  << " because it has no tasks";
> executor->state = Executor::TERMINATING;
> containerizer->destroy(executor->containerId);
>   }
> {code}
> 6) Consequently, the executor will never be terminated by Mesos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5380) Killing a queued task can cause the corresponding command executor to never terminate.

2016-05-13 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-5380:
--
Description: 
We observed this in our testing environment. Sequence of events:

1) A command task is queued since the executor has not registered yet.
2) The framework issues a killTask.
3) Since executor is in REGISTERING state, agent calls 
`statusUpdate(TASK_KILLED, UPID())`
4) `statusUpdate` now will call `containerizer->status()` before calling 
`executor->terminateTask(status.task_id(), status);` which will remove the 
queued task. (Introduced in this patch: https://reviews.apache.org/r/43258).
5) Since the above is async, it's possible that the task is still in queued 
task when we trying to see if we need to kill unregistered executor in 
`killTask`:
{code}
  // TODO(jieyu): Here, we kill the executor if it no longer has
  // any task to run and has not yet registered. This is a
  // workaround for those single task executors that do not have a
  // proper self terminating logic when they haven't received the
  // task within a timeout.
  if (executor->queuedTasks.empty()) {
CHECK(executor->launchedTasks.empty())
<< " Unregistered executor '" << executor->id
<< "' has launched tasks";

LOG(WARNING) << "Killing the unregistered executor " << *executor
 << " because it has no tasks";

executor->state = Executor::TERMINATING;

containerizer->destroy(executor->containerId);
  }
{code}

6) Consequently, the executor will never be terminated by Mesos.

  was:
We observed that in our testing environment. So here is the sequence of events:

1) A command task is queued, the executor is not registered yet
2) The framework issues a killTask
3) Since executor is in REGISTERING state, agent calls 
`statusUpdate(TASK_KILLED, UPID())`
4) `statusUpdate` now will call `containerizer->status()` before calling 
`executor->terminateTask(status.task_id(), status);` which will remove the 
queued task. (introduced in this patch https://reviews.apache.org/r/43258).
5) Since the above is async, it's possible that the task is still in queued 
task when we trying to see if we need to kill unregistered executor in 
`killTask`:
{code}
  // TODO(jieyu): Here, we kill the executor if it no longer has
  // any task to run and has not yet registered. This is a
  // workaround for those single task executors that do not have a
  // proper self terminating logic when they haven't received the
  // task within a timeout.
  if (executor->queuedTasks.empty()) {
CHECK(executor->launchedTasks.empty())
<< " Unregistered executor '" << executor->id
<< "' has launched tasks";

LOG(WARNING) << "Killing the unregistered executor " << *executor
 << " because it has no tasks";

executor->state = Executor::TERMINATING;

containerizer->destroy(executor->containerId);
  }
{code}

6) The executor will never be terminated by Mesos after that.

Attaching the relevant agent log:
{noformat}
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.640527  1342 slave.cpp:1361] Got assigned task 
mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 for framework 
a3ad8418-cb77-4705-b353-4b514ceca52c-
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.641034  1342 slave.cpp:1480] Launching task 
mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 for framework 
a3ad8418-cb77-4705-b353-4b514ceca52c-
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.641440  1342 paths.cpp:528] Trying to chown 
'/var/lib/mesos/slave/slaves/a3ad8418-cb77-4705-b353-4b514ceca52c-S0/frameworks/a3ad8418-cb77-4705-b353-4b514ceca52c-/executors/mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6/runs/24762d43-2134-475e-b724-caa72110497a'
 to user 'root'
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.644664  1342 slave.cpp:5389] Launching executor 
mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6 of framework 
a3ad8418-cb77-4705-b353-4b514ceca52c- with resources cpus(*):0.1; mem(*):32 
in work directory 
'/var/lib/mesos/slave/slaves/a3ad8418-cb77-4705-b353-4b514ceca52c-S0/frameworks/a3ad8418-cb77-4705-b353-4b514ceca52c-/executors/mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6/runs/24762d43-2134-475e-b724-caa72110497a'
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.645195  1342 slave.cpp:1698] Queuing task 
'mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6' for executor 
'mesosvol.6ccd993c-1920-11e6-a722-9648cb19afd6' of framework 
a3ad8418-cb77-4705-b353-4b514ceca52c-
May 13 15:36:13 ip-10-0-2-74.us-west-2.compute.internal mesos-slave[1304]: 
I0513 15:36:13.645491  1338 

[jira] [Updated] (MESOS-5380) Killing a queued task can cause the corresponding command executor to never terminate.

2016-05-13 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-5380:
--
Summary: Killing a queued task can cause the corresponding command executor 
to never terminate.  (was: Killing a queued task can cause the corresponding 
command executor never terminates.)

> Killing a queued task can cause the corresponding command executor to never 
> terminate.
> --
>
> Key: MESOS-5380
> URL: https://issues.apache.org/jira/browse/MESOS-5380
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.28.0, 0.28.1
>Reporter: Jie Yu
>Assignee: Vinod Kone
>Priority: Blocker
> Fix For: 0.29.0, 0.28.2
>
>
> We observed that in our testing environment. So here is the sequence of 
> events:
> 1) A command task is queued, the executor is not registered yet
> 2) The framework issues a killTask
> 3) Since executor is in REGISTERING state, agent calls 
> `statusUpdate(TASK_KILLED, UPID())`
> 4) `statusUpdate` now will call `containerizer->status()` before calling 
> `executor->terminateTask(status.task_id(), status);` which will remove the 
> queued task. (introduced in this patch https://reviews.apache.org/r/43258).
> 5) Since the above is async, it's possible that the task is still in queued 
> task when we trying to see if we need to kill unregistered executor in 
> `killTask`:
> {code}
>   // TODO(jieyu): Here, we kill the executor if it no longer has
>   // any task to run and has not yet registered. This is a
>   // workaround for those single task executors that do not have a
>   // proper self terminating logic when they haven't received the
>   // task within a timeout.
>   if (executor->queuedTasks.empty()) {
> CHECK(executor->launchedTasks.empty())
> << " Unregistered executor '" << executor->id
> << "' has launched tasks";
> LOG(WARNING) << "Killing the unregistered executor " << *executor
>  << " because it has no tasks";
> executor->state = Executor::TERMINATING;
> containerizer->destroy(executor->containerId);
>   }
> {code}
> 6) The executor will never be terminated by Mesos after that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5380) Killing a queued task can cause the corresponding command executor never terminates.

2016-05-13 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-5380:
--
Description: 
We observed that in our testing environment. So here is the sequence of events:

1) A command task is queued, the executor is not registered yet
2) The framework issues a killTask
3) Since executor is in REGISTERING state, agent calls 
`statusUpdate(TASK_KILLED, UPID())`
4) `statusUpdate` now will call `containerizer->status()` before calling 
`executor->terminateTask(status.task_id(), status);` which will remove the 
queued task. (introduced in this patch https://reviews.apache.org/r/43258).
5) Since the above is async, it's possible that the task is still in queued 
task when we trying to see if we need to kill unregistered executor in 
`killTask`:
{code}
  // TODO(jieyu): Here, we kill the executor if it no longer has
  // any task to run and has not yet registered. This is a
  // workaround for those single task executors that do not have a
  // proper self terminating logic when they haven't received the
  // task within a timeout.
  if (executor->queuedTasks.empty()) {
CHECK(executor->launchedTasks.empty())
<< " Unregistered executor '" << executor->id
<< "' has launched tasks";

LOG(WARNING) << "Killing the unregistered executor " << *executor
 << " because it has no tasks";

executor->state = Executor::TERMINATING;

containerizer->destroy(executor->containerId);
  }
{code}

6) The executor will never be terminated by Mesos after that.

  was:
We observed that in our testing environment. So here is the sequence of events:

1) A command task is queued, the executor is not registered yet
2) The framework issues a killTask
3) Since executor is in REGISTERING state, agent calls 
`statusUpdate(TASK_KILLED, UPID())`
4) `statusUpdate` now will call `containerizer->status()` before calling 
`executor->terminateTask(status.task_id(), status);` which will remove the 
queued task. (introduced in this patch https://reviews.apache.org/r/43258).
5) Since the above is async, it's possible that the task is still in queued 
task when we trying to see if we need to kill unregistered executor in 
`killTask`:
```
  // TODO(jieyu): Here, we kill the executor if it no longer has
  // any task to run and has not yet registered. This is a
  // workaround for those single task executors that do not have a
  // proper self terminating logic when they haven't received the
  // task within a timeout.
  if (executor->queuedTasks.empty()) {
CHECK(executor->launchedTasks.empty())
<< " Unregistered executor '" << executor->id
<< "' has launched tasks";

LOG(WARNING) << "Killing the unregistered executor " << *executor
 << " because it has no tasks";

executor->state = Executor::TERMINATING;

containerizer->destroy(executor->containerId);
  }
```
6) The executor will never be terminated by Mesos after that.


> Killing a queued task can cause the corresponding command executor never 
> terminates.
> 
>
> Key: MESOS-5380
> URL: https://issues.apache.org/jira/browse/MESOS-5380
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.28.0, 0.28.1
>Reporter: Jie Yu
>Assignee: Vinod Kone
>Priority: Blocker
> Fix For: 0.29.0, 0.28.2
>
>
> We observed that in our testing environment. So here is the sequence of 
> events:
> 1) A command task is queued, the executor is not registered yet
> 2) The framework issues a killTask
> 3) Since executor is in REGISTERING state, agent calls 
> `statusUpdate(TASK_KILLED, UPID())`
> 4) `statusUpdate` now will call `containerizer->status()` before calling 
> `executor->terminateTask(status.task_id(), status);` which will remove the 
> queued task. (introduced in this patch https://reviews.apache.org/r/43258).
> 5) Since the above is async, it's possible that the task is still in queued 
> task when we trying to see if we need to kill unregistered executor in 
> `killTask`:
> {code}
>   // TODO(jieyu): Here, we kill the executor if it no longer has
>   // any task to run and has not yet registered. This is a
>   // workaround for those single task executors that do not have a
>   // proper self terminating logic when they haven't received the
>   // task within a timeout.
>   if (executor->queuedTasks.empty()) {
> CHECK(executor->launchedTasks.empty())
> << " Unregistered executor '" << executor->id
> << "' has launched tasks";
> LOG(WARNING) << "Killing the unregistered executor " << *executor
>  << " because it has no tasks";
> 

[jira] [Created] (MESOS-5380) Killing a queued task can cause the corresponding command executor never terminates.

2016-05-13 Thread Jie Yu (JIRA)
Jie Yu created MESOS-5380:
-

 Summary: Killing a queued task can cause the corresponding command 
executor never terminates.
 Key: MESOS-5380
 URL: https://issues.apache.org/jira/browse/MESOS-5380
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.28.1, 0.28.0
Reporter: Jie Yu
Assignee: Vinod Kone
Priority: Blocker
 Fix For: 0.29.0, 0.28.2


We observed that in our testing environment. So here is the sequence of events:

1) A command task is queued, the executor is not registered yet
2) The framework issues a killTask
3) Since executor is in REGISTERING state, agent calls 
`statusUpdate(TASK_KILLED, UPID())`
4) `statusUpdate` now will call `containerizer->status()` before calling 
`executor->terminateTask(status.task_id(), status);` which will remove the 
queued task. (introduced in this patch https://reviews.apache.org/r/43258).
5) Since the above is async, it's possible that the task is still in queued 
task when we trying to see if we need to kill unregistered executor in 
`killTask`:
```
  // TODO(jieyu): Here, we kill the executor if it no longer has
  // any task to run and has not yet registered. This is a
  // workaround for those single task executors that do not have a
  // proper self terminating logic when they haven't received the
  // task within a timeout.
  if (executor->queuedTasks.empty()) {
CHECK(executor->launchedTasks.empty())
<< " Unregistered executor '" << executor->id
<< "' has launched tasks";

LOG(WARNING) << "Killing the unregistered executor " << *executor
 << " because it has no tasks";

executor->state = Executor::TERMINATING;

containerizer->destroy(executor->containerId);
  }
```
6) The executor will never be terminated by Mesos after that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5371) Implement `fcntl.hpp`

2016-05-13 Thread Alex Clemmer (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Clemmer updated MESOS-5371:

Description: 
`fcntl.hpp` has a bunch of functions that will never work on Windows. We will 
need to work around them, either by working around specific call sites of 
functions like `os::cloexec`, or by implementing something that keeps track of 
which file descriptors are cloexec, and which aren't.

NOTE: We have elected to log warnings for these functions when we call them, so 
that it is obvious they have done nothing. This carries a performance penalty 
especially for the master, and when we resolve this issue, it is important we 
remove the logging as well.

  was:`fcntl.hpp` has a bunch of functions that will never work on Windows. We 
will need to work around them, either by working around specific call sites of 
functions like `os::cloexec`, or by implementing something that keeps track of 
which file descriptors are cloexec, and which aren't.


> Implement `fcntl.hpp`
> -
>
> Key: MESOS-5371
> URL: https://issues.apache.org/jira/browse/MESOS-5371
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Alex Clemmer
>Assignee: Alex Clemmer
>  Labels: mesosphere, stout, windows-mvp
>
> `fcntl.hpp` has a bunch of functions that will never work on Windows. We will 
> need to work around them, either by working around specific call sites of 
> functions like `os::cloexec`, or by implementing something that keeps track 
> of which file descriptors are cloexec, and which aren't.
> NOTE: We have elected to log warnings for these functions when we call them, 
> so that it is obvious they have done nothing. This carries a performance 
> penalty especially for the master, and when we resolve this issue, it is 
> important we remove the logging as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5215) Update the documentation for '/reserve' and '/create-volumes'

2016-05-13 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-5215:
-
Fix Version/s: 0.29.0

> Update the documentation for '/reserve' and '/create-volumes'
> -
>
> Key: MESOS-5215
> URL: https://issues.apache.org/jira/browse/MESOS-5215
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Affects Versions: 0.28.1
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: documentation, mesosphere
> Fix For: 0.29.0
>
>
> There are a couple issues related to the {{principal}} field in {{DiskInfo}} 
> and {{ReservationInfo}} (see linked JIRAs) that should be better documented. 
> We need to help users understand the purpose of these fields and how they 
> interact with the principal provided in the HTTP authentication header. See 
> linked tickets for background.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-5215) Update the documentation for '/reserve' and '/create-volumes'

2016-05-13 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann reassigned MESOS-5215:


Assignee: Greg Mann

> Update the documentation for '/reserve' and '/create-volumes'
> -
>
> Key: MESOS-5215
> URL: https://issues.apache.org/jira/browse/MESOS-5215
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Affects Versions: 0.28.1
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: documentation, mesosphere
>
> There are a couple issues related to the {{principal}} field in {{DiskInfo}} 
> and {{ReservationInfo}} (see linked JIRAs) that should be better documented. 
> We need to help users understand the purpose of these fields and how they 
> interact with the principal provided in the HTTP authentication header. See 
> linked tickets for background.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4801) Updated `createFrameworkInfo` for hierarchical_allocator_tests.cpp.

2016-05-13 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4801:
---
Shepherd: Alexander Rukletsov  (was: Benjamin Mahler)
  Sprint: Mesosphere Sprint 35
Story Points: 1
  Labels: mesosphere  (was: )
Priority: Minor  (was: Major)
 Component/s: tests
  Issue Type: Improvement  (was: Bug)

> Updated `createFrameworkInfo` for hierarchical_allocator_tests.cpp.
> ---
>
> Key: MESOS-4801
> URL: https://issues.apache.org/jira/browse/MESOS-4801
> Project: Mesos
>  Issue Type: Improvement
>  Components: tests
>Reporter: Guangya Liu
>Assignee: Guangya Liu
>Priority: Minor
>  Labels: mesosphere
>
> The function of {{createFrameworkInfo}} in hierarchical_allocator_tests.cpp 
> should be updated by enabling caller can set a framework capability to create 
> a framework which can use revocable resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5371) Implement `fcntl.hpp`

2016-05-13 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282908#comment-15282908
 ] 

Joris Van Remoortere commented on MESOS-5371:
-

{code}
commit 4c6162d5e3535f4611e869e143c91454033dca2d
Author: Alex Clemmer 
Date:   Fri May 13 13:25:57 2016 -0400

Windows: Added stub implementations of `fcntl.hpp` functions.

This commit introduces temporary versions of 2 important functions:
`os::nonblock` and `os::cloexec`. We put them here in a placeholder
commit so that reviewers can make progress on subprocess. In the
immediate term, the plan is to figure out on a callsite-by-callsite
basis how to work around the functionality of `os::cloexec`. When we
collect more data, we will be in a better position to offer a way
forward here.

Review: https://reviews.apache.org/r/46392/
{code}

> Implement `fcntl.hpp`
> -
>
> Key: MESOS-5371
> URL: https://issues.apache.org/jira/browse/MESOS-5371
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Alex Clemmer
>Assignee: Alex Clemmer
>  Labels: mesosphere, stout, windows-mvp
>
> `fcntl.hpp` has a bunch of functions that will never work on Windows. We will 
> need to work around them, either by working around specific call sites of 
> functions like `os::cloexec`, or by implementing something that keeps track 
> of which file descriptors are cloexec, and which aren't.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4785) Reorganize ACL subject/object descriptions

2016-05-13 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4785:
---
Story Points: 5  (was: 1)
Priority: Blocker  (was: Major)

> Reorganize ACL subject/object descriptions
> --
>
> Key: MESOS-4785
> URL: https://issues.apache.org/jira/browse/MESOS-4785
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Greg Mann
>Assignee: Alexander Rojas
>Priority: Blocker
>  Labels: documentation, mesosphere, security
> Fix For: 0.29.0
>
>
> The authorization documentation would benefit from a reorganization of the 
> ACL subject/object descriptions. Instead of simple lists of the available 
> subjects and objects, it would be nice to see a table showing which subject 
> and object is used with each action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5379) Authentication documentation for libprocess endpoints can be misleading.

2016-05-13 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere updated MESOS-5379:

Assignee: (was: Joris Van Remoortere)

> Authentication documentation for libprocess endpoints can be misleading.
> 
>
> Key: MESOS-5379
> URL: https://issues.apache.org/jira/browse/MESOS-5379
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation, libprocess
>Affects Versions: 0.29.0
>Reporter: Benjamin Bannier
>Priority: Blocker
>  Labels: mesosphere, tech-debt
> Fix For: 0.29.0
>
>
> Libprocess exposes a number of endpoints (at least: {{/logging}}, 
> {{/metrics}}, and {{/profiler}}). If libprocess was initialized with some 
> realm these endpoints require authentication, and don't if not.
> To generate endpoint help we currently use the also function 
> {{AUTHENTICATION}} which injects the following into the help string,
> {code}
> This endpoints requires authentication iff HTTP authentication is enabled.
> {code}
> with {{iff}} documenting a coupling stronger between required authentication 
> and enabled authentication which might not be true for above libprocess 
> endpoints -- it is e.g., true when these endpoints are exposed through mesos 
> masters/agents, but possibly not if exposed through other executables.
> It seems for libprocess endpoint a less strong formulation like e.g.,
> {code}
> This endpoints supports authentication. If HTTP authentication is enabled, 
> this endpoint may require authentication.
> {code}
> might make the generated help strings more reusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-5379) Authentication documentation for libprocess endpoints can be misleading.

2016-05-13 Thread Joris Van Remoortere (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van Remoortere reassigned MESOS-5379:
---

Assignee: Joris Van Remoortere

> Authentication documentation for libprocess endpoints can be misleading.
> 
>
> Key: MESOS-5379
> URL: https://issues.apache.org/jira/browse/MESOS-5379
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation, libprocess
>Affects Versions: 0.29.0
>Reporter: Benjamin Bannier
>Assignee: Joris Van Remoortere
>Priority: Blocker
>  Labels: mesosphere, tech-debt
> Fix For: 0.29.0
>
>
> Libprocess exposes a number of endpoints (at least: {{/logging}}, 
> {{/metrics}}, and {{/profiler}}). If libprocess was initialized with some 
> realm these endpoints require authentication, and don't if not.
> To generate endpoint help we currently use the also function 
> {{AUTHENTICATION}} which injects the following into the help string,
> {code}
> This endpoints requires authentication iff HTTP authentication is enabled.
> {code}
> with {{iff}} documenting a coupling stronger between required authentication 
> and enabled authentication which might not be true for above libprocess 
> endpoints -- it is e.g., true when these endpoints are exposed through mesos 
> masters/agents, but possibly not if exposed through other executables.
> It seems for libprocess endpoint a less strong formulation like e.g.,
> {code}
> This endpoints supports authentication. If HTTP authentication is enabled, 
> this endpoint may require authentication.
> {code}
> might make the generated help strings more reusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-5377) Improve DRF behavior with scarce resources.

2016-05-13 Thread Klaus Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaus Ma updated MESOS-5377:

Comment: was deleted

(was: [~bmahler], I think that's because we did not consider framework's 
request: {{allocator}} assumed all frameworks have the same request to all 
resources, the “dominant resource” is allocated resources instead of 
framework's request. One proposal in my mind is to implement 
{{requestResource}}; for your case, the dominant resources is GPU when only one 
task request GPU, and the dominant resources will be changed to CPU when more 
coming tasks request CPUs in the same framework.)

> Improve DRF behavior with scarce resources.
> ---
>
> Key: MESOS-5377
> URL: https://issues.apache.org/jira/browse/MESOS-5377
> Project: Mesos
>  Issue Type: Epic
>  Components: allocation
>Reporter: Benjamin Mahler
>
> The allocator currently uses the notion of Weighted [Dominant Resource 
> Fairness|https://www.cs.berkeley.edu/~alig/papers/drf.pdf] (WDRF) to 
> establish a linear notion of fairness across allocation roles.
> DRF behaves well for resources that are present within each machine in a 
> cluster (e.g. CPUs, memory, disk). However, some resources (e.g. GPUs) are 
> only present on a subset of machines in the cluster.
> Consider the behavior when there are the following agents in a cluster:
> 1000 agents with (cpus:4,mem:1024,disk:1024)
> 1 agent with (gpus:1,cpus:4,mem:1024,disk:1024)
> If a role wishes to use both GPU and non-GPU resources for tasks, consuming 1 
> GPU will lead DRF to consider the role to have a 100% share of the cluster, 
> since it consumes 100% of the GPUs in the cluster. This framework will then 
> not receive any other offers.
> Among possible improvements, fairness can have understanding of resource 
> packages. In a sense there is 1 GPU package that is competed on and 1000 
> non-GPU packages competed on, and ideally a role's consumption of the single 
> GPU package does not have a large effect on the role's access to the other 
> 1000 non-GPU packages.
> In the interim, we should consider having a recommended way to deal with 
> scarce resources in the current model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5272) Support docker image labels.

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5272:
-
Sprint: Mesosphere Sprint 34, Mesosphere Sprint 35  (was: Mesosphere Sprint 
34)

> Support docker image labels.
> 
>
> Key: MESOS-5272
> URL: https://issues.apache.org/jira/browse/MESOS-5272
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Gilbert Song
>Assignee: Gilbert Song
>  Labels: containerizer, gpu, mesosphere
>
> Docker image labels should be supported in unified containerizer, which can 
> be used for applying custom metadata. Image labels are necessary for mesos 
> features to support docker in unified containerizer (e.g., for mesos GPU 
> device isolator).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5232) Add capability information to ContainerInfo protobuf message.

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5232:
-
Sprint: Mesosphere Sprint 33, Mesosphere Sprint 34, Mesosphere Sprint 35  
(was: Mesosphere Sprint 33, Mesosphere Sprint 34)

> Add capability information to ContainerInfo protobuf message.
> -
>
> Key: MESOS-5232
> URL: https://issues.apache.org/jira/browse/MESOS-5232
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> To enable support for capability as first class framework entity, we need to 
> add capabilities related information to the ContainerInfo protobuf.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5153) Sandboxes contents should be protected from unauthorized users

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5153:
-
Sprint: Mesosphere Sprint 33, Mesosphere Sprint 34, Mesosphere Sprint 35  
(was: Mesosphere Sprint 33, Mesosphere Sprint 34)

> Sandboxes contents should be protected from unauthorized users
> --
>
> Key: MESOS-5153
> URL: https://issues.apache.org/jira/browse/MESOS-5153
> Project: Mesos
>  Issue Type: Bug
>  Components: security, slave
>Reporter: Alexander Rojas
>Assignee: Alexander Rojas
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
>
> MESOS-4956 introduced authentication support for the sandboxes. However, 
> authentication can only go as far as to tell whether an user is known to 
> mesos or not. An extra additional step is necessary to verify whether the 
> known user is allowed to executed the requested operation on the sandbox 
> (browse, read, download, debug).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5155) Consolidate authorization actions for quota.

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5155:
-
Sprint: Mesosphere Sprint 33, Mesosphere Sprint 34, Mesosphere Sprint 35  
(was: Mesosphere Sprint 33, Mesosphere Sprint 34)

> Consolidate authorization actions for quota.
> 
>
> Key: MESOS-5155
> URL: https://issues.apache.org/jira/browse/MESOS-5155
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>Assignee: Zhitao Li
>  Labels: mesosphere
>
> We should have just a single authz action: {{UPDATE_QUOTA_WITH_ROLE}}. It was 
> a mistake in retrospect to introduce multiple actions.
> Actions that are not symmetrical are register/teardown and dynamic 
> reservations. The way they are implemented in this way is because entities 
> that do one action differ from entities that do the other. For example, 
> register framework is issued by a framework, teardown by an operator. What is 
> a good way to identify a framework? A role it runs in, which may be different 
> each launch and makes no sense in multi-role frameworks setup or better a 
> sort of a group id, which is its principal. For dynamic reservations and 
> persistent volumes, they can be both issued by frameworks and operators, 
> hence similar reasoning applies. 
> Now, quota is associated with a role and set only by operators. Do we need to 
> care about principals that set it? Not that much. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4689) Design doc for v1 Operator API

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4689:
-
Sprint: Mesosphere Sprint 29, Mesosphere Sprint 33, Mesosphere Sprint 34, 
Mesosphere Sprint 35  (was: Mesosphere Sprint 29, Mesosphere Sprint 33, 
Mesosphere Sprint 34)

> Design doc for v1 Operator API
> --
>
> Key: MESOS-4689
> URL: https://issues.apache.org/jira/browse/MESOS-4689
> Project: Mesos
>  Issue Type: Documentation
>Reporter: Vinod Kone
>Assignee: Kevin Klues
>
> We need to design how the v1 operator API (all the HTTP endpoints exposed by 
> master/agent that are not for scheduler/executor interactions) looks and 
> works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5173) Allow master/agent to take multiple modules manifest files

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5173:
-
Sprint: Mesosphere Sprint 33, Mesosphere Sprint 34, Mesosphere Sprint 35  
(was: Mesosphere Sprint 33, Mesosphere Sprint 34)

> Allow master/agent to take multiple modules manifest files
> --
>
> Key: MESOS-5173
> URL: https://issues.apache.org/jira/browse/MESOS-5173
> Project: Mesos
>  Issue Type: Task
>Reporter: Kapil Arya
>Assignee: Kapil Arya
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> When loading multiple modules into master/agent, one has to merge all module 
> metadata (library name, module name, parameters, etc.) into a single json 
> file which is then passed on to the --modules flag. This quickly becomes 
> cumbersome especially if the modules are coming from different 
> vendors/developers.
> An alternate would be to allow multiple invocations of --modules flag that 
> can then be passed on to the module manager. That way, each flag corresponds 
> to just one module library and modules from that library.
> Another approach is to create a new flag (e.g., --modules-dir) that contains 
> a path to a directory that would contain multiple json files. One can think 
> of it as an analogous to systemd units. The operator that drops a new file 
> into this directory and the file would automatically be picked up by the 
> master/agent module manager. Further, the naming scheme can also be inherited 
> to prefix the filename with an "NN_" to signify oad order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3781) Replace Master/Slave Terminology Phase I - Add duplicate agent flags

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-3781:
-
Sprint: Mesosphere Sprint 32, Mesosphere Sprint 33, Mesosphere Sprint 34, 
Mesosphere Sprint 35  (was: Mesosphere Sprint 32, Mesosphere Sprint 33, 
Mesosphere Sprint 34)

> Replace Master/Slave Terminology Phase I - Add duplicate agent flags 
> -
>
> Key: MESOS-3781
> URL: https://issues.apache.org/jira/browse/MESOS-3781
> Project: Mesos
>  Issue Type: Task
>Reporter: Diana Arroyo
>Assignee: Jay Guo
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4233) Logging is too verbose for sysadmins / syslog

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4233:
-
Sprint: Mesosphere Sprint 26, Mesosphere Sprint 27, Mesosphere Sprint 28, 
Mesosphere Sprint 29, Mesosphere Sprint 30, Mesosphere Sprint 31, Mesosphere 
Sprint 32, Mesosphere Sprint 33, Mesosphere Sprint 34, Mesosphere Sprint 35  
(was: Mesosphere Sprint 26, Mesosphere Sprint 27, Mesosphere Sprint 28, 
Mesosphere Sprint 29, Mesosphere Sprint 30, Mesosphere Sprint 31, Mesosphere 
Sprint 32, Mesosphere Sprint 33, Mesosphere Sprint 34)

> Logging is too verbose for sysadmins / syslog
> -
>
> Key: MESOS-4233
> URL: https://issues.apache.org/jira/browse/MESOS-4233
> Project: Mesos
>  Issue Type: Epic
>Reporter: Cody Maloney
>Assignee: Kapil Arya
>  Labels: mesosphere
> Attachments: giant_port_range_logging
>
>
> Currently mesos logs a lot. When launching a thousand tasks in the space of 
> 10 seconds it will print tens of thousands of log lines, overwhelming syslog 
> (there is a max rate at which a process can send stuff over a unix socket) 
> and not giving useful information to a sysadmin who cares about just the 
> high-level activity and when something goes wrong.
> Note mesos also blocks writing to its log locations, so when writing a lot of 
> log messages, it can fill up the write buffer in the kernel, and be suspended 
> until the syslog agent catches up reading from the socket (GLOG does a 
> blocking fwrite to stderr). GLOG also has a big mutex around logging so only 
> one thing logs at a time.
> While for "internal debugging" it is useful to see things like "message went 
> from internal compoent x to internal component y", from a sysadmin 
> perspective I only care about the high level actions taken (launched task for 
> framework x), sent offer to framework y, got task failed from host z. Note 
> those are what I'd expect at the "INFO" level. At the "WARNING" level I'd 
> expect very little to be logged / almost nothing in normal operation. Just 
> things like "WARN: Repliacted log write took longer than expected". WARN 
> would also get things like backtraces on crashes and abnormal exits / abort.
> When trying to launch 3k+ tasks inside a second, mesos logging currently 
> overwhelms syslog with 100k+ messages, many of which are thousands of bytes. 
> Sysadmins expect to be able to use syslog to monitor basic events in their 
> system. This is too much.
> We can keep logging the messages to files, but the logging to stderr needs to 
> be reduced significantly (stderr gets picked up and forwarded to syslog / 
> central aggregation).
> What I would like is if I can set the stderr logging level to be different / 
> independent from the file logging level (Syslog giving the "sysadmin" 
> aggregated overview, files useful for debugging in depth what happened in a 
> cluster). A lot of what mesos currently logs at info is really debugging info 
> / should show up as debug log level.
> Some samples of mesos logging a lot more than a sysadmin would want / expect 
> are attached, and some are below:
>  - Every task gets printed multiple times for a basic launch:
> {noformat}
> Dec 15 22:58:30 ip-10-0-7-60.us-west-2.compute.internal mesos-master[1311]: 
> I1215 22:58:29.382644  1315 master.cpp:3248] Launching task 
> envy.5b19a713-a37f-11e5-8b3e-0251692d6109 of framework 
> 5178f46d-71d6-422f-922c-5bbe82dff9cc- (marathon)
> Dec 15 22:58:30 ip-10-0-7-60.us-west-2.compute.internal mesos-master[1311]: 
> I1215 22:58:29.382925  1315 master.hpp:176] Adding task 
> envy.5b1958f2-a37f-11e5-8b3e-0251692d6109 with resources cpus(​*):0.0001; 
> mem(*​):16; ports(*):[14047-14047]
> {noformat}
>  - Every task status update prints many log lines, successful ones are part 
> of normal operation and maybe should be logged at info / debug levels, but 
> not to a sysadmin (Just show when things fail, and maybe aggregate counters 
> to tell of the volume of working)
>  - No log messagse should be really big / more than 1k characters (Would 
> prevent the giant port list attached, make that easily discoverable / bug 
> filable / fixable) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4781) Executor env variables should not be leaked to the command task.

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4781:
-
Sprint: Mesosphere Sprint 30, Mesosphere Sprint 31, Mesosphere Sprint 32, 
Mesosphere Sprint 33, Mesosphere Sprint 34, Mesosphere Sprint 35  (was: 
Mesosphere Sprint 30, Mesosphere Sprint 31, Mesosphere Sprint 32, Mesosphere 
Sprint 33, Mesosphere Sprint 34)

> Executor env variables should not be leaked to the command task.
> 
>
> Key: MESOS-4781
> URL: https://issues.apache.org/jira/browse/MESOS-4781
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Gilbert Song
>Assignee: Gilbert Song
>  Labels: mesosphere
>
> Currently, command task inherits the env variables of the command executor. 
> This is less ideal because the command executor environment variables include 
> some Mesos internal env variables like MESOS_XXX and LIBPROCESS_XXX. Also, 
> this behavior does not match what Docker containerizer does. We should 
> construct the env variables from scratch for the command task, rather than 
> relying on inheriting the env variables from the command executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5296) Split Resource and Inverse offer protobufs for V1 API

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5296:
-
Sprint: Mesosphere Sprint 34, Mesosphere Sprint 35  (was: Mesosphere Sprint 
34)

> Split Resource and Inverse offer protobufs for V1 API
> -
>
> Key: MESOS-5296
> URL: https://issues.apache.org/jira/browse/MESOS-5296
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joris Van Remoortere
>Assignee: Joris Van Remoortere
> Fix For: 0.29.0
>
>
> The protobufs for the V1 api regarding inverse offers initially re-used the 
> existing offer / rescind / accept / decline messages for regular offers.
> We should split these out the be more explicit, and provide the ability to 
> augment the messages with particulars to either resource or inverse offers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4766) Improve allocator performance.

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4766:
-
Sprint: Mesosphere Sprint 32, Mesosphere Sprint 33, Mesosphere Sprint 34, 
Mesosphere Sprint 35  (was: Mesosphere Sprint 32, Mesosphere Sprint 33, 
Mesosphere Sprint 34)

> Improve allocator performance.
> --
>
> Key: MESOS-4766
> URL: https://issues.apache.org/jira/browse/MESOS-4766
> Project: Mesos
>  Issue Type: Epic
>  Components: allocation
>Reporter: Benjamin Mahler
>Assignee: Michael Park
>Priority: Critical
>
> This is an epic to track the various tickets around improving the performance 
> of the allocator, including the following:
> * Preventing un-necessary backup of the allocator.
> * Reducing the cost of allocations and allocator state updates.
> * Improving performance of the DRF sorter.
> * More benchmarking to simulate scenarios with performance issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4938) Support docker registry authentication

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4938:
-
Sprint: Mesosphere Sprint 31, Mesosphere Sprint 32, Mesosphere Sprint 33, 
Mesosphere Sprint 34, Mesosphere Sprint 35  (was: Mesosphere Sprint 31, 
Mesosphere Sprint 32, Mesosphere Sprint 33, Mesosphere Sprint 34)

> Support docker registry authentication
> --
>
> Key: MESOS-4938
> URL: https://issues.apache.org/jira/browse/MESOS-4938
> Project: Mesos
>  Issue Type: Task
>Reporter: Jie Yu
>Assignee: Gilbert Song
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5167) Add tests for `network/cni` isolator

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5167:
-
Sprint: Mesosphere Sprint 33, Mesosphere Sprint 34, Mesosphere Sprint 35  
(was: Mesosphere Sprint 33, Mesosphere Sprint 34)

> Add tests for `network/cni` isolator
> 
>
> Key: MESOS-5167
> URL: https://issues.apache.org/jira/browse/MESOS-5167
> Project: Mesos
>  Issue Type: Task
>  Components: test
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>  Labels: mesosphere
>
> We need to add tests to verify the functionality of `network/cni` isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5168) Benchmark overhead of authorization based filtering.

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5168:
-
Sprint: Mesosphere Sprint 33, Mesosphere Sprint 34, Mesosphere Sprint 35  
(was: Mesosphere Sprint 33, Mesosphere Sprint 34)

> Benchmark overhead of authorization based filtering.
> 
>
> Key: MESOS-5168
> URL: https://issues.apache.org/jira/browse/MESOS-5168
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: authorization, mesosphere, security
> Fix For: 0.29.0
>
>
> When adding authorization based filtering as outlined in MESOS-4931 we need 
> to be careful especially for performance critical endpoints such as /state.
> We should ensure via a benchmark that performance does not degreade below an 
> acceptable state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4690) Reorganize 3rdparty directory

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4690:
-
Sprint: Mesosphere Sprint 33, Mesosphere Sprint 34, Mesosphere Sprint 35  
(was: Mesosphere Sprint 33, Mesosphere Sprint 34)

> Reorganize 3rdparty directory
> -
>
> Key: MESOS-4690
> URL: https://issues.apache.org/jira/browse/MESOS-4690
> Project: Mesos
>  Issue Type: Epic
>  Components: build, libprocess, stout
>Reporter: Kapil Arya
>Assignee: Kapil Arya
>  Labels: mesosphere
>
> This issues is currently being discussed in the dev mailing list:
> http://www.mail-archive.com/dev@mesos.apache.org/msg34349.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4544) Propose design doc for agent partitioning behavior

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4544:
-
Sprint: Mesosphere Sprint 28, Mesosphere Sprint 29, Mesosphere Sprint 30, 
Mesosphere Sprint 31, Mesosphere Sprint 32, Mesosphere Sprint 33, Mesosphere 
Sprint 34, Mesosphere Sprint 35  (was: Mesosphere Sprint 28, Mesosphere Sprint 
29, Mesosphere Sprint 30, Mesosphere Sprint 31, Mesosphere Sprint 32, 
Mesosphere Sprint 33, Mesosphere Sprint 34)

> Propose design doc for agent partitioning behavior
> --
>
> Key: MESOS-4544
> URL: https://issues.apache.org/jira/browse/MESOS-4544
> Project: Mesos
>  Issue Type: Task
>  Components: general
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5275) Add capabilities support for unified containerizer.

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5275:
-
Sprint: Mesosphere Sprint 34, Mesosphere Sprint 35  (was: Mesosphere Sprint 
34)

> Add capabilities support for unified containerizer.
> ---
>
> Key: MESOS-5275
> URL: https://issues.apache.org/jira/browse/MESOS-5275
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Jojy Varghese
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> Add capabilities support for unified containerizer. 
> Requirements:
> 1. Use the mesos capabilities API.
> 2. Frameworks be able to add capability requests for containers.
> 3. Agents be able to add maximum allowed capabilities for all containers 
> launched.
> Design document: 
> https://docs.google.com/document/d/1YiTift8TQla2vq3upQr7K-riQ_pQ-FKOCOsysQJROGc/edit#heading=h.rgfwelqrskmd



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4785) Reorganize ACL subject/object descriptions

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-4785:
-
Sprint: Mesosphere Sprint 33, Mesosphere Sprint 34, Mesosphere Sprint 35  
(was: Mesosphere Sprint 33, Mesosphere Sprint 34)

> Reorganize ACL subject/object descriptions
> --
>
> Key: MESOS-4785
> URL: https://issues.apache.org/jira/browse/MESOS-4785
> Project: Mesos
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Greg Mann
>Assignee: Alexander Rojas
>  Labels: documentation, mesosphere, security
> Fix For: 0.29.0
>
>
> The authorization documentation would benefit from a reorganization of the 
> ACL subject/object descriptions. Instead of simple lists of the available 
> subjects and objects, it would be nice to see a table showing which subject 
> and object is used with each action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5169) Introduce new Authorizer Actions for Authorized based filtering of endpoints.

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5169:
-
Sprint: Mesosphere Sprint 33, Mesosphere Sprint 34, Mesosphere Sprint 35  
(was: Mesosphere Sprint 33, Mesosphere Sprint 34)

> Introduce new Authorizer Actions for Authorized based filtering of endpoints.
> -
>
> Key: MESOS-5169
> URL: https://issues.apache.org/jira/browse/MESOS-5169
> Project: Mesos
>  Issue Type: Improvement
>  Components: security
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: authorization, mesosphere, security
> Fix For: 0.29.0
>
>
> For authorization based endpoint filtering we need to introduce the 
> authorizer actions outlined via MESOS-4932.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5051) Create helpers for manipulating Linux capabilities.

2016-05-13 Thread Artem Harutyunyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Artem Harutyunyan updated MESOS-5051:
-
Sprint: Mesosphere Sprint 32, Mesosphere Sprint 33, Mesosphere Sprint 34, 
Mesosphere Sprint 35  (was: Mesosphere Sprint 32, Mesosphere Sprint 33, 
Mesosphere Sprint 34)

> Create helpers for manipulating Linux capabilities.
> ---
>
> Key: MESOS-5051
> URL: https://issues.apache.org/jira/browse/MESOS-5051
> Project: Mesos
>  Issue Type: Task
>Reporter: Jie Yu
>Assignee: Jojy Varghese
>  Labels: mesosphere
>
> These helpers can either based on some existing library (e.g. libcap), or use 
> system calls directly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5316) Authenticate the agent's '/containers' endpoint

2016-05-13 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-5316:
---
Shepherd: Alexander Rukletsov
  Sprint: Mesosphere Sprint 35

> Authenticate the agent's '/containers' endpoint
> ---
>
> Key: MESOS-5316
> URL: https://issues.apache.org/jira/browse/MESOS-5316
> Project: Mesos
>  Issue Type: Improvement
>  Components: security, slave
>Reporter: Greg Mann
>Assignee: Abhishek Dasgupta
>  Labels: authentication, mesosphere
> Fix For: 0.29.0
>
>
> The {{/containers}} endpoint was recently added to the agent. Authentication 
> should be enabled on this endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5336) Add authorization to GET /quota

2016-05-13 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-5336:
---
Sprint: Mesosphere Sprint 35

> Add authorization to GET /quota
> ---
>
> Key: MESOS-5336
> URL: https://issues.apache.org/jira/browse/MESOS-5336
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, security
>Reporter: Adam B
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
>
> We already authorize which http users can set/remove quota for particular 
> roles, but even knowing of the existence of these roles (let alone their 
> quotas) may be sensitive information. We should add authz around GET 
> operations on /quota.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5336) Add authorization to GET /quota

2016-05-13 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282727#comment-15282727
 ] 

Alexander Rukletsov commented on MESOS-5336:


Let's punt on the coarse-grained authz for now.

> Add authorization to GET /quota
> ---
>
> Key: MESOS-5336
> URL: https://issues.apache.org/jira/browse/MESOS-5336
> Project: Mesos
>  Issue Type: Improvement
>  Components: master, security
>Reporter: Adam B
>  Labels: mesosphere, security
> Fix For: 0.29.0
>
>
> We already authorize which http users can set/remove quota for particular 
> roles, but even knowing of the existence of these roles (let alone their 
> quotas) may be sensitive information. We should add authz around GET 
> operations on /quota.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5316) Authenticate the agent's '/containers' endpoint

2016-05-13 Thread Abhishek Dasgupta (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282578#comment-15282578
 ] 

Abhishek Dasgupta commented on MESOS-5316:
--

Another RR has been added:
https://reviews.apache.org/r/47340/

> Authenticate the agent's '/containers' endpoint
> ---
>
> Key: MESOS-5316
> URL: https://issues.apache.org/jira/browse/MESOS-5316
> Project: Mesos
>  Issue Type: Improvement
>  Components: security, slave
>Reporter: Greg Mann
>Assignee: Abhishek Dasgupta
>  Labels: authentication, mesosphere
> Fix For: 0.29.0
>
>
> The {{/containers}} endpoint was recently added to the agent. Authentication 
> should be enabled on this endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5377) Improve DRF behavior with scarce resources.

2016-05-13 Thread Guangya Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282574#comment-15282574
 ] 

Guangya Liu commented on MESOS-5377:


What about enhancing sorter ignore the {{scarce resources}} when computing 
share but only consider the major resources cpu, memory, disk etc? Cluster 
admin can define the {{scarce resources}} list via a master flag.

> Improve DRF behavior with scarce resources.
> ---
>
> Key: MESOS-5377
> URL: https://issues.apache.org/jira/browse/MESOS-5377
> Project: Mesos
>  Issue Type: Epic
>  Components: allocation
>Reporter: Benjamin Mahler
>
> The allocator currently uses the notion of Weighted [Dominant Resource 
> Fairness|https://www.cs.berkeley.edu/~alig/papers/drf.pdf] (WDRF) to 
> establish a linear notion of fairness across allocation roles.
> DRF behaves well for resources that are present within each machine in a 
> cluster (e.g. CPUs, memory, disk). However, some resources (e.g. GPUs) are 
> only present on a subset of machines in the cluster.
> Consider the behavior when there are the following agents in a cluster:
> 1000 agents with (cpus:4,mem:1024,disk:1024)
> 1 agent with (gpus:1,cpus:4,mem:1024,disk:1024)
> If a role wishes to use both GPU and non-GPU resources for tasks, consuming 1 
> GPU will lead DRF to consider the role to have a 100% share of the cluster, 
> since it consumes 100% of the GPUs in the cluster. This framework will then 
> not receive any other offers.
> Among possible improvements, fairness can have understanding of resource 
> packages. In a sense there is 1 GPU package that is competed on and 1000 
> non-GPU packages competed on, and ideally a role's consumption of the single 
> GPU package does not have a large effect on the role's access to the other 
> 1000 non-GPU packages.
> In the interim, we should consider having a recommended way to deal with 
> scarce resources in the current model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5379) Authentication documentation for libprocess endpoints can be misleading.

2016-05-13 Thread Benjamin Bannier (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282572#comment-15282572
 ] 

Benjamin Bannier commented on MESOS-5379:
-

What I personally like about the current {{AUTHENTICATION}} function is that it 
leaves little room to deviate from a standard format. This makes the resulting 
output extremely easy to parse and e.g., allows confirming expected behavior 
automatically. Sadly, this also is the reason it has problems with the this use 
case in libprocess.

It would be great if we could integrate a fix here into a larger story of 
enabling generation of a machine-parseble REST API spec (like swagger/RAML/...) 
in addition to the text representation for our human friends.

> Authentication documentation for libprocess endpoints can be misleading.
> 
>
> Key: MESOS-5379
> URL: https://issues.apache.org/jira/browse/MESOS-5379
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation, libprocess
>Affects Versions: 0.29.0
>Reporter: Benjamin Bannier
>Priority: Blocker
>  Labels: mesosphere, tech-debt
> Fix For: 0.29.0
>
>
> Libprocess exposes a number of endpoints (at least: {{/logging}}, 
> {{/metrics}}, and {{/profiler}}). If libprocess was initialized with some 
> realm these endpoints require authentication, and don't if not.
> To generate endpoint help we currently use the also function 
> {{AUTHENTICATION}} which injects the following into the help string,
> {code}
> This endpoints requires authentication iff HTTP authentication is enabled.
> {code}
> with {{iff}} documenting a coupling stronger between required authentication 
> and enabled authentication which might not be true for above libprocess 
> endpoints -- it is e.g., true when these endpoints are exposed through mesos 
> masters/agents, but possibly not if exposed through other executables.
> It seems for libprocess endpoint a less strong formulation like e.g.,
> {code}
> This endpoints supports authentication. If HTTP authentication is enabled, 
> this endpoint may require authentication.
> {code}
> might make the generated help strings more reusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3335) FlagsBase copy-ctor leads to dangling pointer.

2016-05-13 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-3335:
---
  Labels: mesosphere  (was: )
Priority: Major  (was: Minor)
 Summary: FlagsBase copy-ctor leads to dangling pointer.  (was: FlagsBase 
copy-ctor leads to dangling pointer)

> FlagsBase copy-ctor leads to dangling pointer.
> --
>
> Key: MESOS-3335
> URL: https://issues.apache.org/jira/browse/MESOS-3335
> Project: Mesos
>  Issue Type: Bug
>Reporter: Neil Conway
>Assignee: Benjamin Bannier
>  Labels: mesosphere
> Attachments: lambda_capture_bug.cpp
>
>
> Per [#3328], ubsan detects the following problem:
> [ RUN ] FaultToleranceTest.ReregisterCompletedFrameworks
> /mesos/3rdparty/libprocess/3rdparty/stout/include/stout/flags/flags.hpp:303:25:
>  runtime error: load of value 33, which is not a valid value for type 'bool'
> I believe what is going on here is the following:
> * The test calls StartMaster(), which does MesosTest::CreateMasterFlags()
> * MesosTest::CreateMasterFlags() allocates a new master::Flags on the stack, 
> which is subsequently copy-constructed back to StartMaster()
> * The FlagsBase constructor is:
> bq. {{FlagsBase() { add(, "help", "...", false); }}}
> where "help" is a member variable -- i.e., it is allocated on the stack in 
> this case.
> * {{FlagsBase()::add}} captures {{}}, e.g.:
> {noformat}
> flag.stringify = [t1](const FlagsBase&) -> Option {
> return stringify(*t1);
>   };}}
> {noformat}
> * The implicit copy constructor for FlagsBase is just going to copy the 
> lambda above, i.e., the result of the copy constructor will have a lambda 
> that points into MesosTest::CreateMasterFlags()'s stack frame, which is bad 
> news.
> Not sure the right fix -- comments welcome. You could define a copy-ctor for 
> FlagsBase that does something gross (basically remove the old help flag and 
> define a new one that points into the target of the copy), but that seems, 
> well, gross.
> Probably not a pressing-problem to fix -- AFAICS worst symptom is that we end 
> up reading one byte from some random stack location when serving 
> {{state.json}}, for example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5379) Authentication documentation for libprocess endpoints can be misleading.

2016-05-13 Thread Joerg Schad (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282558#comment-15282558
 ] 

Joerg Schad commented on MESOS-5379:


We discussed when introducing that Macro that we might want to extend the 
message (e.g., making it a custom string, enum) for allowing a more detailed 
Message later on.


> Authentication documentation for libprocess endpoints can be misleading.
> 
>
> Key: MESOS-5379
> URL: https://issues.apache.org/jira/browse/MESOS-5379
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation, libprocess
>Affects Versions: 0.29.0
>Reporter: Benjamin Bannier
>Priority: Blocker
>  Labels: mesosphere, tech-debt
> Fix For: 0.29.0
>
>
> Libprocess exposes a number of endpoints (at least: {{/logging}}, 
> {{/metrics}}, and {{/profiler}}). If libprocess was initialized with some 
> realm these endpoints require authentication, and don't if not.
> To generate endpoint help we currently use the also function 
> {{AUTHENTICATION}} which injects the following into the help string,
> {code}
> This endpoints requires authentication iff HTTP authentication is enabled.
> {code}
> with {{iff}} documenting a coupling stronger between required authentication 
> and enabled authentication which might not be true for above libprocess 
> endpoints -- it is e.g., true when these endpoints are exposed through mesos 
> masters/agents, but possibly not if exposed through other executables.
> It seems for libprocess endpoint a less strong formulation like e.g.,
> {code}
> This endpoints supports authentication. If HTTP authentication is enabled, 
> this endpoint may require authentication.
> {code}
> might make the generated help strings more reusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5379) Authentication documentation for libprocess endpoints can be misleading.

2016-05-13 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-5379:
---
 Shepherd: Alexander Rukletsov
 Priority: Blocker  (was: Major)
Fix Version/s: 0.29.0

> Authentication documentation for libprocess endpoints can be misleading.
> 
>
> Key: MESOS-5379
> URL: https://issues.apache.org/jira/browse/MESOS-5379
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation, libprocess
>Affects Versions: 0.29.0
>Reporter: Benjamin Bannier
>Priority: Blocker
>  Labels: mesosphere, tech-debt
> Fix For: 0.29.0
>
>
> Libprocess exposes a number of endpoints (at least: {{/logging}}, 
> {{/metrics}}, and {{/profiler}}). If libprocess was initialized with some 
> realm these endpoints require authentication, and don't if not.
> To generate endpoint help we currently use the also function 
> {{AUTHENTICATION}} which injects the following into the help string,
> {code}
> This endpoints requires authentication iff HTTP authentication is enabled.
> {code}
> with {{iff}} documenting a coupling stronger between required authentication 
> and enabled authentication which might not be true for above libprocess 
> endpoints -- it is e.g., true when these endpoints are exposed through mesos 
> masters/agents, but possibly not if exposed through other executables.
> It seems for libprocess endpoint a less strong formulation like e.g.,
> {code}
> This endpoints supports authentication. If HTTP authentication is enabled, 
> this endpoint may require authentication.
> {code}
> might make the generated help strings more reusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2731) Allow frameworks to deploy storage drivers on demand.

2016-05-13 Thread Guangya Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282547#comment-15282547
 ] 

Guangya Liu commented on MESOS-2731:


MESOS-4355 resolved part of this, with MESOS-4355 , mesos can leverage external 
storage with docker volume API.

> Allow frameworks to deploy storage drivers on demand.
> -
>
> Key: MESOS-2731
> URL: https://issues.apache.org/jira/browse/MESOS-2731
> Project: Mesos
>  Issue Type: Epic
>Reporter: Joerg Schad
>  Labels: mesosphere
>
> Certain storage options require storage drivers to access them including HDFS 
> driver, Quobyte client, Database driver, and so on.
> When Tasks in Mesos require access to such storage they also need access to 
> the respective driver on the node where they were scheduled to.
> As it is not desirable to deploy the driver onto all nodes in the cluster, it 
> would be good to deploy the driver on demand.
> Use Cases:
> 1. Fetcher Cache pulling resources from user-provided URIs
> 2. Framework executors/tasks requiring r/w access to HDFS/DFS
> 3. Framework executors/tasks requiring r/w Databases access (requiring 
> drivers)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5379) Authentication documentation for libprocess endpoints can be misleading

2016-05-13 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-5379:
---
Description: 
Libprocess exposes a number of endpoints (at least: {{/logging}}, {{/metrics}}, 
and {{/profiler}}). If libprocess was initialized with some realm these 
endpoints require authentication, and don't if not.

To generate endpoint help we currently use the also function {{AUTHENTICATION}} 
which injects the following into the help string,
{code}
This endpoints requires authentication iff HTTP authentication is enabled.
{code}
with {{iff}} documenting a coupling stronger between required authentication 
and enabled authentication which might not be true for above libprocess 
endpoints -- it is e.g., true when these endpoints are exposed through mesos 
masters/agents, but possibly not if exposed through other executables.

It seems for libprocess endpoint a less strong formulation like e.g.,
{code}
This endpoints supports authentication. If HTTP authentication is enabled, this 
endpoint may require authentication.
{code}
might make the generated help strings more reusable.

  was:
Libprocess exposes a number of endpoints (at least: {{/logging}}, {{/metrics}}, 
and {{/profiler}}). If libprocess was initialized with some realm these 
endpoints require authentication, and don't if not.

To generate endpoint help we currently use the also function {{AUTHENTICATION}} 
which injects the following into the helpstring,
{code}
This endpoints requires authentication iff HTTP authentication is enabled.
{code}
with {{iff}} documenting a coupling stronger between required authentication 
and enabled authentication which might not be true for above libprocess 
endpoints -- it is e.g., true when these endpoints are exposed through mesos 
masters/agents, but possibly not if exposed through other actors.

It seems for libprocess endpoint a less strong formulation like e.g.,
{code}
This endpoints supports authentication. If HTTP authentication is enabled, this 
endpoint may require authentication.`
{code}
might make the generated helpstrings more reusable.


> Authentication documentation for libprocess endpoints can be misleading
> ---
>
> Key: MESOS-5379
> URL: https://issues.apache.org/jira/browse/MESOS-5379
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation, libprocess
>Affects Versions: 0.29.0
>Reporter: Benjamin Bannier
>  Labels: mesosphere, tech-debt
>
> Libprocess exposes a number of endpoints (at least: {{/logging}}, 
> {{/metrics}}, and {{/profiler}}). If libprocess was initialized with some 
> realm these endpoints require authentication, and don't if not.
> To generate endpoint help we currently use the also function 
> {{AUTHENTICATION}} which injects the following into the help string,
> {code}
> This endpoints requires authentication iff HTTP authentication is enabled.
> {code}
> with {{iff}} documenting a coupling stronger between required authentication 
> and enabled authentication which might not be true for above libprocess 
> endpoints -- it is e.g., true when these endpoints are exposed through mesos 
> masters/agents, but possibly not if exposed through other executables.
> It seems for libprocess endpoint a less strong formulation like e.g.,
> {code}
> This endpoints supports authentication. If HTTP authentication is enabled, 
> this endpoint may require authentication.
> {code}
> might make the generated help strings more reusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5379) Authentication documentation for libprocess endpoints can be misleading.

2016-05-13 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-5379:
---
Summary: Authentication documentation for libprocess endpoints can be 
misleading.  (was: Authentication documentation for libprocess endpoints can be 
misleading)

> Authentication documentation for libprocess endpoints can be misleading.
> 
>
> Key: MESOS-5379
> URL: https://issues.apache.org/jira/browse/MESOS-5379
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation, libprocess
>Affects Versions: 0.29.0
>Reporter: Benjamin Bannier
>  Labels: mesosphere, tech-debt
>
> Libprocess exposes a number of endpoints (at least: {{/logging}}, 
> {{/metrics}}, and {{/profiler}}). If libprocess was initialized with some 
> realm these endpoints require authentication, and don't if not.
> To generate endpoint help we currently use the also function 
> {{AUTHENTICATION}} which injects the following into the help string,
> {code}
> This endpoints requires authentication iff HTTP authentication is enabled.
> {code}
> with {{iff}} documenting a coupling stronger between required authentication 
> and enabled authentication which might not be true for above libprocess 
> endpoints -- it is e.g., true when these endpoints are exposed through mesos 
> masters/agents, but possibly not if exposed through other executables.
> It seems for libprocess endpoint a less strong formulation like e.g.,
> {code}
> This endpoints supports authentication. If HTTP authentication is enabled, 
> this endpoint may require authentication.
> {code}
> might make the generated help strings more reusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-5379) Authentication documentation for libprocess endpoints can be misleading

2016-05-13 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-5379:
---
Labels: mesosphere tech-debt  (was: )

> Authentication documentation for libprocess endpoints can be misleading
> ---
>
> Key: MESOS-5379
> URL: https://issues.apache.org/jira/browse/MESOS-5379
> Project: Mesos
>  Issue Type: Bug
>  Components: documentation, libprocess
>Affects Versions: 0.29.0
>Reporter: Benjamin Bannier
>  Labels: mesosphere, tech-debt
>
> Libprocess exposes a number of endpoints (at least: {{/logging}}, 
> {{/metrics}}, and {{/profiler}}). If libprocess was initialized with some 
> realm these endpoints require authentication, and don't if not.
> To generate endpoint help we currently use the also function 
> {{AUTHENTICATION}} which injects the following into the helpstring,
> {code}
> This endpoints requires authentication iff HTTP authentication is enabled.
> {code}
> with {{iff}} documenting a coupling stronger between required authentication 
> and enabled authentication which might not be true for above libprocess 
> endpoints -- it is e.g., true when these endpoints are exposed through mesos 
> masters/agents, but possibly not if exposed through other actors.
> It seems for libprocess endpoint a less strong formulation like e.g.,
> {code}
> This endpoints supports authentication. If HTTP authentication is enabled, 
> this endpoint may require authentication.`
> {code}
> might make the generated helpstrings more reusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-5379) Authentication documentation for libprocess endpoints can be misleading

2016-05-13 Thread Benjamin Bannier (JIRA)
Benjamin Bannier created MESOS-5379:
---

 Summary: Authentication documentation for libprocess endpoints can 
be misleading
 Key: MESOS-5379
 URL: https://issues.apache.org/jira/browse/MESOS-5379
 Project: Mesos
  Issue Type: Bug
  Components: documentation, libprocess
Affects Versions: 0.29.0
Reporter: Benjamin Bannier


Libprocess exposes a number of endpoints (at least: {{/logging}}, {{/metrics}}, 
and {{/profiler}}). If libprocess was initialized with some realm these 
endpoints require authentication, and don't if not.

To generate endpoint help we currently use the also function {{AUTHENTICATION}} 
which injects the following into the helpstring,
{code}
This endpoints requires authentication iff HTTP authentication is enabled.
{code}
with {{iff}} documenting a coupling stronger between required authentication 
and enabled authentication which might not be true for above libprocess 
endpoints -- it is e.g., true when these endpoints are exposed through mesos 
masters/agents, but possibly not if exposed through other actors.

It seems for libprocess endpoint a less strong formulation like e.g.,
{code}
This endpoints supports authentication. If HTTP authentication is enabled, this 
endpoint may require authentication.`
{code}
might make the generated helpstrings more reusable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3435) Add containerizer support for hyper

2016-05-13 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282478#comment-15282478
 ] 

haosdent commented on MESOS-3435:
-

Unfortunately till are not available at leat in following 4 weeks. So we have 
not yet start to review the containerizer modularization.

{quote}
Sounds like the hyper folks are improving their APIs for easier integration 
with Mesos
{quote}

Do you have more details about this or their contact information? I would like 
to contact them. :-)

> Add containerizer support for hyper
> ---
>
> Key: MESOS-3435
> URL: https://issues.apache.org/jira/browse/MESOS-3435
> Project: Mesos
>  Issue Type: Story
>Reporter: Deshi Xiao
>Assignee: haosdent
>
> Secure as hypervisor, fast and easily used as Docker. This is hyper. 
> https://docs.hyper.sh/Introduction/what_is_hyper_.html We could implement 
> this through module way once MESOS-3709 finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3435) Add containerizer support for hyper

2016-05-13 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282470#comment-15282470
 ] 

Timothy Chen commented on MESOS-3435:
-

Sounds like the hyper folks are improving their APIs for easier integration 
with Mesos, [~haosd...@gmail.com] is the module for containerization merged now?

> Add containerizer support for hyper
> ---
>
> Key: MESOS-3435
> URL: https://issues.apache.org/jira/browse/MESOS-3435
> Project: Mesos
>  Issue Type: Story
>Reporter: Deshi Xiao
>Assignee: haosdent
>
> Secure as hypervisor, fast and easily used as Docker. This is hyper. 
> https://docs.hyper.sh/Introduction/what_is_hyper_.html We could implement 
> this through module way once MESOS-3709 finished.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)