[jira] [Commented] (MESOS-5735) Update WebUI to use v1 operator API

2016-09-08 Thread Jay Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15476137#comment-15476137
 ] 

Jay Guo commented on MESOS-5735:


JSONP won't work anyway since we moved from `GET` to `POST` in HTTP API. `CORS` 
imposes security risks and may be only suitable for dev purposes. Hence, we are 
thinking to have proxies in master that relays requests/responds between WebUI 
and agents, so resources will always come from single domain from WebUI point 
of view.

> Update WebUI to use v1 operator API
> ---
>
> Key: MESOS-5735
> URL: https://issues.apache.org/jira/browse/MESOS-5735
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Jay Guo
>
> Having the WebUI use the v1 API would be a good validation of it's usefulness 
> and correctness.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-5735) Update WebUI to use v1 operator API

2016-09-08 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475960#comment-15475960
 ] 

Yan Xu edited comment on MESOS-5735 at 9/9/16 5:22 AM:
---

Could you elaborate on "in master to proxy the requests to every agent"?

Just a reminder that ultimately we'd like to replace JSONP (with CORS?). This 
is tracked by MESOS-5918 which resulted from the discussion in MESOS-5911 (in 
which [~anandmazumdar] pointed out its implication on the webUI work as well).

Do we want to take this on before we address MESOS-5918? Are we able to address 
this without incurring too much tech debt?


was (Author: xujyan):
Could you elaborate on "in master to proxy the requests to every agent"?

Just a reminder that ultimately we'd like to replace JSONP (with CORS?). This 
is tracked by MESOS-5911 which resulted from the discussion in MESOS-5918 (in 
which [~anandmazumdar] pointed out its implication on the webUI work as well).

Do we want to take this on before we address MESOS-5911? Are we able to address 
this without incurring too much tech debt?

> Update WebUI to use v1 operator API
> ---
>
> Key: MESOS-5735
> URL: https://issues.apache.org/jira/browse/MESOS-5735
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Jay Guo
>
> Having the WebUI use the v1 API would be a good validation of it's usefulness 
> and correctness.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5735) Update WebUI to use v1 operator API

2016-09-08 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475960#comment-15475960
 ] 

Yan Xu commented on MESOS-5735:
---

Could you elaborate on "in master to proxy the requests to every agent"?

Just a reminder that ultimately we'd like to replace JSONP (with CORS?). This 
is tracked by MESOS-5911 which resulted from the discussion in MESOS-5918 (in 
which [~anandmazumdar] pointed out its implication on the webUI work as well).

Do we want to take this on before we address MESOS-5911? Are we able to address 
this without incurring too much tech debt?

> Update WebUI to use v1 operator API
> ---
>
> Key: MESOS-5735
> URL: https://issues.apache.org/jira/browse/MESOS-5735
> Project: Mesos
>  Issue Type: Bug
>Reporter: Vinod Kone
>Assignee: Jay Guo
>
> Having the WebUI use the v1 API would be a good validation of it's usefulness 
> and correctness.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6143) resolv.conf is not copied when using the Mesos containerizer with a Docker image

2016-09-08 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475434#comment-15475434
 ] 

Jie Yu commented on MESOS-6143:
---

can you try alpine image see if that works?

We have a unit test which tests this:
https://github.com/apache/mesos/blob/master/src/tests/containerizer/cni_isolator_tests.cpp#L521

We've been running this unit test every commit on debian 8

> resolv.conf is not copied when using the Mesos containerizer with a Docker 
> image
> 
>
> Key: MESOS-6143
> URL: https://issues.apache.org/jira/browse/MESOS-6143
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, isolation
>Affects Versions: 1.0.0
> Environment: OS: Debian Jessie
> Mesos version: 1.0.0
>Reporter: Justin Pinkul
>Assignee: Avinash Sridharan
> Fix For: 1.1.0
>
>
> When using the Mesos containierizer, host networking and a Docker image 
> {{resolv.conf}} is not copied from the host. The only piece of Mesos code 
> that copies these file is currently in the {{network/cni}} isolator so I 
> tried turning this on, by setting 
> {{isolation=network/cni,namespaces/pid,docker/runtime,cgroups/devices,gpu/nvidia,cgroups/cpu,disk/du,filesystem/linux}},
>  but the issue still remained. I suspect this might be related to not setting 
> {{network_cni_config_dir}} and {{network_cni_plugins_dir}} but it seems 
> incorrect that these flags would be required to use host networking.
> Here is how I am able to reproduce this issue:
> {code}
> mesos-execute --master=mesosmaster1:5050 \
>   --name=dns-test \
>   --docker_image=my-docker-image:1.1.3 \
>   --command="bash -c 'ping google.com; while ((1)); do date; 
> sleep 10; done'"
> # Find the PID of mesos-executor's child process and enter it
> nsenter -m -u -i -n -p -r -w -t $PID
> # This file will be empty
> cat /etc/resolv.conf
> {code}
> {code:title=Mesos agent log}
> I0908 17:39:24.599149 181564 slave.cpp:1688] Launching task dns-test for 
> framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
> I0908 17:39:24.599567 181564 paths.cpp:528] Trying to chown 
> '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6'
>  to user 'root'
> I0908 17:39:24.603970 181564 slave.cpp:5748] Launching executor dns-test of 
> framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 with resources 
> cpus(*):0.1; mem(*):32 in work directory 
> '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6'
> I0908 17:39:24.604178 181564 slave.cpp:1914] Queuing task 'dns-test' for 
> executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
> I0908 17:39:24.604284 181571 docker.cpp:1020] Skipping non-docker container
> I0908 17:39:24.604532 181578 containerizer.cpp:781] Starting container 
> '52bdce71-04b0-4440-bb71-cb826f0635c6' for executor 'dns-test' of framework 
> '51831498-0902-4ae9-a1ff-4396f8b8d823-0006'
> I0908 17:39:24.606972 181571 provisioner.cpp:294] Provisioning image rootfs 
> '/mnt/01/mesos_work/provisioner/containers/52bdce71-04b0-4440-bb71-cb826f0635c6/backends/copy/rootfses/db97ba50-c9f0-45e7-8a39-871e4038abf9'
>  for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.037472 181564 cpushare.cpp:389] Updated 'cpu.shares' to 102 
> (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.038415 181560 linux_launcher.cpp:281] Cloning child process 
> with flags = CLONE_NEWNS | CLONE_NEWPID
> I0908 17:39:30.040742 181560 systemd.cpp:96] Assigned child process '190563' 
> to 'mesos_executors.slice'
> I0908 17:39:30.161613 181576 slave.cpp:2902] Got registration for executor 
> 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from 
> executor(1)@10.191.4.65:43707
> I0908 17:39:30.162148 181563 disk.cpp:171] Updating the disk resources for 
> container 52bdce71-04b0-4440-bb71-cb826f0635c6 to cpus(*):0.1; mem(*):32; 
> gpus(*):2
> I0908 17:39:30.162648 181566 cpushare.cpp:389] Updated 'cpu.shares' to 102 
> (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.162822 181574 slave.cpp:2079] Sending queued task 'dns-test' 
> to executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 
> at executor(1)@10.191.4.65:43707
> I0908 17:39:30.168383 181570 slave.cpp:3285] Handling status update 
> TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task dns-test 
> of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from 
> executor(1)@10.191.4.65:43707
> I0908 17:39:30.169019 181577 status_update_manager.

[jira] [Updated] (MESOS-6143) resolv.conf is not copied when using the Mesos containerizer with a Docker image

2016-09-08 Thread Avinash Sridharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avinash Sridharan updated MESOS-6143:
-
Fix Version/s: 1.1.0

> resolv.conf is not copied when using the Mesos containerizer with a Docker 
> image
> 
>
> Key: MESOS-6143
> URL: https://issues.apache.org/jira/browse/MESOS-6143
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, isolation
>Affects Versions: 1.0.0
> Environment: OS: Debian Jessie
> Mesos version: 1.0.0
>Reporter: Justin Pinkul
>Assignee: Avinash Sridharan
> Fix For: 1.1.0
>
>
> When using the Mesos containierizer, host networking and a Docker image 
> {{resolv.conf}} is not copied from the host. The only piece of Mesos code 
> that copies these file is currently in the {{network/cni}} isolator so I 
> tried turning this on, by setting 
> {{isolation=network/cni,namespaces/pid,docker/runtime,cgroups/devices,gpu/nvidia,cgroups/cpu,disk/du,filesystem/linux}},
>  but the issue still remained. I suspect this might be related to not setting 
> {{network_cni_config_dir}} and {{network_cni_plugins_dir}} but it seems 
> incorrect that these flags would be required to use host networking.
> Here is how I am able to reproduce this issue:
> {code}
> mesos-execute --master=mesosmaster1:5050 \
>   --name=dns-test \
>   --docker_image=my-docker-image:1.1.3 \
>   --command="bash -c 'ping google.com; while ((1)); do date; 
> sleep 10; done'"
> # Find the PID of mesos-executor's child process and enter it
> nsenter -m -u -i -n -p -r -w -t $PID
> # This file will be empty
> cat /etc/resolv.conf
> {code}
> {code:title=Mesos agent log}
> I0908 17:39:24.599149 181564 slave.cpp:1688] Launching task dns-test for 
> framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
> I0908 17:39:24.599567 181564 paths.cpp:528] Trying to chown 
> '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6'
>  to user 'root'
> I0908 17:39:24.603970 181564 slave.cpp:5748] Launching executor dns-test of 
> framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 with resources 
> cpus(*):0.1; mem(*):32 in work directory 
> '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6'
> I0908 17:39:24.604178 181564 slave.cpp:1914] Queuing task 'dns-test' for 
> executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
> I0908 17:39:24.604284 181571 docker.cpp:1020] Skipping non-docker container
> I0908 17:39:24.604532 181578 containerizer.cpp:781] Starting container 
> '52bdce71-04b0-4440-bb71-cb826f0635c6' for executor 'dns-test' of framework 
> '51831498-0902-4ae9-a1ff-4396f8b8d823-0006'
> I0908 17:39:24.606972 181571 provisioner.cpp:294] Provisioning image rootfs 
> '/mnt/01/mesos_work/provisioner/containers/52bdce71-04b0-4440-bb71-cb826f0635c6/backends/copy/rootfses/db97ba50-c9f0-45e7-8a39-871e4038abf9'
>  for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.037472 181564 cpushare.cpp:389] Updated 'cpu.shares' to 102 
> (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.038415 181560 linux_launcher.cpp:281] Cloning child process 
> with flags = CLONE_NEWNS | CLONE_NEWPID
> I0908 17:39:30.040742 181560 systemd.cpp:96] Assigned child process '190563' 
> to 'mesos_executors.slice'
> I0908 17:39:30.161613 181576 slave.cpp:2902] Got registration for executor 
> 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from 
> executor(1)@10.191.4.65:43707
> I0908 17:39:30.162148 181563 disk.cpp:171] Updating the disk resources for 
> container 52bdce71-04b0-4440-bb71-cb826f0635c6 to cpus(*):0.1; mem(*):32; 
> gpus(*):2
> I0908 17:39:30.162648 181566 cpushare.cpp:389] Updated 'cpu.shares' to 102 
> (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.162822 181574 slave.cpp:2079] Sending queued task 'dns-test' 
> to executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 
> at executor(1)@10.191.4.65:43707
> I0908 17:39:30.168383 181570 slave.cpp:3285] Handling status update 
> TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task dns-test 
> of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from 
> executor(1)@10.191.4.65:43707
> I0908 17:39:30.169019 181577 status_update_manager.cpp:320] Received status 
> update TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task 
> dns-test of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
> I0908 17:39:30.169173 181576 slave.cpp:3678] Forwarding the update 
> TASK_RUNN

[jira] [Assigned] (MESOS-6143) resolv.conf is not copied when using the Mesos containerizer with a Docker image

2016-09-08 Thread Avinash Sridharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avinash Sridharan reassigned MESOS-6143:


Assignee: Avinash Sridharan

> resolv.conf is not copied when using the Mesos containerizer with a Docker 
> image
> 
>
> Key: MESOS-6143
> URL: https://issues.apache.org/jira/browse/MESOS-6143
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, isolation
>Affects Versions: 1.0.0
> Environment: OS: Debian Jessie
> Mesos version: 1.0.0
>Reporter: Justin Pinkul
>Assignee: Avinash Sridharan
>
> When using the Mesos containierizer, host networking and a Docker image 
> {{resolv.conf}} is not copied from the host. The only piece of Mesos code 
> that copies these file is currently in the {{network/cni}} isolator so I 
> tried turning this on, by setting 
> {{isolation=network/cni,namespaces/pid,docker/runtime,cgroups/devices,gpu/nvidia,cgroups/cpu,disk/du,filesystem/linux}},
>  but the issue still remained. I suspect this might be related to not setting 
> {{network_cni_config_dir}} and {{network_cni_plugins_dir}} but it seems 
> incorrect that these flags would be required to use host networking.
> Here is how I am able to reproduce this issue:
> {code}
> mesos-execute --master=mesosmaster1:5050 \
>   --name=dns-test \
>   --docker_image=my-docker-image:1.1.3 \
>   --command="bash -c 'ping google.com; while ((1)); do date; 
> sleep 10; done'"
> # Find the PID of mesos-executor's child process and enter it
> nsenter -m -u -i -n -p -r -w -t $PID
> # This file will be empty
> cat /etc/resolv.conf
> {code}
> {code:title=Mesos agent log}
> I0908 17:39:24.599149 181564 slave.cpp:1688] Launching task dns-test for 
> framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
> I0908 17:39:24.599567 181564 paths.cpp:528] Trying to chown 
> '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6'
>  to user 'root'
> I0908 17:39:24.603970 181564 slave.cpp:5748] Launching executor dns-test of 
> framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 with resources 
> cpus(*):0.1; mem(*):32 in work directory 
> '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6'
> I0908 17:39:24.604178 181564 slave.cpp:1914] Queuing task 'dns-test' for 
> executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
> I0908 17:39:24.604284 181571 docker.cpp:1020] Skipping non-docker container
> I0908 17:39:24.604532 181578 containerizer.cpp:781] Starting container 
> '52bdce71-04b0-4440-bb71-cb826f0635c6' for executor 'dns-test' of framework 
> '51831498-0902-4ae9-a1ff-4396f8b8d823-0006'
> I0908 17:39:24.606972 181571 provisioner.cpp:294] Provisioning image rootfs 
> '/mnt/01/mesos_work/provisioner/containers/52bdce71-04b0-4440-bb71-cb826f0635c6/backends/copy/rootfses/db97ba50-c9f0-45e7-8a39-871e4038abf9'
>  for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.037472 181564 cpushare.cpp:389] Updated 'cpu.shares' to 102 
> (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.038415 181560 linux_launcher.cpp:281] Cloning child process 
> with flags = CLONE_NEWNS | CLONE_NEWPID
> I0908 17:39:30.040742 181560 systemd.cpp:96] Assigned child process '190563' 
> to 'mesos_executors.slice'
> I0908 17:39:30.161613 181576 slave.cpp:2902] Got registration for executor 
> 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from 
> executor(1)@10.191.4.65:43707
> I0908 17:39:30.162148 181563 disk.cpp:171] Updating the disk resources for 
> container 52bdce71-04b0-4440-bb71-cb826f0635c6 to cpus(*):0.1; mem(*):32; 
> gpus(*):2
> I0908 17:39:30.162648 181566 cpushare.cpp:389] Updated 'cpu.shares' to 102 
> (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6
> I0908 17:39:30.162822 181574 slave.cpp:2079] Sending queued task 'dns-test' 
> to executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 
> at executor(1)@10.191.4.65:43707
> I0908 17:39:30.168383 181570 slave.cpp:3285] Handling status update 
> TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task dns-test 
> of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from 
> executor(1)@10.191.4.65:43707
> I0908 17:39:30.169019 181577 status_update_manager.cpp:320] Received status 
> update TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task 
> dns-test of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
> I0908 17:39:30.169173 181576 slave.cpp:3678] Forwarding the update 
> TASK_RUNNING (UUID: 319e02

[jira] [Created] (MESOS-6143) resolv.conf is not copied when using the Mesos containerizer with a Docker image

2016-09-08 Thread Justin Pinkul (JIRA)
Justin Pinkul created MESOS-6143:


 Summary: resolv.conf is not copied when using the Mesos 
containerizer with a Docker image
 Key: MESOS-6143
 URL: https://issues.apache.org/jira/browse/MESOS-6143
 Project: Mesos
  Issue Type: Bug
  Components: containerization, isolation
Affects Versions: 1.0.0
 Environment: OS: Debian Jessie
Mesos version: 1.0.0
Reporter: Justin Pinkul


When using the Mesos containierizer, host networking and a Docker image 
{{resolv.conf}} is not copied from the host. The only piece of Mesos code that 
copies these file is currently in the {{network/cni}} isolator so I tried 
turning this on, by setting 
{{isolation=network/cni,namespaces/pid,docker/runtime,cgroups/devices,gpu/nvidia,cgroups/cpu,disk/du,filesystem/linux}},
 but the issue still remained. I suspect this might be related to not setting 
{{network_cni_config_dir}} and {{network_cni_plugins_dir}} but it seems 
incorrect that these flags would be required to use host networking.

Here is how I am able to reproduce this issue:
{code}
mesos-execute --master=mesosmaster1:5050 \
--name=dns-test \
--docker_image=my-docker-image:1.1.3 \
--command="bash -c 'ping google.com; while ((1)); do date; 
sleep 10; done'"

# Find the PID of mesos-executor's child process and enter it
nsenter -m -u -i -n -p -r -w -t $PID

# This file will be empty
cat /etc/resolv.conf
{code}

{code:title=Mesos agent log}
I0908 17:39:24.599149 181564 slave.cpp:1688] Launching task dns-test for 
framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
I0908 17:39:24.599567 181564 paths.cpp:528] Trying to chown 
'/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6'
 to user 'root'
I0908 17:39:24.603970 181564 slave.cpp:5748] Launching executor dns-test of 
framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 with resources cpus(*):0.1; 
mem(*):32 in work directory 
'/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6'
I0908 17:39:24.604178 181564 slave.cpp:1914] Queuing task 'dns-test' for 
executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
I0908 17:39:24.604284 181571 docker.cpp:1020] Skipping non-docker container
I0908 17:39:24.604532 181578 containerizer.cpp:781] Starting container 
'52bdce71-04b0-4440-bb71-cb826f0635c6' for executor 'dns-test' of framework 
'51831498-0902-4ae9-a1ff-4396f8b8d823-0006'
I0908 17:39:24.606972 181571 provisioner.cpp:294] Provisioning image rootfs 
'/mnt/01/mesos_work/provisioner/containers/52bdce71-04b0-4440-bb71-cb826f0635c6/backends/copy/rootfses/db97ba50-c9f0-45e7-8a39-871e4038abf9'
 for container 52bdce71-04b0-4440-bb71-cb826f0635c6
I0908 17:39:30.037472 181564 cpushare.cpp:389] Updated 'cpu.shares' to 102 
(cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6
I0908 17:39:30.038415 181560 linux_launcher.cpp:281] Cloning child process with 
flags = CLONE_NEWNS | CLONE_NEWPID
I0908 17:39:30.040742 181560 systemd.cpp:96] Assigned child process '190563' to 
'mesos_executors.slice'
I0908 17:39:30.161613 181576 slave.cpp:2902] Got registration for executor 
'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from 
executor(1)@10.191.4.65:43707
I0908 17:39:30.162148 181563 disk.cpp:171] Updating the disk resources for 
container 52bdce71-04b0-4440-bb71-cb826f0635c6 to cpus(*):0.1; mem(*):32; 
gpus(*):2
I0908 17:39:30.162648 181566 cpushare.cpp:389] Updated 'cpu.shares' to 102 
(cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6
I0908 17:39:30.162822 181574 slave.cpp:2079] Sending queued task 'dns-test' to 
executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 at 
executor(1)@10.191.4.65:43707
I0908 17:39:30.168383 181570 slave.cpp:3285] Handling status update 
TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task dns-test of 
framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from 
executor(1)@10.191.4.65:43707
I0908 17:39:30.169019 181577 status_update_manager.cpp:320] Received status 
update TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task 
dns-test of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006
I0908 17:39:30.169173 181576 slave.cpp:3678] Forwarding the update TASK_RUNNING 
(UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task dns-test of framework 
51831498-0902-4ae9-a1ff-4396f8b8d823-0006 to master@10.191.248.194:5050
I0908 17:39:30.169242 181576 slave.cpp:3588] Sending acknowledgement for status 
update TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task 
dns-test of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 to 
executor(1)@10.191.4.65:43707
I0908 17:39:30.17131

[jira] [Commented] (MESOS-6067) Support provisioner to be nested aware for Mesos Pods.

2016-09-08 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475370#comment-15475370
 ] 

Jie Yu commented on MESOS-6067:
---

commit 1ba5153ee4d0a0075b51debab72903bba561c6b7
Author: Gilbert Song 
Date:   Thu Sep 8 16:05:43 2016 -0700

Supported provisioner listContainers() to be recursive.

This patch supports collecting all containerIds (all containers in the
nested hierarchy) recursively.

Review: https://reviews.apache.org/r/51392/

> Support provisioner to be nested aware for Mesos Pods.
> --
>
> Key: MESOS-6067
> URL: https://issues.apache.org/jira/browse/MESOS-6067
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
>Reporter: Gilbert Song
>Assignee: Gilbert Song
>  Labels: containerizer, provisioner
>
> The provisioner has to be nested aware for sub-container provisioning, as 
> well as recovery and nested container destroy. Better to support multi-level 
> hierarchy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6136) Duplicate framework id handling

2016-09-08 Thread Christopher Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475329#comment-15475329
 ] 

Christopher Hunt commented on MESOS-6136:
-

Thanks - sounds reasonable.

I still think that failover_timeout has a place though. I like what Mesos does 
in terms of preventing a framework to rejoin given the inconsistent state of 
tasks. I also want that the operator explicitly clears this condition by either 
killing all tasks so that Mesos can subsequently accept the same framework id, 
or by leaving the tasks and allowing the framework to join and clean up its own 
tasks. Perhaps /teardown for the former, and a new one, /accept for the latter.

> Duplicate framework id handling
> ---
>
> Key: MESOS-6136
> URL: https://issues.apache.org/jira/browse/MESOS-6136
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Affects Versions: 0.28.1
> Environment: DCOS 1.7 Cloud Formation scripts
>Reporter: Christopher Hunt
>Priority: Critical
>  Labels: framework, lifecyclemanagement, task
>
> We have observed a situation where Mesos will kill tasks belonging to a 
> framework where that framework times out with the Mesos master for some 
> reason, perhaps even because of a network partition.
> While we can provide a long timeout so that Mesos will not kill a framework's 
> tasks for practical purposes, I'm wondering if there's an improvement where a 
> framework shouldn't be permitted to re-register for a given id (as now), but 
> Mesos doesn't also kill tasks? What I'm thinking is that Mesos could be 
> "told" by an operator that this condition should be cleared.
> IMHO frameworks should be the only entity requesting that tasks be killed 
> unless manually overridden by an operator.
> I'm flagging this as a critical improvement because a) the focus should be on 
> keeping tasks running in a system, and it isn't; and b) Mesos is working as 
> designed. 
> In summary I feel that Mesos is taking on a responsibility in killing tasks 
> where it shouldn't be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6101) Add event for Framwork added to master operator API

2016-09-08 Thread Zhitao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475101#comment-15475101
 ] 

Zhitao Li commented on MESOS-6101:
--

[~vinodkone] [~anandmazumdar], for completeness, I think we have two events:

1. FRAMEWORK_ADDED, which maps to the case when a FRAMEWORK_INFO is first known 
to the master;
2. FRAMEWORK_REMOVED, which maps to the case when a FRAMEWORK_INFO.

It's open to me whether we need event for framework 
failover/disconnect/reconnect, but I guess it's probably fine to leave for next 
iteration?

https://reviews.apache.org/r/51700/ has patch for FRAMEWORK_ADDED, I'll start 
one for FRAMEWORK_REMOVED.

> Add event for Framwork added to master operator API
> ---
>
> Key: MESOS-6101
> URL: https://issues.apache.org/jira/browse/MESOS-6101
> Project: Mesos
>  Issue Type: Task
>Reporter: Zhitao Li
>Assignee: Zhitao Li
>
> Consider the following case:
> 1) a subscriber connects to master;
> 2) a new scheduler registered as a new framework;
> 3) a task is launched from this framework.
> In this sequence, subscriber does not have a way to know the FrameworkInfo 
> belonging to the FrameworkId.
> We should support an event (e.g. when framework info in master is 
> added/changed).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6101) Add event for Framwork added to master operator API

2016-09-08 Thread Zhitao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhitao Li reassigned MESOS-6101:


Assignee: Zhitao Li

> Add event for Framwork added to master operator API
> ---
>
> Key: MESOS-6101
> URL: https://issues.apache.org/jira/browse/MESOS-6101
> Project: Mesos
>  Issue Type: Task
>Reporter: Zhitao Li
>Assignee: Zhitao Li
>
> Consider the following case:
> 1) a subscriber connects to master;
> 2) a new scheduler registered as a new framework;
> 3) a task is launched from this framework.
> In this sequence, subscriber does not have a way to know the FrameworkInfo 
> belonging to the FrameworkId.
> We should support an event (e.g. when framework info in master is 
> added/changed).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6142) Frameworks may RESERVE for an arbitrary role.

2016-09-08 Thread Alexander Rukletsov (JIRA)
Alexander Rukletsov created MESOS-6142:
--

 Summary: Frameworks may RESERVE for an arbitrary role.
 Key: MESOS-6142
 URL: https://issues.apache.org/jira/browse/MESOS-6142
 Project: Mesos
  Issue Type: Bug
  Components: allocation, master
Affects Versions: 1.0.0
Reporter: Alexander Rukletsov
Priority: Blocker
 Fix For: 1.1.0


The master does not validate that resources from a reservation request have the 
same role the framework is registered with. As a result, frameworks may reserve 
resources for arbitrary roles.

I've modified the role in [the {{ReserveThenUnreserve}} 
test|https://github.com/apache/mesos/blob/bca600cf5602ed8227d91af9f73d689da14ad786/src/tests/reservation_tests.cpp#L117]
 to "yoyo" and observed the following in the test's log:
{noformat}
I0908 18:35:43.379122 2138112 master.cpp:3362] Processing ACCEPT call for 
offers: [ dfaf67e6-7c1c-4988-b427-c49842cb7bb7-O0 ] on agent 
dfaf67e6-7c1c-4988-b427-c49842cb7bb7-S0 at slave(1)@10.200.181.237:60116 
(alexr.railnet.train) for framework dfaf67e6-7c1c-4988-b427-c49842cb7bb7- 
(default) at scheduler-ca12a660-9f08-49de-be4e-d452aa3aa6da@10.200.181.237:60116
I0908 18:35:43.379170 2138112 master.cpp:3022] Authorizing principal 
'test-principal' to reserve resources 'cpus(yoyo, test-principal):1; mem(yoyo, 
test-principal):512'
I0908 18:35:43.379678 2138112 master.cpp:3642] Applying RESERVE operation for 
resources cpus(yoyo, test-principal):1; mem(yoyo, test-principal):512 from 
framework dfaf67e6-7c1c-4988-b427-c49842cb7bb7- (default) at 
scheduler-ca12a660-9f08-49de-be4e-d452aa3aa6da@10.200.181.237:60116 to agent 
dfaf67e6-7c1c-4988-b427-c49842cb7bb7-S0 at slave(1)@10.200.181.237:60116 
(alexr.railnet.train)
I0908 18:35:43.379767 2138112 master.cpp:7341] Sending checkpointed resources 
cpus(yoyo, test-principal):1; mem(yoyo, test-principal):512 to agent 
dfaf67e6-7c1c-4988-b427-c49842cb7bb7-S0 at slave(1)@10.200.181.237:60116 
(alexr.railnet.train)
I0908 18:35:43.380273 3211264 slave.cpp:2497] Updated checkpointed resources 
from  to cpus(yoyo, test-principal):1; mem(yoyo, test-principal):512
I0908 18:35:43.380574 2674688 hierarchical.cpp:760] Updated allocation of 
framework dfaf67e6-7c1c-4988-b427-c49842cb7bb7- on agent 
dfaf67e6-7c1c-4988-b427-c49842cb7bb7-S0 from cpus(*):1; mem(*):512; 
disk(*):470841; ports(*):[31000-32000] to ports(*):[31000-32000]; cpus(yoyo, 
test-principal):1; disk(*):470841; mem(yoyo, test-principal):512 with RESERVE 
operation
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6128) Make "re-register" vs. "reregister" consistent in the master

2016-09-08 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15474149#comment-15474149
 ] 

Greg Mann commented on MESOS-6128:
--

Also one less character to type! :) +1 for reregister

> Make "re-register" vs. "reregister" consistent in the master
> 
>
> Key: MESOS-6128
> URL: https://issues.apache.org/jira/browse/MESOS-6128
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Neil Conway
>  Labels: mesosphere, newbie
>
> Per discussion in https://reviews.apache.org/r/50705/, we sometimes use 
> "re-register" in comments and elsewhere we use "reregister". We should pick 
> one form and use it consistently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-6136) Duplicate framework id handling

2016-09-08 Thread Neil Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15473653#comment-15473653
 ] 

Neil Conway commented on MESOS-6136:


We might want to distinguish between "framework has been explicitly torn down" 
(via the {{/teardown}} endpoint) and "framework has been disconnected for 
longer than {{failover_timeout}}". In the former case, the operator has 
explicitly removed the framework, so it seems quite reasonable for Mesos to 
kill the associated tasks (and we should arrange to do this even for tasks 
running on agents that are partitioned at the time of the {{/teardown}}). In 
the latter case, having Mesos kill tasks at any point is more debatable. 
Obviously the recommended practice for production frameworks is to set a high 
{{failover_timeout}}. We could perhaps change this behavior: e.g., deprecate 
{{failover_timeout}}, and say that tasks associated with disconnected 
frameworks continue running indefinitely until/unless killed by the operator. 
As part of this, we would probably want to provide better support for cleaning 
up the state associated with such a disconnected framework -- e.g., allowing 
{{/teardown}} to be used for this purpose.

> Duplicate framework id handling
> ---
>
> Key: MESOS-6136
> URL: https://issues.apache.org/jira/browse/MESOS-6136
> Project: Mesos
>  Issue Type: Improvement
>  Components: general
>Affects Versions: 0.28.1
> Environment: DCOS 1.7 Cloud Formation scripts
>Reporter: Christopher Hunt
>Priority: Critical
>  Labels: framework, lifecyclemanagement, task
>
> We have observed a situation where Mesos will kill tasks belonging to a 
> framework where that framework times out with the Mesos master for some 
> reason, perhaps even because of a network partition.
> While we can provide a long timeout so that Mesos will not kill a framework's 
> tasks for practical purposes, I'm wondering if there's an improvement where a 
> framework shouldn't be permitted to re-register for a given id (as now), but 
> Mesos doesn't also kill tasks? What I'm thinking is that Mesos could be 
> "told" by an operator that this condition should be cleared.
> IMHO frameworks should be the only entity requesting that tasks be killed 
> unless manually overridden by an operator.
> I'm flagging this as a critical improvement because a) the focus should be on 
> keeping tasks running in a system, and it isn't; and b) Mesos is working as 
> designed. 
> In summary I feel that Mesos is taking on a responsibility in killing tasks 
> where it shouldn't be.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6141) Some tests do not properly set 'flags.launcher' with the correct value

2016-09-08 Thread Kevin Klues (JIRA)
Kevin Klues created MESOS-6141:
--

 Summary: Some tests do not properly set 'flags.launcher' with the 
correct value
 Key: MESOS-6141
 URL: https://issues.apache.org/jira/browse/MESOS-6141
 Project: Mesos
  Issue Type: Bug
Reporter: Kevin Klues
Assignee: Kevin Klues
 Fix For: 1.1.0


In some of our tests we manually create a 'PosixLauncher' rather than relying 
on the value of 'flags.launcher' to decide which type of launcher to create. 
Since calls to 'CreateSlaveFlags()' set 'flags.launcher' to 'linux' by default, 
there is a discrepency in what the flags say, and what actual launcher type we 
are creating.

We should fix this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6140) Add a parallel test runner

2016-09-08 Thread Benjamin Bannier (JIRA)
Benjamin Bannier created MESOS-6140:
---

 Summary: Add a parallel test runner
 Key: MESOS-6140
 URL: https://issues.apache.org/jira/browse/MESOS-6140
 Project: Mesos
  Issue Type: Improvement
  Components: tests
Reporter: Benjamin Bannier


In order to allow parallelization of the test execution we should add a 
parallel test executor to Mesos, and subsequently activate it in the build 
setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-6140) Add a parallel test runner

2016-09-08 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier reassigned MESOS-6140:
---

Assignee: Benjamin Bannier

> Add a parallel test runner
> --
>
> Key: MESOS-6140
> URL: https://issues.apache.org/jira/browse/MESOS-6140
> Project: Mesos
>  Issue Type: Improvement
>  Components: tests
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>
> In order to allow parallelization of the test execution we should add a 
> parallel test executor to Mesos, and subsequently activate it in the build 
> setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)