[jira] [Commented] (MESOS-5735) Update WebUI to use v1 operator API
[ https://issues.apache.org/jira/browse/MESOS-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15476137#comment-15476137 ] Jay Guo commented on MESOS-5735: JSONP won't work anyway since we moved from `GET` to `POST` in HTTP API. `CORS` imposes security risks and may be only suitable for dev purposes. Hence, we are thinking to have proxies in master that relays requests/responds between WebUI and agents, so resources will always come from single domain from WebUI point of view. > Update WebUI to use v1 operator API > --- > > Key: MESOS-5735 > URL: https://issues.apache.org/jira/browse/MESOS-5735 > Project: Mesos > Issue Type: Bug >Reporter: Vinod Kone >Assignee: Jay Guo > > Having the WebUI use the v1 API would be a good validation of it's usefulness > and correctness. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-5735) Update WebUI to use v1 operator API
[ https://issues.apache.org/jira/browse/MESOS-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475960#comment-15475960 ] Yan Xu edited comment on MESOS-5735 at 9/9/16 5:22 AM: --- Could you elaborate on "in master to proxy the requests to every agent"? Just a reminder that ultimately we'd like to replace JSONP (with CORS?). This is tracked by MESOS-5918 which resulted from the discussion in MESOS-5911 (in which [~anandmazumdar] pointed out its implication on the webUI work as well). Do we want to take this on before we address MESOS-5918? Are we able to address this without incurring too much tech debt? was (Author: xujyan): Could you elaborate on "in master to proxy the requests to every agent"? Just a reminder that ultimately we'd like to replace JSONP (with CORS?). This is tracked by MESOS-5911 which resulted from the discussion in MESOS-5918 (in which [~anandmazumdar] pointed out its implication on the webUI work as well). Do we want to take this on before we address MESOS-5911? Are we able to address this without incurring too much tech debt? > Update WebUI to use v1 operator API > --- > > Key: MESOS-5735 > URL: https://issues.apache.org/jira/browse/MESOS-5735 > Project: Mesos > Issue Type: Bug >Reporter: Vinod Kone >Assignee: Jay Guo > > Having the WebUI use the v1 API would be a good validation of it's usefulness > and correctness. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5735) Update WebUI to use v1 operator API
[ https://issues.apache.org/jira/browse/MESOS-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475960#comment-15475960 ] Yan Xu commented on MESOS-5735: --- Could you elaborate on "in master to proxy the requests to every agent"? Just a reminder that ultimately we'd like to replace JSONP (with CORS?). This is tracked by MESOS-5911 which resulted from the discussion in MESOS-5918 (in which [~anandmazumdar] pointed out its implication on the webUI work as well). Do we want to take this on before we address MESOS-5911? Are we able to address this without incurring too much tech debt? > Update WebUI to use v1 operator API > --- > > Key: MESOS-5735 > URL: https://issues.apache.org/jira/browse/MESOS-5735 > Project: Mesos > Issue Type: Bug >Reporter: Vinod Kone >Assignee: Jay Guo > > Having the WebUI use the v1 API would be a good validation of it's usefulness > and correctness. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6143) resolv.conf is not copied when using the Mesos containerizer with a Docker image
[ https://issues.apache.org/jira/browse/MESOS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475434#comment-15475434 ] Jie Yu commented on MESOS-6143: --- can you try alpine image see if that works? We have a unit test which tests this: https://github.com/apache/mesos/blob/master/src/tests/containerizer/cni_isolator_tests.cpp#L521 We've been running this unit test every commit on debian 8 > resolv.conf is not copied when using the Mesos containerizer with a Docker > image > > > Key: MESOS-6143 > URL: https://issues.apache.org/jira/browse/MESOS-6143 > Project: Mesos > Issue Type: Bug > Components: containerization, isolation >Affects Versions: 1.0.0 > Environment: OS: Debian Jessie > Mesos version: 1.0.0 >Reporter: Justin Pinkul >Assignee: Avinash Sridharan > Fix For: 1.1.0 > > > When using the Mesos containierizer, host networking and a Docker image > {{resolv.conf}} is not copied from the host. The only piece of Mesos code > that copies these file is currently in the {{network/cni}} isolator so I > tried turning this on, by setting > {{isolation=network/cni,namespaces/pid,docker/runtime,cgroups/devices,gpu/nvidia,cgroups/cpu,disk/du,filesystem/linux}}, > but the issue still remained. I suspect this might be related to not setting > {{network_cni_config_dir}} and {{network_cni_plugins_dir}} but it seems > incorrect that these flags would be required to use host networking. > Here is how I am able to reproduce this issue: > {code} > mesos-execute --master=mesosmaster1:5050 \ > --name=dns-test \ > --docker_image=my-docker-image:1.1.3 \ > --command="bash -c 'ping google.com; while ((1)); do date; > sleep 10; done'" > # Find the PID of mesos-executor's child process and enter it > nsenter -m -u -i -n -p -r -w -t $PID > # This file will be empty > cat /etc/resolv.conf > {code} > {code:title=Mesos agent log} > I0908 17:39:24.599149 181564 slave.cpp:1688] Launching task dns-test for > framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 > I0908 17:39:24.599567 181564 paths.cpp:528] Trying to chown > '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6' > to user 'root' > I0908 17:39:24.603970 181564 slave.cpp:5748] Launching executor dns-test of > framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 with resources > cpus(*):0.1; mem(*):32 in work directory > '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6' > I0908 17:39:24.604178 181564 slave.cpp:1914] Queuing task 'dns-test' for > executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 > I0908 17:39:24.604284 181571 docker.cpp:1020] Skipping non-docker container > I0908 17:39:24.604532 181578 containerizer.cpp:781] Starting container > '52bdce71-04b0-4440-bb71-cb826f0635c6' for executor 'dns-test' of framework > '51831498-0902-4ae9-a1ff-4396f8b8d823-0006' > I0908 17:39:24.606972 181571 provisioner.cpp:294] Provisioning image rootfs > '/mnt/01/mesos_work/provisioner/containers/52bdce71-04b0-4440-bb71-cb826f0635c6/backends/copy/rootfses/db97ba50-c9f0-45e7-8a39-871e4038abf9' > for container 52bdce71-04b0-4440-bb71-cb826f0635c6 > I0908 17:39:30.037472 181564 cpushare.cpp:389] Updated 'cpu.shares' to 102 > (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6 > I0908 17:39:30.038415 181560 linux_launcher.cpp:281] Cloning child process > with flags = CLONE_NEWNS | CLONE_NEWPID > I0908 17:39:30.040742 181560 systemd.cpp:96] Assigned child process '190563' > to 'mesos_executors.slice' > I0908 17:39:30.161613 181576 slave.cpp:2902] Got registration for executor > 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from > executor(1)@10.191.4.65:43707 > I0908 17:39:30.162148 181563 disk.cpp:171] Updating the disk resources for > container 52bdce71-04b0-4440-bb71-cb826f0635c6 to cpus(*):0.1; mem(*):32; > gpus(*):2 > I0908 17:39:30.162648 181566 cpushare.cpp:389] Updated 'cpu.shares' to 102 > (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6 > I0908 17:39:30.162822 181574 slave.cpp:2079] Sending queued task 'dns-test' > to executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 > at executor(1)@10.191.4.65:43707 > I0908 17:39:30.168383 181570 slave.cpp:3285] Handling status update > TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task dns-test > of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from > executor(1)@10.191.4.65:43707 > I0908 17:39:30.169019 181577 status_update_manager.
[jira] [Updated] (MESOS-6143) resolv.conf is not copied when using the Mesos containerizer with a Docker image
[ https://issues.apache.org/jira/browse/MESOS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avinash Sridharan updated MESOS-6143: - Fix Version/s: 1.1.0 > resolv.conf is not copied when using the Mesos containerizer with a Docker > image > > > Key: MESOS-6143 > URL: https://issues.apache.org/jira/browse/MESOS-6143 > Project: Mesos > Issue Type: Bug > Components: containerization, isolation >Affects Versions: 1.0.0 > Environment: OS: Debian Jessie > Mesos version: 1.0.0 >Reporter: Justin Pinkul >Assignee: Avinash Sridharan > Fix For: 1.1.0 > > > When using the Mesos containierizer, host networking and a Docker image > {{resolv.conf}} is not copied from the host. The only piece of Mesos code > that copies these file is currently in the {{network/cni}} isolator so I > tried turning this on, by setting > {{isolation=network/cni,namespaces/pid,docker/runtime,cgroups/devices,gpu/nvidia,cgroups/cpu,disk/du,filesystem/linux}}, > but the issue still remained. I suspect this might be related to not setting > {{network_cni_config_dir}} and {{network_cni_plugins_dir}} but it seems > incorrect that these flags would be required to use host networking. > Here is how I am able to reproduce this issue: > {code} > mesos-execute --master=mesosmaster1:5050 \ > --name=dns-test \ > --docker_image=my-docker-image:1.1.3 \ > --command="bash -c 'ping google.com; while ((1)); do date; > sleep 10; done'" > # Find the PID of mesos-executor's child process and enter it > nsenter -m -u -i -n -p -r -w -t $PID > # This file will be empty > cat /etc/resolv.conf > {code} > {code:title=Mesos agent log} > I0908 17:39:24.599149 181564 slave.cpp:1688] Launching task dns-test for > framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 > I0908 17:39:24.599567 181564 paths.cpp:528] Trying to chown > '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6' > to user 'root' > I0908 17:39:24.603970 181564 slave.cpp:5748] Launching executor dns-test of > framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 with resources > cpus(*):0.1; mem(*):32 in work directory > '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6' > I0908 17:39:24.604178 181564 slave.cpp:1914] Queuing task 'dns-test' for > executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 > I0908 17:39:24.604284 181571 docker.cpp:1020] Skipping non-docker container > I0908 17:39:24.604532 181578 containerizer.cpp:781] Starting container > '52bdce71-04b0-4440-bb71-cb826f0635c6' for executor 'dns-test' of framework > '51831498-0902-4ae9-a1ff-4396f8b8d823-0006' > I0908 17:39:24.606972 181571 provisioner.cpp:294] Provisioning image rootfs > '/mnt/01/mesos_work/provisioner/containers/52bdce71-04b0-4440-bb71-cb826f0635c6/backends/copy/rootfses/db97ba50-c9f0-45e7-8a39-871e4038abf9' > for container 52bdce71-04b0-4440-bb71-cb826f0635c6 > I0908 17:39:30.037472 181564 cpushare.cpp:389] Updated 'cpu.shares' to 102 > (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6 > I0908 17:39:30.038415 181560 linux_launcher.cpp:281] Cloning child process > with flags = CLONE_NEWNS | CLONE_NEWPID > I0908 17:39:30.040742 181560 systemd.cpp:96] Assigned child process '190563' > to 'mesos_executors.slice' > I0908 17:39:30.161613 181576 slave.cpp:2902] Got registration for executor > 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from > executor(1)@10.191.4.65:43707 > I0908 17:39:30.162148 181563 disk.cpp:171] Updating the disk resources for > container 52bdce71-04b0-4440-bb71-cb826f0635c6 to cpus(*):0.1; mem(*):32; > gpus(*):2 > I0908 17:39:30.162648 181566 cpushare.cpp:389] Updated 'cpu.shares' to 102 > (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6 > I0908 17:39:30.162822 181574 slave.cpp:2079] Sending queued task 'dns-test' > to executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 > at executor(1)@10.191.4.65:43707 > I0908 17:39:30.168383 181570 slave.cpp:3285] Handling status update > TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task dns-test > of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from > executor(1)@10.191.4.65:43707 > I0908 17:39:30.169019 181577 status_update_manager.cpp:320] Received status > update TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task > dns-test of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 > I0908 17:39:30.169173 181576 slave.cpp:3678] Forwarding the update > TASK_RUNN
[jira] [Assigned] (MESOS-6143) resolv.conf is not copied when using the Mesos containerizer with a Docker image
[ https://issues.apache.org/jira/browse/MESOS-6143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Avinash Sridharan reassigned MESOS-6143: Assignee: Avinash Sridharan > resolv.conf is not copied when using the Mesos containerizer with a Docker > image > > > Key: MESOS-6143 > URL: https://issues.apache.org/jira/browse/MESOS-6143 > Project: Mesos > Issue Type: Bug > Components: containerization, isolation >Affects Versions: 1.0.0 > Environment: OS: Debian Jessie > Mesos version: 1.0.0 >Reporter: Justin Pinkul >Assignee: Avinash Sridharan > > When using the Mesos containierizer, host networking and a Docker image > {{resolv.conf}} is not copied from the host. The only piece of Mesos code > that copies these file is currently in the {{network/cni}} isolator so I > tried turning this on, by setting > {{isolation=network/cni,namespaces/pid,docker/runtime,cgroups/devices,gpu/nvidia,cgroups/cpu,disk/du,filesystem/linux}}, > but the issue still remained. I suspect this might be related to not setting > {{network_cni_config_dir}} and {{network_cni_plugins_dir}} but it seems > incorrect that these flags would be required to use host networking. > Here is how I am able to reproduce this issue: > {code} > mesos-execute --master=mesosmaster1:5050 \ > --name=dns-test \ > --docker_image=my-docker-image:1.1.3 \ > --command="bash -c 'ping google.com; while ((1)); do date; > sleep 10; done'" > # Find the PID of mesos-executor's child process and enter it > nsenter -m -u -i -n -p -r -w -t $PID > # This file will be empty > cat /etc/resolv.conf > {code} > {code:title=Mesos agent log} > I0908 17:39:24.599149 181564 slave.cpp:1688] Launching task dns-test for > framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 > I0908 17:39:24.599567 181564 paths.cpp:528] Trying to chown > '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6' > to user 'root' > I0908 17:39:24.603970 181564 slave.cpp:5748] Launching executor dns-test of > framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 with resources > cpus(*):0.1; mem(*):32 in work directory > '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6' > I0908 17:39:24.604178 181564 slave.cpp:1914] Queuing task 'dns-test' for > executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 > I0908 17:39:24.604284 181571 docker.cpp:1020] Skipping non-docker container > I0908 17:39:24.604532 181578 containerizer.cpp:781] Starting container > '52bdce71-04b0-4440-bb71-cb826f0635c6' for executor 'dns-test' of framework > '51831498-0902-4ae9-a1ff-4396f8b8d823-0006' > I0908 17:39:24.606972 181571 provisioner.cpp:294] Provisioning image rootfs > '/mnt/01/mesos_work/provisioner/containers/52bdce71-04b0-4440-bb71-cb826f0635c6/backends/copy/rootfses/db97ba50-c9f0-45e7-8a39-871e4038abf9' > for container 52bdce71-04b0-4440-bb71-cb826f0635c6 > I0908 17:39:30.037472 181564 cpushare.cpp:389] Updated 'cpu.shares' to 102 > (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6 > I0908 17:39:30.038415 181560 linux_launcher.cpp:281] Cloning child process > with flags = CLONE_NEWNS | CLONE_NEWPID > I0908 17:39:30.040742 181560 systemd.cpp:96] Assigned child process '190563' > to 'mesos_executors.slice' > I0908 17:39:30.161613 181576 slave.cpp:2902] Got registration for executor > 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from > executor(1)@10.191.4.65:43707 > I0908 17:39:30.162148 181563 disk.cpp:171] Updating the disk resources for > container 52bdce71-04b0-4440-bb71-cb826f0635c6 to cpus(*):0.1; mem(*):32; > gpus(*):2 > I0908 17:39:30.162648 181566 cpushare.cpp:389] Updated 'cpu.shares' to 102 > (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6 > I0908 17:39:30.162822 181574 slave.cpp:2079] Sending queued task 'dns-test' > to executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 > at executor(1)@10.191.4.65:43707 > I0908 17:39:30.168383 181570 slave.cpp:3285] Handling status update > TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task dns-test > of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from > executor(1)@10.191.4.65:43707 > I0908 17:39:30.169019 181577 status_update_manager.cpp:320] Received status > update TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task > dns-test of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 > I0908 17:39:30.169173 181576 slave.cpp:3678] Forwarding the update > TASK_RUNNING (UUID: 319e02
[jira] [Created] (MESOS-6143) resolv.conf is not copied when using the Mesos containerizer with a Docker image
Justin Pinkul created MESOS-6143: Summary: resolv.conf is not copied when using the Mesos containerizer with a Docker image Key: MESOS-6143 URL: https://issues.apache.org/jira/browse/MESOS-6143 Project: Mesos Issue Type: Bug Components: containerization, isolation Affects Versions: 1.0.0 Environment: OS: Debian Jessie Mesos version: 1.0.0 Reporter: Justin Pinkul When using the Mesos containierizer, host networking and a Docker image {{resolv.conf}} is not copied from the host. The only piece of Mesos code that copies these file is currently in the {{network/cni}} isolator so I tried turning this on, by setting {{isolation=network/cni,namespaces/pid,docker/runtime,cgroups/devices,gpu/nvidia,cgroups/cpu,disk/du,filesystem/linux}}, but the issue still remained. I suspect this might be related to not setting {{network_cni_config_dir}} and {{network_cni_plugins_dir}} but it seems incorrect that these flags would be required to use host networking. Here is how I am able to reproduce this issue: {code} mesos-execute --master=mesosmaster1:5050 \ --name=dns-test \ --docker_image=my-docker-image:1.1.3 \ --command="bash -c 'ping google.com; while ((1)); do date; sleep 10; done'" # Find the PID of mesos-executor's child process and enter it nsenter -m -u -i -n -p -r -w -t $PID # This file will be empty cat /etc/resolv.conf {code} {code:title=Mesos agent log} I0908 17:39:24.599149 181564 slave.cpp:1688] Launching task dns-test for framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 I0908 17:39:24.599567 181564 paths.cpp:528] Trying to chown '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6' to user 'root' I0908 17:39:24.603970 181564 slave.cpp:5748] Launching executor dns-test of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 with resources cpus(*):0.1; mem(*):32 in work directory '/mnt/01/mesos_work/slaves/67025326-9dfd-4cbb-a008-454a40bce2f5-S2/frameworks/51831498-0902-4ae9-a1ff-4396f8b8d823-0006/executors/dns-test/runs/52bdce71-04b0-4440-bb71-cb826f0635c6' I0908 17:39:24.604178 181564 slave.cpp:1914] Queuing task 'dns-test' for executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 I0908 17:39:24.604284 181571 docker.cpp:1020] Skipping non-docker container I0908 17:39:24.604532 181578 containerizer.cpp:781] Starting container '52bdce71-04b0-4440-bb71-cb826f0635c6' for executor 'dns-test' of framework '51831498-0902-4ae9-a1ff-4396f8b8d823-0006' I0908 17:39:24.606972 181571 provisioner.cpp:294] Provisioning image rootfs '/mnt/01/mesos_work/provisioner/containers/52bdce71-04b0-4440-bb71-cb826f0635c6/backends/copy/rootfses/db97ba50-c9f0-45e7-8a39-871e4038abf9' for container 52bdce71-04b0-4440-bb71-cb826f0635c6 I0908 17:39:30.037472 181564 cpushare.cpp:389] Updated 'cpu.shares' to 102 (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6 I0908 17:39:30.038415 181560 linux_launcher.cpp:281] Cloning child process with flags = CLONE_NEWNS | CLONE_NEWPID I0908 17:39:30.040742 181560 systemd.cpp:96] Assigned child process '190563' to 'mesos_executors.slice' I0908 17:39:30.161613 181576 slave.cpp:2902] Got registration for executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from executor(1)@10.191.4.65:43707 I0908 17:39:30.162148 181563 disk.cpp:171] Updating the disk resources for container 52bdce71-04b0-4440-bb71-cb826f0635c6 to cpus(*):0.1; mem(*):32; gpus(*):2 I0908 17:39:30.162648 181566 cpushare.cpp:389] Updated 'cpu.shares' to 102 (cpus 0.1) for container 52bdce71-04b0-4440-bb71-cb826f0635c6 I0908 17:39:30.162822 181574 slave.cpp:2079] Sending queued task 'dns-test' to executor 'dns-test' of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 at executor(1)@10.191.4.65:43707 I0908 17:39:30.168383 181570 slave.cpp:3285] Handling status update TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task dns-test of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 from executor(1)@10.191.4.65:43707 I0908 17:39:30.169019 181577 status_update_manager.cpp:320] Received status update TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task dns-test of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 I0908 17:39:30.169173 181576 slave.cpp:3678] Forwarding the update TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task dns-test of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 to master@10.191.248.194:5050 I0908 17:39:30.169242 181576 slave.cpp:3588] Sending acknowledgement for status update TASK_RUNNING (UUID: 319e0235-01b9-42ce-a2f8-ed9fc33de150) for task dns-test of framework 51831498-0902-4ae9-a1ff-4396f8b8d823-0006 to executor(1)@10.191.4.65:43707 I0908 17:39:30.17131
[jira] [Commented] (MESOS-6067) Support provisioner to be nested aware for Mesos Pods.
[ https://issues.apache.org/jira/browse/MESOS-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475370#comment-15475370 ] Jie Yu commented on MESOS-6067: --- commit 1ba5153ee4d0a0075b51debab72903bba561c6b7 Author: Gilbert Song Date: Thu Sep 8 16:05:43 2016 -0700 Supported provisioner listContainers() to be recursive. This patch supports collecting all containerIds (all containers in the nested hierarchy) recursively. Review: https://reviews.apache.org/r/51392/ > Support provisioner to be nested aware for Mesos Pods. > -- > > Key: MESOS-6067 > URL: https://issues.apache.org/jira/browse/MESOS-6067 > Project: Mesos > Issue Type: Task > Components: containerization >Reporter: Gilbert Song >Assignee: Gilbert Song > Labels: containerizer, provisioner > > The provisioner has to be nested aware for sub-container provisioning, as > well as recovery and nested container destroy. Better to support multi-level > hierarchy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6136) Duplicate framework id handling
[ https://issues.apache.org/jira/browse/MESOS-6136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475329#comment-15475329 ] Christopher Hunt commented on MESOS-6136: - Thanks - sounds reasonable. I still think that failover_timeout has a place though. I like what Mesos does in terms of preventing a framework to rejoin given the inconsistent state of tasks. I also want that the operator explicitly clears this condition by either killing all tasks so that Mesos can subsequently accept the same framework id, or by leaving the tasks and allowing the framework to join and clean up its own tasks. Perhaps /teardown for the former, and a new one, /accept for the latter. > Duplicate framework id handling > --- > > Key: MESOS-6136 > URL: https://issues.apache.org/jira/browse/MESOS-6136 > Project: Mesos > Issue Type: Improvement > Components: general >Affects Versions: 0.28.1 > Environment: DCOS 1.7 Cloud Formation scripts >Reporter: Christopher Hunt >Priority: Critical > Labels: framework, lifecyclemanagement, task > > We have observed a situation where Mesos will kill tasks belonging to a > framework where that framework times out with the Mesos master for some > reason, perhaps even because of a network partition. > While we can provide a long timeout so that Mesos will not kill a framework's > tasks for practical purposes, I'm wondering if there's an improvement where a > framework shouldn't be permitted to re-register for a given id (as now), but > Mesos doesn't also kill tasks? What I'm thinking is that Mesos could be > "told" by an operator that this condition should be cleared. > IMHO frameworks should be the only entity requesting that tasks be killed > unless manually overridden by an operator. > I'm flagging this as a critical improvement because a) the focus should be on > keeping tasks running in a system, and it isn't; and b) Mesos is working as > designed. > In summary I feel that Mesos is taking on a responsibility in killing tasks > where it shouldn't be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6101) Add event for Framwork added to master operator API
[ https://issues.apache.org/jira/browse/MESOS-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15475101#comment-15475101 ] Zhitao Li commented on MESOS-6101: -- [~vinodkone] [~anandmazumdar], for completeness, I think we have two events: 1. FRAMEWORK_ADDED, which maps to the case when a FRAMEWORK_INFO is first known to the master; 2. FRAMEWORK_REMOVED, which maps to the case when a FRAMEWORK_INFO. It's open to me whether we need event for framework failover/disconnect/reconnect, but I guess it's probably fine to leave for next iteration? https://reviews.apache.org/r/51700/ has patch for FRAMEWORK_ADDED, I'll start one for FRAMEWORK_REMOVED. > Add event for Framwork added to master operator API > --- > > Key: MESOS-6101 > URL: https://issues.apache.org/jira/browse/MESOS-6101 > Project: Mesos > Issue Type: Task >Reporter: Zhitao Li >Assignee: Zhitao Li > > Consider the following case: > 1) a subscriber connects to master; > 2) a new scheduler registered as a new framework; > 3) a task is launched from this framework. > In this sequence, subscriber does not have a way to know the FrameworkInfo > belonging to the FrameworkId. > We should support an event (e.g. when framework info in master is > added/changed). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-6101) Add event for Framwork added to master operator API
[ https://issues.apache.org/jira/browse/MESOS-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhitao Li reassigned MESOS-6101: Assignee: Zhitao Li > Add event for Framwork added to master operator API > --- > > Key: MESOS-6101 > URL: https://issues.apache.org/jira/browse/MESOS-6101 > Project: Mesos > Issue Type: Task >Reporter: Zhitao Li >Assignee: Zhitao Li > > Consider the following case: > 1) a subscriber connects to master; > 2) a new scheduler registered as a new framework; > 3) a task is launched from this framework. > In this sequence, subscriber does not have a way to know the FrameworkInfo > belonging to the FrameworkId. > We should support an event (e.g. when framework info in master is > added/changed). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6142) Frameworks may RESERVE for an arbitrary role.
Alexander Rukletsov created MESOS-6142: -- Summary: Frameworks may RESERVE for an arbitrary role. Key: MESOS-6142 URL: https://issues.apache.org/jira/browse/MESOS-6142 Project: Mesos Issue Type: Bug Components: allocation, master Affects Versions: 1.0.0 Reporter: Alexander Rukletsov Priority: Blocker Fix For: 1.1.0 The master does not validate that resources from a reservation request have the same role the framework is registered with. As a result, frameworks may reserve resources for arbitrary roles. I've modified the role in [the {{ReserveThenUnreserve}} test|https://github.com/apache/mesos/blob/bca600cf5602ed8227d91af9f73d689da14ad786/src/tests/reservation_tests.cpp#L117] to "yoyo" and observed the following in the test's log: {noformat} I0908 18:35:43.379122 2138112 master.cpp:3362] Processing ACCEPT call for offers: [ dfaf67e6-7c1c-4988-b427-c49842cb7bb7-O0 ] on agent dfaf67e6-7c1c-4988-b427-c49842cb7bb7-S0 at slave(1)@10.200.181.237:60116 (alexr.railnet.train) for framework dfaf67e6-7c1c-4988-b427-c49842cb7bb7- (default) at scheduler-ca12a660-9f08-49de-be4e-d452aa3aa6da@10.200.181.237:60116 I0908 18:35:43.379170 2138112 master.cpp:3022] Authorizing principal 'test-principal' to reserve resources 'cpus(yoyo, test-principal):1; mem(yoyo, test-principal):512' I0908 18:35:43.379678 2138112 master.cpp:3642] Applying RESERVE operation for resources cpus(yoyo, test-principal):1; mem(yoyo, test-principal):512 from framework dfaf67e6-7c1c-4988-b427-c49842cb7bb7- (default) at scheduler-ca12a660-9f08-49de-be4e-d452aa3aa6da@10.200.181.237:60116 to agent dfaf67e6-7c1c-4988-b427-c49842cb7bb7-S0 at slave(1)@10.200.181.237:60116 (alexr.railnet.train) I0908 18:35:43.379767 2138112 master.cpp:7341] Sending checkpointed resources cpus(yoyo, test-principal):1; mem(yoyo, test-principal):512 to agent dfaf67e6-7c1c-4988-b427-c49842cb7bb7-S0 at slave(1)@10.200.181.237:60116 (alexr.railnet.train) I0908 18:35:43.380273 3211264 slave.cpp:2497] Updated checkpointed resources from to cpus(yoyo, test-principal):1; mem(yoyo, test-principal):512 I0908 18:35:43.380574 2674688 hierarchical.cpp:760] Updated allocation of framework dfaf67e6-7c1c-4988-b427-c49842cb7bb7- on agent dfaf67e6-7c1c-4988-b427-c49842cb7bb7-S0 from cpus(*):1; mem(*):512; disk(*):470841; ports(*):[31000-32000] to ports(*):[31000-32000]; cpus(yoyo, test-principal):1; disk(*):470841; mem(yoyo, test-principal):512 with RESERVE operation {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6128) Make "re-register" vs. "reregister" consistent in the master
[ https://issues.apache.org/jira/browse/MESOS-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15474149#comment-15474149 ] Greg Mann commented on MESOS-6128: -- Also one less character to type! :) +1 for reregister > Make "re-register" vs. "reregister" consistent in the master > > > Key: MESOS-6128 > URL: https://issues.apache.org/jira/browse/MESOS-6128 > Project: Mesos > Issue Type: Improvement > Components: master >Reporter: Neil Conway > Labels: mesosphere, newbie > > Per discussion in https://reviews.apache.org/r/50705/, we sometimes use > "re-register" in comments and elsewhere we use "reregister". We should pick > one form and use it consistently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-6136) Duplicate framework id handling
[ https://issues.apache.org/jira/browse/MESOS-6136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15473653#comment-15473653 ] Neil Conway commented on MESOS-6136: We might want to distinguish between "framework has been explicitly torn down" (via the {{/teardown}} endpoint) and "framework has been disconnected for longer than {{failover_timeout}}". In the former case, the operator has explicitly removed the framework, so it seems quite reasonable for Mesos to kill the associated tasks (and we should arrange to do this even for tasks running on agents that are partitioned at the time of the {{/teardown}}). In the latter case, having Mesos kill tasks at any point is more debatable. Obviously the recommended practice for production frameworks is to set a high {{failover_timeout}}. We could perhaps change this behavior: e.g., deprecate {{failover_timeout}}, and say that tasks associated with disconnected frameworks continue running indefinitely until/unless killed by the operator. As part of this, we would probably want to provide better support for cleaning up the state associated with such a disconnected framework -- e.g., allowing {{/teardown}} to be used for this purpose. > Duplicate framework id handling > --- > > Key: MESOS-6136 > URL: https://issues.apache.org/jira/browse/MESOS-6136 > Project: Mesos > Issue Type: Improvement > Components: general >Affects Versions: 0.28.1 > Environment: DCOS 1.7 Cloud Formation scripts >Reporter: Christopher Hunt >Priority: Critical > Labels: framework, lifecyclemanagement, task > > We have observed a situation where Mesos will kill tasks belonging to a > framework where that framework times out with the Mesos master for some > reason, perhaps even because of a network partition. > While we can provide a long timeout so that Mesos will not kill a framework's > tasks for practical purposes, I'm wondering if there's an improvement where a > framework shouldn't be permitted to re-register for a given id (as now), but > Mesos doesn't also kill tasks? What I'm thinking is that Mesos could be > "told" by an operator that this condition should be cleared. > IMHO frameworks should be the only entity requesting that tasks be killed > unless manually overridden by an operator. > I'm flagging this as a critical improvement because a) the focus should be on > keeping tasks running in a system, and it isn't; and b) Mesos is working as > designed. > In summary I feel that Mesos is taking on a responsibility in killing tasks > where it shouldn't be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6141) Some tests do not properly set 'flags.launcher' with the correct value
Kevin Klues created MESOS-6141: -- Summary: Some tests do not properly set 'flags.launcher' with the correct value Key: MESOS-6141 URL: https://issues.apache.org/jira/browse/MESOS-6141 Project: Mesos Issue Type: Bug Reporter: Kevin Klues Assignee: Kevin Klues Fix For: 1.1.0 In some of our tests we manually create a 'PosixLauncher' rather than relying on the value of 'flags.launcher' to decide which type of launcher to create. Since calls to 'CreateSlaveFlags()' set 'flags.launcher' to 'linux' by default, there is a discrepency in what the flags say, and what actual launcher type we are creating. We should fix this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-6140) Add a parallel test runner
Benjamin Bannier created MESOS-6140: --- Summary: Add a parallel test runner Key: MESOS-6140 URL: https://issues.apache.org/jira/browse/MESOS-6140 Project: Mesos Issue Type: Improvement Components: tests Reporter: Benjamin Bannier In order to allow parallelization of the test execution we should add a parallel test executor to Mesos, and subsequently activate it in the build setup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-6140) Add a parallel test runner
[ https://issues.apache.org/jira/browse/MESOS-6140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier reassigned MESOS-6140: --- Assignee: Benjamin Bannier > Add a parallel test runner > -- > > Key: MESOS-6140 > URL: https://issues.apache.org/jira/browse/MESOS-6140 > Project: Mesos > Issue Type: Improvement > Components: tests >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > > In order to allow parallelization of the test execution we should add a > parallel test executor to Mesos, and subsequently activate it in the build > setup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)