[
https://issues.apache.org/jira/browse/MESOS-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Qian Zhang updated MESOS-5225:
------------------------------
Description:
Reproduce steps:
1. Start master
{code}
sudo ./bin/mesos-master.sh --work_dir=/tmp
{code}
2. Start agent
{code}
sudo ./bin/mesos-slave.sh --master=192.168.122.171:5050 --containerizers=mesos
--image_providers=docker
--isolation=filesystem/linux,docker/runtime,network/cni
--network_cni_config_dir=/opt/cni/net_configs
--network_cni_plugins_dir=/opt/cni/plugins
{code}
3. Launch a command task with mesos-execute, and it will join a CNI network
{{net1}}.
{code}
sudo src/mesos-execute --master=192.168.122.171:5050 --name=test
--docker_image=library/busybox --networks=net1 --command="sleep 10" --shell=true
I0418 08:25:35.746758 24923 scheduler.cpp:177] Version: 0.29.0
Subscribed with ID '3c4796f0-eee7-4939-a036-7c6387c370eb-0000'
Submitted task 'test' to agent 'b74535d8-276f-4e09-ab47-53e3721ab271-S0'
Received status update TASK_FAILED for task 'test'
message: 'Executor terminated'
source: SOURCE_AGENT
reason: REASON_EXECUTOR_TERMINATED
{code}
So the task failed with the reason "executor terminated". Here is the agent log:
{code}
I0418 08:25:35.804873 24911 slave.cpp:1514] Got assigned task test for
framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
I0418 08:25:35.807937 24911 slave.cpp:1633] Launching task test for framework
3c4796f0-eee7-4939-a036-7c6387c370eb-0000
I0418 08:25:35.812503 24911 paths.cpp:528] Trying to chown
'/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/t
est/runs/2b29d6d6-b314-477f-b734-7771d07d41e3' to user 'root'
I0418 08:25:35.820339 24911 slave.cpp:5620] Launching executor test of
framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000 with resources cpus(*):0.1;
mem(*):32 in work directory '/t
mp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test/runs/2b29d6d6-b314-477f-b734-7771d07d41e3'
I0418 08:25:35.822576 24914 containerizer.cpp:698] Starting container
'2b29d6d6-b314-477f-b734-7771d07d41e3' for executor 'test' of framework
'3c4796f0-eee7-4939-a036-7c6387c370eb-00
00'
I0418 08:25:35.825996 24911 slave.cpp:1851] Queuing task 'test' for executor
'test' of framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
I0418 08:25:35.832348 24911 provisioner.cpp:285] Provisioning image rootfs
'/tmp/mesos/provisioner/containers/2b29d6d6-b314-477f-b734-7771d07d41e3/backends/copy/rootfses/d219ec3a-ea3
1-45f6-b578-a62cd02392e7' for container 2b29d6d6-b314-477f-b734-7771d07d41e3
I0418 08:25:36.061249 24913 linux_launcher.cpp:281] Cloning child process with
flags = CLONE_NEWNET | CLONE_NEWUTS | CLONE_NEWNS
I0418 08:25:36.071208 24915 cni.cpp:643] Bind mounted '/proc/24950/ns/net' to
'/run/mesos/isolators/network/cni/2b29d6d6-b314-477f-b734-7771d07d41e3/ns' for
container 2b29d6d6-b314-4
77f-b734-7771d07d41e3
I0418 08:25:36.250573 24916 cni.cpp:962] Got assigned IPv4 address
'192.168.1.2/24' from CNI network 'net1' for container
2b29d6d6-b314-477f-b734-7771d07d41e3
I0418 08:25:36.252002 24917 cni.cpp:765] Unable to find DNS nameservers for
container 2b29d6d6-b314-477f-b734-7771d07d41e3. Using host '/etc/resolv.conf'
I0418 08:25:37.663487 24916 containerizer.cpp:1696] Executor for container
'2b29d6d6-b314-477f-b734-7771d07d41e3' has exited
I0418 08:25:37.663745 24916 containerizer.cpp:1461] Destroying container
'2b29d6d6-b314-477f-b734-7771d07d41e3'
I0418 08:25:37.670574 24915 cgroups.cpp:2676] Freezing cgroup
/sys/fs/cgroup/freezer/mesos/2b29d6d6-b314-477f-b734-7771d07d41e3
I0418 08:25:37.676864 24912 cgroups.cpp:1409] Successfully froze cgroup
/sys/fs/cgroup/freezer/mesos/2b29d6d6-b314-477f-b734-7771d07d41e3 after
6.061056ms
I0418 08:25:37.680552 24913 cgroups.cpp:2694] Thawing cgroup
/sys/fs/cgroup/freezer/mesos/2b29d6d6-b314-477f-b734-7771d07d41e3
I0418 08:25:37.683346 24913 cgroups.cpp:1438] Successfully thawed cgroup
/sys/fs/cgroup/freezer/mesos/2b29d6d6-b314-477f-b734-7771d07d41e3 after
2.46016ms
I0418 08:25:37.874023 24914 cni.cpp:1121] Unmounted the network namespace
handle
'/run/mesos/isolators/network/cni/2b29d6d6-b314-477f-b734-7771d07d41e3/ns' for
container 2b29d6d6-b31
4-477f-b734-7771d07d41e3
I0418 08:25:37.874194 24914 cni.cpp:1132] Removed the container directory
'/run/mesos/isolators/network/cni/2b29d6d6-b314-477f-b734-7771d07d41e3'
I0418 08:25:37.877306 24912 linux.cpp:814] Ignoring unmounting sandbox/work
directory for container 2b29d6d6-b314-477f-b734-7771d07d41e3
I0418 08:25:37.879295 24912 provisioner.cpp:338] Destroying container rootfs at
'/tmp/mesos/provisioner/containers/2b29d6d6-b314-477f-b734-7771d07d41e3/backends/copy/rootfses/d219ec3
a-ea31-45f6-b578-a62cd02392e7' for container
2b29d6d6-b314-477f-b734-7771d07d41e3
I0418 08:25:37.970871 24914 slave.cpp:4113] Executor 'test' of framework
3c4796f0-eee7-4939-a036-7c6387c370eb-0000 exited with status 1
I0418 08:25:37.975452 24914 slave.cpp:3201] Handling status update TASK_FAILED
(UUID: a5e19b2d-b234-4adc-8791-9046af4c1395) for task test of framework
3c4796f0-eee7-4939-a036-7c6387c
370eb-0000 from @0.0.0.0:0
W0418 08:25:37.978974 24911 containerizer.cpp:1303] Ignoring update for unknown
container: 2b29d6d6-b314-477f-b734-7771d07d41e3
I0418 08:25:37.980370 24917 status_update_manager.cpp:320] Received status
update TASK_FAILED (UUID: a5e19b2d-b234-4adc-8791-9046af4c1395) for task test
of framework 3c4796f0-eee7-49
39-a036-7c6387c370eb-0000
I0418 08:25:37.983105 24913 slave.cpp:3599] Forwarding the update TASK_FAILED
(UUID: a5e19b2d-b234-4adc-8791-9046af4c1395) for task test of framework
3c4796f0-eee7-4939-a036-7c6387c3
70eb-0000 to [email protected]:5050
I0418 08:25:38.017352 24917 slave.cpp:2232] Asked to shut down framework
3c4796f0-eee7-4939-a036-7c6387c370eb-0000 by [email protected]:5050
I0418 08:25:38.018487 24917 slave.cpp:2257] Shutting down framework
3c4796f0-eee7-4939-a036-7c6387c370eb-0000
I0418 08:25:38.019630 24917 slave.cpp:4217] Cleaning up executor 'test' of
framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
I0418 08:25:38.020967 24911 gc.cpp:55] Scheduling
'/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test/runs/
2b29d6d6-b314-477f-b734-7771d07d41e3' for gc 6.99999975983704days in the future
I0418 08:25:38.022328 24917 slave.cpp:4305] Cleaning up framework
3c4796f0-eee7-4939-a036-7c6387c370eb-0000
I0418 08:25:38.022847 24915 status_update_manager.cpp:282] Closing status
update streams for framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
I0418 08:25:38.022459 24912 gc.cpp:55] Scheduling
'/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test'
for
gc 6.99999974402963days in the future
I0418 08:25:38.023483 24916 gc.cpp:55] Scheduling
'/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000'
for gc 6.9999997358
2222days in the future
...
{code}
And this is the stderr of the executor:
{code}
cat
/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test/runs/2b29d6d6-b314-477f-b734-7771d07d41e3/stderr
+ /home/stack/workspace/mesos/build/src/mesos-containerizer mount --help=false
--operation=make-rslave --path=/
+ grep -E /tmp/mesos/.+ /proc/self/mountinfo
+ grep -v 2b29d6d6-b314-477f-b734-7771d07d41e3
+ cut -d -f5
+ xargs --no-run-if-empty umount -l
+ mount -n --rbind
/tmp/mesos/provisioner/containers/2b29d6d6-b314-477f-b734-7771d07d41e3/backends/copy/rootfses/d219ec3a-ea31-45f6-b578-a62cd02392e7
/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test/runs/2b29d6d6-b314-477f-b734-7771d07d41e3/.rootfs
Failed to obtain the IP address for '2b29d6d6-b314-477f-b734-7771d07d41e3'; the
DNS service may not be able to resolve it: Name or service not known
{code}
So the reason why executor terminated is, the libprocess in it failed to
resolved its hostname {{2b29d6d6-b314-477f-b734-7771d07d41e3}}, see
https://github.com/apache/mesos/blob/0.28.0/3rdparty/libprocess/src/process.cpp#L929:L935
for details.
was:
Reproduce steps:
1. Start master
{code}
sudo ./bin/mesos-master.sh --work_dir=/tmp
{code}
2. Start agent
{code}
sudo ./bin/mesos-slave.sh --master=192.168.122.171:5050 --containerizers=mesos
--image_providers=docker
--isolation=filesystem/linux,docker/runtime,network/cni
--network_cni_config_dir=/opt/cni/net_configs
--network_cni_plugins_dir=/opt/cni/plugins}}
{code}
3. Launch a command task with mesos-execute, and it will join a CNI network
{{net1}}.
{code}
sudo src/mesos-execute --master=192.168.122.171:5050 --name=test
--docker_image=library/busybox --networks=net1 --command="sleep 10" --shell=true
I0418 08:25:35.746758 24923 scheduler.cpp:177] Version: 0.29.0
Subscribed with ID '3c4796f0-eee7-4939-a036-7c6387c370eb-0000'
Submitted task 'test' to agent 'b74535d8-276f-4e09-ab47-53e3721ab271-S0'
Received status update TASK_FAILED for task 'test'
message: 'Executor terminated'
source: SOURCE_AGENT
reason: REASON_EXECUTOR_TERMINATED
{code}
So the task failed with the reason "executor terminated". Here is the agent log:
{code}
I0418 08:25:35.804873 24911 slave.cpp:1514] Got assigned task test for
framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
I0418 08:25:35.807937 24911 slave.cpp:1633] Launching task test for framework
3c4796f0-eee7-4939-a036-7c6387c370eb-0000
I0418 08:25:35.812503 24911 paths.cpp:528] Trying to chown
'/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/t
est/runs/2b29d6d6-b314-477f-b734-7771d07d41e3' to user 'root'
I0418 08:25:35.820339 24911 slave.cpp:5620] Launching executor test of
framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000 with resources cpus(*):0.1;
mem(*):32 in work directory '/t
mp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test/runs/2b29d6d6-b314-477f-b734-7771d07d41e3'
I0418 08:25:35.822576 24914 containerizer.cpp:698] Starting container
'2b29d6d6-b314-477f-b734-7771d07d41e3' for executor 'test' of framework
'3c4796f0-eee7-4939-a036-7c6387c370eb-00
00'
I0418 08:25:35.825996 24911 slave.cpp:1851] Queuing task 'test' for executor
'test' of framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
I0418 08:25:35.832348 24911 provisioner.cpp:285] Provisioning image rootfs
'/tmp/mesos/provisioner/containers/2b29d6d6-b314-477f-b734-7771d07d41e3/backends/copy/rootfses/d219ec3a-ea3
1-45f6-b578-a62cd02392e7' for container 2b29d6d6-b314-477f-b734-7771d07d41e3
I0418 08:25:36.061249 24913 linux_launcher.cpp:281] Cloning child process with
flags = CLONE_NEWNET | CLONE_NEWUTS | CLONE_NEWNS
I0418 08:25:36.071208 24915 cni.cpp:643] Bind mounted '/proc/24950/ns/net' to
'/run/mesos/isolators/network/cni/2b29d6d6-b314-477f-b734-7771d07d41e3/ns' for
container 2b29d6d6-b314-4
77f-b734-7771d07d41e3
I0418 08:25:36.250573 24916 cni.cpp:962] Got assigned IPv4 address
'192.168.1.2/24' from CNI network 'net1' for container
2b29d6d6-b314-477f-b734-7771d07d41e3
I0418 08:25:36.252002 24917 cni.cpp:765] Unable to find DNS nameservers for
container 2b29d6d6-b314-477f-b734-7771d07d41e3. Using host '/etc/resolv.conf'
I0418 08:25:37.663487 24916 containerizer.cpp:1696] Executor for container
'2b29d6d6-b314-477f-b734-7771d07d41e3' has exited
I0418 08:25:37.663745 24916 containerizer.cpp:1461] Destroying container
'2b29d6d6-b314-477f-b734-7771d07d41e3'
I0418 08:25:37.670574 24915 cgroups.cpp:2676] Freezing cgroup
/sys/fs/cgroup/freezer/mesos/2b29d6d6-b314-477f-b734-7771d07d41e3
I0418 08:25:37.676864 24912 cgroups.cpp:1409] Successfully froze cgroup
/sys/fs/cgroup/freezer/mesos/2b29d6d6-b314-477f-b734-7771d07d41e3 after
6.061056ms
I0418 08:25:37.680552 24913 cgroups.cpp:2694] Thawing cgroup
/sys/fs/cgroup/freezer/mesos/2b29d6d6-b314-477f-b734-7771d07d41e3
I0418 08:25:37.683346 24913 cgroups.cpp:1438] Successfully thawed cgroup
/sys/fs/cgroup/freezer/mesos/2b29d6d6-b314-477f-b734-7771d07d41e3 after
2.46016ms
I0418 08:25:37.874023 24914 cni.cpp:1121] Unmounted the network namespace
handle
'/run/mesos/isolators/network/cni/2b29d6d6-b314-477f-b734-7771d07d41e3/ns' for
container 2b29d6d6-b31
4-477f-b734-7771d07d41e3
I0418 08:25:37.874194 24914 cni.cpp:1132] Removed the container directory
'/run/mesos/isolators/network/cni/2b29d6d6-b314-477f-b734-7771d07d41e3'
I0418 08:25:37.877306 24912 linux.cpp:814] Ignoring unmounting sandbox/work
directory for container 2b29d6d6-b314-477f-b734-7771d07d41e3
I0418 08:25:37.879295 24912 provisioner.cpp:338] Destroying container rootfs at
'/tmp/mesos/provisioner/containers/2b29d6d6-b314-477f-b734-7771d07d41e3/backends/copy/rootfses/d219ec3
a-ea31-45f6-b578-a62cd02392e7' for container
2b29d6d6-b314-477f-b734-7771d07d41e3
I0418 08:25:37.970871 24914 slave.cpp:4113] Executor 'test' of framework
3c4796f0-eee7-4939-a036-7c6387c370eb-0000 exited with status 1
I0418 08:25:37.975452 24914 slave.cpp:3201] Handling status update TASK_FAILED
(UUID: a5e19b2d-b234-4adc-8791-9046af4c1395) for task test of framework
3c4796f0-eee7-4939-a036-7c6387c
370eb-0000 from @0.0.0.0:0
W0418 08:25:37.978974 24911 containerizer.cpp:1303] Ignoring update for unknown
container: 2b29d6d6-b314-477f-b734-7771d07d41e3
I0418 08:25:37.980370 24917 status_update_manager.cpp:320] Received status
update TASK_FAILED (UUID: a5e19b2d-b234-4adc-8791-9046af4c1395) for task test
of framework 3c4796f0-eee7-49
39-a036-7c6387c370eb-0000
I0418 08:25:37.983105 24913 slave.cpp:3599] Forwarding the update TASK_FAILED
(UUID: a5e19b2d-b234-4adc-8791-9046af4c1395) for task test of framework
3c4796f0-eee7-4939-a036-7c6387c3
70eb-0000 to [email protected]:5050
I0418 08:25:38.017352 24917 slave.cpp:2232] Asked to shut down framework
3c4796f0-eee7-4939-a036-7c6387c370eb-0000 by [email protected]:5050
I0418 08:25:38.018487 24917 slave.cpp:2257] Shutting down framework
3c4796f0-eee7-4939-a036-7c6387c370eb-0000
I0418 08:25:38.019630 24917 slave.cpp:4217] Cleaning up executor 'test' of
framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
I0418 08:25:38.020967 24911 gc.cpp:55] Scheduling
'/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test/runs/
2b29d6d6-b314-477f-b734-7771d07d41e3' for gc 6.99999975983704days in the future
I0418 08:25:38.022328 24917 slave.cpp:4305] Cleaning up framework
3c4796f0-eee7-4939-a036-7c6387c370eb-0000
I0418 08:25:38.022847 24915 status_update_manager.cpp:282] Closing status
update streams for framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
I0418 08:25:38.022459 24912 gc.cpp:55] Scheduling
'/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test'
for
gc 6.99999974402963days in the future
I0418 08:25:38.023483 24916 gc.cpp:55] Scheduling
'/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000'
for gc 6.9999997358
2222days in the future
...
{code}
And this is the stderr of the executor:
{code}
cat
/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test/runs/2b29d6d6-b314-477f-b734-7771d07d41e3/stderr
+ /home/stack/workspace/mesos/build/src/mesos-containerizer mount --help=false
--operation=make-rslave --path=/
+ grep -E /tmp/mesos/.+ /proc/self/mountinfo
+ grep -v 2b29d6d6-b314-477f-b734-7771d07d41e3
+ cut -d -f5
+ xargs --no-run-if-empty umount -l
+ mount -n --rbind
/tmp/mesos/provisioner/containers/2b29d6d6-b314-477f-b734-7771d07d41e3/backends/copy/rootfses/d219ec3a-ea31-45f6-b578-a62cd02392e7
/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test/runs/2b29d6d6-b314-477f-b734-7771d07d41e3/.rootfs
Failed to obtain the IP address for '2b29d6d6-b314-477f-b734-7771d07d41e3'; the
DNS service may not be able to resolve it: Name or service not known
{code}
So the reason why executor terminated is, the libprocess in it failed to
resolved its hostname {{2b29d6d6-b314-477f-b734-7771d07d41e3}}, see
https://github.com/apache/mesos/blob/0.28.0/3rdparty/libprocess/src/process.cpp#L929:L935
for details.
> Command executor can not start when joining a CNI network
> ---------------------------------------------------------
>
> Key: MESOS-5225
> URL: https://issues.apache.org/jira/browse/MESOS-5225
> Project: Mesos
> Issue Type: Bug
> Components: isolation
> Reporter: Qian Zhang
> Assignee: Avinash Sridharan
>
> Reproduce steps:
> 1. Start master
> {code}
> sudo ./bin/mesos-master.sh --work_dir=/tmp
> {code}
>
> 2. Start agent
> {code}
> sudo ./bin/mesos-slave.sh --master=192.168.122.171:5050
> --containerizers=mesos --image_providers=docker
> --isolation=filesystem/linux,docker/runtime,network/cni
> --network_cni_config_dir=/opt/cni/net_configs
> --network_cni_plugins_dir=/opt/cni/plugins
> {code}
>
> 3. Launch a command task with mesos-execute, and it will join a CNI network
> {{net1}}.
> {code}
> sudo src/mesos-execute --master=192.168.122.171:5050 --name=test
> --docker_image=library/busybox --networks=net1 --command="sleep 10"
> --shell=true
> I0418 08:25:35.746758 24923 scheduler.cpp:177] Version: 0.29.0
> Subscribed with ID '3c4796f0-eee7-4939-a036-7c6387c370eb-0000'
> Submitted task 'test' to agent 'b74535d8-276f-4e09-ab47-53e3721ab271-S0'
> Received status update TASK_FAILED for task 'test'
> message: 'Executor terminated'
> source: SOURCE_AGENT
> reason: REASON_EXECUTOR_TERMINATED
> {code}
> So the task failed with the reason "executor terminated". Here is the agent
> log:
> {code}
> I0418 08:25:35.804873 24911 slave.cpp:1514] Got assigned task test for
> framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
> I0418 08:25:35.807937 24911 slave.cpp:1633] Launching task test for framework
> 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
> I0418 08:25:35.812503 24911 paths.cpp:528] Trying to chown
> '/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/t
> est/runs/2b29d6d6-b314-477f-b734-7771d07d41e3' to user 'root'
> I0418 08:25:35.820339 24911 slave.cpp:5620] Launching executor test of
> framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000 with resources
> cpus(*):0.1; mem(*):32 in work directory '/t
> mp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test/runs/2b29d6d6-b314-477f-b734-7771d07d41e3'
> I0418 08:25:35.822576 24914 containerizer.cpp:698] Starting container
> '2b29d6d6-b314-477f-b734-7771d07d41e3' for executor 'test' of framework
> '3c4796f0-eee7-4939-a036-7c6387c370eb-00
> 00'
> I0418 08:25:35.825996 24911 slave.cpp:1851] Queuing task 'test' for executor
> 'test' of framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
> I0418 08:25:35.832348 24911 provisioner.cpp:285] Provisioning image rootfs
> '/tmp/mesos/provisioner/containers/2b29d6d6-b314-477f-b734-7771d07d41e3/backends/copy/rootfses/d219ec3a-ea3
> 1-45f6-b578-a62cd02392e7' for container 2b29d6d6-b314-477f-b734-7771d07d41e3
> I0418 08:25:36.061249 24913 linux_launcher.cpp:281] Cloning child process
> with flags = CLONE_NEWNET | CLONE_NEWUTS | CLONE_NEWNS
> I0418 08:25:36.071208 24915 cni.cpp:643] Bind mounted '/proc/24950/ns/net' to
> '/run/mesos/isolators/network/cni/2b29d6d6-b314-477f-b734-7771d07d41e3/ns'
> for container 2b29d6d6-b314-4
> 77f-b734-7771d07d41e3
> I0418 08:25:36.250573 24916 cni.cpp:962] Got assigned IPv4 address
> '192.168.1.2/24' from CNI network 'net1' for container
> 2b29d6d6-b314-477f-b734-7771d07d41e3
> I0418 08:25:36.252002 24917 cni.cpp:765] Unable to find DNS nameservers for
> container 2b29d6d6-b314-477f-b734-7771d07d41e3. Using host '/etc/resolv.conf'
> I0418 08:25:37.663487 24916 containerizer.cpp:1696] Executor for container
> '2b29d6d6-b314-477f-b734-7771d07d41e3' has exited
> I0418 08:25:37.663745 24916 containerizer.cpp:1461] Destroying container
> '2b29d6d6-b314-477f-b734-7771d07d41e3'
> I0418 08:25:37.670574 24915 cgroups.cpp:2676] Freezing cgroup
> /sys/fs/cgroup/freezer/mesos/2b29d6d6-b314-477f-b734-7771d07d41e3
> I0418 08:25:37.676864 24912 cgroups.cpp:1409] Successfully froze cgroup
> /sys/fs/cgroup/freezer/mesos/2b29d6d6-b314-477f-b734-7771d07d41e3 after
> 6.061056ms
> I0418 08:25:37.680552 24913 cgroups.cpp:2694] Thawing cgroup
> /sys/fs/cgroup/freezer/mesos/2b29d6d6-b314-477f-b734-7771d07d41e3
> I0418 08:25:37.683346 24913 cgroups.cpp:1438] Successfully thawed cgroup
> /sys/fs/cgroup/freezer/mesos/2b29d6d6-b314-477f-b734-7771d07d41e3 after
> 2.46016ms
> I0418 08:25:37.874023 24914 cni.cpp:1121] Unmounted the network namespace
> handle
> '/run/mesos/isolators/network/cni/2b29d6d6-b314-477f-b734-7771d07d41e3/ns'
> for container 2b29d6d6-b31
> 4-477f-b734-7771d07d41e3
> I0418 08:25:37.874194 24914 cni.cpp:1132] Removed the container directory
> '/run/mesos/isolators/network/cni/2b29d6d6-b314-477f-b734-7771d07d41e3'
> I0418 08:25:37.877306 24912 linux.cpp:814] Ignoring unmounting sandbox/work
> directory for container 2b29d6d6-b314-477f-b734-7771d07d41e3
> I0418 08:25:37.879295 24912 provisioner.cpp:338] Destroying container rootfs
> at
> '/tmp/mesos/provisioner/containers/2b29d6d6-b314-477f-b734-7771d07d41e3/backends/copy/rootfses/d219ec3
> a-ea31-45f6-b578-a62cd02392e7' for container
> 2b29d6d6-b314-477f-b734-7771d07d41e3
> I0418 08:25:37.970871 24914 slave.cpp:4113] Executor 'test' of framework
> 3c4796f0-eee7-4939-a036-7c6387c370eb-0000 exited with status 1
> I0418 08:25:37.975452 24914 slave.cpp:3201] Handling status update
> TASK_FAILED (UUID: a5e19b2d-b234-4adc-8791-9046af4c1395) for task test of
> framework 3c4796f0-eee7-4939-a036-7c6387c
> 370eb-0000 from @0.0.0.0:0
> W0418 08:25:37.978974 24911 containerizer.cpp:1303] Ignoring update for
> unknown container: 2b29d6d6-b314-477f-b734-7771d07d41e3
> I0418 08:25:37.980370 24917 status_update_manager.cpp:320] Received status
> update TASK_FAILED (UUID: a5e19b2d-b234-4adc-8791-9046af4c1395) for task test
> of framework 3c4796f0-eee7-49
> 39-a036-7c6387c370eb-0000
> I0418 08:25:37.983105 24913 slave.cpp:3599] Forwarding the update TASK_FAILED
> (UUID: a5e19b2d-b234-4adc-8791-9046af4c1395) for task test of framework
> 3c4796f0-eee7-4939-a036-7c6387c3
> 70eb-0000 to [email protected]:5050
> I0418 08:25:38.017352 24917 slave.cpp:2232] Asked to shut down framework
> 3c4796f0-eee7-4939-a036-7c6387c370eb-0000 by [email protected]:5050
> I0418 08:25:38.018487 24917 slave.cpp:2257] Shutting down framework
> 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
> I0418 08:25:38.019630 24917 slave.cpp:4217] Cleaning up executor 'test' of
> framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
> I0418 08:25:38.020967 24911 gc.cpp:55] Scheduling
> '/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test/runs/
> 2b29d6d6-b314-477f-b734-7771d07d41e3' for gc 6.99999975983704days in the
> future
> I0418 08:25:38.022328 24917 slave.cpp:4305] Cleaning up framework
> 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
> I0418 08:25:38.022847 24915 status_update_manager.cpp:282] Closing status
> update streams for framework 3c4796f0-eee7-4939-a036-7c6387c370eb-0000
> I0418 08:25:38.022459 24912 gc.cpp:55] Scheduling
> '/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test'
> for
> gc 6.99999974402963days in the future
> I0418 08:25:38.023483 24916 gc.cpp:55] Scheduling
> '/tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000'
> for gc 6.9999997358
> 2222days in the future
> ...
> {code}
> And this is the stderr of the executor:
> {code}
> cat
> /tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test/runs/2b29d6d6-b314-477f-b734-7771d07d41e3/stderr
>
> + /home/stack/workspace/mesos/build/src/mesos-containerizer mount
> --help=false --operation=make-rslave --path=/
> + grep -E /tmp/mesos/.+ /proc/self/mountinfo
> + grep -v 2b29d6d6-b314-477f-b734-7771d07d41e3
> + cut -d -f5
> + xargs --no-run-if-empty umount -l
> + mount -n --rbind
> /tmp/mesos/provisioner/containers/2b29d6d6-b314-477f-b734-7771d07d41e3/backends/copy/rootfses/d219ec3a-ea31-45f6-b578-a62cd02392e7
>
> /tmp/mesos/slaves/b74535d8-276f-4e09-ab47-53e3721ab271-S0/frameworks/3c4796f0-eee7-4939-a036-7c6387c370eb-0000/executors/test/runs/2b29d6d6-b314-477f-b734-7771d07d41e3/.rootfs
> Failed to obtain the IP address for '2b29d6d6-b314-477f-b734-7771d07d41e3';
> the DNS service may not be able to resolve it: Name or service not known
> {code}
> So the reason why executor terminated is, the libprocess in it failed to
> resolved its hostname {{2b29d6d6-b314-477f-b734-7771d07d41e3}}, see
> https://github.com/apache/mesos/blob/0.28.0/3rdparty/libprocess/src/process.cpp#L929:L935
> for details.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)