[
https://issues.apache.org/jira/browse/MESOS-8414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16326155#comment-16326155
]
Armand Grillet commented on MESOS-8414:
---------------------------------------
[~kaysoky] could you please take a look as you have worked on
[https://reviews.apache.org/r/43963/] ?
> DockerContainerizerTest.ROOT_DOCKER_Logs fails on CentOS 6
> ----------------------------------------------------------
>
> Key: MESOS-8414
> URL: https://issues.apache.org/jira/browse/MESOS-8414
> Project: Mesos
> Issue Type: Bug
> Components: test
> Environment: CentOS 6, Docker version 1.7.1, build 786b29d
> Reporter: Armand Grillet
> Assignee: Armand Grillet
> Priority: Major
> Attachments:
> centos6-ssl-DockerContainerizerTest.ROOT_DOCKER_Logs.txt, centos6-vlog2.txt,
> docker-inspect.json, docker-logs.txt
>
>
> You can find the verbose logs attached.
> The most interesting part:
> {code:java}
> I0108 16:35:45.887037 17805 sched.cpp:897] Received 1 offers
> I0108 16:35:45.887070 17805 sched.cpp:921] Scheduler::resourceOffers took
> 12130ns
> I0108 16:35:45.985957 17808 docker.cpp:349] Unable to detect IP Address at
> 'NetworkSettings.Networks.host.IPAddress', attempting deprecated field
> I0108 16:35:45.986428 17809 task_status_update_manager.cpp:328] Received task
> status update TASK_FAILED (Status UUID: 7f544700-215b-4d27-ab43-b48e19592d00)
> for task 1 of framework f09c89e1-aa62-4662-bda8-15a2c87f412e-0000
> I0108 16:35:45.986552 17809 task_status_update_manager.cpp:383] Forwarding
> task status update TASK_FAILED (Status UUID:
> 7f544700-215b-4d27-ab43-b48e19592d00) for task 1 of framework
> f09c89e1-aa62-4662-bda8-15a2c87f412e-0000 to the agent
> I0108 16:35:45.986654 17809 slave.cpp:5209] Forwarding the update TASK_FAILED
> (Status UUID: 7f544700-215b-4d27-ab43-b48e19592d00) for task 1 of framework
> f09c89e1-aa62-4662-bda8-15a2c87f412e-0000 to [email protected]:37252
> I0108 16:35:45.986795 17809 slave.cpp:5102] Task status update manager
> successfully handled status update TASK_FAILED (Status UUID:
> 7f544700-215b-4d27-ab43-b48e19592d00) for task 1 of framework
> f09c89e1-aa62-4662-bda8-15a2c87f412e-0000
> I0108 16:35:45.986829 17809 slave.cpp:5118] Sending acknowledgement for
> status update TASK_FAILED (Status UUID: 7f544700-215b-4d27-ab43-b48e19592d00)
> for task 1 of framework f09c89e1-aa62-4662-bda8-15a2c87f412e-0000 to
> executor(1)@172.16.10.110:38499
> I0108 16:35:45.986901 17805 master.cpp:7890] Status update TASK_FAILED
> (Status UUID: 7f544700-215b-4d27-ab43-b48e19592d00) for task 1 of framework
> f09c89e1-aa62-4662-bda8-15a2c87f412e-0000 from agent
> f09c89e1-aa62-4662-bda8-15a2c87f412e-S0 at slave(1)@172.16.10.110:37252
> (ip-172-16-10-110.ec2.internal)
> I0108 16:35:45.986928 17805 master.cpp:7946] Forwarding status update
> TASK_FAILED (Status UUID: 7f544700-215b-4d27-ab43-b48e19592d00) for task 1 of
> framework f09c89e1-aa62-4662-bda8-15a2c87f412e-0000
> I0108 16:35:45.986984 17805 master.cpp:10193] Updating the state of task 1 of
> framework f09c89e1-aa62-4662-bda8-15a2c87f412e-0000 (latest state:
> TASK_FAILED, status update state: TASK_FAILED)
> I0108 16:35:45.987047 17805 sched.cpp:990] Received status update TASK_FAILED
> (Status UUID: 7f544700-215b-4d27-ab43-b48e19592d00) for task 1 of framework
> f09c89e1-aa62-4662-bda8-15a2c87f412e-0000 from slave(1)@172.16.10.110:37252
> I0108 16:35:45.987103 17805 sched.cpp:1029] Scheduler::statusUpdate took
> 30948ns
> I0108 16:35:45.987112 17805 sched.cpp:1048] Sending ACK for status update
> TASK_FAILED (Status UUID: 7f544700-215b-4d27-ab43-b48e19592d00) for task 1 of
> framework f09c89e1-aa62-4662-bda8-15a2c87f412e-0000 to
> [email protected]:37252
> I0108 16:35:45.987221 17805 master.cpp:5826] Processing ACKNOWLEDGE call
> 7f544700-215b-4d27-ab43-b48e19592d00 for task 1 of framework
> f09c89e1-aa62-4662-bda8-15a2c87f412e-0000 (default) at
> [email protected]:37252 on agent
> f09c89e1-aa62-4662-bda8-15a2c87f412e-S0
> I0108 16:35:45.987267 17805 master.cpp:10299] Removing task 1 with resources
> cpus(allocated: *):2; mem(allocated: *):1024; disk(allocated: *):1024;
> ports(allocated: *):[31000-32000] of framework
> f09c89e1-aa62-4662-bda8-15a2c87f412e-0000 on agent
> f09c89e1-aa62-4662-bda8-15a2c87f412e-S0 at slave(1)@172.16.10.110:37252
> (ip-172-16-10-110.ec2.internal)
> I0108 16:35:45.987473 17807 task_status_update_manager.cpp:401] Received task
> status update acknowledgement (UUID: 7f544700-215b-4d27-ab43-b48e19592d00)
> for task 1 of framework f09c89e1-aa62-4662-bda8-15a2c87f412e-0000
> I0108 16:35:45.987561 17807 task_status_update_manager.cpp:538] Cleaning up
> status update stream for task 1 of framework
> f09c89e1-aa62-4662-bda8-15a2c87f412e-0000
> I0108 16:35:45.987814 17807 slave.cpp:3974] Task status update manager
> successfully handled status update acknowledgement (UUID:
> 7f544700-215b-4d27-ab43-b48e19592d00) for task 1 of framework
> f09c89e1-aa62-4662-bda8-15a2c87f412e-0000
> I0108 16:35:45.987849 17807 slave.cpp:8935] Completing task 1
> {code}
> After further testing,
> [https://github.com/apache/mesos/blob/51a3bd95bd2d740a39b55634251abeadb561e5c8/src/docker/docker.cpp#L384]
> appears to never be reached as {{ipAddressValue->value}} is an empty string.
> What happens according to {{docker logs}}:
> {code:java}
> The system has no more ptys. Ask your system administrator to create more.
> while executing
> "spawn -noecho echo out849cd741-4ba7-429c-979f-da5cb0b16e0c"
> ("eval" body line 1)
> invoked from within
> "eval [list spawn -noecho] $argv"
> invoked from within
> "if {[string compare [lindex $argv 0] "-p"] == 0} {
> # pipeline
> set stty_init "-echo"
> eval [list spawn -noecho] [lrange $argv 1 end]
> clo..."
> (file "/usr/bin/unbuffer" line 13)
> The system has no more ptys. Ask your system administrator to create more.
> while executing
> "spawn -noecho echo err849cd741-4ba7-429c-979f-da5cb0b16e0c"
> ("eval" body line 1)
> invoked from within
> "eval [list spawn -noecho] $argv"
> invoked from within
> "if {[string compare [lindex $argv 0] "-p"] == 0} {
> # pipeline
> set stty_init "-echo"
> eval [list spawn -noecho] [lrange $argv 1 end]
> clo..."
> (file "/usr/bin/unbuffer" line 13)
> {code}
> Removing the {{unbuffer}} in the commands in
> [https://github.com/apache/mesos/blob/9c03a463c1ac8f63dc00255945a04016c45f04e9/src/tests/containerizer/docker_containerizer_tests.cpp#L2150-L2151]
> solve the issue but they are here for a reason:
> [https://reviews.apache.org/r/43963] The test, without {{unbuffer}}, has been
> successful even when running 100 times using {{GLOG_v=2 sudo ./mesos-tests.sh
> --gtest_filter="DockerContainerizerTest.ROOT_DOCKER_Logs"
> --gtest_break_on_failure --gtest_repeat=100 --verbose}}.
> The command can be reproduced without using Mesos on Centos 6 (using Docker
> version 1.7.1, build 786b29d) :
> {code:java}
> [centos@ip-10-16-234-12 bin]$ sudo docker run -i -t mesosphere/alpine-expect
> /bin/sh
> / # unbuffer echo mesos
> The system has no more ptys. Ask your system administrator to create more.
> while executing
> "spawn -noecho echo mesos"
> ("eval" body line 1)
> invoked from within
> "eval [list spawn -noecho] $argv"
> invoked from within
> "if {[string compare [lindex $argv 0] "-p"] == 0} {
> # pipeline
> set stty_init "-echo"
> eval [list spawn -noecho] [lrange $argv 1 end]
> clo..."
> (file "/usr/bin/unbuffer" line 13)
> / # echo mesos
> mesos{code}
> By default, \{{unbuffer}} is not installed on CentOS 6. Installing it does
> not resolve the issue but the command works on CentOS 6:
> {code:java}
> [centos@ip-10-16-234-12 bin]$ sudo yum install expect
> [centos@ip-10-16-234-12 bin]$ unbuffer echo mesos
> mesos{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)