[jira] [Updated] (MESOS-8105) Docker containerizer fails with "Unable to get executor pid after launch"
[ https://issues.apache.org/jira/browse/MESOS-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maybob updated MESOS-8105: -- Description: When running lots of command at the same time by each command using same executor with different executorId by docker,some executor occur error "Unable to get executor pid after launch". Reason of this error may be "docker inspect" hangs or exit 0 with pid 0. Another reason may be lots of docker consume many resources, e.g file descriptor. {color:red}Log:{color} {code:java} I1012 16:15:01.003931 124081 slave.cpp:1619] Got assigned task '920860' for framework framework-id-daily I1012 16:15:01.006091 124081 slave.cpp:1900] Authorizing task '920860' for framework framework-id-daily I1012 16:15:01.008281 124081 slave.cpp:2087] Launching task '920860' for framework framework-id-daily I1012 16:15:01.008779 124081 paths.cpp:573] Trying to chown '/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3' to user 'maybob' I1012 16:15:01.009027 124081 slave.cpp:7401] Checkpointing ExecutorInfo to '/volumes/sdb1/mesos/meta/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/executor.info' I1012 16:15:01.009546 124081 slave.cpp:7038] Launching executor 'Executor_920860' of framework framework-id-daily with resources {} in work directory '/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3' I1012 16:15:01.010339 124081 slave.cpp:7429] Checkpointing TaskInfo to '/volumes/sdb1/mesos/meta/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3/tasks/920860/task.info' I1012 16:15:01.010726 124081 slave.cpp:2316] Queued task '920860' for executor 'Executor_920860' of framework framework-id-daily I1012 16:15:01.011740 124088 docker.cpp:1175] Starting container '29c82b61-1242-4de9-80cf-16f46c30e7e3' for executor 'Executor_920860' and framework framework-id-daily I1012 16:15:01.013123 124081 slave.cpp:877] Successfully attached file '/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3' I1012 16:15:01.013290 124080 fetcher.cpp:353] Starting to fetch URIs for container: 29c82b61-1242-4de9-80cf-16f46c30e7e3, directory: /volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3 I1012 16:15:01.706429 124071 docker.cpp:909] Running docker -H unix:///var/run/docker.sock run --cpu-shares 378 --memory 427819008 -e LIBPROCESS_PORT=0 -e MESOS_AGENT_ENDPOINT=xxx.xxx.xxx.xxx:5051 -e MESOS_CHECKPOINT=1 -e MESOS_CONTAINER_NAME=mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3 -e MESOS_DIRECTORY=/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3 -e MESOS_EXECUTOR_ID=Executor_920860 -e MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD=5secs -e MESOS_FRAMEWORK_ID=framework-id-daily -e MESOS_HTTP_COMMAND_EXECUTOR=0 -e MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos-1.3.1.so -e MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos-1.3.1.so -e MESOS_RECOVERY_TIMEOUT=15mins -e MESOS_SANDBOX=/mnt/mesos/sandbox -e MESOS_SLAVE_ID=89192f68-d28f-498c-808f-442a1ef576b3-S2 -e MESOS_SLAVE_PID=slave(1)@xxx.xxx.xxx.xxx:5051 -e MESOS_SUBSCRIPTION_BACKOFF_MAX=2secs -v /volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3:/mnt/mesos/sandbox --net host --entrypoint /bin/sh --name mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3 reg.docker.xxx/xx/executor:v25 -c env && cd $MESOS_SANDBOX && ./executor.sh I1012 16:15:01.717859 124071 docker.cpp:1071] Running docker -H unix:///var/run/docker.sock inspect mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3 I1012 16:15:02.033951 124085 docker.cpp:1118] Retrying inspect with non-zero status code. cmd: 'docker -H unix:///var/run/docker.sock inspect mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3', interval: 1secs I1012 16:15:03.034230 124090 docker.cpp:1071] Running docker -H unix:///var/run/docker.sock inspect mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3 I1012 16:15:03.518020 124078 docker.cpp:1071] Running docker -H unix:///var/run/docker.sock inspect mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c3
[jira] [Updated] (MESOS-8032) Launch CSI plugins in storage local resource provider.
[ https://issues.apache.org/jira/browse/MESOS-8032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chun-Hung Hsiao updated MESOS-8032: --- Component/s: (was: agent) storage > Launch CSI plugins in storage local resource provider. > -- > > Key: MESOS-8032 > URL: https://issues.apache.org/jira/browse/MESOS-8032 > Project: Mesos > Issue Type: Task > Components: storage >Reporter: Chun-Hung Hsiao >Assignee: Chun-Hung Hsiao > Labels: mesosphere, storage > Fix For: 1.5.0 > > > Launching a CSI plugin requires the following steps: > 1. Verify the configuration. > 2. Prepare a directory in the work directory of the resource provider where > the socket file should be placed, and construct the path of the socket file. > 3. If the socket file already exists and the plugin is already running, we > should not launch another plugin instance. > 4. Otherwise, launch a standalone container to run the plugin and connect to > it through the socket file. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7561) Add storage resource provider specific information in ResourceProviderInfo.
[ https://issues.apache.org/jira/browse/MESOS-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chun-Hung Hsiao updated MESOS-7561: --- Component/s: storage > Add storage resource provider specific information in ResourceProviderInfo. > --- > > Key: MESOS-7561 > URL: https://issues.apache.org/jira/browse/MESOS-7561 > Project: Mesos > Issue Type: Task > Components: storage >Reporter: Jie Yu >Assignee: Chun-Hung Hsiao > Labels: mesosphere, storage > Fix For: 1.5.0 > > > For storage resource provider, there will be some specific configuration > information. For instance, the most important one is the `ContainerConfig` of > the CSI Plugin container. > That config information will be sent to the corresponding agent that will use > the resources provided by the resource provider. For storage resource > provider particularly, the agent needs to launch the CSI Node Plugin to mount > the volumes. > Comparing to adding first class storage resource provider information, an > alternative is to add a generic labels field in ResourceProviderInfo and let > resource provider itself figure out the format of the labels. However, I > believe a first class solution is better and more clear. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (MESOS-8108) Process offer operations in storage local resource provider
Chun-Hung Hsiao created MESOS-8108: -- Summary: Process offer operations in storage local resource provider Key: MESOS-8108 URL: https://issues.apache.org/jira/browse/MESOS-8108 Project: Mesos Issue Type: Task Components: storage Reporter: Chun-Hung Hsiao Assignee: Chun-Hung Hsiao Fix For: 1.5.0 The storage local resource provider receives offer operations for reservations and resource conversions, and invoke proper CSI calls to implement these operations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8101) Import resources from CSI plugins in storage local resource provider.
[ https://issues.apache.org/jira/browse/MESOS-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chun-Hung Hsiao updated MESOS-8101: --- Component/s: (was: agent) storage > Import resources from CSI plugins in storage local resource provider. > - > > Key: MESOS-8101 > URL: https://issues.apache.org/jira/browse/MESOS-8101 > Project: Mesos > Issue Type: Task > Components: storage >Reporter: Chun-Hung Hsiao >Assignee: Chun-Hung Hsiao > Labels: mesosphere, storage > Fix For: 1.5.0 > > > The following lists the steps to import resources from a CSI plugin: > 1. Launch the node plugin > 1.1 GetSupportedVersions > 1.2 GetPluginInfo > 1.3 ProbeNode > 1.4 GetNodeCapabilities > 2. Launch the controller plugin > 2.1 GetSuportedVersions > 2.2 GetPluginInfo > 2.3 GetControllerCapabilities > 3. GetCapacity > 4. ListVolumes > 5. Report to the resource provider through UPDATE_TOTAL_RESOURCES -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (MESOS-8107) Add a call to update total resources in the resource provider API.
Chun-Hung Hsiao created MESOS-8107: -- Summary: Add a call to update total resources in the resource provider API. Key: MESOS-8107 URL: https://issues.apache.org/jira/browse/MESOS-8107 Project: Mesos Issue Type: Task Reporter: Chun-Hung Hsiao Assignee: Chun-Hung Hsiao Fix For: 1.5.0 We should add a call for a resource provider to update the total resources, and remove {{resources}} from the {{SUBSCRIBE}} call and instead moving to a protocol where a resource provider first subscribes and then updates its resources. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-6792) MasterSlaveReconciliationTest.ReconcileLostTask test is flaky
[ https://issues.apache.org/jira/browse/MESOS-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov updated MESOS-6792: --- Description: The test {{MasterSlaveReconciliationTest.ReconcileLostTask}} is flaky for me as of {{e99ea9ce8b1de01dd8b3cac6675337edb6320f38}}, {code} Repeating all tests (iteration 912) . . . Note: Google Test filter = <...> [==] Running 1 test from 1 test case. [--] Global test environment set-up. [--] 1 test from MasterSlaveReconciliationTest [ RUN ] MasterSlaveReconciliationTest.SlaveReregisterTerminatedExecutor I1214 04:41:11.559672 2005 cluster.cpp:160] Creating default 'local' authorizer I1214 04:41:11.560848 2045 master.cpp:380] Master 87dd8179-dd7d-4270-ace2-ea771b57371c (gru1.hw.ca1.mesosphere.com) started on 192.99.40.208:37659 I1214 04:41:11.560878 2045 master.cpp:382] Flags at startup: --acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/cXHI89/credentials" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="100secs" --registry_strict="false" --root_submissions="true" --user_sorter="drf" --version="false" --webui_dir="/home/bbannier/src/mesos/build/P/share/mesos/webui" --work_dir="/tmp/cXHI89/master" --zk_session_timeout="10secs" I1214 04:41:11.561079 2045 master.cpp:432] Master only allowing authenticated frameworks to register I1214 04:41:11.561089 2045 master.cpp:446] Master only allowing authenticated agents to register I1214 04:41:11.561095 2045 master.cpp:459] Master only allowing authenticated HTTP frameworks to register I1214 04:41:11.561101 2045 credentials.hpp:39] Loading credentials for authentication from '/tmp/cXHI89/credentials' I1214 04:41:11.561194 2045 master.cpp:504] Using default 'crammd5' authenticator I1214 04:41:11.561236 2045 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-readonly' I1214 04:41:11.561274 2045 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-readwrite' I1214 04:41:11.561301 2045 http.cpp:922] Using default 'basic' HTTP authenticator for realm 'mesos-master-scheduler' I1214 04:41:11.561326 2045 master.cpp:584] Authorization enabled I1214 04:41:11.562155 2039 master.cpp:2045] Elected as the leading master! I1214 04:41:11.562173 2039 master.cpp:1568] Recovering from registrar I1214 04:41:11.562347 2045 registrar.cpp:362] Successfully fetched the registry (0B) in 114944ns I1214 04:41:11.562441 2045 registrar.cpp:461] Applied 1 operations in 7920ns; attempting to update the registry I1214 04:41:11.562621 2048 registrar.cpp:506] Successfully updated the registry in 155136ns I1214 04:41:11.562664 2048 registrar.cpp:392] Successfully recovered registrar I1214 04:41:11.562832 2044 master.cpp:1684] Recovered 0 agents from the registry (166B); allowing 10mins for agents to re-register I1214 04:41:11.568444 2005 cluster.cpp:446] Creating default 'local' authorizer I1214 04:41:11.569344 2005 sched.cpp:232] Version: 1.2.0 I1214 04:41:11.569842 2035 slave.cpp:209] Mesos agent started on (912)@192.99.40.208:37659 I1214 04:41:11.570080 2040 sched.cpp:336] New master detected at master@192.99.40.208:37659 I1214 04:41:11.570117 2040 sched.cpp:402] Authenticating with master master@192.99.40.208:37659 I1214 04:41:11.570127 2040 sched.cpp:409] Using default CRAM-MD5 authenticatee I1214 04:41:11.570220 2040 authenticatee.cpp:121] Creating new client SASL connection I1214 04:41:11.57 2035 slave.cpp:210] Flags at startup: --acls="" --appc_simple_discovery_uri_prefix="http://"; --appc_store_dir="/tmp/mesos/store/appc" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" --cgroups_root="mesos" --container_disk_watch_interval="15secs" --containerizers="mesos" --credential="/tmp/MasterSlaveReconciliationTest_SlaveReregisterTe
[jira] [Commented] (MESOS-8106) Docker fetcher plugin unsupported scheme failure message is not accurate.
[ https://issues.apache.org/jira/browse/MESOS-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208122#comment-16208122 ] Gilbert Song commented on MESOS-8106: - This was from this patch https://reviews.apache.org/r/58778/. When we add the GCE registry support, we did not update the error message. /cc [~chhsia0] [~zhitao] > Docker fetcher plugin unsupported scheme failure message is not accurate. > - > > Key: MESOS-8106 > URL: https://issues.apache.org/jira/browse/MESOS-8106 > Project: Mesos > Issue Type: Improvement > Components: containerization >Reporter: Gilbert Song > Labels: containerizer, docker-fetcher > > https://github.com/apache/mesos/blob/1.4.0/src/uri/fetchers/docker.cpp#L843 > This failure message is not accurate. For such a case, if the user/operator > give a wrong credential to communicate to a BASIC auth based docker private > registry. The authentication failed but the log is still saying: "Unsupported > auth-scheme: BASIC" -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8104) Add code coverage to continuous integration.
[ https://issues.apache.org/jira/browse/MESOS-8104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208117#comment-16208117 ] Vinod Kone commented on MESOS-8104: --- Huge +1. We attempted this a while back but the build itself was hanging/segfaulting IIRC. Need to dig up what tools we used back then. > Add code coverage to continuous integration. > > > Key: MESOS-8104 > URL: https://issues.apache.org/jira/browse/MESOS-8104 > Project: Mesos > Issue Type: Bug > Components: build, test >Reporter: James Peach > > We should integrate code coverage in the the CI testing. Adding the right > compiler options looks like > [this|https://github.com/apache/trafficserver/commit/be237ea7ee874355c6f8209b4793dfe4c4fedd88] > in automake (though we need to remove the bugs). We can push the coverage > data to coveralls.io from specific test build configurations, and there's > even precedent for the ASF infra team wiring it into Github directly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8090) Mesos 1.4.0 crashes with 1.3.x agent with oversubscription
[ https://issues.apache.org/jira/browse/MESOS-8090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208103#comment-16208103 ] Zhitao Li commented on MESOS-8090: -- A quick attempt to fix: https://reviews.apache.org/r/63084/ > Mesos 1.4.0 crashes with 1.3.x agent with oversubscription > -- > > Key: MESOS-8090 > URL: https://issues.apache.org/jira/browse/MESOS-8090 > Project: Mesos > Issue Type: Bug > Components: master, oversubscription >Affects Versions: 1.4.0 >Reporter: Zhitao Li >Assignee: Michael Park > > We are seeing a crash in 1.4.0 master when it receives {{updateSlave}} from a > over-subscription enabled agent running 1.3.1 code. > The crash line is: > {code:none} > resources.cpp:1050] Check failed: !resource.has_role() cpus{REV}:19 > {code} > Stack trace in gdb: > {panel:title=My title} > #0 0x7f22f3553067 in __GI_raise (sig=sig@entry=6) at > ../nptl/sysdeps/unix/sysv/linux/raise.c:56 > #1 0x7f22f3554448 in __GI_abort () at abort.c:89 > #2 0x7f22f615cd79 in google::DumpStackTraceAndExit () at > src/utilities.cc:147 > #3 0x7f22f6154a4d in google::LogMessage::Fail () at src/logging.cc:1458 > #4 0x7f22f61566cd in google::LogMessage::SendToLog (this= out>) at src/logging.cc:1412 > #5 0x7f22f6154612 in google::LogMessage::Flush (this=0x18ac7) at > src/logging.cc:1281 > #6 0x7f22f61570b9 in google::LogMessageFatal::~LogMessageFatal > (this=, __in_chrg=) at src/logging.cc:1984 > #7 0x7f22f527e133 in mesos::Resources::isEmpty (resource=...) at > /mesos/src/common/resources.cpp:1051 > #8 0x7f22f527e1e5 in mesos::Resources::Resource_::isEmpty > (this=this@entry=0x7f22e713d2e0) at /mesos/src/common/resources.cpp:1173 > #9 0x7f22f527e20c in mesos::Resources::add (this=0x7f22e713d400, > that=...) at /mesos/src/common/resources.cpp:1993 > #10 0x7f22f527f860 in mesos::Resources::operator+= > (this=this@entry=0x7f22e713d400, that=...) at > /mesos/src/common/resources.cpp:2016 > #11 0x7f22f527f91d in mesos::Resources::operator+= > (this=this@entry=0x7f22e713d400, that=...) at > /mesos/src/common/resources.cpp:2025 > #12 0x7f22f527fa4b in mesos::Resources::Resources (this=0x7f22e713d400, > _resources=...) at /mesos/src/common/resources.cpp:1277 > #13 0x7f22f548b812 in mesos::internal::master::Master::updateSlave > (this=0x558137bbae70, message=...) at /mesos/src/master/master.cpp:6681 > #14 0x7f22f550adc1 in > ProtobufProcess::_handlerM > (t=0x558137bbae70, method= > (void > (mesos::internal::master::Master::*)(mesos::internal::master::Master * const, > const mesos::internal::UpdateSlaveMessage &)) 0x7f22f548b6d0 > const&)>, > data="\n)\n'07ba28cc-d9fa-44fb-8d6b-f8c5c90f8a90-S1\022\030\n\004cpus\020\000\032\t\t\000\000\000\000\000\000\063@2\001*J") > at /mesos/3rdparty/libprocess/include/process/protobuf.hpp:799 > #15 0x7f22f54c8791 in > ProtobufProcess::visit (this=0x558137bbae70, > event=...) at /mesos/3rdparty/libprocess/include/process/protobuf.hpp:104 > #16 0x7f22f54572d4 in mesos::internal::master::Master::_visit > (this=this@entry=0x558137bbae70, event=...) at > /mesos/src/master/master.cpp:1643 > #17 0x7f22f547014d in mesos::internal::master::Master::visit > (this=0x558137bbae70, event=...) at /mesos/src/master/master.cpp:1575 > #18 0x7f22f60b7169 in serve (event=..., this=0x558137bbbf28) at > /mesos/3rdparty/libprocess/include/process/process.hpp:87 > #19 process::ProcessManager::resume (this=, > process=0x558137bbbf28) at /mesos/3rdparty/libprocess/src/process.cpp:3346 > #20 0x7f22f60bd056 in operator() (__closure=0x558137aa3218) at > /mesos/3rdparty/libprocess/src/process.cpp:2881 > #21 _M_invoke<> (this=0x558137aa3218) at /usr/include/c++/4.9/functional:1700 > #22 operator() (this=0x558137aa3218) at /usr/include/c++/4.9/functional:1688 > #23 > std::thread::_Impl()> > >::_M_run(void) (this=0x558137aa3200) at /usr/include/c++/4.9/thread:115 > #24 0x7f22f40b3970 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 > #25 0x7f22f38d1064 in start_thread (arg=0x7f22e713e700) at > pthread_create.c:309 > #26 0x7f22f360662d in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 > {panel} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (MESOS-8106) Docker fetcher plugin unsupported scheme failure message is not accurate.
Gilbert Song created MESOS-8106: --- Summary: Docker fetcher plugin unsupported scheme failure message is not accurate. Key: MESOS-8106 URL: https://issues.apache.org/jira/browse/MESOS-8106 Project: Mesos Issue Type: Improvement Components: containerization Reporter: Gilbert Song https://github.com/apache/mesos/blob/1.4.0/src/uri/fetchers/docker.cpp#L843 This failure message is not accurate. For such a case, if the user/operator give a wrong credential to communicate to a BASIC auth based docker private registry. The authentication failed but the log is still saying: "Unsupported auth-scheme: BASIC" -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7504) Parent's mount namespace cannot be determined when launching a nested container.
[ https://issues.apache.org/jira/browse/MESOS-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207751#comment-16207751 ] Andrei Budnik commented on MESOS-7504: -- List of failing tests: {{NestedMesosContainerizerTest.ROOT_CGROUPS_DestroyDebugContainerOnRecover}} {{ROOT_CGROUPS_DebugNestedContainerInheritsEnvironment}} > Parent's mount namespace cannot be determined when launching a nested > container. > > > Key: MESOS-7504 > URL: https://issues.apache.org/jira/browse/MESOS-7504 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.3.0 > Environment: Ubuntu 16.04 >Reporter: Alexander Rukletsov >Assignee: Andrei Budnik > Labels: containerizer, flaky-test, mesosphere > > I've observed this failure twice in different Linux environments. Here is an > example of such failure: > {noformat} > [ RUN ] > NestedMesosContainerizerTest.ROOT_CGROUPS_DestroyDebugContainerOnRecover > I0509 21:53:25.471657 17167 containerizer.cpp:221] Using isolation: > cgroups/cpu,filesystem/linux,namespaces/pid,network/cni,volume/image > I0509 21:53:25.475124 17167 linux_launcher.cpp:150] Using > /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > I0509 21:53:25.475407 17167 provisioner.cpp:249] Using default backend > 'overlay' > I0509 21:53:25.481232 17186 containerizer.cpp:608] Recovering containerizer > I0509 21:53:25.482295 17186 provisioner.cpp:410] Provisioner recovery complete > I0509 21:53:25.482587 17187 containerizer.cpp:1001] Starting container > 21bc372c-0f2c-49f5-b8ab-8d32c232b95d for executor 'executor' of framework > I0509 21:53:25.482918 17189 cgroups.cpp:410] Creating cgroup at > '/sys/fs/cgroup/cpu,cpuacct/mesos_test_d989f526-efe0-4553-bf79-936ad66c3753/21bc372c-0f2c-49f5-b8ab-8d32c232b95d' > for container 21bc372c-0f2c-49f5-b8ab-8d32c232b95d > I0509 21:53:25.484103 17190 cpu.cpp:101] Updated 'cpu.shares' to 1024 (cpus > 1) for container 21bc372c-0f2c-49f5-b8ab-8d32c232b95d > I0509 21:53:25.484808 17186 containerizer.cpp:1524] Launching > 'mesos-containerizer' with flags '--help="false" > --launch_info="{"clone_namespaces":[131072,536870912],"command":{"shell":true,"value":"sleep > > 1000"},"environment":{"variables":[{"name":"MESOS_SANDBOX","type":"VALUE","value":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr"}]},"pre_exec_commands":[{"arguments":["mesos-containerizer","mount","--help=false","--operation=make-rslave","--path=\/"],"shell":false,"value":"\/home\/ubuntu\/workspace\/mesos\/Mesos_CI-build\/FLAG\/SSL\/label\/mesos-ec2-ubuntu-16.04\/mesos\/build\/src\/mesos-containerizer"},{"shell":true,"value":"mount > -n -t proc proc \/proc -o > nosuid,noexec,nodev"}],"working_directory":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr"}" > --pipe_read="29" --pipe_write="32" > --runtime_directory="/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_sKhtj7/containers/21bc372c-0f2c-49f5-b8ab-8d32c232b95d" > --unshare_namespace_mnt="false"' > I0509 21:53:25.484978 17189 linux_launcher.cpp:429] Launching container > 21bc372c-0f2c-49f5-b8ab-8d32c232b95d and cloning with namespaces CLONE_NEWNS > | CLONE_NEWPID > I0509 21:53:25.513890 17186 containerizer.cpp:1623] Checkpointing container's > forked pid 1873 to > '/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_Rdjw6M/meta/slaves/frameworks/executors/executor/runs/21bc372c-0f2c-49f5-b8ab-8d32c232b95d/pids/forked.pid' > I0509 21:53:25.515878 17190 fetcher.cpp:353] Starting to fetch URIs for > container: 21bc372c-0f2c-49f5-b8ab-8d32c232b95d, directory: > /tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr > I0509 21:53:25.517715 17193 containerizer.cpp:1791] Starting nested container > 21bc372c-0f2c-49f5-b8ab-8d32c232b95d.ea991d38-e1a5-44fe-a522-622b15142e35 > I0509 21:53:25.518569 17193 switchboard.cpp:545] Launching > 'mesos-io-switchboard' with flags '--heartbeat_interval="30secs" > --help="false" > --socket_address="/tmp/mesos-io-switchboard-ca463cf2-70ba-4121-a5c6-1a170ae40c1b" > --stderr_from_fd="36" --stderr_to_fd="2" --stdin_to_fd="32" > --stdout_from_fd="33" --stdout_to_fd="1" --tty="false" > --wait_for_connection="true"' for container > 21bc372c-0f2c-49f5-b8ab-8d32c232b95d.ea991d38-e1a5-44fe-a522-622b15142e35 > I0509 21:53:25.521229 17193 switchboard.cpp:575] Created I/O switchboard > server (pid: 1881) listening on socket file > '/tmp/mesos-io-switchboard-ca463cf2-70ba-4121-a5c6-1a170ae40c1b' for > container > 21bc372c-0f2c-49f5-b8ab-8d32c232b95d.ea991d38-e1a5-44fe-a522-622b15142e35 > I0509 21:53:25.522195 17191 containerizer.cpp:15
[jira] [Commented] (MESOS-7504) Parent's mount namespace cannot be determined when launching a nested container.
[ https://issues.apache.org/jira/browse/MESOS-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207669#comment-16207669 ] Andrei Budnik commented on MESOS-7504: -- https://reviews.apache.org/r/63074/ https://reviews.apache.org/r/63035/ > Parent's mount namespace cannot be determined when launching a nested > container. > > > Key: MESOS-7504 > URL: https://issues.apache.org/jira/browse/MESOS-7504 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.3.0 > Environment: Ubuntu 16.04 >Reporter: Alexander Rukletsov >Assignee: Andrei Budnik > Labels: containerizer, flaky-test, mesosphere > > I've observed this failure twice in different Linux environments. Here is an > example of such failure: > {noformat} > [ RUN ] > NestedMesosContainerizerTest.ROOT_CGROUPS_DestroyDebugContainerOnRecover > I0509 21:53:25.471657 17167 containerizer.cpp:221] Using isolation: > cgroups/cpu,filesystem/linux,namespaces/pid,network/cni,volume/image > I0509 21:53:25.475124 17167 linux_launcher.cpp:150] Using > /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher > I0509 21:53:25.475407 17167 provisioner.cpp:249] Using default backend > 'overlay' > I0509 21:53:25.481232 17186 containerizer.cpp:608] Recovering containerizer > I0509 21:53:25.482295 17186 provisioner.cpp:410] Provisioner recovery complete > I0509 21:53:25.482587 17187 containerizer.cpp:1001] Starting container > 21bc372c-0f2c-49f5-b8ab-8d32c232b95d for executor 'executor' of framework > I0509 21:53:25.482918 17189 cgroups.cpp:410] Creating cgroup at > '/sys/fs/cgroup/cpu,cpuacct/mesos_test_d989f526-efe0-4553-bf79-936ad66c3753/21bc372c-0f2c-49f5-b8ab-8d32c232b95d' > for container 21bc372c-0f2c-49f5-b8ab-8d32c232b95d > I0509 21:53:25.484103 17190 cpu.cpp:101] Updated 'cpu.shares' to 1024 (cpus > 1) for container 21bc372c-0f2c-49f5-b8ab-8d32c232b95d > I0509 21:53:25.484808 17186 containerizer.cpp:1524] Launching > 'mesos-containerizer' with flags '--help="false" > --launch_info="{"clone_namespaces":[131072,536870912],"command":{"shell":true,"value":"sleep > > 1000"},"environment":{"variables":[{"name":"MESOS_SANDBOX","type":"VALUE","value":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr"}]},"pre_exec_commands":[{"arguments":["mesos-containerizer","mount","--help=false","--operation=make-rslave","--path=\/"],"shell":false,"value":"\/home\/ubuntu\/workspace\/mesos\/Mesos_CI-build\/FLAG\/SSL\/label\/mesos-ec2-ubuntu-16.04\/mesos\/build\/src\/mesos-containerizer"},{"shell":true,"value":"mount > -n -t proc proc \/proc -o > nosuid,noexec,nodev"}],"working_directory":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr"}" > --pipe_read="29" --pipe_write="32" > --runtime_directory="/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_sKhtj7/containers/21bc372c-0f2c-49f5-b8ab-8d32c232b95d" > --unshare_namespace_mnt="false"' > I0509 21:53:25.484978 17189 linux_launcher.cpp:429] Launching container > 21bc372c-0f2c-49f5-b8ab-8d32c232b95d and cloning with namespaces CLONE_NEWNS > | CLONE_NEWPID > I0509 21:53:25.513890 17186 containerizer.cpp:1623] Checkpointing container's > forked pid 1873 to > '/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_Rdjw6M/meta/slaves/frameworks/executors/executor/runs/21bc372c-0f2c-49f5-b8ab-8d32c232b95d/pids/forked.pid' > I0509 21:53:25.515878 17190 fetcher.cpp:353] Starting to fetch URIs for > container: 21bc372c-0f2c-49f5-b8ab-8d32c232b95d, directory: > /tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr > I0509 21:53:25.517715 17193 containerizer.cpp:1791] Starting nested container > 21bc372c-0f2c-49f5-b8ab-8d32c232b95d.ea991d38-e1a5-44fe-a522-622b15142e35 > I0509 21:53:25.518569 17193 switchboard.cpp:545] Launching > 'mesos-io-switchboard' with flags '--heartbeat_interval="30secs" > --help="false" > --socket_address="/tmp/mesos-io-switchboard-ca463cf2-70ba-4121-a5c6-1a170ae40c1b" > --stderr_from_fd="36" --stderr_to_fd="2" --stdin_to_fd="32" > --stdout_from_fd="33" --stdout_to_fd="1" --tty="false" > --wait_for_connection="true"' for container > 21bc372c-0f2c-49f5-b8ab-8d32c232b95d.ea991d38-e1a5-44fe-a522-622b15142e35 > I0509 21:53:25.521229 17193 switchboard.cpp:575] Created I/O switchboard > server (pid: 1881) listening on socket file > '/tmp/mesos-io-switchboard-ca463cf2-70ba-4121-a5c6-1a170ae40c1b' for > container > 21bc372c-0f2c-49f5-b8ab-8d32c232b95d.ea991d38-e1a5-44fe-a522-622b15142e35 > I0509 21:53:25.522195 17191 containerizer.cpp:1524] Launching > 'mesos-containerizer' with flags '--help="false" > --launch_info="{
[jira] [Updated] (MESOS-8105) Docker containerizer fails with "Unable to get executor pid after launch"
[ https://issues.apache.org/jira/browse/MESOS-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maybob updated MESOS-8105: -- Description: When running lots of command at the same time by each command using same executor with different executorId by docker,some executor occur error "Unable to get executor pid after launch". Reason of this error may be "docker inspect" hangs or not return. {color:red}Log:{color} {code:java} I1012 16:15:01.003931 124081 slave.cpp:1619] Got assigned task '920860' for framework framework-id-daily I1012 16:15:01.006091 124081 slave.cpp:1900] Authorizing task '920860' for framework framework-id-daily I1012 16:15:01.008281 124081 slave.cpp:2087] Launching task '920860' for framework framework-id-daily I1012 16:15:01.008779 124081 paths.cpp:573] Trying to chown '/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3' to user 'maybob' I1012 16:15:01.009027 124081 slave.cpp:7401] Checkpointing ExecutorInfo to '/volumes/sdb1/mesos/meta/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/executor.info' I1012 16:15:01.009546 124081 slave.cpp:7038] Launching executor 'Executor_920860' of framework framework-id-daily with resources {} in work directory '/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3' I1012 16:15:01.010339 124081 slave.cpp:7429] Checkpointing TaskInfo to '/volumes/sdb1/mesos/meta/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3/tasks/920860/task.info' I1012 16:15:01.010726 124081 slave.cpp:2316] Queued task '920860' for executor 'Executor_920860' of framework framework-id-daily I1012 16:15:01.011740 124088 docker.cpp:1175] Starting container '29c82b61-1242-4de9-80cf-16f46c30e7e3' for executor 'Executor_920860' and framework framework-id-daily I1012 16:15:01.013123 124081 slave.cpp:877] Successfully attached file '/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3' I1012 16:15:01.013290 124080 fetcher.cpp:353] Starting to fetch URIs for container: 29c82b61-1242-4de9-80cf-16f46c30e7e3, directory: /volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3 I1012 16:15:01.706429 124071 docker.cpp:909] Running docker -H unix:///var/run/docker.sock run --cpu-shares 378 --memory 427819008 -e LIBPROCESS_PORT=0 -e MESOS_AGENT_ENDPOINT=xxx.xxx.xxx.xxx:5051 -e MESOS_CHECKPOINT=1 -e MESOS_CONTAINER_NAME=mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3 -e MESOS_DIRECTORY=/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3 -e MESOS_EXECUTOR_ID=Executor_920860 -e MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD=5secs -e MESOS_FRAMEWORK_ID=framework-id-daily -e MESOS_HTTP_COMMAND_EXECUTOR=0 -e MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos-1.3.1.so -e MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos-1.3.1.so -e MESOS_RECOVERY_TIMEOUT=15mins -e MESOS_SANDBOX=/mnt/mesos/sandbox -e MESOS_SLAVE_ID=89192f68-d28f-498c-808f-442a1ef576b3-S2 -e MESOS_SLAVE_PID=slave(1)@xxx.xxx.xxx.xxx:5051 -e MESOS_SUBSCRIPTION_BACKOFF_MAX=2secs -v /volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3:/mnt/mesos/sandbox --net host --entrypoint /bin/sh --name mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3 reg.docker.xxx/xx/executor:v25 -c env && cd $MESOS_SANDBOX && ./executor.sh I1012 16:15:01.717859 124071 docker.cpp:1071] Running docker -H unix:///var/run/docker.sock inspect mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3 I1012 16:15:02.033951 124085 docker.cpp:1118] Retrying inspect with non-zero status code. cmd: 'docker -H unix:///var/run/docker.sock inspect mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3', interval: 1secs I1012 16:15:03.034230 124090 docker.cpp:1071] Running docker -H unix:///var/run/docker.sock inspect mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3 I1012 16:15:03.518020 124078 docker.cpp:1071] Running docker -H unix:///var/run/docker.sock inspect mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3 I1012 16:15:29.554232 124076 docker.cpp:1753] Updated 'cpu.shares' to 378 at /sys/f
[jira] [Created] (MESOS-8105) Docker containerizer fails with "Unable to get executor pid after launch"
maybob created MESOS-8105: - Summary: Docker containerizer fails with "Unable to get executor pid after launch" Key: MESOS-8105 URL: https://issues.apache.org/jira/browse/MESOS-8105 Project: Mesos Issue Type: Bug Components: containerization Reporter: maybob When running lots of command at the same time by each command using same executor with different executorId by docker,same executor occur error "Unable to get executor pid after launch". Reason of this error may be "docker inspect" hangs or not return. {color:red}Log:{color} {code:java} I1012 16:15:01.003931 124081 slave.cpp:1619] Got assigned task '920860' for framework framework-id-daily I1012 16:15:01.006091 124081 slave.cpp:1900] Authorizing task '920860' for framework framework-id-daily I1012 16:15:01.008281 124081 slave.cpp:2087] Launching task '920860' for framework framework-id-daily I1012 16:15:01.008779 124081 paths.cpp:573] Trying to chown '/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3' to user 'maybob' I1012 16:15:01.009027 124081 slave.cpp:7401] Checkpointing ExecutorInfo to '/volumes/sdb1/mesos/meta/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/executor.info' I1012 16:15:01.009546 124081 slave.cpp:7038] Launching executor 'Executor_920860' of framework framework-id-daily with resources {} in work directory '/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3' I1012 16:15:01.010339 124081 slave.cpp:7429] Checkpointing TaskInfo to '/volumes/sdb1/mesos/meta/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3/tasks/920860/task.info' I1012 16:15:01.010726 124081 slave.cpp:2316] Queued task '920860' for executor 'Executor_920860' of framework framework-id-daily I1012 16:15:01.011740 124088 docker.cpp:1175] Starting container '29c82b61-1242-4de9-80cf-16f46c30e7e3' for executor 'Executor_920860' and framework framework-id-daily I1012 16:15:01.013123 124081 slave.cpp:877] Successfully attached file '/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3' I1012 16:15:01.013290 124080 fetcher.cpp:353] Starting to fetch URIs for container: 29c82b61-1242-4de9-80cf-16f46c30e7e3, directory: /volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3 I1012 16:15:01.706429 124071 docker.cpp:909] Running docker -H unix:///var/run/docker.sock run --cpu-shares 378 --memory 427819008 -e LIBPROCESS_PORT=0 -e MESOS_AGENT_ENDPOINT=xxx.xxx.xxx.xxx:5051 -e MESOS_CHECKPOINT=1 -e MESOS_CONTAINER_NAME=mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3 -e MESOS_DIRECTORY=/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3 -e MESOS_EXECUTOR_ID=Executor_920860 -e MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD=5secs -e MESOS_FRAMEWORK_ID=framework-id-daily -e MESOS_HTTP_COMMAND_EXECUTOR=0 -e MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos-1.3.1.so -e MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos-1.3.1.so -e MESOS_RECOVERY_TIMEOUT=15mins -e MESOS_SANDBOX=/mnt/mesos/sandbox -e MESOS_SLAVE_ID=89192f68-d28f-498c-808f-442a1ef576b3-S2 -e MESOS_SLAVE_PID=slave(1)@xxx.xxx.xxx.xxx:5051 -e MESOS_SUBSCRIPTION_BACKOFF_MAX=2secs -v /volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3:/mnt/mesos/sandbox --net host --entrypoint /bin/sh --name mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3 reg.docker.xxx/xx/executor:v25 -c env && cd $MESOS_SANDBOX && ./executor.sh I1012 16:15:01.717859 124071 docker.cpp:1071] Running docker -H unix:///var/run/docker.sock inspect mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3 I1012 16:15:02.033951 124085 docker.cpp:1118] Retrying inspect with non-zero status code. cmd: 'docker -H unix:///var/run/docker.sock inspect mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3', interval: 1secs I1012 16:15:03.034230 124090 docker.cpp:1071] Running docker -H unix:///var/run/docker.sock inspect mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3 I1012 16:15:03.518020 124078 docker.cpp:1071] Running docker -H unix:///var/run/docker