[jira] [Updated] (MESOS-8105) Docker containerizer fails with "Unable to get executor pid after launch"

2017-10-17 Thread maybob (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maybob updated MESOS-8105:
--
Description: 
When running lots of command at the same time by each command using same 
executor with different executorId by docker,some executor occur error "Unable 
to get executor pid after launch". 
Reason of this error may be "docker inspect" hangs or exit 0 with pid 0. 
Another reason may be lots of docker consume many resources, e.g file 
descriptor.

{color:red}Log:{color}

{code:java}
I1012 16:15:01.003931 124081 slave.cpp:1619] Got assigned task '920860' for 
framework framework-id-daily
I1012 16:15:01.006091 124081 slave.cpp:1900] Authorizing task '920860' for 
framework framework-id-daily
I1012 16:15:01.008281 124081 slave.cpp:2087] Launching task '920860' for 
framework framework-id-daily
I1012 16:15:01.008779 124081 paths.cpp:573] Trying to chown 
'/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3'
 to user 'maybob'
I1012 16:15:01.009027 124081 slave.cpp:7401] Checkpointing ExecutorInfo to 
'/volumes/sdb1/mesos/meta/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/executor.info'
I1012 16:15:01.009546 124081 slave.cpp:7038] Launching executor 
'Executor_920860' of framework framework-id-daily with resources {} in work 
directory 
'/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3'
I1012 16:15:01.010339 124081 slave.cpp:7429] Checkpointing TaskInfo to 
'/volumes/sdb1/mesos/meta/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3/tasks/920860/task.info'
I1012 16:15:01.010726 124081 slave.cpp:2316] Queued task '920860' for executor 
'Executor_920860' of framework framework-id-daily
I1012 16:15:01.011740 124088 docker.cpp:1175] Starting container 
'29c82b61-1242-4de9-80cf-16f46c30e7e3' for executor 'Executor_920860' and 
framework framework-id-daily
I1012 16:15:01.013123 124081 slave.cpp:877] Successfully attached file 
'/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3'
I1012 16:15:01.013290 124080 fetcher.cpp:353] Starting to fetch URIs for 
container: 29c82b61-1242-4de9-80cf-16f46c30e7e3, directory: 
/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3

I1012 16:15:01.706429 124071 docker.cpp:909] Running docker -H 
unix:///var/run/docker.sock run --cpu-shares 378 --memory 427819008 -e 
LIBPROCESS_PORT=0 -e MESOS_AGENT_ENDPOINT=xxx.xxx.xxx.xxx:5051 -e 
MESOS_CHECKPOINT=1 -e 
MESOS_CONTAINER_NAME=mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3
 -e 
MESOS_DIRECTORY=/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3
 -e MESOS_EXECUTOR_ID=Executor_920860 -e 
MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD=5secs -e 
MESOS_FRAMEWORK_ID=framework-id-daily -e MESOS_HTTP_COMMAND_EXECUTOR=0 -e 
MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos-1.3.1.so -e 
MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos-1.3.1.so -e 
MESOS_RECOVERY_TIMEOUT=15mins -e MESOS_SANDBOX=/mnt/mesos/sandbox -e 
MESOS_SLAVE_ID=89192f68-d28f-498c-808f-442a1ef576b3-S2 -e 
MESOS_SLAVE_PID=slave(1)@xxx.xxx.xxx.xxx:5051 -e 
MESOS_SUBSCRIPTION_BACKOFF_MAX=2secs -v 
/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3:/mnt/mesos/sandbox
 --net host --entrypoint /bin/sh --name 
mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3
 reg.docker.xxx/xx/executor:v25 -c env && cd $MESOS_SANDBOX && ./executor.sh
I1012 16:15:01.717859 124071 docker.cpp:1071] Running docker -H 
unix:///var/run/docker.sock inspect 
mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3

I1012 16:15:02.033951 124085 docker.cpp:1118] Retrying inspect with non-zero 
status code. cmd: 'docker -H unix:///var/run/docker.sock inspect 
mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3',
 interval: 1secs

I1012 16:15:03.034230 124090 docker.cpp:1071] Running docker -H 
unix:///var/run/docker.sock inspect 
mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3

I1012 16:15:03.518020 124078 docker.cpp:1071] Running docker -H 
unix:///var/run/docker.sock inspect 
mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c3

[jira] [Updated] (MESOS-8032) Launch CSI plugins in storage local resource provider.

2017-10-17 Thread Chun-Hung Hsiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun-Hung Hsiao updated MESOS-8032:
---
Component/s: (was: agent)
 storage

> Launch CSI plugins in storage local resource provider.
> --
>
> Key: MESOS-8032
> URL: https://issues.apache.org/jira/browse/MESOS-8032
> Project: Mesos
>  Issue Type: Task
>  Components: storage
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>  Labels: mesosphere, storage
> Fix For: 1.5.0
>
>
> Launching a CSI plugin requires the following steps:
> 1. Verify the configuration.
> 2. Prepare a directory in the work directory of the resource provider where 
> the socket file should be placed, and construct the path of the socket file.
> 3. If the socket file already exists and the plugin is already running, we 
> should not launch another plugin instance.
> 4. Otherwise, launch a standalone container to run the plugin and connect to 
> it through the socket file.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-7561) Add storage resource provider specific information in ResourceProviderInfo.

2017-10-17 Thread Chun-Hung Hsiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun-Hung Hsiao updated MESOS-7561:
---
Component/s: storage

> Add storage resource provider specific information in ResourceProviderInfo.
> ---
>
> Key: MESOS-7561
> URL: https://issues.apache.org/jira/browse/MESOS-7561
> Project: Mesos
>  Issue Type: Task
>  Components: storage
>Reporter: Jie Yu
>Assignee: Chun-Hung Hsiao
>  Labels: mesosphere, storage
> Fix For: 1.5.0
>
>
> For storage resource provider, there will be some specific configuration 
> information. For instance, the most important one is the `ContainerConfig` of 
> the CSI Plugin container.
> That config information will be sent to the corresponding agent that will use 
> the resources provided by the resource provider. For storage resource 
> provider particularly, the agent needs to launch the CSI Node Plugin to mount 
> the volumes.
> Comparing to adding first class storage resource provider information, an 
> alternative is to add a generic labels field in ResourceProviderInfo and let 
> resource provider itself figure out the format of the labels. However, I 
> believe a first class solution is better and more clear.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8108) Process offer operations in storage local resource provider

2017-10-17 Thread Chun-Hung Hsiao (JIRA)
Chun-Hung Hsiao created MESOS-8108:
--

 Summary: Process offer operations in storage local resource 
provider
 Key: MESOS-8108
 URL: https://issues.apache.org/jira/browse/MESOS-8108
 Project: Mesos
  Issue Type: Task
  Components: storage
Reporter: Chun-Hung Hsiao
Assignee: Chun-Hung Hsiao
 Fix For: 1.5.0


The storage local resource provider receives offer operations for reservations 
and resource conversions, and invoke proper CSI calls to implement these 
operations.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-8101) Import resources from CSI plugins in storage local resource provider.

2017-10-17 Thread Chun-Hung Hsiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun-Hung Hsiao updated MESOS-8101:
---
Component/s: (was: agent)
 storage

> Import resources from CSI plugins in storage local resource provider.
> -
>
> Key: MESOS-8101
> URL: https://issues.apache.org/jira/browse/MESOS-8101
> Project: Mesos
>  Issue Type: Task
>  Components: storage
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>  Labels: mesosphere, storage
> Fix For: 1.5.0
>
>
> The following lists the steps to import resources from a CSI plugin:
> 1. Launch the node plugin
> 1.1 GetSupportedVersions
> 1.2 GetPluginInfo
> 1.3 ProbeNode
> 1.4 GetNodeCapabilities
> 2. Launch the controller plugin
> 2.1 GetSuportedVersions
> 2.2 GetPluginInfo
> 2.3 GetControllerCapabilities
> 3. GetCapacity
> 4. ListVolumes
> 5. Report to the resource provider through UPDATE_TOTAL_RESOURCES



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8107) Add a call to update total resources in the resource provider API.

2017-10-17 Thread Chun-Hung Hsiao (JIRA)
Chun-Hung Hsiao created MESOS-8107:
--

 Summary: Add a call to update total resources in the resource 
provider API.
 Key: MESOS-8107
 URL: https://issues.apache.org/jira/browse/MESOS-8107
 Project: Mesos
  Issue Type: Task
Reporter: Chun-Hung Hsiao
Assignee: Chun-Hung Hsiao
 Fix For: 1.5.0


We should add a call for a resource provider to update the total resources, and 
remove {{resources}} from the {{SUBSCRIBE}} call and instead moving to a 
protocol where a resource provider first subscribes and then updates its 
resources.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (MESOS-6792) MasterSlaveReconciliationTest.ReconcileLostTask test is flaky

2017-10-17 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-6792:
---
Description: 
The test {{MasterSlaveReconciliationTest.ReconcileLostTask}} is flaky for me as 
of {{e99ea9ce8b1de01dd8b3cac6675337edb6320f38}},

{code}
Repeating all tests (iteration 912) . . .

Note: Google Test filter = <...>
[==] Running 1 test from 1 test case.
[--] Global test environment set-up.
[--] 1 test from MasterSlaveReconciliationTest
[ RUN  ] MasterSlaveReconciliationTest.SlaveReregisterTerminatedExecutor
I1214 04:41:11.559672  2005 cluster.cpp:160] Creating default 'local' authorizer
I1214 04:41:11.560848  2045 master.cpp:380] Master 
87dd8179-dd7d-4270-ace2-ea771b57371c (gru1.hw.ca1.mesosphere.com) started on 
192.99.40.208:37659
I1214 04:41:11.560878  2045 master.cpp:382] Flags at startup: --acls="" 
--agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate_agents="true" --authenticate_frameworks="true" 
--authenticate_http_frameworks="true" --authenticate_http_readonly="true" 
--authenticate_http_readwrite="true" --authenticators="crammd5" 
--authorizers="local" --credentials="/tmp/cXHI89/credentials" 
--framework_sorter="drf" --help="false" --hostname_lookup="true" 
--http_authenticators="basic" --http_framework_authenticators="basic" 
--initialize_driver_logging="true" --log_auto_initialize="true" 
--logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" 
--max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" 
--quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" 
--registry_fetch_timeout="1mins" --registry_gc_interval="15mins" 
--registry_max_agent_age="2weeks" --registry_max_agent_count="102400" 
--registry_store_timeout="100secs" --registry_strict="false" 
--root_submissions="true" --user_sorter="drf" --version="false" 
--webui_dir="/home/bbannier/src/mesos/build/P/share/mesos/webui" 
--work_dir="/tmp/cXHI89/master" --zk_session_timeout="10secs"
I1214 04:41:11.561079  2045 master.cpp:432] Master only allowing authenticated 
frameworks to register
I1214 04:41:11.561089  2045 master.cpp:446] Master only allowing authenticated 
agents to register
I1214 04:41:11.561095  2045 master.cpp:459] Master only allowing authenticated 
HTTP frameworks to register
I1214 04:41:11.561101  2045 credentials.hpp:39] Loading credentials for 
authentication from '/tmp/cXHI89/credentials'
I1214 04:41:11.561194  2045 master.cpp:504] Using default 'crammd5' 
authenticator
I1214 04:41:11.561236  2045 http.cpp:922] Using default 'basic' HTTP 
authenticator for realm 'mesos-master-readonly'
I1214 04:41:11.561274  2045 http.cpp:922] Using default 'basic' HTTP 
authenticator for realm 'mesos-master-readwrite'
I1214 04:41:11.561301  2045 http.cpp:922] Using default 'basic' HTTP 
authenticator for realm 'mesos-master-scheduler'
I1214 04:41:11.561326  2045 master.cpp:584] Authorization enabled
I1214 04:41:11.562155  2039 master.cpp:2045] Elected as the leading master!
I1214 04:41:11.562173  2039 master.cpp:1568] Recovering from registrar
I1214 04:41:11.562347  2045 registrar.cpp:362] Successfully fetched the 
registry (0B) in 114944ns
I1214 04:41:11.562441  2045 registrar.cpp:461] Applied 1 operations in 7920ns; 
attempting to update the registry
I1214 04:41:11.562621  2048 registrar.cpp:506] Successfully updated the 
registry in 155136ns
I1214 04:41:11.562664  2048 registrar.cpp:392] Successfully recovered registrar
I1214 04:41:11.562832  2044 master.cpp:1684] Recovered 0 agents from the 
registry (166B); allowing 10mins for agents to re-register
I1214 04:41:11.568444  2005 cluster.cpp:446] Creating default 'local' authorizer
I1214 04:41:11.569344  2005 sched.cpp:232] Version: 1.2.0
I1214 04:41:11.569842  2035 slave.cpp:209] Mesos agent started on 
(912)@192.99.40.208:37659
I1214 04:41:11.570080  2040 sched.cpp:336] New master detected at 
master@192.99.40.208:37659
I1214 04:41:11.570117  2040 sched.cpp:402] Authenticating with master 
master@192.99.40.208:37659
I1214 04:41:11.570127  2040 sched.cpp:409] Using default CRAM-MD5 authenticatee
I1214 04:41:11.570220  2040 authenticatee.cpp:121] Creating new client SASL 
connection
I1214 04:41:11.57  2035 slave.cpp:210] Flags at startup: --acls="" 
--appc_simple_discovery_uri_prefix="http://"; 
--appc_store_dir="/tmp/mesos/store/appc" --authenticate_http_readonly="true" 
--authenticate_http_readwrite="true" --authenticatee="crammd5" 
--authentication_backoff_factor="1secs" --authorizer="local" 
--cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false" 
--cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false" 
--cgroups_root="mesos" --container_disk_watch_interval="15secs" 
--containerizers="mesos" 
--credential="/tmp/MasterSlaveReconciliationTest_SlaveReregisterTe

[jira] [Commented] (MESOS-8106) Docker fetcher plugin unsupported scheme failure message is not accurate.

2017-10-17 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208122#comment-16208122
 ] 

Gilbert Song commented on MESOS-8106:
-

This was from this patch https://reviews.apache.org/r/58778/. When we add the 
GCE registry support, we did not update the error message.

/cc [~chhsia0] [~zhitao]

> Docker fetcher plugin unsupported scheme failure message is not accurate.
> -
>
> Key: MESOS-8106
> URL: https://issues.apache.org/jira/browse/MESOS-8106
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: Gilbert Song
>  Labels: containerizer, docker-fetcher
>
> https://github.com/apache/mesos/blob/1.4.0/src/uri/fetchers/docker.cpp#L843
> This failure message is not accurate. For such a case, if the user/operator 
> give a wrong credential to communicate to a BASIC auth based docker private 
> registry. The authentication failed but the log is still saying: "Unsupported 
> auth-scheme: BASIC"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8104) Add code coverage to continuous integration.

2017-10-17 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208117#comment-16208117
 ] 

Vinod Kone commented on MESOS-8104:
---

Huge +1.

We attempted this a while back but the build itself was hanging/segfaulting 
IIRC. Need to dig up what tools we used back then.

> Add code coverage to continuous integration.
> 
>
> Key: MESOS-8104
> URL: https://issues.apache.org/jira/browse/MESOS-8104
> Project: Mesos
>  Issue Type: Bug
>  Components: build, test
>Reporter: James Peach
>
> We should integrate code coverage in the the CI testing. Adding the right 
> compiler options looks like 
> [this|https://github.com/apache/trafficserver/commit/be237ea7ee874355c6f8209b4793dfe4c4fedd88]
>  in automake (though we need to remove the bugs). We can push the coverage 
> data to coveralls.io from specific test build configurations, and there's 
> even precedent for the ASF infra team wiring it into Github directly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-8090) Mesos 1.4.0 crashes with 1.3.x agent with oversubscription

2017-10-17 Thread Zhitao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16208103#comment-16208103
 ] 

Zhitao Li commented on MESOS-8090:
--

A quick attempt to fix: https://reviews.apache.org/r/63084/

> Mesos 1.4.0 crashes with 1.3.x agent with oversubscription
> --
>
> Key: MESOS-8090
> URL: https://issues.apache.org/jira/browse/MESOS-8090
> Project: Mesos
>  Issue Type: Bug
>  Components: master, oversubscription
>Affects Versions: 1.4.0
>Reporter: Zhitao Li
>Assignee: Michael Park
>
> We are seeing a crash in 1.4.0 master when it receives {{updateSlave}} from a 
> over-subscription enabled agent running 1.3.1 code.
> The crash line is:
> {code:none}
> resources.cpp:1050] Check failed: !resource.has_role() cpus{REV}:19
> {code}
> Stack trace in gdb:
> {panel:title=My title}
> #0  0x7f22f3553067 in __GI_raise (sig=sig@entry=6) at 
> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1  0x7f22f3554448 in __GI_abort () at abort.c:89
> #2  0x7f22f615cd79 in google::DumpStackTraceAndExit () at 
> src/utilities.cc:147
> #3  0x7f22f6154a4d in google::LogMessage::Fail () at src/logging.cc:1458
> #4  0x7f22f61566cd in google::LogMessage::SendToLog (this= out>) at src/logging.cc:1412
> #5  0x7f22f6154612 in google::LogMessage::Flush (this=0x18ac7) at 
> src/logging.cc:1281
> #6  0x7f22f61570b9 in google::LogMessageFatal::~LogMessageFatal 
> (this=, __in_chrg=) at src/logging.cc:1984
> #7  0x7f22f527e133 in mesos::Resources::isEmpty (resource=...) at 
> /mesos/src/common/resources.cpp:1051
> #8  0x7f22f527e1e5 in mesos::Resources::Resource_::isEmpty 
> (this=this@entry=0x7f22e713d2e0) at /mesos/src/common/resources.cpp:1173
> #9  0x7f22f527e20c in mesos::Resources::add (this=0x7f22e713d400, 
> that=...) at /mesos/src/common/resources.cpp:1993
> #10 0x7f22f527f860 in mesos::Resources::operator+= 
> (this=this@entry=0x7f22e713d400, that=...) at 
> /mesos/src/common/resources.cpp:2016
> #11 0x7f22f527f91d in mesos::Resources::operator+= 
> (this=this@entry=0x7f22e713d400, that=...) at 
> /mesos/src/common/resources.cpp:2025
> #12 0x7f22f527fa4b in mesos::Resources::Resources (this=0x7f22e713d400, 
> _resources=...) at /mesos/src/common/resources.cpp:1277
> #13 0x7f22f548b812 in mesos::internal::master::Master::updateSlave 
> (this=0x558137bbae70, message=...) at /mesos/src/master/master.cpp:6681
> #14 0x7f22f550adc1 in 
> ProtobufProcess::_handlerM
>  (t=0x558137bbae70, method=
> (void 
> (mesos::internal::master::Master::*)(mesos::internal::master::Master * const, 
> const mesos::internal::UpdateSlaveMessage &)) 0x7f22f548b6d0 
>   const&)>, 
> data="\n)\n'07ba28cc-d9fa-44fb-8d6b-f8c5c90f8a90-S1\022\030\n\004cpus\020\000\032\t\t\000\000\000\000\000\000\063@2\001*J")
> at /mesos/3rdparty/libprocess/include/process/protobuf.hpp:799
> #15 0x7f22f54c8791 in 
> ProtobufProcess::visit (this=0x558137bbae70, 
> event=...) at /mesos/3rdparty/libprocess/include/process/protobuf.hpp:104
> #16 0x7f22f54572d4 in mesos::internal::master::Master::_visit 
> (this=this@entry=0x558137bbae70, event=...) at 
> /mesos/src/master/master.cpp:1643
> #17 0x7f22f547014d in mesos::internal::master::Master::visit 
> (this=0x558137bbae70, event=...) at /mesos/src/master/master.cpp:1575
> #18 0x7f22f60b7169 in serve (event=..., this=0x558137bbbf28) at 
> /mesos/3rdparty/libprocess/include/process/process.hpp:87
> #19 process::ProcessManager::resume (this=, 
> process=0x558137bbbf28) at /mesos/3rdparty/libprocess/src/process.cpp:3346
> #20 0x7f22f60bd056 in operator() (__closure=0x558137aa3218) at 
> /mesos/3rdparty/libprocess/src/process.cpp:2881
> #21 _M_invoke<> (this=0x558137aa3218) at /usr/include/c++/4.9/functional:1700
> #22 operator() (this=0x558137aa3218) at /usr/include/c++/4.9/functional:1688
> #23 
> std::thread::_Impl()>
>  >::_M_run(void) (this=0x558137aa3200) at /usr/include/c++/4.9/thread:115
> #24 0x7f22f40b3970 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> #25 0x7f22f38d1064 in start_thread (arg=0x7f22e713e700) at 
> pthread_create.c:309
> #26 0x7f22f360662d in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> {panel}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (MESOS-8106) Docker fetcher plugin unsupported scheme failure message is not accurate.

2017-10-17 Thread Gilbert Song (JIRA)
Gilbert Song created MESOS-8106:
---

 Summary: Docker fetcher plugin unsupported scheme failure message 
is not accurate.
 Key: MESOS-8106
 URL: https://issues.apache.org/jira/browse/MESOS-8106
 Project: Mesos
  Issue Type: Improvement
  Components: containerization
Reporter: Gilbert Song


https://github.com/apache/mesos/blob/1.4.0/src/uri/fetchers/docker.cpp#L843

This failure message is not accurate. For such a case, if the user/operator 
give a wrong credential to communicate to a BASIC auth based docker private 
registry. The authentication failed but the log is still saying: "Unsupported 
auth-scheme: BASIC"





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (MESOS-7504) Parent's mount namespace cannot be determined when launching a nested container.

2017-10-17 Thread Andrei Budnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207751#comment-16207751
 ] 

Andrei Budnik commented on MESOS-7504:
--

List of failing tests:
{{NestedMesosContainerizerTest.ROOT_CGROUPS_DestroyDebugContainerOnRecover}}
{{ROOT_CGROUPS_DebugNestedContainerInheritsEnvironment}}

> Parent's mount namespace cannot be determined when launching a nested 
> container.
> 
>
> Key: MESOS-7504
> URL: https://issues.apache.org/jira/browse/MESOS-7504
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
> Environment: Ubuntu 16.04
>Reporter: Alexander Rukletsov
>Assignee: Andrei Budnik
>  Labels: containerizer, flaky-test, mesosphere
>
> I've observed this failure twice in different Linux environments. Here is an 
> example of such failure:
> {noformat}
> [ RUN  ] 
> NestedMesosContainerizerTest.ROOT_CGROUPS_DestroyDebugContainerOnRecover
> I0509 21:53:25.471657 17167 containerizer.cpp:221] Using isolation: 
> cgroups/cpu,filesystem/linux,namespaces/pid,network/cni,volume/image
> I0509 21:53:25.475124 17167 linux_launcher.cpp:150] Using 
> /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
> I0509 21:53:25.475407 17167 provisioner.cpp:249] Using default backend 
> 'overlay'
> I0509 21:53:25.481232 17186 containerizer.cpp:608] Recovering containerizer
> I0509 21:53:25.482295 17186 provisioner.cpp:410] Provisioner recovery complete
> I0509 21:53:25.482587 17187 containerizer.cpp:1001] Starting container 
> 21bc372c-0f2c-49f5-b8ab-8d32c232b95d for executor 'executor' of framework 
> I0509 21:53:25.482918 17189 cgroups.cpp:410] Creating cgroup at 
> '/sys/fs/cgroup/cpu,cpuacct/mesos_test_d989f526-efe0-4553-bf79-936ad66c3753/21bc372c-0f2c-49f5-b8ab-8d32c232b95d'
>  for container 21bc372c-0f2c-49f5-b8ab-8d32c232b95d
> I0509 21:53:25.484103 17190 cpu.cpp:101] Updated 'cpu.shares' to 1024 (cpus 
> 1) for container 21bc372c-0f2c-49f5-b8ab-8d32c232b95d
> I0509 21:53:25.484808 17186 containerizer.cpp:1524] Launching 
> 'mesos-containerizer' with flags '--help="false" 
> --launch_info="{"clone_namespaces":[131072,536870912],"command":{"shell":true,"value":"sleep
>  
> 1000"},"environment":{"variables":[{"name":"MESOS_SANDBOX","type":"VALUE","value":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr"}]},"pre_exec_commands":[{"arguments":["mesos-containerizer","mount","--help=false","--operation=make-rslave","--path=\/"],"shell":false,"value":"\/home\/ubuntu\/workspace\/mesos\/Mesos_CI-build\/FLAG\/SSL\/label\/mesos-ec2-ubuntu-16.04\/mesos\/build\/src\/mesos-containerizer"},{"shell":true,"value":"mount
>  -n -t proc proc \/proc -o 
> nosuid,noexec,nodev"}],"working_directory":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr"}"
>  --pipe_read="29" --pipe_write="32" 
> --runtime_directory="/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_sKhtj7/containers/21bc372c-0f2c-49f5-b8ab-8d32c232b95d"
>  --unshare_namespace_mnt="false"'
> I0509 21:53:25.484978 17189 linux_launcher.cpp:429] Launching container 
> 21bc372c-0f2c-49f5-b8ab-8d32c232b95d and cloning with namespaces CLONE_NEWNS 
> | CLONE_NEWPID
> I0509 21:53:25.513890 17186 containerizer.cpp:1623] Checkpointing container's 
> forked pid 1873 to 
> '/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_Rdjw6M/meta/slaves/frameworks/executors/executor/runs/21bc372c-0f2c-49f5-b8ab-8d32c232b95d/pids/forked.pid'
> I0509 21:53:25.515878 17190 fetcher.cpp:353] Starting to fetch URIs for 
> container: 21bc372c-0f2c-49f5-b8ab-8d32c232b95d, directory: 
> /tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr
> I0509 21:53:25.517715 17193 containerizer.cpp:1791] Starting nested container 
> 21bc372c-0f2c-49f5-b8ab-8d32c232b95d.ea991d38-e1a5-44fe-a522-622b15142e35
> I0509 21:53:25.518569 17193 switchboard.cpp:545] Launching 
> 'mesos-io-switchboard' with flags '--heartbeat_interval="30secs" 
> --help="false" 
> --socket_address="/tmp/mesos-io-switchboard-ca463cf2-70ba-4121-a5c6-1a170ae40c1b"
>  --stderr_from_fd="36" --stderr_to_fd="2" --stdin_to_fd="32" 
> --stdout_from_fd="33" --stdout_to_fd="1" --tty="false" 
> --wait_for_connection="true"' for container 
> 21bc372c-0f2c-49f5-b8ab-8d32c232b95d.ea991d38-e1a5-44fe-a522-622b15142e35
> I0509 21:53:25.521229 17193 switchboard.cpp:575] Created I/O switchboard 
> server (pid: 1881) listening on socket file 
> '/tmp/mesos-io-switchboard-ca463cf2-70ba-4121-a5c6-1a170ae40c1b' for 
> container 
> 21bc372c-0f2c-49f5-b8ab-8d32c232b95d.ea991d38-e1a5-44fe-a522-622b15142e35
> I0509 21:53:25.522195 17191 containerizer.cpp:15

[jira] [Commented] (MESOS-7504) Parent's mount namespace cannot be determined when launching a nested container.

2017-10-17 Thread Andrei Budnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207669#comment-16207669
 ] 

Andrei Budnik commented on MESOS-7504:
--

https://reviews.apache.org/r/63074/
https://reviews.apache.org/r/63035/

> Parent's mount namespace cannot be determined when launching a nested 
> container.
> 
>
> Key: MESOS-7504
> URL: https://issues.apache.org/jira/browse/MESOS-7504
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
> Environment: Ubuntu 16.04
>Reporter: Alexander Rukletsov
>Assignee: Andrei Budnik
>  Labels: containerizer, flaky-test, mesosphere
>
> I've observed this failure twice in different Linux environments. Here is an 
> example of such failure:
> {noformat}
> [ RUN  ] 
> NestedMesosContainerizerTest.ROOT_CGROUPS_DestroyDebugContainerOnRecover
> I0509 21:53:25.471657 17167 containerizer.cpp:221] Using isolation: 
> cgroups/cpu,filesystem/linux,namespaces/pid,network/cni,volume/image
> I0509 21:53:25.475124 17167 linux_launcher.cpp:150] Using 
> /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
> I0509 21:53:25.475407 17167 provisioner.cpp:249] Using default backend 
> 'overlay'
> I0509 21:53:25.481232 17186 containerizer.cpp:608] Recovering containerizer
> I0509 21:53:25.482295 17186 provisioner.cpp:410] Provisioner recovery complete
> I0509 21:53:25.482587 17187 containerizer.cpp:1001] Starting container 
> 21bc372c-0f2c-49f5-b8ab-8d32c232b95d for executor 'executor' of framework 
> I0509 21:53:25.482918 17189 cgroups.cpp:410] Creating cgroup at 
> '/sys/fs/cgroup/cpu,cpuacct/mesos_test_d989f526-efe0-4553-bf79-936ad66c3753/21bc372c-0f2c-49f5-b8ab-8d32c232b95d'
>  for container 21bc372c-0f2c-49f5-b8ab-8d32c232b95d
> I0509 21:53:25.484103 17190 cpu.cpp:101] Updated 'cpu.shares' to 1024 (cpus 
> 1) for container 21bc372c-0f2c-49f5-b8ab-8d32c232b95d
> I0509 21:53:25.484808 17186 containerizer.cpp:1524] Launching 
> 'mesos-containerizer' with flags '--help="false" 
> --launch_info="{"clone_namespaces":[131072,536870912],"command":{"shell":true,"value":"sleep
>  
> 1000"},"environment":{"variables":[{"name":"MESOS_SANDBOX","type":"VALUE","value":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr"}]},"pre_exec_commands":[{"arguments":["mesos-containerizer","mount","--help=false","--operation=make-rslave","--path=\/"],"shell":false,"value":"\/home\/ubuntu\/workspace\/mesos\/Mesos_CI-build\/FLAG\/SSL\/label\/mesos-ec2-ubuntu-16.04\/mesos\/build\/src\/mesos-containerizer"},{"shell":true,"value":"mount
>  -n -t proc proc \/proc -o 
> nosuid,noexec,nodev"}],"working_directory":"\/tmp\/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr"}"
>  --pipe_read="29" --pipe_write="32" 
> --runtime_directory="/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_sKhtj7/containers/21bc372c-0f2c-49f5-b8ab-8d32c232b95d"
>  --unshare_namespace_mnt="false"'
> I0509 21:53:25.484978 17189 linux_launcher.cpp:429] Launching container 
> 21bc372c-0f2c-49f5-b8ab-8d32c232b95d and cloning with namespaces CLONE_NEWNS 
> | CLONE_NEWPID
> I0509 21:53:25.513890 17186 containerizer.cpp:1623] Checkpointing container's 
> forked pid 1873 to 
> '/tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_Rdjw6M/meta/slaves/frameworks/executors/executor/runs/21bc372c-0f2c-49f5-b8ab-8d32c232b95d/pids/forked.pid'
> I0509 21:53:25.515878 17190 fetcher.cpp:353] Starting to fetch URIs for 
> container: 21bc372c-0f2c-49f5-b8ab-8d32c232b95d, directory: 
> /tmp/NestedMesosContainerizerTest_ROOT_CGROUPS_DestroyDebugContainerOnRecover_zlywyr
> I0509 21:53:25.517715 17193 containerizer.cpp:1791] Starting nested container 
> 21bc372c-0f2c-49f5-b8ab-8d32c232b95d.ea991d38-e1a5-44fe-a522-622b15142e35
> I0509 21:53:25.518569 17193 switchboard.cpp:545] Launching 
> 'mesos-io-switchboard' with flags '--heartbeat_interval="30secs" 
> --help="false" 
> --socket_address="/tmp/mesos-io-switchboard-ca463cf2-70ba-4121-a5c6-1a170ae40c1b"
>  --stderr_from_fd="36" --stderr_to_fd="2" --stdin_to_fd="32" 
> --stdout_from_fd="33" --stdout_to_fd="1" --tty="false" 
> --wait_for_connection="true"' for container 
> 21bc372c-0f2c-49f5-b8ab-8d32c232b95d.ea991d38-e1a5-44fe-a522-622b15142e35
> I0509 21:53:25.521229 17193 switchboard.cpp:575] Created I/O switchboard 
> server (pid: 1881) listening on socket file 
> '/tmp/mesos-io-switchboard-ca463cf2-70ba-4121-a5c6-1a170ae40c1b' for 
> container 
> 21bc372c-0f2c-49f5-b8ab-8d32c232b95d.ea991d38-e1a5-44fe-a522-622b15142e35
> I0509 21:53:25.522195 17191 containerizer.cpp:1524] Launching 
> 'mesos-containerizer' with flags '--help="false" 
> --launch_info="{

[jira] [Updated] (MESOS-8105) Docker containerizer fails with "Unable to get executor pid after launch"

2017-10-17 Thread maybob (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maybob updated MESOS-8105:
--
Description: 
When running lots of command at the same time by each command using same 
executor with different executorId by docker,some executor occur error "Unable 
to get executor pid after launch". 
Reason of this error may be "docker inspect" hangs or not return.

{color:red}Log:{color}

{code:java}
I1012 16:15:01.003931 124081 slave.cpp:1619] Got assigned task '920860' for 
framework framework-id-daily
I1012 16:15:01.006091 124081 slave.cpp:1900] Authorizing task '920860' for 
framework framework-id-daily
I1012 16:15:01.008281 124081 slave.cpp:2087] Launching task '920860' for 
framework framework-id-daily
I1012 16:15:01.008779 124081 paths.cpp:573] Trying to chown 
'/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3'
 to user 'maybob'
I1012 16:15:01.009027 124081 slave.cpp:7401] Checkpointing ExecutorInfo to 
'/volumes/sdb1/mesos/meta/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/executor.info'
I1012 16:15:01.009546 124081 slave.cpp:7038] Launching executor 
'Executor_920860' of framework framework-id-daily with resources {} in work 
directory 
'/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3'
I1012 16:15:01.010339 124081 slave.cpp:7429] Checkpointing TaskInfo to 
'/volumes/sdb1/mesos/meta/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3/tasks/920860/task.info'
I1012 16:15:01.010726 124081 slave.cpp:2316] Queued task '920860' for executor 
'Executor_920860' of framework framework-id-daily
I1012 16:15:01.011740 124088 docker.cpp:1175] Starting container 
'29c82b61-1242-4de9-80cf-16f46c30e7e3' for executor 'Executor_920860' and 
framework framework-id-daily
I1012 16:15:01.013123 124081 slave.cpp:877] Successfully attached file 
'/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3'
I1012 16:15:01.013290 124080 fetcher.cpp:353] Starting to fetch URIs for 
container: 29c82b61-1242-4de9-80cf-16f46c30e7e3, directory: 
/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3

I1012 16:15:01.706429 124071 docker.cpp:909] Running docker -H 
unix:///var/run/docker.sock run --cpu-shares 378 --memory 427819008 -e 
LIBPROCESS_PORT=0 -e MESOS_AGENT_ENDPOINT=xxx.xxx.xxx.xxx:5051 -e 
MESOS_CHECKPOINT=1 -e 
MESOS_CONTAINER_NAME=mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3
 -e 
MESOS_DIRECTORY=/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3
 -e MESOS_EXECUTOR_ID=Executor_920860 -e 
MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD=5secs -e 
MESOS_FRAMEWORK_ID=framework-id-daily -e MESOS_HTTP_COMMAND_EXECUTOR=0 -e 
MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos-1.3.1.so -e 
MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos-1.3.1.so -e 
MESOS_RECOVERY_TIMEOUT=15mins -e MESOS_SANDBOX=/mnt/mesos/sandbox -e 
MESOS_SLAVE_ID=89192f68-d28f-498c-808f-442a1ef576b3-S2 -e 
MESOS_SLAVE_PID=slave(1)@xxx.xxx.xxx.xxx:5051 -e 
MESOS_SUBSCRIPTION_BACKOFF_MAX=2secs -v 
/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3:/mnt/mesos/sandbox
 --net host --entrypoint /bin/sh --name 
mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3
 reg.docker.xxx/xx/executor:v25 -c env && cd $MESOS_SANDBOX && ./executor.sh
I1012 16:15:01.717859 124071 docker.cpp:1071] Running docker -H 
unix:///var/run/docker.sock inspect 
mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3

I1012 16:15:02.033951 124085 docker.cpp:1118] Retrying inspect with non-zero 
status code. cmd: 'docker -H unix:///var/run/docker.sock inspect 
mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3',
 interval: 1secs

I1012 16:15:03.034230 124090 docker.cpp:1071] Running docker -H 
unix:///var/run/docker.sock inspect 
mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3

I1012 16:15:03.518020 124078 docker.cpp:1071] Running docker -H 
unix:///var/run/docker.sock inspect 
mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3

I1012 16:15:29.554232 124076 docker.cpp:1753] Updated 'cpu.shares' to 378 at 
/sys/f

[jira] [Created] (MESOS-8105) Docker containerizer fails with "Unable to get executor pid after launch"

2017-10-17 Thread maybob (JIRA)
maybob created MESOS-8105:
-

 Summary: Docker containerizer fails with "Unable to get executor 
pid after launch"
 Key: MESOS-8105
 URL: https://issues.apache.org/jira/browse/MESOS-8105
 Project: Mesos
  Issue Type: Bug
  Components: containerization
Reporter: maybob


When running lots of command at the same time by each command using same 
executor with different executorId by docker,same executor occur error "Unable 
to get executor pid after launch". 
Reason of this error may be "docker inspect" hangs or not return.

{color:red}Log:{color}

{code:java}
I1012 16:15:01.003931 124081 slave.cpp:1619] Got assigned task '920860' for 
framework framework-id-daily
I1012 16:15:01.006091 124081 slave.cpp:1900] Authorizing task '920860' for 
framework framework-id-daily
I1012 16:15:01.008281 124081 slave.cpp:2087] Launching task '920860' for 
framework framework-id-daily
I1012 16:15:01.008779 124081 paths.cpp:573] Trying to chown 
'/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3'
 to user 'maybob'
I1012 16:15:01.009027 124081 slave.cpp:7401] Checkpointing ExecutorInfo to 
'/volumes/sdb1/mesos/meta/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/executor.info'
I1012 16:15:01.009546 124081 slave.cpp:7038] Launching executor 
'Executor_920860' of framework framework-id-daily with resources {} in work 
directory 
'/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3'
I1012 16:15:01.010339 124081 slave.cpp:7429] Checkpointing TaskInfo to 
'/volumes/sdb1/mesos/meta/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3/tasks/920860/task.info'
I1012 16:15:01.010726 124081 slave.cpp:2316] Queued task '920860' for executor 
'Executor_920860' of framework framework-id-daily
I1012 16:15:01.011740 124088 docker.cpp:1175] Starting container 
'29c82b61-1242-4de9-80cf-16f46c30e7e3' for executor 'Executor_920860' and 
framework framework-id-daily
I1012 16:15:01.013123 124081 slave.cpp:877] Successfully attached file 
'/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3'
I1012 16:15:01.013290 124080 fetcher.cpp:353] Starting to fetch URIs for 
container: 29c82b61-1242-4de9-80cf-16f46c30e7e3, directory: 
/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3

I1012 16:15:01.706429 124071 docker.cpp:909] Running docker -H 
unix:///var/run/docker.sock run --cpu-shares 378 --memory 427819008 -e 
LIBPROCESS_PORT=0 -e MESOS_AGENT_ENDPOINT=xxx.xxx.xxx.xxx:5051 -e 
MESOS_CHECKPOINT=1 -e 
MESOS_CONTAINER_NAME=mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3
 -e 
MESOS_DIRECTORY=/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3
 -e MESOS_EXECUTOR_ID=Executor_920860 -e 
MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD=5secs -e 
MESOS_FRAMEWORK_ID=framework-id-daily -e MESOS_HTTP_COMMAND_EXECUTOR=0 -e 
MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos-1.3.1.so -e 
MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos-1.3.1.so -e 
MESOS_RECOVERY_TIMEOUT=15mins -e MESOS_SANDBOX=/mnt/mesos/sandbox -e 
MESOS_SLAVE_ID=89192f68-d28f-498c-808f-442a1ef576b3-S2 -e 
MESOS_SLAVE_PID=slave(1)@xxx.xxx.xxx.xxx:5051 -e 
MESOS_SUBSCRIPTION_BACKOFF_MAX=2secs -v 
/volumes/sdb1/mesos/slaves/89192f68-d28f-498c-808f-442a1ef576b3-S2/frameworks/framework-id-daily/executors/Executor_920860/runs/29c82b61-1242-4de9-80cf-16f46c30e7e3:/mnt/mesos/sandbox
 --net host --entrypoint /bin/sh --name 
mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3
 reg.docker.xxx/xx/executor:v25 -c env && cd $MESOS_SANDBOX && ./executor.sh
I1012 16:15:01.717859 124071 docker.cpp:1071] Running docker -H 
unix:///var/run/docker.sock inspect 
mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3

I1012 16:15:02.033951 124085 docker.cpp:1118] Retrying inspect with non-zero 
status code. cmd: 'docker -H unix:///var/run/docker.sock inspect 
mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3',
 interval: 1secs

I1012 16:15:03.034230 124090 docker.cpp:1071] Running docker -H 
unix:///var/run/docker.sock inspect 
mesos-89192f68-d28f-498c-808f-442a1ef576b3-S2.29c82b61-1242-4de9-80cf-16f46c30e7e3

I1012 16:15:03.518020 124078 docker.cpp:1071] Running docker -H 
unix:///var/run/docker