[jira] [Comment Edited] (MESOS-7971) PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove test is flaky
[ https://issues.apache.org/jira/browse/MESOS-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16739635#comment-16739635 ] Andrei Budnik edited comment on MESOS-7971 at 1/10/19 5:40 PM: --- This is something different from previous ones. {code:java} E0110 17:13:09.326659 13916 master.cpp:8586] Failed to find the operation '' (uuid: 825f65eb-3ba1-4dfa-bdfa-8eb29194ace3) for an operator API call on agent ae22a9c8-0ef6-4f1e-b1eb-7b55f6e4508b-S0 {code} Full log: {code:java} [ RUN ] PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove I0110 17:12:59.303460 13893 cluster.cpp:174] Creating default 'local' authorizer I0110 17:12:59.304430 13912 master.cpp:416] Master ae22a9c8-0ef6-4f1e-b1eb-7b55f6e4508b (ip-172-16-10-92.ec2.internal) started on 172.16.10.92:42320 I0110 17:12:59.304451 13912 master.cpp:419] Flags at startup: --acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1000secs" --allocator="hierarchical" --authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authentication_v0_timeout="15secs" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/PfFTwT/credentials" --filter_gpu_resources="true" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_operator_event_stream_subscribers="1000" --max_unreachable_tasks_per_framework="1000" --memory_profiling="false" --min_allocatable_resources="cpus:0.01|mem:32" --port="5050" --publish_per_framework_metrics="true" --quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="100secs" --registry_strict="false" --require_agent_domain="false" --role_sorter="drf" --roles="role1" --root_submissions="true" --version="false" --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/PfFTwT/master" --zk_session_timeout="10secs" I0110 17:12:59.304585 13912 master.cpp:468] Master only allowing authenticated frameworks to register I0110 17:12:59.304595 13912 master.cpp:474] Master only allowing authenticated agents to register I0110 17:12:59.304603 13912 master.cpp:480] Master only allowing authenticated HTTP frameworks to register I0110 17:12:59.304615 13912 credentials.hpp:37] Loading credentials for authentication from '/tmp/PfFTwT/credentials' I0110 17:12:59.304684 13912 master.cpp:524] Using default 'crammd5' authenticator I0110 17:12:59.304744 13912 http.cpp:965] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readonly' I0110 17:12:59.304831 13912 http.cpp:965] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readwrite' I0110 17:12:59.304889 13912 http.cpp:965] Creating default 'basic' HTTP authenticator for realm 'mesos-master-scheduler' I0110 17:12:59.304941 13912 master.cpp:605] Authorization enabled W0110 17:12:59.304967 13912 master.cpp:668] The '--roles' flag is deprecated. This flag will be removed in the future. See the Mesos 0.27 upgrade notes for more information I0110 17:12:59.305047 13919 hierarchical.cpp:176] Initialized hierarchical allocator process I0110 17:12:59.305128 13918 whitelist_watcher.cpp:77] No whitelist given I0110 17:12:59.305600 13914 master.cpp:2085] Elected as the leading master! I0110 17:12:59.305622 13914 master.cpp:1640] Recovering from registrar I0110 17:12:59.305698 13913 registrar.cpp:339] Recovering registrar I0110 17:12:59.305853 13912 registrar.cpp:383] Successfully fetched the registry (0B) in 118016ns I0110 17:12:59.305899 13912 registrar.cpp:487] Applied 1 operations in 8238ns; attempting to update the registry I0110 17:12:59.306036 13912 registrar.cpp:544] Successfully updated the registry in 112128ns I0110 17:12:59.306092 13912 registrar.cpp:416] Successfully recovered registrar I0110 17:12:59.306217 13916 master.cpp:1754] Recovered 0 agents from the registry (172B); allowing 10mins for agents to reregister I0110 17:12:59.306258 13919 hierarchical.cpp:216] Skipping recovery of hierarchical allocator: nothing to recover W0110 17:12:59.307780 13893 process.cpp:2829] Attempted to spawn already running process files@172.16.10.92:42320 I0110 17:12:59.308149 13893 containerizer.cpp:305] Using isolation { environment_secret, posix/cpu, posix/mem, filesystem/posix, network/cni } I0110 17:12:59.310348 13893 linux_launcher.cpp:144] Using /sys/fs/cgroup/freezer as the freezer hierarchy for the Linux
[jira] [Comment Edited] (MESOS-7971) PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove test is flaky
[ https://issues.apache.org/jira/browse/MESOS-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712278#comment-16712278 ] Chun-Hung Hsiao edited comment on MESOS-7971 at 12/7/18 2:52 AM: - For resource provider operations, we use {{resource_version_uuid}} to resolve this. It seems to me that we should to the same in {{Slave::applyOperation}} as well: Check if {{ApplyOperationMessage.resource_version_uuid}} equals to {{resourceVersion}}, and only apply the speculative operation if the version matches. However, we only have {{resource_version_uuid}} since 1.5 (with the {{RESOURCE_PROVIDER}} agent capability), we could not use the same strategy to fix this in 1.4 if we want to (1.4 is no longer supported though). was (Author: chhsia0): For resource provider operations, we use {{resource_version_uuid}} to resolve this. It seems to me that we should to the same in {{Slave::applyOperation}} as well: Check if {{ApplyOperationMessage.resource_version_uuid}} equals to {{resourceVersion}}, and only apply the speculative operation if the version matches. However, we only have {{resource_version_uuid}} since 1.5 (with the {{RESOURCE_PROVIDER}} agent capability), we could not use the same strategy to fix this in 1.4 if we want to. > PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove test is flaky > - > > Key: MESOS-7971 > URL: https://issues.apache.org/jira/browse/MESOS-7971 > Project: Mesos > Issue Type: Bug > Components: allocation >Affects Versions: 1.4.0, 1.6.0, 1.7.0, 1.8.0 >Reporter: Vinod Kone >Assignee: Meng Zhu >Priority: Critical > Labels: flaky-test, mesosphere > Attachments: ApacheJenkinsConsoleText_autotools_gcc_ubuntu16.txt > > > Saw this when testing 1.4.0-rc5 > {code} > [ RUN ] PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove > I0912 05:40:27.335222 30860 cluster.cpp:162] Creating default 'local' > authorizer > I0912 05:40:27.338429 30867 master.cpp:442] Master > 2bd1e8eb-e314-4181-9ed3-d397ec1dbede (6aa774430302) started on > 172.17.0.3:54639 > I0912 05:40:27.338472 30867 master.cpp:444] Flags at startup: --acls="" > --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" > --allocation_interval="50ms" --allocator="HierarchicalDRF" > --authenticate_agents="true" --authenticate_frameworks="true" > --authenticate_http_frameworks="true" --authenticate_http_readonly="true" > --authenticate_http_readwrite="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/hH0YXe/credentials" > --filter_gpu_resources="true" --framework_sorter="drf" --help="false" > --hostname_lookup="true" --http_authenticators="basic" > --http_framework_authenticators="basic" --initialize_driver_logging="true" > --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" > --max_agent_ping_timeouts="5" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" > --max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false" > --recovery_agent_removal_limit="100%" --registry="in_memory" > --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" > --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" > --registry_store_timeout="100secs" --registry_strict="false" --roles="role1" > --root_submissions="true" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/hH0YXe/master" > --zk_session_timeout="10secs" > I0912 05:40:27.338778 30867 master.cpp:494] Master only allowing > authenticated frameworks to register > I0912 05:40:27.338788 30867 master.cpp:508] Master only allowing > authenticated agents to register > I0912 05:40:27.338793 30867 master.cpp:521] Master only allowing > authenticated HTTP frameworks to register > I0912 05:40:27.338799 30867 credentials.hpp:37] Loading credentials for > authentication from '/tmp/hH0YXe/credentials' > I0912 05:40:27.353009 30867 master.cpp:566] Using default 'crammd5' > authenticator > I0912 05:40:27.353183 30867 http.cpp:1026] Creating default 'basic' HTTP > authenticator for realm 'mesos-master-readonly' > I0912 05:40:27.353364 30867 http.cpp:1026] Creating default 'basic' HTTP > authenticator for realm 'mesos-master-readwrite' > I0912 05:40:27.353482 30867 http.cpp:1026] Creating default 'basic' HTTP > authenticator for realm 'mesos-master-scheduler' > I0912 05:40:27.353588 30867 master.cpp:646] Authorization enabled > W0912 05:40:27.353605 30867 master.cpp:709] The '--roles' flag is deprecated. > This flag will be removed in the future. See the Mesos 0.27 upgrade notes for > more information > I0912 05:40:27.353742 30868 hierarchical.cpp:171] Initialized hierarchical > allocator process
[jira] [Comment Edited] (MESOS-7971) PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove test is flaky
[ https://issues.apache.org/jira/browse/MESOS-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712278#comment-16712278 ] Chun-Hung Hsiao edited comment on MESOS-7971 at 12/7/18 2:51 AM: - For resource provider operations, we use {{resource_version_uuid}} to resolve this. It seems to me that we should to the same in {{Slave::applyOperation}} as well: Check if {{ApplyOperationMessage.resource_version_uuid}} equals to {{resourceVersion}}, and only apply the speculative operation if the version matches. However, we only have {{resource_version_uuid}} since 1.5 (with the {{RESOURCE_PROVIDER}} agent capability), we could not use the same strategy to fix this in 1.4 if we want to. was (Author: chhsia0): For resource provider operations, we use {{resource_version_uuid}} to resolve this. It seems to me that we should to the same in {{Slave::applyOperation}} as well: Check if {{ApplyOperationMessage.resource_version_uuid}} equals to {{resourceVersion}}, and only apply the speculative operation if the version matches. However, we only have `resource_version_uuid` since 1.5 (with the {{RESOURCE_PROVIDER}} agent capability), we could not use the same strategy to fix this in 1.4 if we want to. > PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove test is flaky > - > > Key: MESOS-7971 > URL: https://issues.apache.org/jira/browse/MESOS-7971 > Project: Mesos > Issue Type: Bug > Components: allocation >Affects Versions: 1.4.0, 1.6.0, 1.7.0, 1.8.0 >Reporter: Vinod Kone >Assignee: Meng Zhu >Priority: Critical > Labels: flaky-test, mesosphere > Attachments: ApacheJenkinsConsoleText_autotools_gcc_ubuntu16.txt > > > Saw this when testing 1.4.0-rc5 > {code} > [ RUN ] PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove > I0912 05:40:27.335222 30860 cluster.cpp:162] Creating default 'local' > authorizer > I0912 05:40:27.338429 30867 master.cpp:442] Master > 2bd1e8eb-e314-4181-9ed3-d397ec1dbede (6aa774430302) started on > 172.17.0.3:54639 > I0912 05:40:27.338472 30867 master.cpp:444] Flags at startup: --acls="" > --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" > --allocation_interval="50ms" --allocator="HierarchicalDRF" > --authenticate_agents="true" --authenticate_frameworks="true" > --authenticate_http_frameworks="true" --authenticate_http_readonly="true" > --authenticate_http_readwrite="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/hH0YXe/credentials" > --filter_gpu_resources="true" --framework_sorter="drf" --help="false" > --hostname_lookup="true" --http_authenticators="basic" > --http_framework_authenticators="basic" --initialize_driver_logging="true" > --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" > --max_agent_ping_timeouts="5" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" > --max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false" > --recovery_agent_removal_limit="100%" --registry="in_memory" > --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" > --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" > --registry_store_timeout="100secs" --registry_strict="false" --roles="role1" > --root_submissions="true" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/hH0YXe/master" > --zk_session_timeout="10secs" > I0912 05:40:27.338778 30867 master.cpp:494] Master only allowing > authenticated frameworks to register > I0912 05:40:27.338788 30867 master.cpp:508] Master only allowing > authenticated agents to register > I0912 05:40:27.338793 30867 master.cpp:521] Master only allowing > authenticated HTTP frameworks to register > I0912 05:40:27.338799 30867 credentials.hpp:37] Loading credentials for > authentication from '/tmp/hH0YXe/credentials' > I0912 05:40:27.353009 30867 master.cpp:566] Using default 'crammd5' > authenticator > I0912 05:40:27.353183 30867 http.cpp:1026] Creating default 'basic' HTTP > authenticator for realm 'mesos-master-readonly' > I0912 05:40:27.353364 30867 http.cpp:1026] Creating default 'basic' HTTP > authenticator for realm 'mesos-master-readwrite' > I0912 05:40:27.353482 30867 http.cpp:1026] Creating default 'basic' HTTP > authenticator for realm 'mesos-master-scheduler' > I0912 05:40:27.353588 30867 master.cpp:646] Authorization enabled > W0912 05:40:27.353605 30867 master.cpp:709] The '--roles' flag is deprecated. > This flag will be removed in the future. See the Mesos 0.27 upgrade notes for > more information > I0912 05:40:27.353742 30868 hierarchical.cpp:171] Initialized hierarchical > allocator process > I0912 05:40:27.353775 30872
[jira] [Comment Edited] (MESOS-7971) PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove test is flaky
[ https://issues.apache.org/jira/browse/MESOS-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712200#comment-16712200 ] Meng Zhu edited comment on MESOS-7971 at 12/7/18 1:12 AM: -- This looks like a legitimate bug. Here is the sequence of events that can trigger the bug - agent (re)registers with the master - operation calls are made to the master (let’s say create volume) - the allocator is speculatively updated in https://github.com/apache/mesos/blob/master/src/master/master.cpp#L11315 - before agent resource gets updated, it sends `UpdateSlaveMessage` when getting the (re)registered message in https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L1551 and https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L1633 - the `UpdateSlaveMessage` triggers allocator to update the total resources with STALE info sent from the agent https://github.com/apache/mesos/blob/master/src/master/master.cpp#L8205, thus updates from the previous operation is overwritten and LOST - agent finishes the operation and informs the master through `UpdateOperationStatusMessage` - but for the speculative operation, we do not update the allocator https://github.com/apache/mesos/blob/master/src/master/master.cpp#L11177 Thus, the speculative operation failed to be applied on the allocator but successfully applied to the agent. was (Author: mzhu): This looks like a legitimate bug. Here is the sequence of events that can trigger the bug - agent (re)registers with the master - operation calls are made to the master (let’s say create volume) - the allocator is speculatively updated in https://github.com/apache/mesos/blob/master/src/master/master.cpp#L11315 - before agent resource gets updated, it sends `UpdateSlaveMessage` when getting the (re)registered message in https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L1551 and https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L1633 - the `UpdateSlaveMessage` triggers allocator to update the total resources again https://github.com/apache/mesos/blob/master/src/master/master.cpp#L8205, resource update from the previous operation is overwritten and LOST - agent finishes the operation and informs the master through `UpdateOperationStatusMessage` - but for the speculative operation, we do not update the allocator https://github.com/apache/mesos/blob/master/src/master/master.cpp#L11177 Thus, the speculative operation failed to be applied on the allocator but successfully applied to the agent. > PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove test is flaky > - > > Key: MESOS-7971 > URL: https://issues.apache.org/jira/browse/MESOS-7971 > Project: Mesos > Issue Type: Bug > Components: allocation >Affects Versions: 1.4.0, 1.6.0, 1.7.0, 1.8.0 >Reporter: Vinod Kone >Assignee: Meng Zhu >Priority: Critical > Labels: flaky-test, mesosphere > Attachments: ApacheJenkinsConsoleText_autotools_gcc_ubuntu16.txt > > > Saw this when testing 1.4.0-rc5 > {code} > [ RUN ] PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove > I0912 05:40:27.335222 30860 cluster.cpp:162] Creating default 'local' > authorizer > I0912 05:40:27.338429 30867 master.cpp:442] Master > 2bd1e8eb-e314-4181-9ed3-d397ec1dbede (6aa774430302) started on > 172.17.0.3:54639 > I0912 05:40:27.338472 30867 master.cpp:444] Flags at startup: --acls="" > --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" > --allocation_interval="50ms" --allocator="HierarchicalDRF" > --authenticate_agents="true" --authenticate_frameworks="true" > --authenticate_http_frameworks="true" --authenticate_http_readonly="true" > --authenticate_http_readwrite="true" --authenticators="crammd5" > --authorizers="local" --credentials="/tmp/hH0YXe/credentials" > --filter_gpu_resources="true" --framework_sorter="drf" --help="false" > --hostname_lookup="true" --http_authenticators="basic" > --http_framework_authenticators="basic" --initialize_driver_logging="true" > --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" > --max_agent_ping_timeouts="5" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" > --max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false" > --recovery_agent_removal_limit="100%" --registry="in_memory" > --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" > --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" > --registry_store_timeout="100secs" --registry_strict="false" --roles="role1" > --root_submissions="true" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/hH0YXe/master" > --zk_session_timeout="10secs"
[jira] [Comment Edited] (MESOS-7971) PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove test is flaky
[ https://issues.apache.org/jira/browse/MESOS-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707765#comment-16707765 ] Vinod Kone edited comment on MESOS-7971 at 12/3/18 8:50 PM: Saw this again. {noformat} 06:14:51 [ RUN ] PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove 06:14:51 I1203 06:14:50.630549 19784 cluster.cpp:173] Creating default 'local' authorizer 06:14:51 I1203 06:14:50.633529 19796 master.cpp:413] Master f1ffe054-ad44-45d4-9f39-84b048e1a359 (c16130e94783) started on 172.17.0.3:44340 06:14:51 I1203 06:14:50.633581 19796 master.cpp:416] Flags at startup: --acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1000secs" --allocator="hierarchical" --authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authentication_v0_timeout="15secs" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/4vMyjy/credentials" --filter_gpu_resources="true" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_unreachable_tasks_per_framework="1000" --memory_profiling="false" --min_allocatable_resources="cpus:0.01|mem:32" --port="5050" --publish_per_framework_metrics="true" --quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="100secs" --registry_strict="false" --require_agent_domain="false" --role_sorter="drf" --roles="role1" --root_submissions="true" --version="false" --webui_dir="/tmp/SRC/build/mesos-1.8.0/_inst/share/mesos/webui" --work_dir="/tmp/4vMyjy/master" --zk_session_timeout="10secs" 06:14:51 I1203 06:14:50.634217 19796 master.cpp:465] Master only allowing authenticated frameworks to register 06:14:51 I1203 06:14:50.634236 19796 master.cpp:471] Master only allowing authenticated agents to register 06:14:51 I1203 06:14:50.634253 19796 master.cpp:477] Master only allowing authenticated HTTP frameworks to register 06:14:51 I1203 06:14:50.634270 19796 credentials.hpp:37] Loading credentials for authentication from '/tmp/4vMyjy/credentials' 06:14:51 I1203 06:14:50.634608 19796 master.cpp:521] Using default 'crammd5' authenticator 06:14:51 I1203 06:14:50.634840 19796 http.cpp:1042] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readonly' 06:14:51 I1203 06:14:50.635052 19796 http.cpp:1042] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readwrite' 06:14:51 I1203 06:14:50.635200 19796 http.cpp:1042] Creating default 'basic' HTTP authenticator for realm 'mesos-master-scheduler' 06:14:51 I1203 06:14:50.635373 19796 master.cpp:602] Authorization enabled 06:14:51 W1203 06:14:50.635457 19796 master.cpp:665] The '--roles' flag is deprecated. This flag will be removed in the future. See the Mesos 0.27 upgrade notes for more information 06:14:51 I1203 06:14:50.635991 19800 whitelist_watcher.cpp:77] No whitelist given 06:14:51 I1203 06:14:50.636032 19793 hierarchical.cpp:175] Initialized hierarchical allocator process 06:14:51 I1203 06:14:50.638939 19796 master.cpp:2105] Elected as the leading master! 06:14:51 I1203 06:14:50.638975 19796 master.cpp:1660] Recovering from registrar 06:14:51 I1203 06:14:50.639200 19792 registrar.cpp:339] Recovering registrar 06:14:51 I1203 06:14:50.639927 19792 registrar.cpp:383] Successfully fetched the registry (0B) in 672768ns 06:14:51 I1203 06:14:50.640069 19792 registrar.cpp:487] Applied 1 operations in 48006ns; attempting to update the registry 06:14:51 I1203 06:14:50.640718 19792 registrar.cpp:544] Successfully updated the registry in 582912ns 06:14:51 I1203 06:14:50.640852 19792 registrar.cpp:416] Successfully recovered registrar 06:14:51 I1203 06:14:50.641299 19800 master.cpp:1774] Recovered 0 agents from the registry (135B); allowing 10mins for agents to reregister 06:14:51 I1203 06:14:50.641340 19799 hierarchical.cpp:215] Skipping recovery of hierarchical allocator: nothing to recover 06:14:51 W1203 06:14:50.647153 19784 process.cpp:2829] Attempted to spawn already running process files@172.17.0.3:44340 06:14:51 I1203 06:14:50.648453 19784 containerizer.cpp:305] Using isolation { environment_secret, posix/cpu, posix/mem, filesystem/posix, network/cni } 06:14:51 W1203 06:14:50.649060 19784 backend.cpp:76] Failed to create 'aufs' backend: AufsBackend requires root privileges 06:14:51 W1203 06:14:50.649088 19784 backend.cpp:76] Failed