[
https://issues.apache.org/jira/browse/MESOS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15174482#comment-15174482
]
Jie Yu commented on MESOS-4832:
-------------------------------
I think the problem is that we're trying to umount the persistent volume twice:
{noformat}
I0226 03:17:28.127876 1114 docker.cpp:912] Unmounting volume for container
'bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c'
I0226 03:17:28.127957 1114 docker.cpp:912] Unmounting volume for container
'bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c'
{noformat}
Looking at the code:
{code}
Try<Nothing> unmountPersistentVolumes(const ContainerID& containerId)
{
// We assume volumes are only supported on Linux, and also
// the target path contains the containerId.
#ifdef __linux__
Try<fs::MountInfoTable> table = fs::MountInfoTable::read();
if (table.isError()) {
return Error("Failed to get mount table: " + table.error());
}
foreach (const fs::MountInfoTable::Entry& entry,
adaptor::reverse(table.get().entries)) {
// TODO(tnachen): We assume there is only one docker container
// running per container Id and no other mounts will have the
// container Id name. We might need to revisit if this is no
// longer true.
if (strings::contains(entry.target, containerId.value())) {
LOG(INFO) << "Unmounting volume for container '" << containerId
<< "'";
Try<Nothing> unmount = fs::unmount(entry.target);
if (unmount.isError()) {
return Error("Failed to unmount volume '" + entry.target +
"': " + unmount.error());
}
}
}
#endif // __linux__
return Nothing();
}
{code}
We rely on {noformat}if (strings::contains(entry.target, containerId.value()))
{noformat} to discovery persistent volume mounts. But on some system settings,
if the slave's work_dir is under a bind mount, and the parent of that bind
mount is a 'shared' mount, that mount of persistent volumes will be propagated
to another mount point. That means there will be two mounts in the mount table
that contain the 'containerId'.
There are two issues:
1) we should modify unmountPersistentVolumes to be more robust. One simple fix
is to check if 'entry.target' is under slave's work_dir or not.
2) Ideally, we should do the same as we did in LinuxFilesystemIsolator to make
slave's work_dir a slave+shared mount. I'll add a TODO
> DockerContainerizerTest.ROOT_DOCKER_RecoverOrphanedPersistentVolumes exits
> when the /tmp directory is bind-mounted
> ------------------------------------------------------------------------------------------------------------------
>
> Key: MESOS-4832
> URL: https://issues.apache.org/jira/browse/MESOS-4832
> Project: Mesos
> Issue Type: Bug
> Components: containerization, docker
> Affects Versions: 0.27.0
> Environment: Seen on CentOS 7 & Debian 8.
> Reporter: Joseph Wu
> Assignee: Jie Yu
> Labels: mesosphere, test
> Fix For: 0.28.0
>
>
> If the {{/tmp}} directory (where Mesos tests create temporary directories) is
> a bind mount, the test suite will exit here:
> {code}
> [ RUN ]
> DockerContainerizerTest.ROOT_DOCKER_RecoverOrphanedPersistentVolumes
> I0226 03:17:26.722806 1097 leveldb.cpp:174] Opened db in 12.587676ms
> I0226 03:17:26.723496 1097 leveldb.cpp:181] Compacted db in 636999ns
> I0226 03:17:26.723536 1097 leveldb.cpp:196] Created db iterator in 18271ns
> I0226 03:17:26.723547 1097 leveldb.cpp:202] Seeked to beginning of db in
> 1555ns
> I0226 03:17:26.723554 1097 leveldb.cpp:271] Iterated through 0 keys in the
> db in 363ns
> I0226 03:17:26.723593 1097 replica.cpp:779] Replica recovered with log
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0226 03:17:26.724128 1117 recover.cpp:447] Starting replica recovery
> I0226 03:17:26.724367 1117 recover.cpp:473] Replica is in EMPTY status
> I0226 03:17:26.725237 1117 replica.cpp:673] Replica in EMPTY status received
> a broadcasted recover request from (13810)@172.30.2.151:51934
> I0226 03:17:26.725744 1114 recover.cpp:193] Received a recover response from
> a replica in EMPTY status
> I0226 03:17:26.726356 1111 master.cpp:376] Master
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a (ip-172-30-2-151.mesosphere.io) started
> on 172.30.2.151:51934
> I0226 03:17:26.726369 1118 recover.cpp:564] Updating replica status to
> STARTING
> I0226 03:17:26.726378 1111 master.cpp:378] Flags at startup: --acls=""
> --allocation_interval="1secs" --allocator="HierarchicalDRF"
> --authenticate="true" --authenticate_http="true" --authenticate_slaves="true"
> --authenticators="crammd5" --authorizers="local"
> --credentials="/tmp/djHTVQ/credentials" --framework_sorter="drf"
> --help="false" --hostname_lookup="true" --http_authenticators="basic"
> --initialize_driver_logging="true" --log_auto_initialize="true"
> --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50"
> --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5"
> --quiet="false" --recovery_slave_removal_limit="100%"
> --registry="replicated_log" --registry_fetch_timeout="1mins"
> --registry_store_timeout="100secs" --registry_strict="true"
> --root_submissions="true" --slave_ping_timeout="15secs"
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false"
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/djHTVQ/master"
> --zk_session_timeout="10secs"
> I0226 03:17:26.726605 1111 master.cpp:423] Master only allowing
> authenticated frameworks to register
> I0226 03:17:26.726616 1111 master.cpp:428] Master only allowing
> authenticated slaves to register
> I0226 03:17:26.726632 1111 credentials.hpp:35] Loading credentials for
> authentication from '/tmp/djHTVQ/credentials'
> I0226 03:17:26.726860 1111 master.cpp:468] Using default 'crammd5'
> authenticator
> I0226 03:17:26.726977 1111 master.cpp:537] Using default 'basic' HTTP
> authenticator
> I0226 03:17:26.727092 1111 master.cpp:571] Authorization enabled
> I0226 03:17:26.727243 1118 hierarchical.cpp:144] Initialized hierarchical
> allocator process
> I0226 03:17:26.727285 1116 whitelist_watcher.cpp:77] No whitelist given
> I0226 03:17:26.728852 1114 master.cpp:1712] The newly elected leader is
> [email protected]:51934 with id 5cc57c0e-f1ad-4107-893f-420ed1a1db1a
> I0226 03:17:26.728876 1114 master.cpp:1725] Elected as the leading master!
> I0226 03:17:26.728891 1114 master.cpp:1470] Recovering from registrar
> I0226 03:17:26.728977 1117 registrar.cpp:307] Recovering registrar
> I0226 03:17:26.731503 1112 leveldb.cpp:304] Persisting metadata (8 bytes) to
> leveldb took 4.977811ms
> I0226 03:17:26.731539 1112 replica.cpp:320] Persisted replica status to
> STARTING
> I0226 03:17:26.731711 1111 recover.cpp:473] Replica is in STARTING status
> I0226 03:17:26.732501 1114 replica.cpp:673] Replica in STARTING status
> received a broadcasted recover request from (13812)@172.30.2.151:51934
> I0226 03:17:26.732862 1111 recover.cpp:193] Received a recover response from
> a replica in STARTING status
> I0226 03:17:26.733264 1117 recover.cpp:564] Updating replica status to VOTING
> I0226 03:17:26.733836 1118 leveldb.cpp:304] Persisting metadata (8 bytes) to
> leveldb took 388246ns
> I0226 03:17:26.733855 1118 replica.cpp:320] Persisted replica status to
> VOTING
> I0226 03:17:26.733979 1113 recover.cpp:578] Successfully joined the Paxos
> group
> I0226 03:17:26.734149 1113 recover.cpp:462] Recover process terminated
> I0226 03:17:26.734478 1111 log.cpp:659] Attempting to start the writer
> I0226 03:17:26.735523 1114 replica.cpp:493] Replica received implicit
> promise request from (13813)@172.30.2.151:51934 with proposal 1
> I0226 03:17:26.736130 1114 leveldb.cpp:304] Persisting metadata (8 bytes) to
> leveldb took 576451ns
> I0226 03:17:26.736150 1114 replica.cpp:342] Persisted promised to 1
> I0226 03:17:26.736709 1115 coordinator.cpp:238] Coordinator attempting to
> fill missing positions
> I0226 03:17:26.737771 1114 replica.cpp:388] Replica received explicit
> promise request from (13814)@172.30.2.151:51934 for position 0 with proposal 2
> I0226 03:17:26.738386 1114 leveldb.cpp:341] Persisting action (8 bytes) to
> leveldb took 583184ns
> I0226 03:17:26.738404 1114 replica.cpp:712] Persisted action at 0
> I0226 03:17:26.739312 1118 replica.cpp:537] Replica received write request
> for position 0 from (13815)@172.30.2.151:51934
> I0226 03:17:26.739367 1118 leveldb.cpp:436] Reading position from leveldb
> took 26157ns
> I0226 03:17:26.740638 1118 leveldb.cpp:341] Persisting action (14 bytes) to
> leveldb took 1.238477ms
> I0226 03:17:26.740669 1118 replica.cpp:712] Persisted action at 0
> I0226 03:17:26.741158 1118 replica.cpp:691] Replica received learned notice
> for position 0 from @0.0.0.0:0
> I0226 03:17:26.742878 1118 leveldb.cpp:341] Persisting action (16 bytes) to
> leveldb took 1.697254ms
> I0226 03:17:26.742902 1118 replica.cpp:712] Persisted action at 0
> I0226 03:17:26.742916 1118 replica.cpp:697] Replica learned NOP action at
> position 0
> I0226 03:17:26.743393 1117 log.cpp:675] Writer started with ending position 0
> I0226 03:17:26.744370 1112 leveldb.cpp:436] Reading position from leveldb
> took 34329ns
> I0226 03:17:26.745240 1117 registrar.cpp:340] Successfully fetched the
> registry (0B) in 16.21888ms
> I0226 03:17:26.745350 1117 registrar.cpp:439] Applied 1 operations in
> 30460ns; attempting to update the 'registry'
> I0226 03:17:26.746016 1111 log.cpp:683] Attempting to append 210 bytes to
> the log
> I0226 03:17:26.746119 1116 coordinator.cpp:348] Coordinator attempting to
> write APPEND action at position 1
> I0226 03:17:26.746798 1114 replica.cpp:537] Replica received write request
> for position 1 from (13816)@172.30.2.151:51934
> I0226 03:17:26.747251 1114 leveldb.cpp:341] Persisting action (229 bytes) to
> leveldb took 411333ns
> I0226 03:17:26.747269 1114 replica.cpp:712] Persisted action at 1
> I0226 03:17:26.747808 1113 replica.cpp:691] Replica received learned notice
> for position 1 from @0.0.0.0:0
> I0226 03:17:26.749511 1113 leveldb.cpp:341] Persisting action (231 bytes) to
> leveldb took 1.673488ms
> I0226 03:17:26.749534 1113 replica.cpp:712] Persisted action at 1
> I0226 03:17:26.749550 1113 replica.cpp:697] Replica learned APPEND action at
> position 1
> I0226 03:17:26.750422 1111 registrar.cpp:484] Successfully updated the
> 'registry' in 5.021952ms
> I0226 03:17:26.750560 1111 registrar.cpp:370] Successfully recovered
> registrar
> I0226 03:17:26.750635 1112 log.cpp:702] Attempting to truncate the log to 1
> I0226 03:17:26.750751 1113 coordinator.cpp:348] Coordinator attempting to
> write TRUNCATE action at position 2
> I0226 03:17:26.751096 1116 master.cpp:1522] Recovered 0 slaves from the
> Registry (171B) ; allowing 10mins for slaves to re-register
> I0226 03:17:26.751126 1111 hierarchical.cpp:171] Skipping recovery of
> hierarchical allocator: nothing to recover
> I0226 03:17:26.751561 1118 replica.cpp:537] Replica received write request
> for position 2 from (13817)@172.30.2.151:51934
> I0226 03:17:26.751999 1118 leveldb.cpp:341] Persisting action (16 bytes) to
> leveldb took 406823ns
> I0226 03:17:26.752018 1118 replica.cpp:712] Persisted action at 2
> I0226 03:17:26.752521 1113 replica.cpp:691] Replica received learned notice
> for position 2 from @0.0.0.0:0
> I0226 03:17:26.754161 1113 leveldb.cpp:341] Persisting action (18 bytes) to
> leveldb took 1.614888ms
> I0226 03:17:26.754210 1113 leveldb.cpp:399] Deleting ~1 keys from leveldb
> took 26384ns
> I0226 03:17:26.754225 1113 replica.cpp:712] Persisted action at 2
> I0226 03:17:26.754240 1113 replica.cpp:697] Replica learned TRUNCATE action
> at position 2
> I0226 03:17:26.765103 1115 slave.cpp:193] Slave started on
> 399)@172.30.2.151:51934
> I0226 03:17:26.765130 1115 slave.cpp:194] Flags at startup:
> --appc_simple_discovery_uri_prefix="http://"
> --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5"
> --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false"
> --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false"
> --cgroups_root="mesos" --container_disk_watch_interval="15secs"
> --containerizers="mesos"
> --credential="/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/credential"
> --default_role="*" --disk_watch_interval="1mins" --docker="docker"
> --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io"
> --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock"
> --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker"
> --enforce_container_disk_quota="false"
> --executor_registration_timeout="1mins"
> --executor_shutdown_grace_period="5secs"
> --fetcher_cache_dir="/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/fetch"
> --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks"
> --gc_disk_headroom="0.1" --hadoop_home="" --help="false"
> --hostname_lookup="true" --image_provisioner_backend="copy"
> --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem"
> --launcher_dir="/mnt/teamcity/work/4240ba9ddd0997c3/build/src"
> --logbufsecs="0" --logging_level="INFO"
> --oversubscribed_resources_interval="15secs" --perf_duration="10secs"
> --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false"
> --recover="reconnect" --recovery_timeout="15mins"
> --registration_backoff_factor="10ms"
> --resources="cpu:2;mem:2048;disk(role1):2048"
> --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox"
> --strict="true" --switch_user="true" --systemd_enable_support="true"
> --systemd_runtime_directory="/run/systemd/system" --version="false"
> --work_dir="/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP"
> I0226 03:17:26.765403 1115 credentials.hpp:83] Loading credential for
> authentication from
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/credential'
> I0226 03:17:26.765573 1115 slave.cpp:324] Slave using credential for:
> test-principal
> I0226 03:17:26.765733 1115 resources.cpp:576] Parsing resources as JSON
> failed: cpu:2;mem:2048;disk(role1):2048
> Trying semicolon-delimited string format instead
> I0226 03:17:26.766185 1115 slave.cpp:464] Slave resources: cpu(*):2;
> mem(*):2048; disk(role1):2048; cpus(*):8; ports(*):[31000-32000]
> I0226 03:17:26.766242 1115 slave.cpp:472] Slave attributes: [ ]
> I0226 03:17:26.766250 1115 slave.cpp:477] Slave hostname:
> ip-172-30-2-151.mesosphere.io
> I0226 03:17:26.767325 1097 sched.cpp:222] Version: 0.28.0
> I0226 03:17:26.767390 1111 state.cpp:58] Recovering state from
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/meta'
> I0226 03:17:26.767603 1115 status_update_manager.cpp:200] Recovering status
> update manager
> I0226 03:17:26.767865 1113 docker.cpp:726] Recovering Docker containers
> I0226 03:17:26.767971 1111 sched.cpp:326] New master detected at
> [email protected]:51934
> I0226 03:17:26.768045 1111 sched.cpp:382] Authenticating with master
> [email protected]:51934
> I0226 03:17:26.768059 1111 sched.cpp:389] Using default CRAM-MD5
> authenticatee
> I0226 03:17:26.768070 1118 slave.cpp:4565] Finished recovery
> I0226 03:17:26.768273 1112 authenticatee.cpp:121] Creating new client SASL
> connection
> I0226 03:17:26.768435 1118 slave.cpp:4737] Querying resource estimator for
> oversubscribable resources
> I0226 03:17:26.768565 1111 master.cpp:5526] Authenticating
> [email protected]:51934
> I0226 03:17:26.768661 1118 slave.cpp:796] New master detected at
> [email protected]:51934
> I0226 03:17:26.768659 1115 authenticator.cpp:413] Starting authentication
> session for crammd5_authenticatee(839)@172.30.2.151:51934
> I0226 03:17:26.768679 1113 status_update_manager.cpp:174] Pausing sending
> status updates
> I0226 03:17:26.768728 1118 slave.cpp:859] Authenticating with master
> [email protected]:51934
> I0226 03:17:26.768743 1118 slave.cpp:864] Using default CRAM-MD5
> authenticatee
> I0226 03:17:26.768865 1118 slave.cpp:832] Detecting new master
> I0226 03:17:26.768868 1112 authenticator.cpp:98] Creating new server SASL
> connection
> I0226 03:17:26.768908 1114 authenticatee.cpp:121] Creating new client SASL
> connection
> I0226 03:17:26.769003 1118 slave.cpp:4751] Received oversubscribable
> resources from the resource estimator
> I0226 03:17:26.769103 1115 authenticatee.cpp:212] Received SASL
> authentication mechanisms: CRAM-MD5
> I0226 03:17:26.769131 1115 authenticatee.cpp:238] Attempting to authenticate
> with mechanism 'CRAM-MD5'
> I0226 03:17:26.769209 1116 master.cpp:5526] Authenticating
> slave(399)@172.30.2.151:51934
> I0226 03:17:26.769253 1114 authenticator.cpp:203] Received SASL
> authentication start
> I0226 03:17:26.769295 1115 authenticator.cpp:413] Starting authentication
> session for crammd5_authenticatee(840)@172.30.2.151:51934
> I0226 03:17:26.769307 1114 authenticator.cpp:325] Authentication requires
> more steps
> I0226 03:17:26.769403 1117 authenticatee.cpp:258] Received SASL
> authentication step
> I0226 03:17:26.769495 1114 authenticator.cpp:98] Creating new server SASL
> connection
> I0226 03:17:26.769531 1115 authenticator.cpp:231] Received SASL
> authentication step
> I0226 03:17:26.769554 1115 auxprop.cpp:107] Request to lookup properties for
> user: 'test-principal' realm: 'ip-172-30-2-151.mesosphere.io' server FQDN:
> 'ip-172-30-2-151.mesosphere.io' SASL_AUXPROP_VERIFY_AGAINST_HASH: false
> SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false
> I0226 03:17:26.769562 1115 auxprop.cpp:179] Looking up auxiliary property
> '*userPassword'
> I0226 03:17:26.769608 1115 auxprop.cpp:179] Looking up auxiliary property
> '*cmusaslsecretCRAM-MD5'
> I0226 03:17:26.769629 1115 auxprop.cpp:107] Request to lookup properties for
> user: 'test-principal' realm: 'ip-172-30-2-151.mesosphere.io' server FQDN:
> 'ip-172-30-2-151.mesosphere.io' SASL_AUXPROP_VERIFY_AGAINST_HASH: false
> SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true
> I0226 03:17:26.769637 1115 auxprop.cpp:129] Skipping auxiliary property
> '*userPassword' since SASL_AUXPROP_AUTHZID == true
> I0226 03:17:26.769642 1115 auxprop.cpp:129] Skipping auxiliary property
> '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> I0226 03:17:26.769654 1115 authenticator.cpp:317] Authentication success
> I0226 03:17:26.769728 1117 authenticatee.cpp:298] Authentication success
> I0226 03:17:26.769769 1112 authenticatee.cpp:212] Received SASL
> authentication mechanisms: CRAM-MD5
> I0226 03:17:26.769767 1118 master.cpp:5556] Successfully authenticated
> principal 'test-principal' at
> [email protected]:51934
> I0226 03:17:26.769803 1112 authenticatee.cpp:238] Attempting to authenticate
> with mechanism 'CRAM-MD5'
> I0226 03:17:26.769798 1114 authenticator.cpp:431] Authentication session
> cleanup for crammd5_authenticatee(839)@172.30.2.151:51934
> I0226 03:17:26.769881 1112 authenticator.cpp:203] Received SASL
> authentication start
> I0226 03:17:26.769932 1112 authenticator.cpp:325] Authentication requires
> more steps
> I0226 03:17:26.769981 1117 sched.cpp:471] Successfully authenticated with
> master [email protected]:51934
> I0226 03:17:26.770004 1117 sched.cpp:776] Sending SUBSCRIBE call to
> [email protected]:51934
> I0226 03:17:26.770064 1118 authenticatee.cpp:258] Received SASL
> authentication step
> I0226 03:17:26.770102 1117 sched.cpp:809] Will retry registration in
> 1.937819802secs if necessary
> I0226 03:17:26.770165 1115 authenticator.cpp:231] Received SASL
> authentication step
> I0226 03:17:26.770193 1115 auxprop.cpp:107] Request to lookup properties for
> user: 'test-principal' realm: 'ip-172-30-2-151.mesosphere.io' server FQDN:
> 'ip-172-30-2-151.mesosphere.io' SASL_AUXPROP_VERIFY_AGAINST_HASH: false
> SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false
> I0226 03:17:26.770207 1115 auxprop.cpp:179] Looking up auxiliary property
> '*userPassword'
> I0226 03:17:26.770213 1116 master.cpp:2280] Received SUBSCRIBE call for
> framework 'default' at
> [email protected]:51934
> I0226 03:17:26.770241 1115 auxprop.cpp:179] Looking up auxiliary property
> '*cmusaslsecretCRAM-MD5'
> I0226 03:17:26.770274 1115 auxprop.cpp:107] Request to lookup properties for
> user: 'test-principal' realm: 'ip-172-30-2-151.mesosphere.io' server FQDN:
> 'ip-172-30-2-151.mesosphere.io' SASL_AUXPROP_VERIFY_AGAINST_HASH: false
> SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true
> I0226 03:17:26.770277 1116 master.cpp:1751] Authorizing framework principal
> 'test-principal' to receive offers for role 'role1'
> I0226 03:17:26.770298 1115 auxprop.cpp:129] Skipping auxiliary property
> '*userPassword' since SASL_AUXPROP_AUTHZID == true
> I0226 03:17:26.770331 1115 auxprop.cpp:129] Skipping auxiliary property
> '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> I0226 03:17:26.770349 1115 authenticator.cpp:317] Authentication success
> I0226 03:17:26.770428 1118 authenticatee.cpp:298] Authentication success
> I0226 03:17:26.770442 1116 master.cpp:5556] Successfully authenticated
> principal 'test-principal' at slave(399)@172.30.2.151:51934
> I0226 03:17:26.770547 1116 authenticator.cpp:431] Authentication session
> cleanup for crammd5_authenticatee(840)@172.30.2.151:51934
> I0226 03:17:26.770846 1116 master.cpp:2351] Subscribing framework default
> with checkpointing enabled and capabilities [ ]
> I0226 03:17:26.770866 1118 slave.cpp:927] Successfully authenticated with
> master [email protected]:51934
> I0226 03:17:26.770966 1118 slave.cpp:1321] Will retry registration in
> 1.453415ms if necessary
> I0226 03:17:26.771225 1115 hierarchical.cpp:265] Added framework
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000
> I0226 03:17:26.771275 1118 sched.cpp:703] Framework registered with
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000
> I0226 03:17:26.771299 1115 hierarchical.cpp:1434] No resources available to
> allocate!
> I0226 03:17:26.771328 1115 hierarchical.cpp:1529] No inverse offers to send
> out!
> I0226 03:17:26.771344 1118 sched.cpp:717] Scheduler::registered took 50146ns
> I0226 03:17:26.771356 1116 master.cpp:4240] Registering slave at
> slave(399)@172.30.2.151:51934 (ip-172-30-2-151.mesosphere.io) with id
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0
> I0226 03:17:26.771348 1115 hierarchical.cpp:1127] Performed allocation for 0
> slaves in 101438ns
> I0226 03:17:26.771860 1114 registrar.cpp:439] Applied 1 operations in
> 59672ns; attempting to update the 'registry'
> I0226 03:17:26.772645 1117 log.cpp:683] Attempting to append 423 bytes to
> the log
> I0226 03:17:26.772758 1112 coordinator.cpp:348] Coordinator attempting to
> write APPEND action at position 3
> I0226 03:17:26.773435 1117 replica.cpp:537] Replica received write request
> for position 3 from (13824)@172.30.2.151:51934
> I0226 03:17:26.773586 1111 slave.cpp:1321] Will retry registration in
> 2.74261ms if necessary
> I0226 03:17:26.773682 1115 master.cpp:4228] Ignoring register slave message
> from slave(399)@172.30.2.151:51934 (ip-172-30-2-151.mesosphere.io) as
> admission is already in progress
> I0226 03:17:26.773937 1117 leveldb.cpp:341] Persisting action (442 bytes) to
> leveldb took 469969ns
> I0226 03:17:26.773957 1117 replica.cpp:712] Persisted action at 3
> I0226 03:17:26.774605 1114 replica.cpp:691] Replica received learned notice
> for position 3 from @0.0.0.0:0
> I0226 03:17:26.775961 1114 leveldb.cpp:341] Persisting action (444 bytes) to
> leveldb took 1.329435ms
> I0226 03:17:26.775986 1114 replica.cpp:712] Persisted action at 3
> I0226 03:17:26.776008 1114 replica.cpp:697] Replica learned APPEND action at
> position 3
> I0226 03:17:26.777228 1115 slave.cpp:1321] Will retry registration in
> 41.5608ms if necessary
> I0226 03:17:26.777300 1112 registrar.cpp:484] Successfully updated the
> 'registry' in 5.378048ms
> I0226 03:17:26.777361 1114 master.cpp:4228] Ignoring register slave message
> from slave(399)@172.30.2.151:51934 (ip-172-30-2-151.mesosphere.io) as
> admission is already in progress
> I0226 03:17:26.777505 1113 log.cpp:702] Attempting to truncate the log to 3
> I0226 03:17:26.777616 1111 coordinator.cpp:348] Coordinator attempting to
> write TRUNCATE action at position 4
> I0226 03:17:26.778062 1114 slave.cpp:3482] Received ping from
> slave-observer(369)@172.30.2.151:51934
> I0226 03:17:26.778139 1118 master.cpp:4308] Registered slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 at slave(399)@172.30.2.151:51934
> (ip-172-30-2-151.mesosphere.io) with cpu(*):2; mem(*):2048; disk(role1):2048;
> cpus(*):8; ports(*):[31000-32000]
> I0226 03:17:26.778213 1113 replica.cpp:537] Replica received write request
> for position 4 from (13825)@172.30.2.151:51934
> I0226 03:17:26.778291 1114 slave.cpp:971] Registered with master
> [email protected]:51934; given slave ID
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0
> I0226 03:17:26.778316 1114 fetcher.cpp:81] Clearing fetcher cache
> I0226 03:17:26.778367 1116 hierarchical.cpp:473] Added slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 (ip-172-30-2-151.mesosphere.io) with
> cpu(*):2; mem(*):2048; disk(role1):2048; cpus(*):8; ports(*):[31000-32000]
> (allocated: )
> I0226 03:17:26.778447 1117 status_update_manager.cpp:181] Resuming sending
> status updates
> I0226 03:17:26.778617 1113 leveldb.cpp:341] Persisting action (16 bytes) to
> leveldb took 375414ns
> I0226 03:17:26.778635 1113 replica.cpp:712] Persisted action at 4
> I0226 03:17:26.778650 1114 slave.cpp:994] Checkpointing SlaveInfo to
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/meta/slaves/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0/slave.info'
> I0226 03:17:26.778900 1114 slave.cpp:1030] Forwarding total oversubscribed
> resources
> I0226 03:17:26.779109 1114 master.cpp:4649] Received update of slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 at slave(399)@172.30.2.151:51934
> (ip-172-30-2-151.mesosphere.io) with total oversubscribed resources
> I0226 03:17:26.779139 1112 replica.cpp:691] Replica received learned notice
> for position 4 from @0.0.0.0:0
> I0226 03:17:26.779331 1116 hierarchical.cpp:1529] No inverse offers to send
> out!
> I0226 03:17:26.779369 1116 hierarchical.cpp:1147] Performed allocation for
> slave 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 in 969593ns
> I0226 03:17:26.779645 1113 master.cpp:5355] Sending 1 offers to framework
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 (default) at
> [email protected]:51934
> I0226 03:17:26.779700 1116 hierarchical.cpp:531] Slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 (ip-172-30-2-151.mesosphere.io)
> updated with oversubscribed resources (total: cpu(*):2; mem(*):2048;
> disk(role1):2048; cpus(*):8; ports(*):[31000-32000], allocated:
> disk(role1):2048; cpu(*):2; mem(*):2048; cpus(*):8; ports(*):[31000-32000])
> I0226 03:17:26.779819 1116 hierarchical.cpp:1434] No resources available to
> allocate!
> I0226 03:17:26.779847 1116 hierarchical.cpp:1529] No inverse offers to send
> out!
> I0226 03:17:26.779865 1116 hierarchical.cpp:1147] Performed allocation for
> slave 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 in 133437ns
> I0226 03:17:26.780025 1118 sched.cpp:873] Scheduler::resourceOffers took
> 102165ns
> I0226 03:17:26.780372 1097 resources.cpp:576] Parsing resources as JSON
> failed: cpus:1;mem:64;
> Trying semicolon-delimited string format instead
> I0226 03:17:26.780882 1112 leveldb.cpp:341] Persisting action (18 bytes) to
> leveldb took 1.715066ms
> I0226 03:17:26.780938 1112 leveldb.cpp:399] Deleting ~2 keys from leveldb
> took 32370ns
> I0226 03:17:26.780953 1112 replica.cpp:712] Persisted action at 4
> I0226 03:17:26.780971 1112 replica.cpp:697] Replica learned TRUNCATE action
> at position 4
> I0226 03:17:26.781693 1117 master.cpp:3138] Processing ACCEPT call for
> offers: [ 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-O0 ] on slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 at slave(399)@172.30.2.151:51934
> (ip-172-30-2-151.mesosphere.io) for framework
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 (default) at
> [email protected]:51934
> I0226 03:17:26.781731 1117 master.cpp:2926] Authorizing principal
> 'test-principal' to create volumes
> I0226 03:17:26.781801 1117 master.cpp:2825] Authorizing framework principal
> 'test-principal' to launch task 1 as user 'root'
> I0226 03:17:26.782827 1114 master.cpp:3467] Applying CREATE operation for
> volumes disk(role1)[id1:path1]:64 from framework
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 (default) at
> [email protected]:51934 to slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 at slave(399)@172.30.2.151:51934
> (ip-172-30-2-151.mesosphere.io)
> I0226 03:17:26.783136 1114 master.cpp:6589] Sending checkpointed resources
> disk(role1)[id1:path1]:64 to slave 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 at
> slave(399)@172.30.2.151:51934 (ip-172-30-2-151.mesosphere.io)
> I0226 03:17:26.783641 1111 slave.cpp:2341] Updated checkpointed resources
> from to disk(role1)[id1:path1]:64
> I0226 03:17:26.783911 1114 master.hpp:176] Adding task 1 with resources
> cpus(*):1; mem(*):64; disk(role1)[id1:path1]:64 on slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 (ip-172-30-2-151.mesosphere.io)
> I0226 03:17:26.784056 1114 master.cpp:3623] Launching task 1 of framework
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 (default) at
> [email protected]:51934 with
> resources cpus(*):1; mem(*):64; disk(role1)[id1:path1]:64 on slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 at slave(399)@172.30.2.151:51934
> (ip-172-30-2-151.mesosphere.io)
> I0226 03:17:26.784397 1115 slave.cpp:1361] Got assigned task 1 for framework
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000
> I0226 03:17:26.784557 1115 slave.cpp:5287] Checkpointing FrameworkInfo to
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/meta/slaves/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0/frameworks/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000/framework.info'
> I0226 03:17:26.784739 1116 hierarchical.cpp:653] Updated allocation of
> framework 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 on slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 from disk(role1):2048; cpu(*):2;
> mem(*):2048; cpus(*):8; ports(*):[31000-32000] to disk(role1):1984; cpu(*):2;
> mem(*):2048; cpus(*):8; ports(*):[31000-32000]; disk(role1)[id1:path1]:64
> I0226 03:17:26.784848 1115 slave.cpp:5298] Checkpointing framework pid
> '[email protected]:51934' to
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/meta/slaves/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0/frameworks/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000/framework.pid'
> I0226 03:17:26.785078 1115 resources.cpp:576] Parsing resources as JSON
> failed: cpus:0.1;mem:32
> Trying semicolon-delimited string format instead
> I0226 03:17:26.785322 1116 hierarchical.cpp:892] Recovered disk(role1):1984;
> cpu(*):2; mem(*):1984; cpus(*):7; ports(*):[31000-32000] (total: cpu(*):2;
> mem(*):2048; disk(role1):1984; cpus(*):8; ports(*):[31000-32000];
> disk(role1)[id1:path1]:64, allocated: disk(role1)[id1:path1]:64; cpus(*):1;
> mem(*):64) on slave 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 from framework
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000
> I0226 03:17:26.785658 1115 slave.cpp:1480] Launching task 1 for framework
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000
> I0226 03:17:26.785719 1115 resources.cpp:576] Parsing resources as JSON
> failed: cpus:0.1;mem:32
> Trying semicolon-delimited string format instead
> I0226 03:17:26.786197 1115 paths.cpp:474] Trying to chown
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/slaves/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0/frameworks/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000/executors/1/runs/bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c'
> to user 'root'
> I0226 03:17:26.791122 1115 slave.cpp:5739] Checkpointing ExecutorInfo to
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/meta/slaves/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0/frameworks/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000/executors/1/executor.info'
> I0226 03:17:26.791543 1115 slave.cpp:5367] Launching executor 1 of framework
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 with resources cpus(*):0.1;
> mem(*):32 in work directory
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/slaves/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0/frameworks/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000/executors/1/runs/bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c'
> I0226 03:17:26.792325 1115 slave.cpp:5762] Checkpointing TaskInfo to
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/meta/slaves/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0/frameworks/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000/executors/1/runs/bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c/tasks/1/task.info'
> I0226 03:17:26.794337 1115 slave.cpp:1698] Queuing task '1' for executor '1'
> of framework 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000
> I0226 03:17:26.794478 1115 slave.cpp:749] Successfully attached file
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/slaves/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0/frameworks/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000/executors/1/runs/bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c'
> I0226 03:17:26.797106 1116 docker.cpp:1023] Starting container
> 'bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c' for task '1' (and executor '1') of
> framework '5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000'
> I0226 03:17:26.797462 1116 docker.cpp:1053] Running docker -H
> unix:///var/run/docker.sock inspect alpine:latest
> I0226 03:17:26.910549 1111 docker.cpp:394] Docker pull alpine completed
> I0226 03:17:26.910800 1111 docker.cpp:483] Changing the ownership of the
> persistent volume at
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/volumes/roles/role1/id1'
> with uid 0 and gid 0
> I0226 03:17:26.915712 1111 docker.cpp:504] Mounting
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/volumes/roles/role1/id1'
> to
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/slaves/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0/frameworks/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000/executors/1/runs/bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c/path1'
> for persistent volume disk(role1)[id1:path1]:64 of container
> bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c
> I0226 03:17:26.919000 1117 docker.cpp:576] Checkpointing pid 9568 to
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/meta/slaves/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0/frameworks/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000/executors/1/runs/bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c/pids/forked.pid'
> I0226 03:17:26.974776 1114 slave.cpp:2643] Got registration for executor '1'
> of framework 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 from
> executor(1)@172.30.2.151:46052
> I0226 03:17:26.975217 1114 slave.cpp:2729] Checkpointing executor pid
> 'executor(1)@172.30.2.151:46052' to
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/meta/slaves/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0/frameworks/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000/executors/1/runs/bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c/pids/libprocess.pid'
> I0226 03:17:26.976177 1113 docker.cpp:1303] Ignoring updating container
> 'bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c' with resources passed to update is
> identical to existing resources
> I0226 03:17:26.976492 1115 slave.cpp:1863] Sending queued task '1' to
> executor '1' of framework 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 at
> executor(1)@172.30.2.151:46052
> I0226 03:17:27.691769 1111 slave.cpp:3002] Handling status update
> TASK_RUNNING (UUID: 9f75a4e5-9ff4-4ca9-8623-8b2574796229) for task 1 of
> framework 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 from
> executor(1)@172.30.2.151:46052
> I0226 03:17:27.692291 1116 status_update_manager.cpp:320] Received status
> update TASK_RUNNING (UUID: 9f75a4e5-9ff4-4ca9-8623-8b2574796229) for task 1
> of framework 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000
> I0226 03:17:27.692327 1116 status_update_manager.cpp:497] Creating
> StatusUpdate stream for task 1 of framework
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000
> I0226 03:17:27.692773 1116 status_update_manager.cpp:824] Checkpointing
> UPDATE for status update TASK_RUNNING (UUID:
> 9f75a4e5-9ff4-4ca9-8623-8b2574796229) for task 1 of framework
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000
> I0226 03:17:27.700090 1116 status_update_manager.cpp:374] Forwarding update
> TASK_RUNNING (UUID: 9f75a4e5-9ff4-4ca9-8623-8b2574796229) for task 1 of
> framework 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 to the slave
> I0226 03:17:27.700389 1113 slave.cpp:3400] Forwarding the update
> TASK_RUNNING (UUID: 9f75a4e5-9ff4-4ca9-8623-8b2574796229) for task 1 of
> framework 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 to
> [email protected]:51934
> I0226 03:17:27.700606 1113 slave.cpp:3294] Status update manager
> successfully handled status update TASK_RUNNING (UUID:
> 9f75a4e5-9ff4-4ca9-8623-8b2574796229) for task 1 of framework
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000
> I0226 03:17:27.700644 1113 slave.cpp:3310] Sending acknowledgement for
> status update TASK_RUNNING (UUID: 9f75a4e5-9ff4-4ca9-8623-8b2574796229) for
> task 1 of framework 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 to
> executor(1)@172.30.2.151:46052
> I0226 03:17:27.700742 1117 master.cpp:4794] Status update TASK_RUNNING
> (UUID: 9f75a4e5-9ff4-4ca9-8623-8b2574796229) for task 1 of framework
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 from slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 at slave(399)@172.30.2.151:51934
> (ip-172-30-2-151.mesosphere.io)
> I0226 03:17:27.700775 1117 master.cpp:4842] Forwarding status update
> TASK_RUNNING (UUID: 9f75a4e5-9ff4-4ca9-8623-8b2574796229) for task 1 of
> framework 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000
> I0226 03:17:27.700923 1117 master.cpp:6450] Updating the state of task 1 of
> framework 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 (latest state:
> TASK_RUNNING, status update state: TASK_RUNNING)
> I0226 03:17:27.701145 1118 sched.cpp:981] Scheduler::statusUpdate took
> 107222ns
> I0226 03:17:27.701550 1112 master.cpp:3952] Processing ACKNOWLEDGE call
> 9f75a4e5-9ff4-4ca9-8623-8b2574796229 for task 1 of framework
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000 (default) at
> [email protected]:51934 on slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0
> I0226 03:17:27.701828 1114 status_update_manager.cpp:392] Received status
> update acknowledgement (UUID: 9f75a4e5-9ff4-4ca9-8623-8b2574796229) for task
> 1 of framework 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000
> I0226 03:17:27.701962 1114 status_update_manager.cpp:824] Checkpointing ACK
> for status update TASK_RUNNING (UUID: 9f75a4e5-9ff4-4ca9-8623-8b2574796229)
> for task 1 of framework 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000
> I0226 03:17:27.701987 1112 slave.cpp:668] Slave terminating
> I0226 03:17:27.702256 1117 master.cpp:1174] Slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 at slave(399)@172.30.2.151:51934
> (ip-172-30-2-151.mesosphere.io) disconnected
> I0226 03:17:27.702275 1117 master.cpp:2635] Disconnecting slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 at slave(399)@172.30.2.151:51934
> (ip-172-30-2-151.mesosphere.io)
> I0226 03:17:27.702335 1117 master.cpp:2654] Deactivating slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 at slave(399)@172.30.2.151:51934
> (ip-172-30-2-151.mesosphere.io)
> I0226 03:17:27.702492 1111 hierarchical.cpp:560] Slave
> 5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0 deactivated
> I0226 03:17:27.707713 1115 slave.cpp:193] Slave started on
> 400)@172.30.2.151:51934
> I0226 03:17:27.707739 1115 slave.cpp:194] Flags at startup:
> --appc_simple_discovery_uri_prefix="http://"
> --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5"
> --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false"
> --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false"
> --cgroups_root="mesos" --container_disk_watch_interval="15secs"
> --containerizers="mesos"
> --credential="/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/credential"
> --default_role="*" --disk_watch_interval="1mins" --docker="docker"
> --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io"
> --docker_remove_delay="6hrs" --docker_socket="/var/run/docker.sock"
> --docker_stop_timeout="0ns" --docker_store_dir="/tmp/mesos/store/docker"
> --enforce_container_disk_quota="false"
> --executor_registration_timeout="1mins"
> --executor_shutdown_grace_period="5secs"
> --fetcher_cache_dir="/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/fetch"
> --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks"
> --gc_disk_headroom="0.1" --hadoop_home="" --help="false"
> --hostname_lookup="true" --image_provisioner_backend="copy"
> --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem"
> --launcher_dir="/mnt/teamcity/work/4240ba9ddd0997c3/build/src"
> --logbufsecs="0" --logging_level="INFO"
> --oversubscribed_resources_interval="15secs" --perf_duration="10secs"
> --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false"
> --recover="reconnect" --recovery_timeout="15mins"
> --registration_backoff_factor="10ms"
> --resources="cpu:2;mem:2048;disk(role1):2048"
> --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox"
> --strict="true" --switch_user="true" --systemd_enable_support="true"
> --systemd_runtime_directory="/run/systemd/system" --version="false"
> --work_dir="/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP"
> I0226 03:17:27.708133 1115 credentials.hpp:83] Loading credential for
> authentication from
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/credential'
> I0226 03:17:27.708282 1115 slave.cpp:324] Slave using credential for:
> test-principal
> I0226 03:17:27.708407 1115 resources.cpp:576] Parsing resources as JSON
> failed: cpu:2;mem:2048;disk(role1):2048
> Trying semicolon-delimited string format instead
> I0226 03:17:27.708874 1115 slave.cpp:464] Slave resources: cpu(*):2;
> mem(*):2048; disk(role1):2048; cpus(*):8; ports(*):[31000-32000]
> I0226 03:17:27.708931 1115 slave.cpp:472] Slave attributes: [ ]
> I0226 03:17:27.708941 1115 slave.cpp:477] Slave hostname:
> ip-172-30-2-151.mesosphere.io
> I0226 03:17:27.710033 1113 state.cpp:58] Recovering state from
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/meta'
> I0226 03:17:27.711252 1114 fetcher.cpp:81] Clearing fetcher cache
> I0226 03:17:27.711447 1116 status_update_manager.cpp:200] Recovering status
> update manager
> I0226 03:17:27.711727 1111 docker.cpp:726] Recovering Docker containers
> I0226 03:17:27.711839 1111 docker.cpp:885] Running docker -H
> unix:///var/run/docker.sock ps -a
> I0226 03:17:27.728170 1117 hierarchical.cpp:1434] No resources available to
> allocate!
> I0226 03:17:27.728235 1117 hierarchical.cpp:1529] No inverse offers to send
> out!
> I0226 03:17:27.728268 1117 hierarchical.cpp:1127] Performed allocation for 1
> slaves in 296715ns
> I0226 03:17:27.817551 1113 docker.cpp:766] Running docker -H
> unix:///var/run/docker.sock inspect
> mesos-5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0.bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c
> I0226 03:17:27.923014 1112 docker.cpp:932] Checking if Docker container
> named
> '/mesos-5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0.bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c'
> was started by Mesos
> I0226 03:17:27.923071 1112 docker.cpp:942] Checking if Mesos container with
> ID 'bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c' has been orphaned
> I0226 03:17:27.923122 1112 docker.cpp:678] Running docker -H
> unix:///var/run/docker.sock stop -t 0
> 0a10ad8641f8e85227324a979817933322dc901706cb4430eab0bcaf979835d1
> I0226 03:17:28.023885 1116 docker.cpp:727] Running docker -H
> unix:///var/run/docker.sock rm -v
> 0a10ad8641f8e85227324a979817933322dc901706cb4430eab0bcaf979835d1
> I0226 03:17:28.127876 1114 docker.cpp:912] Unmounting volume for container
> 'bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c'
> ../../3rdparty/libprocess/include/process/gmock.hpp:214: ERROR: this mock
> object (used in test
> DockerContainerizerTest.ROOT_DOCKER_RecoverOrphanedPersistentVolumes) should
> be deleted but never is. Its address is @0x5781dd8.
> I0226 03:17:28.127957 1114 docker.cpp:912] Unmounting volume for container
> 'bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c'
> ../../src/tests/mesos.cpp:673: ERROR: this mock object (used in test
> DockerContainerizerTest.ROOT_DOCKER_RecoverOrphanedPersistentVolumes) should
> be deleted but never is. Its address is @0x5a03260.
> ../../src/tests/mesos.hpp:1357: ERROR: this mock object (used in test
> DockerContainerizerTest.ROOT_DOCKER_RecoverOrphanedPersistentVolumes) should
> be deleted but never is. Its address is @0x5b477c0.
> Failed to perform recovery: Unable to unmount volumes for Docker container
> 'bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c': Failed to unmount volume
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/slaves/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0/frameworks/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000/executors/1/runs/bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c/path1':
> Failed to unmount
> '/tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/slaves/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-S0/frameworks/5cc57c0e-f1ad-4107-893f-420ed1a1db1a-0000/executors/1/runs/bcc90102-163d-4ff6-a3fc-a1b2e3fc3b7c/path1':
> Invalid argument
> ../../src/tests/containerizer/docker_containerizer_tests.cpp:1650: ERROR:
> this mock object (used in test
> DockerContainerizerTest.ROOT_DOCKER_RecoverOrphanedPersistentVolumes) should
> be deleted but never is. Its address is @0x7ffe75a8d310.
> To remedy this do as follows:
> ERROR: 4 leaked mock objects found at program exit.
> Step 1: rm -f
> /tmp/DockerContainerizerTest_ROOT_DOCKER_RecoverOrphanedPersistentVolumes_aJOesP/meta/slaves/latest
> This ensures slave doesn't recover old live executors.
> Step 2: Restart the slave.
> Process exited with code 1
> {code}
> There appear to be two problems:
> 1) The docker containerizer should not exit on failure to clean up orphans.
> The MesosContainerizer does not do this (see [MESOS-2367]).
> 2) Unmounting the orphan persistent volume fails for some reason.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)