[
https://issues.apache.org/jira/browse/MESOS-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033936#comment-15033936
]
Jan Schlicht edited comment on MESOS-4032 at 12/1/15 4:14 PM:
--------------------------------------------------------------
The tests work fine if Mesos is compiled with libev, without SSL.
Running the test in isolation also fails.
Verbose output:
{noformat}
[ RUN ] SlaveRecoveryTest/0.Reboot
I1201 16:13:43.764530 30105 cgroups.cpp:2429] Freezing cgroup
/sys/fs/cgroup/freezer/mesos_test_4ea77e5a-030e-468d-aa54-6cf580143b86
I1201 16:13:43.955772 30100 cgroups.cpp:1411] Successfully froze cgroup
/sys/fs/cgroup/freezer/mesos_test_4ea77e5a-030e-468d-aa54-6cf580143b86 after
190.95296ms
I1201 16:13:44.151808 30106 cgroups.cpp:2447] Thawing cgroup
/sys/fs/cgroup/freezer/mesos_test_4ea77e5a-030e-468d-aa54-6cf580143b86
I1201 16:13:44.338899 30103 cgroups.cpp:1440] Successfullly thawed cgroup
/sys/fs/cgroup/freezer/mesos_test_4ea77e5a-030e-468d-aa54-6cf580143b86 after
186.987008ms
I1201 16:13:46.429718 30085 leveldb.cpp:176] Opened db in 6.794189ms
I1201 16:13:46.431185 30085 leveldb.cpp:183] Compacted db in 1.403926ms
I1201 16:13:46.431273 30085 leveldb.cpp:198] Created db iterator in 55789ns
I1201 16:13:46.431289 30085 leveldb.cpp:204] Seeked to beginning of db in 3775ns
I1201 16:13:46.431293 30085 leveldb.cpp:273] Iterated through 0 keys in the db
in 120ns
I1201 16:13:46.431409 30085 replica.cpp:780] Replica recovered with log
positions 0 -> 0 with 1 holes and 0 unlearned
I1201 16:13:46.432781 30104 recover.cpp:449] Starting replica recovery
I1201 16:13:46.433365 30104 recover.cpp:475] Replica is in EMPTY status
I1201 16:13:46.438645 30104 replica.cpp:676] Replica in EMPTY status received a
broadcasted recover request from (9)@127.0.0.1:52014
I1201 16:13:46.439353 30099 master.cpp:367] Master
0c54b5bb-d0f8-4c94-8f2a-c49672419e62 (centos71) started on 127.0.0.1:52014
I1201 16:13:46.439602 30100 recover.cpp:195] Received a recover response from a
replica in EMPTY status
I1201 16:13:46.439393 30099 master.cpp:369] Flags at startup: --acls=""
--allocation_interval="1secs" --allocator="HierarchicalDRF"
--authenticate="true" --authenticate_slaves="true" --authenticators="crammd5"
--authorizers="local" --credentials="/tmp/qZBjUp/credentials"
--framework_sorter="drf" --help="false" --hostname_lookup="true"
--initialize_driver_logging="true" --log_auto_initialize="true"
--logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5"
--quiet="false" --recovery_slave_removal_limit="100%"
--registry="replicated_log" --registry_fetch_timeout="1mins"
--registry_store_timeout="25secs" --registry_strict="true"
--root_submissions="true" --slave_ping_timeout="15secs"
--slave_reregister_timeout="10mins" --user_sorter="drf" --version="false"
--webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/qZBjUp/master"
--zk_session_timeout="10secs"
W1201 16:13:46.439997 30099 master.cpp:372]
**************************************************
Master bound to loopback interface! Cannot communicate with remote schedulers
or slaves. You might want to set '--ip' flag to a routable IP address.
**************************************************
I1201 16:13:46.440037 30099 master.cpp:414] Master only allowing authenticated
frameworks to register
I1201 16:13:46.440042 30099 master.cpp:419] Master only allowing authenticated
slaves to register
I1201 16:13:46.440047 30099 credentials.hpp:37] Loading credentials for
authentication from '/tmp/qZBjUp/credentials'
I1201 16:13:46.440315 30106 recover.cpp:566] Updating replica status to STARTING
I1201 16:13:46.440580 30099 master.cpp:458] Using default 'crammd5'
authenticator
I1201 16:13:46.440743 30099 authenticator.cpp:520] Initializing server SASL
I1201 16:13:46.442067 30099 master.cpp:495] Authorization enabled
I1201 16:13:46.447201 30099 master.cpp:1606] The newly elected leader is
[email protected]:52014 with id 0c54b5bb-d0f8-4c94-8f2a-c49672419e62
I1201 16:13:46.447230 30099 master.cpp:1619] Elected as the leading master!
I1201 16:13:46.447255 30099 master.cpp:1379] Recovering from registrar
I1201 16:13:46.447590 30099 registrar.cpp:309] Recovering registrar
I1201 16:13:46.451647 30100 leveldb.cpp:306] Persisting metadata (8 bytes) to
leveldb took 10.746719ms
I1201 16:13:46.451686 30100 replica.cpp:323] Persisted replica status to
STARTING
I1201 16:13:46.451942 30106 recover.cpp:475] Replica is in STARTING status
I1201 16:13:46.452819 30100 replica.cpp:676] Replica in STARTING status
received a broadcasted recover request from (10)@127.0.0.1:52014
I1201 16:13:46.453064 30105 recover.cpp:195] Received a recover response from a
replica in STARTING status
I1201 16:13:46.453727 30104 recover.cpp:566] Updating replica status to VOTING
I1201 16:13:46.454529 30105 leveldb.cpp:306] Persisting metadata (8 bytes) to
leveldb took 720044ns
I1201 16:13:46.454548 30105 replica.cpp:323] Persisted replica status to VOTING
I1201 16:13:46.454638 30105 recover.cpp:580] Successfully joined the Paxos group
I1201 16:13:46.454787 30105 recover.cpp:464] Recover process terminated
I1201 16:13:46.455405 30105 log.cpp:661] Attempting to start the writer
I1201 16:13:46.457233 30106 replica.cpp:496] Replica received implicit promise
request from (11)@127.0.0.1:52014 with proposal 1
I1201 16:13:46.458009 30106 leveldb.cpp:306] Persisting metadata (8 bytes) to
leveldb took 744722ns
I1201 16:13:46.458032 30106 replica.cpp:345] Persisted promised to 1
I1201 16:13:46.458905 30099 coordinator.cpp:240] Coordinator attempting to fill
missing positions
I1201 16:13:46.460537 30104 replica.cpp:391] Replica received explicit promise
request from (12)@127.0.0.1:52014 for position 0 with proposal 2
I1201 16:13:46.461196 30104 leveldb.cpp:343] Persisting action (8 bytes) to
leveldb took 615463ns
I1201 16:13:46.461220 30104 replica.cpp:715] Persisted action at 0
I1201 16:13:46.462738 30103 replica.cpp:540] Replica received write request for
position 0 from (13)@127.0.0.1:52014
I1201 16:13:46.462808 30103 leveldb.cpp:438] Reading position from leveldb took
44123ns
I1201 16:13:46.463500 30103 leveldb.cpp:343] Persisting action (14 bytes) to
leveldb took 634612ns
I1201 16:13:46.463520 30103 replica.cpp:715] Persisted action at 0
I1201 16:13:46.464489 30106 replica.cpp:694] Replica received learned notice
for position 0 from @0.0.0.0:0
I1201 16:13:46.465271 30106 leveldb.cpp:343] Persisting action (16 bytes) to
leveldb took 750397ns
I1201 16:13:46.465294 30106 replica.cpp:715] Persisted action at 0
I1201 16:13:46.465309 30106 replica.cpp:700] Replica learned NOP action at
position 0
I1201 16:13:46.466377 30102 log.cpp:677] Writer started with ending position 0
I1201 16:13:46.468979 30104 leveldb.cpp:438] Reading position from leveldb took
69844ns
I1201 16:13:46.471437 30100 registrar.cpp:342] Successfully fetched the
registry (0B) in 23.800064ms
I1201 16:13:46.471635 30100 registrar.cpp:441] Applied 1 operations in 63798ns;
attempting to update the 'registry'
I1201 16:13:46.474053 30100 log.cpp:685] Attempting to append 158 bytes to the
log
I1201 16:13:46.474249 30100 coordinator.cpp:350] Coordinator attempting to
write APPEND action at position 1
I1201 16:13:46.475517 30100 replica.cpp:540] Replica received write request for
position 1 from (14)@127.0.0.1:52014
I1201 16:13:46.476613 30100 leveldb.cpp:343] Persisting action (177 bytes) to
leveldb took 1.056031ms
I1201 16:13:46.476636 30100 replica.cpp:715] Persisted action at 1
I1201 16:13:46.477459 30099 replica.cpp:694] Replica received learned notice
for position 1 from @0.0.0.0:0
I1201 16:13:46.478224 30099 leveldb.cpp:343] Persisting action (179 bytes) to
leveldb took 731548ns
I1201 16:13:46.478245 30099 replica.cpp:715] Persisted action at 1
I1201 16:13:46.478258 30099 replica.cpp:700] Replica learned APPEND action at
position 1
I1201 16:13:46.479604 30103 registrar.cpp:486] Successfully updated the
'registry' in 7.875072ms
I1201 16:13:46.479717 30103 registrar.cpp:372] Successfully recovered registrar
I1201 16:13:46.479877 30101 log.cpp:704] Attempting to truncate the log to 1
I1201 16:13:46.480336 30099 coordinator.cpp:350] Coordinator attempting to
write TRUNCATE action at position 2
I1201 16:13:46.480451 30100 master.cpp:1416] Recovered 0 slaves from the
Registry (120B) ; allowing 10mins for slaves to re-register
I1201 16:13:46.481528 30105 replica.cpp:540] Replica received write request for
position 2 from (15)@127.0.0.1:52014
I1201 16:13:46.482573 30105 leveldb.cpp:343] Persisting action (16 bytes) to
leveldb took 1.016896ms
I1201 16:13:46.482623 30105 replica.cpp:715] Persisted action at 2
I1201 16:13:46.483778 30104 replica.cpp:694] Replica received learned notice
for position 2 from @0.0.0.0:0
I1201 16:13:46.485167 30104 leveldb.cpp:343] Persisting action (18 bytes) to
leveldb took 1.362734ms
I1201 16:13:46.485237 30104 leveldb.cpp:401] Deleting ~1 keys from leveldb took
41835ns
I1201 16:13:46.485252 30104 replica.cpp:715] Persisted action at 2
I1201 16:13:46.485272 30104 replica.cpp:700] Replica learned TRUNCATE action at
position 2
I1201 16:13:46.494474 30085 containerizer.cpp:142] Using isolation:
cgroups/cpu,cgroups/mem,filesystem/posix
I1201 16:13:49.671596 30085 linux_launcher.cpp:103] Using
/sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
I1201 16:13:49.676256 30085 systemd.cpp:128] systemd version `208` detected
W1201 16:13:49.676290 30085 systemd.cpp:136] Required functionality `Delegate`
was introduced in Version `218`. Your system may not function properly; however
since some distributions have patched systemd packages, your system may still
be functional. This is why we keep running. See MESOS-3352 for more information
I1201 16:13:49.682055 30085 systemd.cpp:210] Started systemd slice
`mesos_executors.slice`
I1201 16:13:49.752518 30106 slave.cpp:191] Slave started on 1)@127.0.0.1:52014
I1201 16:13:49.752552 30106 slave.cpp:192] Flags at startup:
--appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5"
--cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false"
--cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false"
--cgroups_root="mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a"
--container_disk_watch_interval="15secs" --containerizers="mesos"
--credential="/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/credential"
--default_role="*" --disk_watch_interval="1mins" --docker="docker"
--docker_auth_server="auth.docker.io" --docker_auth_server_port="443"
--docker_kill_orphans="true"
--docker_local_archives_dir="/tmp/mesos/images/docker" --docker_puller="local"
--docker_puller_timeout="60" --docker_registry="registry-1.docker.io"
--docker_registry_port="443" --docker_remove_delay="6hrs"
--docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns"
--docker_store_dir="/tmp/mesos/store/docker"
--enforce_container_disk_quota="false" --executor_registration_timeout="1mins"
--executor_shutdown_grace_period="5secs"
--fetcher_cache_dir="/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/fetch"
--fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks"
--gc_disk_headroom="0.1" --hadoop_home="" --help="false"
--hostname_lookup="true" --image_provisioner_backend="copy"
--initialize_driver_logging="true" --isolation="cgroups/cpu,cgroups/mem"
--launcher_dir="/home/vagrant/mesos/build-ssl/src" --logbufsecs="0"
--logging_level="INFO" --oversubscribed_resources_interval="15secs"
--perf_duration="10secs" --perf_interval="1mins"
--qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect"
--recovery_timeout="15mins" --registration_backoff_factor="10ms"
--resources="cpus:2;mem:1024;disk:1024;ports:[31000-32000]"
--revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox"
--slave_subsystems="memory,cpuacct" --strict="false" --switch_user="true"
--systemd_runtime_directory="/run/systemd/system" --version="false"
--work_dir="/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5"
W1201 16:13:49.752852 30106 slave.cpp:195]
**************************************************
Slave bound to loopback interface! Cannot communicate with remote master(s).
You might want to set '--ip' flag to a routable IP address.
**************************************************
I1201 16:13:49.752867 30106 slave.cpp:212] Moving slave process into its own
cgroup for subsystem: memory
W1201 16:13:49.753592 30085 sched.cpp:1542]
**************************************************
Scheduler driver bound to loopback interface! Cannot communicate with remote
master(s). You might want to set 'LIBPROCESS_IP' environment variable to use a
routable IP address.
**************************************************
I1201 16:13:49.755309 30085 sched.cpp:166] Version: 0.26.0
I1201 16:13:49.756275 30105 sched.cpp:264] New master detected at
[email protected]:52014
I1201 16:13:49.756366 30105 sched.cpp:320] Authenticating with master
[email protected]:52014
I1201 16:13:49.756381 30105 sched.cpp:327] Using default CRAM-MD5 authenticatee
I1201 16:13:49.756722 30101 authenticatee.cpp:99] Initializing client SASL
I1201 16:13:49.756868 30101 authenticatee.cpp:123] Creating new client SASL
connection
I1201 16:13:49.757745 30105 master.cpp:5150] Authenticating
[email protected]:52014
I1201 16:13:49.758275 30105 authenticator.cpp:100] Creating new server SASL
connection
I1201 16:13:49.758677 30101 authenticatee.cpp:214] Received SASL authentication
mechanisms: CRAM-MD5
I1201 16:13:49.758704 30101 authenticatee.cpp:240] Attempting to authenticate
with mechanism 'CRAM-MD5'
I1201 16:13:49.758795 30101 authenticator.cpp:205] Received SASL authentication
start
I1201 16:13:49.758844 30101 authenticator.cpp:327] Authentication requires more
steps
I1201 16:13:49.758931 30101 authenticatee.cpp:260] Received SASL authentication
step
I1201 16:13:49.759091 30100 authenticator.cpp:233] Received SASL authentication
step
I1201 16:13:49.759285 30100 authenticator.cpp:319] Authentication success
I1201 16:13:49.759389 30105 authenticatee.cpp:300] Authentication success
I1201 16:13:49.759475 30105 master.cpp:5180] Successfully authenticated
principal 'test-principal' at
[email protected]:52014
I1201 16:13:49.759702 30101 sched.cpp:409] Successfully authenticated with
master [email protected]:52014
I1201 16:13:49.760151 30104 master.cpp:2176] Received SUBSCRIBE call for
framework 'default' at
[email protected]:52014
I1201 16:13:49.760252 30104 master.cpp:1645] Authorizing framework principal
'test-principal' to receive offers for role '*'
I1201 16:13:49.760586 30103 master.cpp:2247] Subscribing framework default with
checkpointing enabled and capabilities [ ]
I1201 16:13:49.761400 30104 hierarchical.cpp:195] Added framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:49.761494 30103 sched.cpp:643] Framework registered with
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:50.846541 30106 slave.cpp:212] Moving slave process into its own
cgroup for subsystem: cpuacct
I1201 16:13:51.606369 30106 credentials.hpp:85] Loading credential for
authentication from '/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/credential'
I1201 16:13:51.606848 30106 slave.cpp:322] Slave using credential for:
test-principal
I1201 16:13:51.607802 30106 slave.cpp:392] Slave resources: cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I1201 16:13:51.607900 30106 slave.cpp:400] Slave attributes: [ ]
I1201 16:13:51.607913 30106 slave.cpp:405] Slave hostname: centos71
I1201 16:13:51.607918 30106 slave.cpp:410] Slave checkpoint: true
I1201 16:13:51.609427 30100 state.cpp:54] Recovering state from
'/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/meta'
I1201 16:13:51.609800 30100 status_update_manager.cpp:202] Recovering status
update manager
I1201 16:13:51.610244 30104 containerizer.cpp:384] Recovering containerizer
I1201 16:13:51.805718 30102 slave.cpp:4230] Finished recovery
I1201 16:13:51.806789 30105 slave.cpp:729] New master detected at
[email protected]:52014
I1201 16:13:51.806830 30102 status_update_manager.cpp:176] Pausing sending
status updates
I1201 16:13:51.806860 30105 slave.cpp:792] Authenticating with master
[email protected]:52014
I1201 16:13:51.806871 30105 slave.cpp:797] Using default CRAM-MD5 authenticatee
I1201 16:13:51.807067 30099 authenticatee.cpp:123] Creating new client SASL
connection
I1201 16:13:51.807073 30105 slave.cpp:765] Detecting new master
I1201 16:13:51.809116 30102 master.cpp:5150] Authenticating
slave(1)@127.0.0.1:52014
I1201 16:13:51.809415 30105 authenticator.cpp:100] Creating new server SASL
connection
I1201 16:13:51.810292 30099 authenticatee.cpp:214] Received SASL authentication
mechanisms: CRAM-MD5
I1201 16:13:51.810312 30099 authenticatee.cpp:240] Attempting to authenticate
with mechanism 'CRAM-MD5'
I1201 16:13:51.810420 30102 authenticator.cpp:205] Received SASL authentication
start
I1201 16:13:51.810454 30102 authenticator.cpp:327] Authentication requires more
steps
I1201 16:13:51.810523 30102 authenticatee.cpp:260] Received SASL authentication
step
I1201 16:13:51.810619 30105 authenticator.cpp:233] Received SASL authentication
step
I1201 16:13:51.810670 30105 authenticator.cpp:319] Authentication success
I1201 16:13:51.810789 30102 authenticatee.cpp:300] Authentication success
I1201 16:13:51.810870 30103 master.cpp:5180] Successfully authenticated
principal 'test-principal' at slave(1)@127.0.0.1:52014
I1201 16:13:51.811317 30102 slave.cpp:860] Successfully authenticated with
master [email protected]:52014
I1201 16:13:51.811728 30105 master.cpp:3859] Registering slave at
slave(1)@127.0.0.1:52014 (centos71) with id
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0
I1201 16:13:51.812587 30099 registrar.cpp:441] Applied 1 operations in 67177ns;
attempting to update the 'registry'
I1201 16:13:51.814066 30101 log.cpp:685] Attempting to append 324 bytes to the
log
I1201 16:13:51.814266 30103 coordinator.cpp:350] Coordinator attempting to
write APPEND action at position 3
I1201 16:13:51.815278 30104 replica.cpp:540] Replica received write request for
position 3 from (34)@127.0.0.1:52014
I1201 16:13:51.817994 30104 leveldb.cpp:343] Persisting action (343 bytes) to
leveldb took 2.682526ms
I1201 16:13:51.818027 30104 replica.cpp:715] Persisted action at 3
I1201 16:13:51.818933 30106 replica.cpp:694] Replica received learned notice
for position 3 from @0.0.0.0:0
I1201 16:13:51.819807 30106 leveldb.cpp:343] Persisting action (345 bytes) to
leveldb took 845203ns
I1201 16:13:51.819829 30106 replica.cpp:715] Persisted action at 3
I1201 16:13:51.819845 30106 replica.cpp:700] Replica learned APPEND action at
position 3
I1201 16:13:51.820930 30102 registrar.cpp:486] Successfully updated the
'registry' in 8.269056ms
I1201 16:13:51.821074 30101 log.cpp:704] Attempting to truncate the log to 3
I1201 16:13:51.821261 30101 coordinator.cpp:350] Coordinator attempting to
write TRUNCATE action at position 4
I1201 16:13:51.821889 30101 replica.cpp:540] Replica received write request for
position 4 from (35)@127.0.0.1:52014
I1201 16:13:51.821899 30102 master.cpp:3927] Registered slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71)
with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I1201 16:13:51.822307 30102 slave.cpp:904] Registered with master
[email protected]:52014; given slave ID 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0
I1201 16:13:51.822427 30103 hierarchical.cpp:344] Added slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 (centos71) with cpus(*):2; mem(*):1024;
disk(*):1024; ports(*):[31000-32000] (allocated: )
I1201 16:13:51.822471 30100 status_update_manager.cpp:183] Resuming sending
status updates
I1201 16:13:51.822811 30102 slave.cpp:963] Forwarding total oversubscribed
resources
I1201 16:13:51.822967 30102 master.cpp:4269] Received update of slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71)
with total oversubscribed resources
I1201 16:13:51.823230 30105 hierarchical.cpp:400] Slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 (centos71) updated with oversubscribed
resources (total: cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000])
I1201 16:13:51.823547 30102 master.cpp:4979] Sending 1 offers to framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 (default) at
[email protected]:52014
I1201 16:13:51.824229 30101 leveldb.cpp:343] Persisting action (16 bytes) to
leveldb took 2.310016ms
I1201 16:13:51.824267 30101 replica.cpp:715] Persisted action at 4
I1201 16:13:51.826622 30099 replica.cpp:694] Replica received learned notice
for position 4 from @0.0.0.0:0
I1201 16:13:51.827656 30099 leveldb.cpp:343] Persisting action (18 bytes) to
leveldb took 1.002838ms
I1201 16:13:51.827719 30099 leveldb.cpp:401] Deleting ~2 keys from leveldb took
34558ns
I1201 16:13:51.827733 30099 replica.cpp:715] Persisted action at 4
I1201 16:13:51.827749 30099 replica.cpp:700] Replica learned TRUNCATE action at
position 4
I1201 16:13:51.829840 30101 master.cpp:2915] Processing ACCEPT call for offers:
[ 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-O0 ] on slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71)
for framework 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 (default) at
[email protected]:52014
I1201 16:13:51.829892 30101 master.cpp:2711] Authorizing framework principal
'test-principal' to launch task ccd11e20-3250-4007-b026-e665da093cd1 as user
'root'
I1201 16:13:51.831492 30104 master.hpp:176] Adding task
ccd11e20-3250-4007-b026-e665da093cd1 with resources cpus(*):2; mem(*):1024;
disk(*):1024; ports(*):[31000-32000] on slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 (centos71)
I1201 16:13:51.831588 30104 master.cpp:3245] Launching task
ccd11e20-3250-4007-b026-e665da093cd1 of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 (default) at
[email protected]:52014 with resources
cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71)
I1201 16:13:51.831887 30100 slave.cpp:1294] Got assigned task
ccd11e20-3250-4007-b026-e665da093cd1 for framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:51.832974 30100 slave.cpp:1410] Launching task
ccd11e20-3250-4007-b026-e665da093cd1 for framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:51.833433 30100 paths.cpp:436] Trying to chown
'/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/slaves/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0/frameworks/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000/executors/ccd11e20-3250-4007-b026-e665da093cd1/runs/312026ae-212e-4e4c-bac9-84c8bb2930d9'
to user 'root'
I1201 16:13:51.837865 30100 slave.cpp:4999] Launching executor
ccd11e20-3250-4007-b026-e665da093cd1 of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 with resources cpus(*):0.1; mem(*):32
in work directory
'/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/slaves/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0/frameworks/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000/executors/ccd11e20-3250-4007-b026-e665da093cd1/runs/312026ae-212e-4e4c-bac9-84c8bb2930d9'
I1201 16:13:51.838582 30106 containerizer.cpp:618] Starting container
'312026ae-212e-4e4c-bac9-84c8bb2930d9' for executor
'ccd11e20-3250-4007-b026-e665da093cd1' of framework
'0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000'
I1201 16:13:51.838891 30100 slave.cpp:1628] Queuing task
'ccd11e20-3250-4007-b026-e665da093cd1' for executor
'ccd11e20-3250-4007-b026-e665da093cd1' of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:52.052389 30106 cpushare.cpp:392] Updated 'cpu.shares' to 2150
(cpus 2.1) for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:52.276257 30104 mem.cpp:605] Started listening for OOM events for
container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:52.342397 30104 mem.cpp:725] Started listening on low memory
pressure events for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:52.409441 30104 mem.cpp:725] Started listening on medium memory
pressure events for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:52.472625 30104 mem.cpp:725] Started listening on critical memory
pressure events for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:52.533138 30104 mem.cpp:356] Updated 'memory.soft_limit_in_bytes'
to 1056MB for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:52.654757 30104 mem.cpp:391] Updated 'memory.limit_in_bytes' to
1056MB for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:52.811933 30103 linux_launcher.cpp:365] Cloning child process with
flags =
I1201 16:13:52.935303 30103 linux_launcher.cpp:422] Assigned child process
'30121' to 'mesos_executors.slice'
I1201 16:13:52.935662 30103 containerizer.cpp:851] Checkpointing executor's
forked pid 30121 to
'/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/meta/slaves/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0/frameworks/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000/executors/ccd11e20-3250-4007-b026-e665da093cd1/runs/312026ae-212e-4e4c-bac9-84c8bb2930d9/pids/forked.pid'
I1201 16:13:53.059762 30121 exec.cpp:136] Version: 0.26.0
I1201 16:13:53.066401 30102 slave.cpp:2405] Got registration for executor
'ccd11e20-3250-4007-b026-e665da093cd1' of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 from executor(1)@127.0.0.1:38774
I1201 16:13:53.071707 30148 exec.cpp:210] Executor registered on slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0
Registered executor on centos71
I1201 16:13:53.136584 30100 mem.cpp:356] Updated 'memory.soft_limit_in_bytes'
to 1056MB for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:53.142530 30101 cpushare.cpp:392] Updated 'cpu.shares' to 2150
(cpus 2.1) for container 312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:53.202687 30104 slave.cpp:1793] Sending queued task
'ccd11e20-3250-4007-b026-e665da093cd1' to executor
'ccd11e20-3250-4007-b026-e665da093cd1' of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 at executor(1)@127.0.0.1:38774
Starting task ccd11e20-3250-4007-b026-e665da093cd1
sh -c 'sleep 1000'
Forked command at 30157
I1201 16:13:53.208745 30100 slave.cpp:2762] Handling status update TASK_RUNNING
(UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task
ccd11e20-3250-4007-b026-e665da093cd1 of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 from executor(1)@127.0.0.1:38774
I1201 16:13:53.209280 30101 status_update_manager.cpp:322] Received status
update TASK_RUNNING (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task
ccd11e20-3250-4007-b026-e665da093cd1 of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:53.209790 30101 status_update_manager.cpp:826] Checkpointing UPDATE
for status update TASK_RUNNING (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for
task ccd11e20-3250-4007-b026-e665da093cd1 of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:53.214834 30104 slave.cpp:3087] Forwarding the update TASK_RUNNING
(UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task
ccd11e20-3250-4007-b026-e665da093cd1 of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 to [email protected]:52014
I1201 16:13:53.215004 30104 slave.cpp:3011] Sending acknowledgement for status
update TASK_RUNNING (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task
ccd11e20-3250-4007-b026-e665da093cd1 of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 to executor(1)@127.0.0.1:38774
I1201 16:13:53.215369 30099 master.cpp:4414] Status update TASK_RUNNING (UUID:
bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task
ccd11e20-3250-4007-b026-e665da093cd1 of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 from slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71)
I1201 16:13:53.215407 30099 master.cpp:4462] Forwarding status update
TASK_RUNNING (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task
ccd11e20-3250-4007-b026-e665da093cd1 of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:53.215579 30099 master.cpp:6066] Updating the state of task
ccd11e20-3250-4007-b026-e665da093cd1 of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 (latest state: TASK_RUNNING, status
update state: TASK_RUNNING)
I1201 16:13:53.215934 30104 master.cpp:3571] Processing ACKNOWLEDGE call
bc5157f4-d1f5-4f97-93df-55a6e2d17330 for task
ccd11e20-3250-4007-b026-e665da093cd1 of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 (default) at
[email protected]:52014 on slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0
I1201 16:13:53.216327 30104 status_update_manager.cpp:394] Received status
update acknowledgement (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for task
ccd11e20-3250-4007-b026-e665da093cd1 of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:53.216462 30104 status_update_manager.cpp:826] Checkpointing ACK
for status update TASK_RUNNING (UUID: bc5157f4-d1f5-4f97-93df-55a6e2d17330) for
task ccd11e20-3250-4007-b026-e665da093cd1 of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000
I1201 16:13:53.220029 30085 slave.cpp:601] Slave terminating
I1201 16:13:53.220582 30103 master.cpp:1083] Slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71)
disconnected
I1201 16:13:53.220607 30103 master.cpp:2531] Disconnecting slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71)
I1201 16:13:53.220788 30103 master.cpp:2550] Deactivating slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71)
I1201 16:13:53.221032 30103 hierarchical.cpp:429] Slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 deactivated
I1201 16:13:53.471288 30149 exec.cpp:383] Executor asked to shutdown
Shutting down
Sending SIGTERM to process tree at pid 30157
Killing the following process trees:
[
--- 30157 sleep 1000
]
Command terminated with signal Terminated (pid: 30157)
I1201 16:13:54.534430 30085 containerizer.cpp:142] Using isolation:
cgroups/cpu,cgroups/mem,filesystem/posix
I1201 16:13:57.248833 30085 linux_launcher.cpp:103] Using
/sys/fs/cgroup/freezer as the freezer hierarchy for the Linux launcher
I1201 16:13:57.254952 30085 systemd.cpp:210] Started systemd slice
`mesos_executors.slice`
I1201 16:13:57.322489 30104 slave.cpp:191] Slave started on 2)@127.0.0.1:52014
I1201 16:13:57.322525 30104 slave.cpp:192] Flags at startup:
--appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5"
--cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false"
--cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false"
--cgroups_root="mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a"
--container_disk_watch_interval="15secs" --containerizers="mesos"
--credential="/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/credential"
--default_role="*" --disk_watch_interval="1mins" --docker="docker"
--docker_auth_server="auth.docker.io" --docker_auth_server_port="443"
--docker_kill_orphans="true"
--docker_local_archives_dir="/tmp/mesos/images/docker" --docker_puller="local"
--docker_puller_timeout="60" --docker_registry="registry-1.docker.io"
--docker_registry_port="443" --docker_remove_delay="6hrs"
--docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns"
--docker_store_dir="/tmp/mesos/store/docker"
--enforce_container_disk_quota="false" --executor_registration_timeout="1mins"
--executor_shutdown_grace_period="5secs"
--fetcher_cache_dir="/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/fetch"
--fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks"
--gc_disk_headroom="0.1" --hadoop_home="" --help="false"
--hostname_lookup="true" --image_provisioner_backend="copy"
--initialize_driver_logging="true" --isolation="cgroups/cpu,cgroups/mem"
--launcher_dir="/home/vagrant/mesos/build-ssl/src" --logbufsecs="0"
--logging_level="INFO" --oversubscribed_resources_interval="15secs"
--perf_duration="10secs" --perf_interval="1mins"
--qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect"
--recovery_timeout="15mins" --registration_backoff_factor="10ms"
--resources="cpus:2;mem:1024;disk:1024;ports:[31000-32000]"
--revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox"
--slave_subsystems="memory,cpuacct" --strict="false" --switch_user="true"
--systemd_runtime_directory="/run/systemd/system" --version="false"
--work_dir="/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5"
W1201 16:13:57.322767 30104 slave.cpp:195]
**************************************************
Slave bound to loopback interface! Cannot communicate with remote master(s).
You might want to set '--ip' flag to a routable IP address.
**************************************************
I1201 16:13:57.322778 30104 slave.cpp:212] Moving slave process into its own
cgroup for subsystem: memory
I1201 16:13:58.308562 30104 slave.cpp:212] Moving slave process into its own
cgroup for subsystem: cpuacct
I1201 16:13:58.973721 30104 credentials.hpp:85] Loading credential for
authentication from '/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/credential'
I1201 16:13:58.973891 30104 slave.cpp:322] Slave using credential for:
test-principal
I1201 16:13:58.974534 30104 slave.cpp:392] Slave resources: cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I1201 16:13:58.974582 30104 slave.cpp:400] Slave attributes: [ ]
I1201 16:13:58.974588 30104 slave.cpp:405] Slave hostname: centos71
I1201 16:13:58.974591 30104 slave.cpp:410] Slave checkpoint: true
I1201 16:13:58.975349 30104 state.cpp:54] Recovering state from
'/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/meta'
I1201 16:13:58.975401 30104 state.cpp:681] No checkpointed resources found at
'/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/meta/resources/resources.info'
I1201 16:13:58.975617 30104 state.cpp:85] Slave host rebooted
I1201 16:13:58.975929 30099 status_update_manager.cpp:202] Recovering status
update manager
I1201 16:13:58.976277 30103 containerizer.cpp:384] Recovering containerizer
I1201 16:13:59.167284 30103 containerizer.cpp:522] Removing orphan container
312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:59.358265 30101 cgroups.cpp:2429] Freezing cgroup
/sys/fs/cgroup/freezer/mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a/312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:59.358578 30103 slave.cpp:4230] Finished recovery
I1201 16:13:59.359745 30103 slave.cpp:4263] Garbage collecting old slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0
I1201 16:13:59.359963 30102 gc.cpp:56] Scheduling
'/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/slaves/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0'
for gc 6.99999583571852days in the future
I1201 16:13:59.360177 30102 gc.cpp:56] Scheduling
'/tmp/SlaveRecoveryTest_0_Reboot_5xoqS5/meta/slaves/0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0'
for gc 6.99999583361185days in the future
I1201 16:13:59.360265 30102 status_update_manager.cpp:176] Pausing sending
status updates
I1201 16:13:59.360267 30099 slave.cpp:729] New master detected at
[email protected]:52014
I1201 16:13:59.360348 30099 slave.cpp:792] Authenticating with master
[email protected]:52014
I1201 16:13:59.360359 30099 slave.cpp:797] Using default CRAM-MD5 authenticatee
I1201 16:13:59.360474 30099 slave.cpp:765] Detecting new master
I1201 16:13:59.360571 30102 authenticatee.cpp:123] Creating new client SASL
connection
I1201 16:13:59.360900 30100 master.cpp:5150] Authenticating
slave(2)@127.0.0.1:52014
I1201 16:13:59.361102 30100 authenticator.cpp:100] Creating new server SASL
connection
I1201 16:13:59.361421 30105 authenticatee.cpp:214] Received SASL authentication
mechanisms: CRAM-MD5
I1201 16:13:59.361444 30105 authenticatee.cpp:240] Attempting to authenticate
with mechanism 'CRAM-MD5'
I1201 16:13:59.361508 30105 authenticator.cpp:205] Received SASL authentication
start
I1201 16:13:59.361547 30105 authenticator.cpp:327] Authentication requires more
steps
I1201 16:13:59.361611 30105 authenticatee.cpp:260] Received SASL authentication
step
I1201 16:13:59.361699 30102 authenticator.cpp:233] Received SASL authentication
step
I1201 16:13:59.361757 30102 authenticator.cpp:319] Authentication success
I1201 16:13:59.361841 30105 authenticatee.cpp:300] Authentication success
I1201 16:13:59.361862 30102 master.cpp:5180] Successfully authenticated
principal 'test-principal' at slave(2)@127.0.0.1:52014
I1201 16:13:59.362381 30105 slave.cpp:860] Successfully authenticated with
master [email protected]:52014
I1201 16:13:59.362884 30104 master.cpp:3859] Registering slave at
slave(2)@127.0.0.1:52014 (centos71) with id
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S1
I1201 16:13:59.363746 30105 registrar.cpp:441] Applied 1 operations in 69347ns;
attempting to update the 'registry'
I1201 16:13:59.365444 30103 log.cpp:685] Attempting to append 486 bytes to the
log
I1201 16:13:59.365583 30106 coordinator.cpp:350] Coordinator attempting to
write APPEND action at position 5
I1201 16:13:59.366288 30100 replica.cpp:540] Replica received write request for
position 5 from (68)@127.0.0.1:52014
I1201 16:13:59.366833 30100 leveldb.cpp:343] Persisting action (505 bytes) to
leveldb took 513939ns
I1201 16:13:59.366853 30100 replica.cpp:715] Persisted action at 5
I1201 16:13:59.367413 30100 replica.cpp:694] Replica received learned notice
for position 5 from @0.0.0.0:0
I1201 16:13:59.368077 30100 leveldb.cpp:343] Persisting action (507 bytes) to
leveldb took 636735ns
I1201 16:13:59.368098 30100 replica.cpp:715] Persisted action at 5
I1201 16:13:59.368113 30100 replica.cpp:700] Replica learned APPEND action at
position 5
I1201 16:13:59.369359 30099 registrar.cpp:486] Successfully updated the
'registry' in 5.55392ms
I1201 16:13:59.369951 30100 log.cpp:704] Attempting to truncate the log to 5
I1201 16:13:59.371310 30104 coordinator.cpp:350] Coordinator attempting to
write TRUNCATE action at position 6
I1201 16:13:59.371423 30103 master.cpp:3927] Registered slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S1 at slave(2)@127.0.0.1:52014 (centos71)
with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I1201 16:13:59.371464 30100 hierarchical.cpp:344] Added slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S1 (centos71) with cpus(*):2; mem(*):1024;
disk(*):1024; ports(*):[31000-32000] (allocated: )
I1201 16:13:59.371532 30105 slave.cpp:904] Registered with master
[email protected]:52014; given slave ID 0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S1
I1201 16:13:59.371687 30103 status_update_manager.cpp:183] Resuming sending
status updates
I1201 16:13:59.371963 30104 replica.cpp:540] Replica received write request for
position 6 from (69)@127.0.0.1:52014
I1201 16:13:59.372654 30103 master.cpp:4979] Sending 1 offers to framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 (default) at
[email protected]:52014
I1201 16:13:59.373229 30105 slave.cpp:963] Forwarding total oversubscribed
resources
I1201 16:13:59.373275 30104 leveldb.cpp:343] Persisting action (16 bytes) to
leveldb took 1.280817ms
I1201 16:13:59.373320 30104 replica.cpp:715] Persisted action at 6
I1201 16:13:59.373451 30105 master.cpp:4269] Received update of slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S1 at slave(2)@127.0.0.1:52014 (centos71)
with total oversubscribed resources
I1201 16:13:59.373554 30085 sched.cpp:1805] Asked to stop the driver
I1201 16:13:59.373631 30105 hierarchical.cpp:400] Slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S1 (centos71) updated with oversubscribed
resources (total: cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000])
I1201 16:13:59.373713 30104 sched.cpp:1043] Stopping framework
'0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000'
I1201 16:13:59.373881 30104 replica.cpp:694] Replica received learned notice
for position 6 from @0.0.0.0:0
I1201 16:13:59.374068 30085 master.cpp:922] Master terminating
I1201 16:13:59.374452 30099 hierarchical.cpp:373] Removed slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S1
W1201 16:13:59.374821 30085 master.cpp:6118] Removing task
ccd11e20-3250-4007-b026-e665da093cd1 with resources cpus(*):2; mem(*):1024;
disk(*):1024; ports(*):[31000-32000] of framework
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-0000 on slave
0c54b5bb-d0f8-4c94-8f2a-c49672419e62-S0 at slave(1)@127.0.0.1:52014 (centos71)
in non-terminal state TASK_RUNNING
I1201 16:13:59.377071 30099 slave.cpp:3215] [email protected]:52014 exited
W1201 16:13:59.377092 30099 slave.cpp:3218] Master disconnected! Waiting for a
new master to be elected
I1201 16:13:59.381366 30104 leveldb.cpp:343] Persisting action (18 bytes) to
leveldb took 7.454044ms
I1201 16:13:59.381429 30104 leveldb.cpp:401] Deleting ~2 keys from leveldb took
36702ns
I1201 16:13:59.381441 30104 replica.cpp:715] Persisted action at 6
I1201 16:13:59.381458 30104 replica.cpp:700] Replica learned TRUNCATE action at
position 6
I1201 16:13:59.383708 30085 slave.cpp:601] Slave terminating
I1201 16:13:59.561939 30101 cgroups.cpp:1411] Successfully froze cgroup
/sys/fs/cgroup/freezer/mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a/312026ae-212e-4e4c-bac9-84c8bb2930d9
after 203.608064ms
I1201 16:13:59.755419 30106 cgroups.cpp:2447] Thawing cgroup
/sys/fs/cgroup/freezer/mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a/312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:59.880949 30106 cgroups.cpp:2429] Freezing cgroup
/sys/fs/cgroup/freezer/mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a/312026ae-212e-4e4c-bac9-84c8bb2930d9
I1201 16:13:59.947829 30105 cgroups.cpp:1440] Successfullly thawed cgroup
/sys/fs/cgroup/freezer/mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a/312026ae-212e-4e4c-bac9-84c8bb2930d9
after 192.301824ms
../../src/tests/mesos.cpp:781: Failure
(cgroups::destroy(hierarchy, cgroup)).failure(): Failed to kill tasks in nested
cgroups: Collect failed: Failed to write 'FROZEN' to control 'freezer.state':
'mesos_test_fe0ffaac-0ed6-4cac-939b-d5d65d306e0a/312026ae-212e-4e4c-bac9-84c8bb2930d9'
is not a valid cgroup
[ FAILED ] SlaveRecoveryTest/0.Reboot, where TypeParam =
mesos::internal::slave::MesosContainerizer (18301 ms)
{noformat}
was (Author: nfnt):
{noformat}
[ RUN ] SlaveRecoveryTest/0.Reboot
I1201 15:59:03.294540 21012 exec.cpp:136] Version: 0.26.0
I1201 15:59:03.302486 21039 exec.cpp:210] Executor registered on slave
b17072f2-ce17-4f80-aa41-2197194f7cd0-S0
Registered executor on centos71
Starting task 6060349a-ab26-45d2-a2fa-96e561f794a8
sh -c 'sleep 1000'
Forked command at 21048
I1201 15:59:03.420940 21044 exec.cpp:383] Executor asked to shutdown
Shutting down
Sending SIGTERM to process tree at pid 21048
Killing the following process trees:
[
--- 21048 sleep 1000
]
Command terminated with signal Terminated (pid: 21048)
../../src/tests/mesos.cpp:781: Failure
(cgroups::destroy(hierarchy, cgroup)).failure(): Failed to kill tasks in nested
cgroups: Collect failed: Invalid freezer cgroup:
'mesos_test_d456f5bc-7718-4850-990e-8961404efd15/8fa1aee7-b393-4a20-85e1-33cd2fca0b10'
is not a valid cgroup
[ FAILED ] SlaveRecoveryTest/0.Reboot, where TypeParam =
mesos::internal::slave::MesosContainerizer (4835 ms)
{noformat}
> SlaveRecoveryTest/0.Reboot fails under CentOS 7.1 with libevent & SSL enabled.
> ------------------------------------------------------------------------------
>
> Key: MESOS-4032
> URL: https://issues.apache.org/jira/browse/MESOS-4032
> Project: Mesos
> Issue Type: Bug
> Environment: CentOS 7.1, {{--enable-libevent --enable-ssl}}
> Reporter: Jan Schlicht
>
> Running {{sudo ./bin/mesos-tests.sh}} has SlaveRecoveryTest/0.Reboot failing.
> A virtual env was used to run the tests.
> Vagrantfile generator:
> {noformat}
> cat << EOF > Vagrantfile
> # -*- mode: ruby -*-" >
> # vi: set ft=ruby :
> Vagrant.configure(2) do |config|
> # Disable shared folder to prevent certain kernel module dependencies.
> config.vm.synced_folder ".", "/vagrant", disabled: true
> config.vm.hostname = "centos71"
> config.vm.box = "bento/centos-7.1"
> config.vm.provider "virtualbox" do |vb|
> vb.memory = 8192
> vb.cpus = 8
> end
> config.vm.provision "shell", inline: <<-SHELL
> yum -y update systemd
> yum install -y tar wget
> wget
> http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo
> -O /etc/yum.repos.d/epel-apache-maven.repo
> yum groupinstall -y "Development Tools"
> yum install -y apache-maven python-devel java-1.7.0-openjdk-devel
> zlib-devel libcurl-devel openssl-devel cyrus-sasl-devel cyrus-sasl-md5
> apr-devel subversion-devel apr-util-devel
> yum install -y libevent-devel
> yum install -y perf nmap-ncat
> yum install -y git
> yum install -y docker
> systemctl start docker
> systemctl enable docker
> SHELL
> end
> EOF
> vagrant up
> vagrant reload
> vagrant ssh -c "
> git clone https://github.com/apache/mesos.git mesos
> cd mesos
> git checkout -b 0.26.0-rc2 0.26.0-rc2
> ./bootstrap
> mkdir build
> cd build
> ../configure --enable-libevent --enable-ssl
> GTEST_FILTER="" make check
> sudo ./bin/mesos-tests.sh
> "
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)