[
https://issues.apache.org/jira/browse/MESOS-3770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinod Kone updated MESOS-3770:
------------------------------
Affects Version/s: (was: 0.25.0)
0.26.0
The test is flaky because of an actual bug in the code which was introduced in
MESOS-2864. I'll re-open it add details of the bug there.
> SlaveRecoveryTest/0.RecoverCompletedExecutor is flaky
> -----------------------------------------------------
>
> Key: MESOS-3770
> URL: https://issues.apache.org/jira/browse/MESOS-3770
> Project: Mesos
> Issue Type: Bug
> Components: test
> Affects Versions: 0.26.0
> Reporter: Vinod Kone
> Assignee: Vinod Kone
>
> Observed this in internal CI
> {code}
> DEBUG: [ RUN ] SlaveRecoveryTest/0.RecoverCompletedExecutor
> DEBUG: Using temporary directory
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_rTtR9B'
> DEBUG: I1020 08:56:36.634321 28115 leveldb.cpp:176] Opened db in 185.662339ms
> DEBUG: I1020 08:56:36.701638 28115 leveldb.cpp:183] Compacted db in
> 67.257643ms
> DEBUG: I1020 08:56:36.701705 28115 leveldb.cpp:198] Created db iterator in
> 8212ns
> DEBUG: I1020 08:56:36.701719 28115 leveldb.cpp:204] Seeked to beginning of db
> in 1417ns
> DEBUG: I1020 08:56:36.701730 28115 leveldb.cpp:273] Iterated through 0 keys
> in the db in 357ns
> DEBUG: I1020 08:56:36.701756 28115 replica.cpp:746] Replica recovered with
> log positions 0 -> 0 with 1 holes and 0 unlearned
> DEBUG: I1020 08:56:36.702062 28132 recover.cpp:449] Starting replica recovery
> DEBUG: I1020 08:56:36.702116 28132 recover.cpp:475] Replica is in EMPTY status
> DEBUG: I1020 08:56:36.702952 28132 replica.cpp:642] Replica in EMPTY status
> received a broadcasted recover request from (7143)@172.16.132.117:37586
> DEBUG: I1020 08:56:36.703795 28141 recover.cpp:195] Received a recover
> response from a replica in EMPTY status
> DEBUG: I1020 08:56:36.704100 28138 recover.cpp:566] Updating replica status
> to STARTING
> DEBUG: I1020 08:56:36.705229 28133 master.cpp:376] Master
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b (smfc-ahy-19-sr2.corpdc.twitter.com)
> started on 172.16.132.117:37586
> DEBUG: I1020 08:56:36.705289 28133 master.cpp:378] Flags at startup:
> --acls="" --allocation_interval="1secs" --allocator="HierarchicalDRF"
> --authenticate="true" --authenticate_slaves="true" --authenticators="crammd5"
> --authorizers="local"
> --credentials="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_rTtR9B/credentials"
> --framework_sorter="drf" --help="false" --hostname_lookup="true"
> --initialize_driver_logging="true" --log_auto_initialize="true"
> --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5"
> --quiet="false" --recovery_slave_removal_limit="100%"
> --registry="replicated_log" --registry_fetch_timeout="1mins"
> --registry_store_timeout="25secs" --registry_strict="true"
> --root_submissions="true" --slave_ping_timeout="15secs"
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false"
> --webui_dir="/usr/local/share/mesos/webui"
> --work_dir="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_rTtR9B/master"
> --zk_session_timeout="10secs"
> DEBUG: I1020 08:56:36.705440 28133 master.cpp:423] Master only allowing
> authenticated frameworks to register
> DEBUG: I1020 08:56:36.705446 28133 master.cpp:428] Master only allowing
> authenticated slaves to register
> DEBUG: I1020 08:56:36.705451 28133 credentials.hpp:37] Loading credentials
> for authentication from
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_rTtR9B/credentials'
> DEBUG: I1020 08:56:36.705587 28133 master.cpp:467] Using default 'crammd5'
> authenticator
> DEBUG: I1020 08:56:36.705651 28133 master.cpp:504] Authorization enabled
> DEBUG: I1020 08:56:36.706521 28134 master.cpp:1609] The newly elected leader
> is [email protected]:37586 with id 0d54e2f1-43d7-4f74-8532-9c37ac40791b
> DEBUG: I1020 08:56:36.706539 28134 master.cpp:1622] Elected as the leading
> master!
> DEBUG: I1020 08:56:36.706545 28134 master.cpp:1382] Recovering from registrar
> DEBUG: I1020 08:56:36.706681 28146 registrar.cpp:309] Recovering registrar
> DEBUG: I1020 08:56:36.768453 28138 leveldb.cpp:306] Persisting metadata (8
> bytes) to leveldb took 64.300669ms
> DEBUG: I1020 08:56:36.768492 28138 replica.cpp:323] Persisted replica status
> to STARTING
> DEBUG: I1020 08:56:36.768568 28138 recover.cpp:475] Replica is in STARTING
> status
> DEBUG: I1020 08:56:36.769737 28131 replica.cpp:642] Replica in STARTING
> status received a broadcasted recover request from (7144)@172.16.132.117:37586
> DEBUG: I1020 08:56:36.769816 28131 recover.cpp:195] Received a recover
> response from a replica in STARTING status
> DEBUG: I1020 08:56:36.770355 28141 recover.cpp:566] Updating replica status
> to VOTING
> DEBUG: I1020 08:56:36.818709 28136 leveldb.cpp:306] Persisting metadata (8
> bytes) to leveldb took 48.054479ms
> DEBUG: I1020 08:56:36.818743 28136 replica.cpp:323] Persisted replica status
> to VOTING
> DEBUG: I1020 08:56:36.818791 28136 recover.cpp:580] Successfully joined the
> Paxos group
> DEBUG: I1020 08:56:36.818842 28136 recover.cpp:464] Recover process terminated
> DEBUG: I1020 08:56:36.818954 28130 log.cpp:661] Attempting to start the writer
> DEBUG: I1020 08:56:36.820060 28140 replica.cpp:478] Replica received implicit
> promise request from (7145)@172.16.132.117:37586 with proposal 1
> DEBUG: I1020 08:56:36.885488 28140 leveldb.cpp:306] Persisting metadata (8
> bytes) to leveldb took 65.391754ms
> DEBUG: I1020 08:56:36.885524 28140 replica.cpp:345] Persisted promised to 1
> DEBUG: I1020 08:56:36.885898 28148 coordinator.cpp:231] Coordinator attemping
> to fill missing position
> DEBUG: I1020 08:56:36.888197 28143 replica.cpp:378] Replica received explicit
> promise request from (7146)@172.16.132.117:37586 for position 0 with proposal
> 2
> DEBUG: I1020 08:56:36.952364 28143 leveldb.cpp:343] Persisting action (8
> bytes) to leveldb took 64.11496ms
> DEBUG: I1020 08:56:36.952407 28143 replica.cpp:681] Persisted action at 0
> DEBUG: I1020 08:56:36.953027 28139 replica.cpp:512] Replica received write
> request for position 0 from (7147)@172.16.132.117:37586
> DEBUG: I1020 08:56:36.953059 28139 leveldb.cpp:438] Reading position from
> leveldb took 14203ns
> DEBUG: I1020 08:56:36.994161 28139 leveldb.cpp:343] Persisting action (14
> bytes) to leveldb took 41.082908ms
> DEBUG: I1020 08:56:36.994199 28139 replica.cpp:681] Persisted action at 0
> DEBUG: I1020 08:56:36.994415 28143 replica.cpp:660] Replica received learned
> notice for position 0 from @0.0.0.0:0
> DEBUG: I1020 08:56:37.036006 28143 leveldb.cpp:343] Persisting action (16
> bytes) to leveldb took 41.569884ms
> DEBUG: I1020 08:56:37.036043 28143 replica.cpp:681] Persisted action at 0
> DEBUG: I1020 08:56:37.036056 28143 replica.cpp:666] Replica learned NOP
> action at position 0
> DEBUG: I1020 08:56:37.036365 28138 log.cpp:677] Writer started with ending
> position 0
> DEBUG: I1020 08:56:37.037250 28133 leveldb.cpp:438] Reading position from
> leveldb took 45165ns
> DEBUG: I1020 08:56:37.038321 28131 registrar.cpp:342] Successfully fetched
> the registry (0B) in 331.61216ms
> DEBUG: I1020 08:56:37.038378 28131 registrar.cpp:441] Applied 1 operations in
> 6704ns; attempting to update the 'registry'
> DEBUG: I1020 08:56:37.038996 28132 log.cpp:685] Attempting to append 228
> bytes to the log
> DEBUG: I1020 08:56:37.039103 28132 coordinator.cpp:341] Coordinator
> attempting to write APPEND action at position 1
> DEBUG: I1020 08:56:37.039717 28152 replica.cpp:512] Replica received write
> request for position 1 from (7148)@172.16.132.117:37586
> DEBUG: I1020 08:56:37.111176 28152 leveldb.cpp:343] Persisting action (247
> bytes) to leveldb took 71.428059ms
> DEBUG: I1020 08:56:37.111225 28152 replica.cpp:681] Persisted action at 1
> DEBUG: I1020 08:56:37.111547 28138 replica.cpp:660] Replica received learned
> notice for position 1 from @0.0.0.0:0
> DEBUG: I1020 08:56:37.177995 28138 leveldb.cpp:343] Persisting action (249
> bytes) to leveldb took 66.426857ms
> DEBUG: I1020 08:56:37.178033 28138 replica.cpp:681] Persisted action at 1
> DEBUG: I1020 08:56:37.178045 28138 replica.cpp:666] Replica learned APPEND
> action at position 1
> DEBUG: I1020 08:56:37.178407 28137 registrar.cpp:486] Successfully updated
> the 'registry' in 139.99104ms
> DEBUG: I1020 08:56:37.178473 28137 registrar.cpp:372] Successfully recovered
> registrar
> DEBUG: I1020 08:56:37.178496 28146 log.cpp:704] Attempting to truncate the
> log to 1
> DEBUG: I1020 08:56:37.178966 28145 master.cpp:1419] Recovered 0 slaves from
> the Registry (189B) ; allowing 10mins for slaves to re-register
> DEBUG: I1020 08:56:37.178951 28140 coordinator.cpp:341] Coordinator
> attempting to write TRUNCATE action at position 2
> DEBUG: I1020 08:56:37.180043 28132 replica.cpp:512] Replica received write
> request for position 2 from (7149)@172.16.132.117:37586
> DEBUG: I1020 08:56:37.219796 28132 leveldb.cpp:343] Persisting action (16
> bytes) to leveldb took 39.72368ms
> DEBUG: I1020 08:56:37.219825 28132 replica.cpp:681] Persisted action at 2
> DEBUG: I1020 08:56:37.219990 28132 replica.cpp:660] Replica received learned
> notice for position 2 from @0.0.0.0:0
> DEBUG: I1020 08:56:37.278291 28132 leveldb.cpp:343] Persisting action (18
> bytes) to leveldb took 58.280037ms
> DEBUG: I1020 08:56:37.278342 28132 leveldb.cpp:401] Deleting ~1 keys from
> leveldb took 21056ns
> DEBUG: I1020 08:56:37.278355 28132 replica.cpp:681] Persisted action at 2
> DEBUG: I1020 08:56:37.278363 28132 replica.cpp:666] Replica learned TRUNCATE
> action at position 2
> DEBUG: I1020 08:56:37.281244 28115 containerizer.cpp:143] Using isolation:
> posix/cpu,posix/mem,filesystem/posix
> DEBUG: W1020 08:56:37.281390 28115 backend.cpp:50] Failed to create 'bind'
> backend: BindBackend requires root privileges
> DEBUG: I1020 08:56:37.284584 28153 slave.cpp:191] Slave started on
> 255)@172.16.132.117:37586
> DEBUG: I1020 08:56:37.284600 28153 slave.cpp:192] Flags at startup:
> --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5"
> --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false"
> --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false"
> --cgroups_root="mesos" --container_disk_watch_interval="15secs"
> --containerizers="mesos"
> --credential="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/credential"
> --default_role="*" --disk_watch_interval="1mins" --docker="docker"
> --docker_kill_orphans="true"
> --docker_local_archives_dir="/tmp/mesos/images/docker"
> --docker_puller="local" --docker_remove_delay="6hrs"
> --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns"
> --docker_store_dir="/tmp/mesos/store/docker"
> --egress_unique_flow_per_container="false"
> --enforce_container_disk_quota="false" --ephemeral_ports_per_container="1024"
> --executor_registration_timeout="1mins"
> --executor_shutdown_grace_period="5secs"
> --fetcher_cache_dir="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/fetch"
> --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks"
> --gc_disk_headroom="0.1" --hadoop_home="" --help="false"
> --hostname_lookup="true" --image_provisioner_backend="copy"
> --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem"
> --launcher_dir="/builddir/build/BUILD/mesos-0.26.0/src" --logbufsecs="0"
> --logging_level="INFO" --network_enable_socket_statistics_details="false"
> --network_enable_socket_statistics_summary="false"
> --oversubscribed_resources_interval="15secs" --perf_duration="10secs"
> --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false"
> --recover="reconnect" --recovery_timeout="15mins"
> --registration_backoff_factor="10ms"
> --resources="cpus:2;mem:1024;disk:1024;ports:[31000-32000]"
> --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox"
> --strict="true" --switch_user="true"
> --systemd_runtime_directory="/run/systemd/system" --version="false"
> --work_dir="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx"
> DEBUG: I1020 08:56:37.284816 28153 credentials.hpp:85] Loading credential for
> authentication from
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/credential'
> DEBUG: I1020 08:56:37.284910 28153 slave.cpp:322] Slave using credential for:
> test-principal
> DEBUG: I1020 08:56:37.285879 28153 slave.cpp:392] Slave resources: cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> DEBUG: I1020 08:56:37.285928 28153 slave.cpp:400] Slave attributes: [ ]
> DEBUG: I1020 08:56:37.285943 28153 slave.cpp:405] Slave hostname:
> smfc-ahy-19-sr2.corpdc.twitter.com
> DEBUG: I1020 08:56:37.285955 28153 slave.cpp:410] Slave checkpoint: true
> DEBUG: I1020 08:56:37.286746 28115 sched.cpp:166] Version: 0.26.0-rc0
> DEBUG: I1020 08:56:37.286973 28151 state.cpp:54] Recovering state from
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta'
> DEBUG: I1020 08:56:37.287063 28143 status_update_manager.cpp:202] Recovering
> status update manager
> DEBUG: I1020 08:56:37.287384 28137 containerizer.cpp:385] Recovering
> containerizer
> DEBUG: I1020 08:56:37.287608 28139 sched.cpp:264] New master detected at
> [email protected]:37586
> DEBUG: I1020 08:56:37.287746 28139 sched.cpp:320] Authenticating with master
> [email protected]:37586
> DEBUG: I1020 08:56:37.287757 28139 sched.cpp:327] Using default CRAM-MD5
> authenticatee
> DEBUG: I1020 08:56:37.287824 28135 slave.cpp:4222] Finished recovery
> DEBUG: I1020 08:56:37.287967 28149 authenticatee.cpp:115] Creating new client
> SASL connection
> DEBUG: I1020 08:56:37.288192 28139 master.cpp:5144] Authenticating
> [email protected]:37586
> DEBUG: I1020 08:56:37.288382 28136 status_update_manager.cpp:176] Pausing
> sending status updates
> DEBUG: I1020 08:56:37.288444 28140 slave.cpp:726] New master detected at
> [email protected]:37586
> DEBUG: I1020 08:56:37.288467 28140 slave.cpp:789] Authenticating with master
> [email protected]:37586
> DEBUG: I1020 08:56:37.288475 28140 slave.cpp:794] Using default CRAM-MD5
> authenticatee
> DEBUG: I1020 08:56:37.288508 28140 slave.cpp:762] Detecting new master
> DEBUG: I1020 08:56:37.288604 28130 authenticatee.cpp:115] Creating new client
> SASL connection
> DEBUG: I1020 08:56:37.288717 28140 authenticator.cpp:92] Creating new server
> SASL connection
> DEBUG: I1020 08:56:37.288785 28139 master.cpp:5144] Authenticating
> slave(255)@172.16.132.117:37586
> DEBUG: I1020 08:56:37.288861 28135 authenticatee.cpp:206] Received SASL
> authentication mechanisms: CRAM-MD5
> DEBUG: I1020 08:56:37.288882 28135 authenticatee.cpp:232] Attempting to
> authenticate with mechanism 'CRAM-MD5'
> DEBUG: I1020 08:56:37.289080 28133 authenticator.cpp:92] Creating new server
> SASL connection
> DEBUG: I1020 08:56:37.289175 28131 authenticatee.cpp:206] Received SASL
> authentication mechanisms: CRAM-MD5
> DEBUG: I1020 08:56:37.289199 28131 authenticatee.cpp:232] Attempting to
> authenticate with mechanism 'CRAM-MD5'
> DEBUG: I1020 08:56:37.289355 28144 authenticator.cpp:197] Received SASL
> authentication start
> DEBUG: I1020 08:56:37.289371 28140 authenticator.cpp:197] Received SASL
> authentication start
> DEBUG: I1020 08:56:37.289440 28144 authenticator.cpp:319] Authentication
> requires more steps
> DEBUG: I1020 08:56:37.289459 28140 authenticator.cpp:319] Authentication
> requires more steps
> DEBUG: I1020 08:56:37.289480 28144 authenticatee.cpp:252] Received SASL
> authentication step
> DEBUG: I1020 08:56:37.289489 28140 authenticatee.cpp:252] Received SASL
> authentication step
> DEBUG: I1020 08:56:37.289813 28132 authenticator.cpp:225] Received SASL
> authentication step
> DEBUG: I1020 08:56:37.289852 28132 authenticator.cpp:311] Authentication
> success
> DEBUG: I1020 08:56:37.289898 28143 authenticatee.cpp:292] Authentication
> success
> DEBUG: I1020 08:56:37.289971 28142 master.cpp:5174] Successfully
> authenticated principal 'test-principal' at
> [email protected]:37586
> DEBUG: I1020 08:56:37.290009 28143 sched.cpp:409] Successfully authenticated
> with master [email protected]:37586
> DEBUG: I1020 08:56:37.290163 28137 authenticator.cpp:225] Received SASL
> authentication step
> DEBUG: I1020 08:56:37.290220 28137 authenticator.cpp:311] Authentication
> success
> DEBUG: I1020 08:56:37.290266 28134 master.cpp:2185] Received SUBSCRIBE call
> for framework 'default' at
> [email protected]:37586
> DEBUG: I1020 08:56:37.290289 28134 master.cpp:1648] Authorizing framework
> principal 'test-principal' to receive offers for role '*'
> DEBUG: I1020 08:56:37.290405 28153 authenticatee.cpp:292] Authentication
> success
> DEBUG: I1020 08:56:37.290431 28144 master.cpp:5174] Successfully
> authenticated principal 'test-principal' at slave(255)@172.16.132.117:37586
> DEBUG: I1020 08:56:37.290464 28144 master.cpp:2256] Subscribing framework
> default with checkpointing enabled and capabilities [ ]
> DEBUG: I1020 08:56:37.290654 28136 hierarchical.cpp:185] Added framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.290650 28148 slave.cpp:857] Successfully authenticated
> with master [email protected]:37586
> DEBUG: I1020 08:56:37.290817 28144 master.cpp:3868] Registering slave at
> slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com) with id
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0
> DEBUG: I1020 08:56:37.290873 28151 sched.cpp:643] Framework registered with
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.290930 28130 registrar.cpp:441] Applied 1 operations in
> 16086ns; attempting to update the 'registry'
> DEBUG: I1020 08:56:37.291250 28151 log.cpp:685] Attempting to append 419
> bytes to the log
> DEBUG: I1020 08:56:37.291321 28139 coordinator.cpp:341] Coordinator
> attempting to write APPEND action at position 3
> DEBUG: I1020 08:56:37.291939 28131 replica.cpp:512] Replica received write
> request for position 3 from (7161)@172.16.132.117:37586
> DEBUG: I1020 08:56:37.300041 28136 master.cpp:3856] Ignoring register slave
> message from slave(255)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com) as admission is already in progress
> DEBUG: I1020 08:56:37.311420 28141 master.cpp:3856] Ignoring register slave
> message from slave(255)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com) as admission is already in progress
> DEBUG: I1020 08:56:37.341991 28130 master.cpp:3856] Ignoring register slave
> message from slave(255)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com) as admission is already in progress
> DEBUG: I1020 08:56:37.354004 28131 leveldb.cpp:343] Persisting action (438
> bytes) to leveldb took 62.03948ms
> DEBUG: I1020 08:56:37.354042 28131 replica.cpp:681] Persisted action at 3
> DEBUG: I1020 08:56:37.354390 28136 replica.cpp:660] Replica received learned
> notice for position 3 from @0.0.0.0:0
> DEBUG: I1020 08:56:37.378435 28130 master.cpp:3856] Ignoring register slave
> message from slave(255)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com) as admission is already in progress
> DEBUG: I1020 08:56:37.395829 28136 leveldb.cpp:343] Persisting action (440
> bytes) to leveldb took 41.417781ms
> DEBUG: I1020 08:56:37.395867 28136 replica.cpp:681] Persisted action at 3
> DEBUG: I1020 08:56:37.395879 28136 replica.cpp:666] Replica learned APPEND
> action at position 3
> DEBUG: I1020 08:56:37.396229 28136 registrar.cpp:486] Successfully updated
> the 'registry' in 105.257984ms
> DEBUG: I1020 08:56:37.396466 28148 log.cpp:704] Attempting to truncate the
> log to 3
> DEBUG: I1020 08:56:37.396519 28151 coordinator.cpp:341] Coordinator
> attempting to write TRUNCATE action at position 4
> DEBUG: I1020 08:56:37.396827 28136 master.cpp:3936] Registered slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com) with cpus(*):2; mem(*):1024;
> disk(*):1024; ports(*):[31000-32000]
> DEBUG: I1020 08:56:37.396867 28137 slave.cpp:901] Registered with master
> [email protected]:37586; given slave ID
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0
> DEBUG: I1020 08:56:37.396874 28152 hierarchical.cpp:335] Added slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 (smfc-ahy-19-sr2.corpdc.twitter.com)
> with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (allocated:
> )
> DEBUG: I1020 08:56:37.397186 28150 master.cpp:4973] Sending 1 offers to
> framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 (default) at
> [email protected]:37586
> DEBUG: I1020 08:56:37.397260 28132 status_update_manager.cpp:183] Resuming
> sending status updates
> DEBUG: I1020 08:56:37.397285 28136 replica.cpp:512] Replica received write
> request for position 4 from (7162)@172.16.132.117:37586
> DEBUG: I1020 08:56:37.397516 28137 slave.cpp:960] Forwarding total
> oversubscribed resources
> DEBUG: I1020 08:56:37.397769 28150 master.cpp:4278] Received update of slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com) with total oversubscribed resources
> DEBUG: I1020 08:56:37.397902 28149 hierarchical.cpp:391] Slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 (smfc-ahy-19-sr2.corpdc.twitter.com)
> updated with oversubscribed resources (total: cpus(*):2; mem(*):1024;
> disk(*):1024; ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024;
> disk(*):1024; ports(*):[31000-32000])
> DEBUG: I1020 08:56:37.399044 28152 master.cpp:2924] Processing ACCEPT call
> for offers: [ 0d54e2f1-43d7-4f74-8532-9c37ac40791b-O0 ] on slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com) for framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 (default) at
> [email protected]:37586
> DEBUG: I1020 08:56:37.399066 28152 master.cpp:2720] Authorizing framework
> principal 'test-principal' to launch task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b as user 'mockbuild'
> DEBUG: I1020 08:56:37.399948 28147 master.hpp:176] Adding task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b with resources cpus(*):2; mem(*):1024;
> disk(*):1024; ports(*):[31000-32000] on slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:37.399991 28147 master.cpp:3254] Launching task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 (default) at
> [email protected]:37586 with
> resources cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on
> slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at
> slave(255)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:37.400190 28147 slave.cpp:1291] Got assigned task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b for framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.409716 28147 slave.cpp:1407] Launching task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b for framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.410130 28147 paths.cpp:417] Trying to chown
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b/runs/2b1371cc-4d78-44d1-b1f1-e1b29127adc8'
> to user 'mockbuild'
> DEBUG: I1020 08:56:37.418064 28147 slave.cpp:4994] Launching executor
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 with resources cpus(*):0.1;
> mem(*):32 in work directory
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b/runs/2b1371cc-4d78-44d1-b1f1-e1b29127adc8'
> DEBUG: I1020 08:56:37.418360 28137 containerizer.cpp:639] Starting container
> '2b1371cc-4d78-44d1-b1f1-e1b29127adc8' for executor
> '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' of framework
> '0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000'
> DEBUG: I1020 08:56:37.418512 28147 slave.cpp:1625] Queuing task
> '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' for executor
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> '0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.421762 28143 launcher.cpp:132] Forked child with pid
> '40438' for container '2b1371cc-4d78-44d1-b1f1-e1b29127adc8'
> DEBUG: I1020 08:56:37.421979 28143 containerizer.cpp:872] Checkpointing
> executor's forked pid 40438 to
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b/runs/2b1371cc-4d78-44d1-b1f1-e1b29127adc8/pids/forked.pid'
> DEBUG: I1020 08:56:37.463163 28136 leveldb.cpp:343] Persisting action (16
> bytes) to leveldb took 65.83211ms
> DEBUG: I1020 08:56:37.463227 28136 replica.cpp:681] Persisted action at 4
> DEBUG: I1020 08:56:37.464862 28136 replica.cpp:660] Replica received learned
> notice for position 4 from @0.0.0.0:0
> DEBUG: I1020 08:56:37.504696 40438 exec.cpp:136] Version: 0.26.0-rc0
> DEBUG: I1020 08:56:37.506574 28150 slave.cpp:2415] Got registration for
> executor '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 from
> executor(1)@172.16.132.117:41415
> DEBUG: I1020 08:56:37.575783 40482 exec.cpp:210] Executor registered on slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0
> DEBUG: I1020 08:56:37.576037 28147 slave.cpp:1796] Sending queued task
> '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' to executor
> '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: Registered executor on smfc-ahy-19-sr2.corpdc.twitter.com
> DEBUG: Starting task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b
> DEBUG: sh -c 'exit 0'
> DEBUG: Forked command at 40500
> DEBUG: I1020 08:56:37.580664 28139 slave.cpp:2778] Handling status update
> TASK_RUNNING (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 from
> executor(1)@172.16.132.117:41415
> DEBUG: I1020 08:56:37.580994 28146 status_update_manager.cpp:322] Received
> status update TASK_RUNNING (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for
> task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.581665 28146 status_update_manager.cpp:826]
> Checkpointing UPDATE for status update TASK_RUNNING (UUID:
> fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.623446 28136 leveldb.cpp:343] Persisting action (18
> bytes) to leveldb took 158.550858ms
> DEBUG: Command exited with status 0 (pid: 40500)
> DEBUG: I1020 08:56:37.623538 28136 leveldb.cpp:401] Deleting ~2 keys from
> leveldb took 41621ns
> DEBUG: I1020 08:56:37.623558 28136 replica.cpp:681] Persisted action at 4
> DEBUG: I1020 08:56:37.623571 28136 replica.cpp:666] Replica learned TRUNCATE
> action at position 4
> DEBUG: I1020 08:56:37.681676 28145 slave.cpp:2778] Handling status update
> TASK_FINISHED (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 from
> executor(1)@172.16.132.117:41415
> DEBUG: I1020 08:56:37.840840 28146 status_update_manager.cpp:322] Received
> status update TASK_FINISHED (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for
> task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.840920 28146 status_update_manager.cpp:826]
> Checkpointing UPDATE for status update TASK_FINISHED (UUID:
> 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.840901 28148 slave.cpp:3090] Forwarding the update
> TASK_RUNNING (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 to [email protected]:37586
> DEBUG: I1020 08:56:37.841084 28148 slave.cpp:3020] Sending acknowledgement
> for status update TASK_RUNNING (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b)
> for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 to executor(1)@172.16.132.117:41415
> DEBUG: I1020 08:56:37.841506 28153 master.cpp:4421] Status update
> TASK_RUNNING (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 from slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:37.841531 28153 master.cpp:4460] Forwarding status update
> TASK_RUNNING (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.841815 28153 master.cpp:6086] Updating the latest state
> of task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 to TASK_FINISHED (status update
> state: TASK_RUNNING)
> DEBUG: I1020 08:56:37.842349 28131 hierarchical.cpp:739] Recovered cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total: cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: ) on slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 from framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.843065 28152 master.cpp:3612] Processing ACKNOWLEDGE
> call fb1a8b73-5136-4a8f-ae92-9d26b13fb41b for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 (default) at
> [email protected]:37586 on slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0
> DEBUG: I1020 08:56:37.932744 28146 status_update_manager.cpp:394] Received
> status update acknowledgement (UUID: fb1a8b73-5136-4a8f-ae92-9d26b13fb41b)
> for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.932793 28146 status_update_manager.cpp:826]
> Checkpointing ACK for status update TASK_RUNNING (UUID:
> fb1a8b73-5136-4a8f-ae92-9d26b13fb41b) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:37.932839 28152 slave.cpp:3020] Sending acknowledgement
> for status update TASK_FINISHED (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5)
> for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 to executor(1)@172.16.132.117:41415
> DEBUG: 2015-10-20
> 08:56:37,948:28115(0x7f526963c700):ZOO_ERROR@handle_socket_error_msg@1697:
> Socket [127.0.0.1:34917] zk retcode=-4, errno=111(Connection refused): server
> refused to accept the client
> DEBUG: I1020 08:56:38.091502 28144 slave.cpp:3090] Forwarding the update
> TASK_FINISHED (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 to [email protected]:37586
> DEBUG: I1020 08:56:38.091954 28152 master.cpp:4421] Status update
> TASK_FINISHED (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 from slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:38.092000 28152 master.cpp:4460] Forwarding status update
> TASK_FINISHED (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: E1020 08:56:38.092056 28152 master.cpp:6032] Ignoring status update
> for the terminated task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b (TASK_FINISHED
> -> TASK_FINISHED) of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.092725 28153 master.cpp:3612] Processing ACKNOWLEDGE
> call 4a38e137-c590-48f6-ba10-2e0ba4903ac5 for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 (default) at
> [email protected]:37586 on slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0
> DEBUG: I1020 08:56:38.092818 28153 status_update_manager.cpp:394] Received
> status update acknowledgement (UUID: 4a38e137-c590-48f6-ba10-2e0ba4903ac5)
> for task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.092852 28153 status_update_manager.cpp:826]
> Checkpointing ACK for status update TASK_FINISHED (UUID:
> 4a38e137-c590-48f6-ba10-2e0ba4903ac5) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.691695 28137 containerizer.cpp:1278] Executor for
> container '2b1371cc-4d78-44d1-b1f1-e1b29127adc8' has exited
> DEBUG: I1020 08:56:38.691723 28137 containerizer.cpp:1095] Destroying
> container '2b1371cc-4d78-44d1-b1f1-e1b29127adc8'
> DEBUG: I1020 08:56:38.708005 28138 master.cpp:4973] Sending 1 offers to
> framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 (default) at
> [email protected]:37586
> DEBUG: I1020 08:56:38.715711 28141 slave.cpp:3542] Executor
> '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 exited with status 0
> DEBUG: I1020 08:56:38.715741 28141 slave.cpp:3646] Cleaning up executor
> '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: tests/slave_recovery_tests.cpp:881: Failure
> DEBUG: Mock function called more times than expected - returning directly.
> DEBUG: Function call: statusUpdate(0x7fffabf3b0c0, @0x7f52f0b9d740
> 112-byte object <F0-C1 40-FC 52-7F 00-00 00-00 00-00 00-00 00-00 B0-24 01-AC
> 52-7F 00-00 00-95 01-AC 52-7F 00-00 05-00 00-00 01-00 00-00 80-28 40-01 00-00
> 00-00 40-AF 01-AC 52-7F 00-00 00-00 00-00 00-00 00-00 09-00 00-00 00-00 00-00
> 8B-FB 6F-FD 6F-90 D5-41 20-AF 01-AC 52-7F 00-00 00-00 00-00 00-00 00-00 00-00
> 00-00 00-00 00-00 00-00 00-00 5F-03 00-00>)
> DEBUG: Expected: to be called twice
> DEBUG: Actual: called 3 times - over-saturated and active
> DEBUG: I1020 08:56:38.716214 28141 slave.cpp:3735] Cleaning up framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.716245 28131 gc.cpp:56] Scheduling
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b/runs/2b1371cc-4d78-44d1-b1f1-e1b29127adc8'
> for gc 6.99999171211852days in the future
> DEBUG: I1020 08:56:38.716320 28151 status_update_manager.cpp:284] Closing
> status update streams for framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.716517 28131 gc.cpp:56] Scheduling
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b'
> for gc 6.99999171165333days in the future
> DEBUG: I1020 08:56:38.716562 28131 gc.cpp:56] Scheduling
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b/runs/2b1371cc-4d78-44d1-b1f1-e1b29127adc8'
> for gc 6.99999171135407days in the future
> DEBUG: I1020 08:56:38.716585 28131 gc.cpp:56] Scheduling
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b'
> for gc 6.99999171074074days in the future
> DEBUG: I1020 08:56:38.716604 28131 gc.cpp:56] Scheduling
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000'
> for gc 6.99999171001185days in the future
> DEBUG: I1020 08:56:38.716624 28131 gc.cpp:56] Scheduling
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000'
> for gc 6.9999917082637days in the future
> DEBUG: I1020 08:56:38.716841 28115 slave.cpp:606] Slave terminating
> DEBUG: I1020 08:56:38.716974 28150 master.cpp:1086] Slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com) disconnected
> DEBUG: I1020 08:56:38.716994 28150 master.cpp:2540] Disconnecting slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:38.717082 28150 master.cpp:2559] Deactivating slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:38.717128 28136 hierarchical.cpp:420] Slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 deactivated
> DEBUG: I1020 08:56:38.717308 28136 hierarchical.cpp:739] Recovered cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total: cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: ) on slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 from framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.718613 28115 containerizer.cpp:143] Using isolation:
> posix/cpu,posix/mem,filesystem/posix
> DEBUG: W1020 08:56:38.718729 28115 backend.cpp:50] Failed to create 'bind'
> backend: BindBackend requires root privileges
> DEBUG: I1020 08:56:38.720690 28142 slave.cpp:191] Slave started on
> 256)@172.16.132.117:37586
> DEBUG: I1020 08:56:38.720705 28142 slave.cpp:192] Flags at startup:
> --appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5"
> --cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false"
> --cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false"
> --cgroups_root="mesos" --container_disk_watch_interval="15secs"
> --containerizers="mesos"
> --credential="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/credential"
> --default_role="*" --disk_watch_interval="1mins" --docker="docker"
> --docker_kill_orphans="true"
> --docker_local_archives_dir="/tmp/mesos/images/docker"
> --docker_puller="local" --docker_remove_delay="6hrs"
> --docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns"
> --docker_store_dir="/tmp/mesos/store/docker"
> --egress_unique_flow_per_container="false"
> --enforce_container_disk_quota="false" --ephemeral_ports_per_container="1024"
> --executor_registration_timeout="1mins"
> --executor_shutdown_grace_period="5secs"
> --fetcher_cache_dir="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/fetch"
> --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks"
> --gc_disk_headroom="0.1" --hadoop_home="" --help="false"
> --hostname_lookup="true" --image_provisioner_backend="copy"
> --initialize_driver_logging="true" --isolation="posix/cpu,posix/mem"
> --launcher_dir="/builddir/build/BUILD/mesos-0.26.0/src" --logbufsecs="0"
> --logging_level="INFO" --network_enable_socket_statistics_details="false"
> --network_enable_socket_statistics_summary="false"
> --oversubscribed_resources_interval="15secs" --perf_duration="10secs"
> --perf_interval="1mins" --qos_correction_interval_min="0ns" --quiet="false"
> --recover="reconnect" --recovery_timeout="15mins"
> --registration_backoff_factor="10ms"
> --resources="cpus:2;mem:1024;disk:1024;ports:[31000-32000]"
> --revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox"
> --strict="true" --switch_user="true"
> --systemd_runtime_directory="/run/systemd/system" --version="false"
> --work_dir="/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx"
> DEBUG: I1020 08:56:38.720965 28142 credentials.hpp:85] Loading credential for
> authentication from
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/credential'
> DEBUG: I1020 08:56:38.721045 28142 slave.cpp:322] Slave using credential for:
> test-principal
> DEBUG: I1020 08:56:38.721573 28142 slave.cpp:392] Slave resources: cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> DEBUG: I1020 08:56:38.721603 28142 slave.cpp:400] Slave attributes: [ ]
> DEBUG: I1020 08:56:38.721609 28142 slave.cpp:405] Slave hostname:
> smfc-ahy-19-sr2.corpdc.twitter.com
> DEBUG: I1020 08:56:38.721613 28142 slave.cpp:410] Slave checkpoint: true
> DEBUG: I1020 08:56:38.721895 28131 state.cpp:54] Recovering state from
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta'
> DEBUG: I1020 08:56:38.721935 28131 state.cpp:681] No checkpointed resources
> found at
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/resources/resources.info'
> DEBUG: I1020 08:56:38.723804 28131 slave.cpp:4310] Recovering framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.723827 28131 slave.cpp:5103] Recovering executor
> '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.724068 28142 gc.cpp:56] Scheduling
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b/runs/2b1371cc-4d78-44d1-b1f1-e1b29127adc8'
> for gc 6.99999162064889days in the future
> DEBUG: I1020 08:56:38.724123 28142 gc.cpp:56] Scheduling
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b/runs/2b1371cc-4d78-44d1-b1f1-e1b29127adc8'
> for gc 6.99999161984889days in the future
> DEBUG: I1020 08:56:38.724180 28142 gc.cpp:56] Scheduling
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b'
> for gc 6.99999161965333days in the future
> DEBUG: I1020 08:56:38.724195 28131 slave.cpp:3735] Cleaning up framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.724220 28142 gc.cpp:56] Scheduling
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000/executors/0bbb69a3-3ba3-4016-86ed-4d80c1bf594b'
> for gc 6.99999161943111days in the future
> DEBUG: I1020 08:56:38.724318 28142 status_update_manager.cpp:284] Closing
> status update streams for framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.724431 28142 gc.cpp:56] Scheduling
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000'
> for gc 6.99999161661037days in the future
> DEBUG: I1020 08:56:38.724465 28142 gc.cpp:56] Scheduling
> '/tmp/SlaveRecoveryTest_0_RecoverCompletedExecutor_WO4xyx/meta/slaves/0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0/frameworks/0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000'
> for gc 6.99999161520889days in the future
> DEBUG: I1020 08:56:38.724778 28140 status_update_manager.cpp:202] Recovering
> status update manager
> DEBUG: I1020 08:56:38.724791 28140 status_update_manager.cpp:210] Recovering
> executor '0bbb69a3-3ba3-4016-86ed-4d80c1bf594b' of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.725278 28132 containerizer.cpp:385] Recovering
> containerizer
> DEBUG: I1020 08:56:38.725953 28142 slave.cpp:4222] Finished recovery
> DEBUG: I1020 08:56:38.726372 28131 status_update_manager.cpp:176] Pausing
> sending status updates
> DEBUG: I1020 08:56:38.726676 28133 slave.cpp:726] New master detected at
> [email protected]:37586
> DEBUG: I1020 08:56:38.726698 28133 slave.cpp:789] Authenticating with master
> [email protected]:37586
> DEBUG: I1020 08:56:38.726706 28133 slave.cpp:794] Using default CRAM-MD5
> authenticatee
> DEBUG: I1020 08:56:38.726750 28133 slave.cpp:762] Detecting new master
> DEBUG: I1020 08:56:38.726832 28153 authenticatee.cpp:115] Creating new client
> SASL connection
> DEBUG: I1020 08:56:38.727023 28153 master.cpp:5144] Authenticating
> slave(256)@172.16.132.117:37586
> DEBUG: I1020 08:56:38.727404 28149 authenticator.cpp:92] Creating new server
> SASL connection
> DEBUG: I1020 08:56:38.727483 28149 authenticatee.cpp:206] Received SASL
> authentication mechanisms: CRAM-MD5
> DEBUG: I1020 08:56:38.727496 28149 authenticatee.cpp:232] Attempting to
> authenticate with mechanism 'CRAM-MD5'
> DEBUG: I1020 08:56:38.727525 28149 authenticator.cpp:197] Received SASL
> authentication start
> DEBUG: I1020 08:56:38.727591 28149 authenticator.cpp:319] Authentication
> requires more steps
> DEBUG: I1020 08:56:38.727670 28135 authenticatee.cpp:252] Received SASL
> authentication step
> DEBUG: I1020 08:56:38.727736 28135 authenticator.cpp:225] Received SASL
> authentication step
> DEBUG: I1020 08:56:38.727766 28135 authenticator.cpp:311] Authentication
> success
> DEBUG: I1020 08:56:38.727821 28147 authenticatee.cpp:292] Authentication
> success
> DEBUG: I1020 08:56:38.727826 28132 master.cpp:5174] Successfully
> authenticated principal 'test-principal' at slave(256)@172.16.132.117:37586
> DEBUG: I1020 08:56:38.728272 28144 slave.cpp:857] Successfully authenticated
> with master [email protected]:37586
> DEBUG: I1020 08:56:38.728466 28153 master.cpp:4028] Re-registering slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(255)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: W1020 08:56:38.728508 28153 master.cpp:5241] Task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 unknown to the slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(256)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com) during re-registration: reconciling with
> the slave
> DEBUG: I1020 08:56:38.728565 28153 master.cpp:4216] Sending updated
> checkpointed resources to slave 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at
> slave(256)@172.16.132.117:37586 (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:38.728612 28145 slave.cpp:1001] Re-registered with master
> [email protected]:37586
> DEBUG: I1020 08:56:38.728628 28140 hierarchical.cpp:408] Slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 reactivated
> DEBUG: I1020 08:56:38.728693 28145 slave.cpp:1037] Forwarding total
> oversubscribed resources
> DEBUG: I1020 08:56:38.728703 28132 status_update_manager.cpp:183] Resuming
> sending status updates
> DEBUG: W1020 08:56:38.728718 28145 slave.cpp:1077] Slave reconciling task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 in state TASK_LOST: task unknown to
> the slave
> DEBUG: I1020 08:56:38.728761 28135 master.cpp:4278] Received update of slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(256)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com) with total oversubscribed resources
> DEBUG: I1020 08:56:38.728778 28141 status_update_manager.cpp:322] Received
> status update TASK_LOST (UUID: 30c316cf-fda2-4ab4-9c43-910e9416bab2) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.728927 28135 hierarchical.cpp:391] Slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 (smfc-ahy-19-sr2.corpdc.twitter.com)
> updated with oversubscribed resources (total: cpus(*):2; mem(*):1024;
> disk(*):1024; ports(*):[31000-32000], allocated: )
> DEBUG: I1020 08:56:38.728973 28145 slave.cpp:2284] Updated checkpointed
> resources from to
> DEBUG: I1020 08:56:38.729003 28145 slave.cpp:3090] Forwarding the update
> TASK_LOST (UUID: 30c316cf-fda2-4ab4-9c43-910e9416bab2) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 to [email protected]:37586
> DEBUG: I1020 08:56:38.729135 28148 master.cpp:4973] Sending 1 offers to
> framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 (default) at
> [email protected]:37586
> DEBUG: I1020 08:56:38.729395 28148 master.cpp:4421] Status update TASK_LOST
> (UUID: 30c316cf-fda2-4ab4-9c43-910e9416bab2) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 from slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(256)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:38.729423 28148 master.cpp:4460] Forwarding status update
> TASK_LOST (UUID: 30c316cf-fda2-4ab4-9c43-910e9416bab2) for task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: E1020 08:56:38.729467 28148 master.cpp:6032] Ignoring status update
> for the terminated task 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b (TASK_FINISHED
> -> TASK_LOST) of framework 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.729768 28115 sched.cpp:1781] Asked to stop the driver
> DEBUG: I1020 08:56:38.729800 28115 master.cpp:925] Master terminating
> DEBUG: I1020 08:56:38.729825 28115 master.cpp:6154] Removing task
> 0bbb69a3-3ba3-4016-86ed-4d80c1bf594b with resources cpus(*):2; mem(*):1024;
> disk(*):1024; ports(*):[31000-32000] of framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000 on slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0 at slave(256)@172.16.132.117:37586
> (smfc-ahy-19-sr2.corpdc.twitter.com)
> DEBUG: I1020 08:56:38.729920 28137 hierarchical.cpp:364] Removed slave
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-S0
> DEBUG: I1020 08:56:38.729949 28135 sched.cpp:1042] Stopping framework
> '0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000'
> DEBUG: I1020 08:56:38.730121 28137 hierarchical.cpp:220] Removed framework
> 0d54e2f1-43d7-4f74-8532-9c37ac40791b-0000
> DEBUG: I1020 08:56:38.730507 28135 slave.cpp:3258]
> [email protected]:37586 exited
> DEBUG: W1020 08:56:38.730526 28135 slave.cpp:3261] Master disconnected!
> Waiting for a new master to be elected
> DEBUG: [ FAILED ] SlaveRecoveryTest/0.RecoverCompletedExecutor, where
> TypeParam = mesos::internal::slave::MesosContainerizer (2288 ms)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)