Jie Yu created MESOS-3509:
-----------------------------
Summary: SlaveTest.TerminatingSlaveDoesNotReregister is flaky
Key: MESOS-3509
URL: https://issues.apache.org/jira/browse/MESOS-3509
Project: Mesos
Issue Type: Bug
Affects Versions: 0.25.0
Reporter: Jie Yu
Observed on Apache CI
{noformat}
[ RUN ] SlaveTest.TerminatingSlaveDoesNotReregister
Using temporary directory
'/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_PiT5jn'
I0924 04:54:06.645386 29977 leveldb.cpp:176] Opened db in 112.347533ms
I0924 04:54:06.665618 29977 leveldb.cpp:183] Compacted db in 20.158157ms
I0924 04:54:06.665704 29977 leveldb.cpp:198] Created db iterator in 24721ns
I0924 04:54:06.665727 29977 leveldb.cpp:204] Seeked to beginning of db in 3125ns
I0924 04:54:06.665741 29977 leveldb.cpp:273] Iterated through 0 keys in the db
in 256ns
I0924 04:54:06.665799 29977 replica.cpp:744] Replica recovered with log
positions 0 -> 0 with 1 holes and 0 unlearned
I0924 04:54:06.666378 30000 recover.cpp:449] Starting replica recovery
I0924 04:54:06.666831 30000 recover.cpp:475] Replica is in EMPTY status
I0924 04:54:06.667945 30008 replica.cpp:641] Replica in EMPTY status received a
broadcasted recover request
I0924 04:54:06.668473 30005 recover.cpp:195] Received a recover response from a
replica in EMPTY status
I0924 04:54:06.669003 30007 recover.cpp:566] Updating replica status to STARTING
I0924 04:54:06.669154 30000 master.cpp:370] Master
bd006350-cdbf-414f-9eef-25f03ccdc5fb (7c9c99874a2d) started on
172.17.1.154:55676
I0924 04:54:06.669347 30000 master.cpp:372] Flags at startup: --acls=""
--allocation_interval="1secs" --allocator="HierarchicalDRF"
--authenticate="true" --authenticate_slaves="true" --authenticators="crammd5"
--authorizers="local"
--credentials="/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_PiT5jn/credentials"
--framework_sorter="drf" --help="false" --initialize_driver_logging="true"
--log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO"
--max_slave_ping_timeouts="5" --quiet="false"
--recovery_slave_removal_limit="100%" --registry="replicated_log"
--registry_fetch_timeout="1mins" --registry_store_timeout="25secs"
--registry_strict="true" --root_submissions="true"
--slave_ping_timeout="15secs" --slave_reregister_timeout="10mins"
--user_sorter="drf" --version="false"
--webui_dir="/mesos/mesos-0.25.0/_inst/share/mesos/webui"
--work_dir="/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_PiT5jn/master"
--zk_session_timeout="10secs"
I0924 04:54:06.669595 30000 master.cpp:417] Master only allowing authenticated
frameworks to register
I0924 04:54:06.669605 30000 master.cpp:422] Master only allowing authenticated
slaves to register
I0924 04:54:06.669613 30000 credentials.hpp:37] Loading credentials for
authentication from
'/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_PiT5jn/credentials'
I0924 04:54:06.669874 30000 master.cpp:461] Using default 'crammd5'
authenticator
I0924 04:54:06.670027 30000 master.cpp:498] Authorization enabled
I0924 04:54:06.670264 30005 hierarchical.hpp:468] Initialized hierarchical
allocator process
I0924 04:54:06.670320 30005 whitelist_watcher.cpp:79] No whitelist given
I0924 04:54:06.671406 29999 master.cpp:1597] The newly elected leader is
[email protected]:55676 with id bd006350-cdbf-414f-9eef-25f03ccdc5fb
I0924 04:54:06.671438 29999 master.cpp:1610] Elected as the leading master!
I0924 04:54:06.671461 29999 master.cpp:1370] Recovering from registrar
I0924 04:54:06.671612 30011 registrar.cpp:309] Recovering registrar
I0924 04:54:06.704048 30002 leveldb.cpp:306] Persisting metadata (8 bytes) to
leveldb took 34.707125ms
I0924 04:54:06.704123 30002 replica.cpp:323] Persisted replica status to
STARTING
I0924 04:54:06.704504 29997 recover.cpp:475] Replica is in STARTING status
I0924 04:54:06.705658 30002 replica.cpp:641] Replica in STARTING status
received a broadcasted recover request
I0924 04:54:06.706466 30007 recover.cpp:195] Received a recover response from a
replica in STARTING status
I0924 04:54:06.707064 30004 recover.cpp:566] Updating replica status to VOTING
I0924 04:54:06.737488 30004 leveldb.cpp:306] Persisting metadata (8 bytes) to
leveldb took 30.261425ms
I0924 04:54:06.737560 30004 replica.cpp:323] Persisted replica status to VOTING
I0924 04:54:06.737746 30004 recover.cpp:580] Successfully joined the Paxos group
I0924 04:54:06.737932 30004 recover.cpp:464] Recover process terminated
I0924 04:54:06.738535 30009 log.cpp:661] Attempting to start the writer
I0924 04:54:06.739835 30001 replica.cpp:477] Replica received implicit promise
request with proposal 1
I0924 04:54:06.770998 30001 leveldb.cpp:306] Persisting metadata (8 bytes) to
leveldb took 31.093864ms
I0924 04:54:06.771077 30001 replica.cpp:345] Persisted promised to 1
I0924 04:54:06.771845 30002 coordinator.cpp:231] Coordinator attemping to fill
missing position
I0924 04:54:06.773066 30002 replica.cpp:378] Replica received explicit promise
request for position 0 with proposal 2
I0924 04:54:06.804437 30002 leveldb.cpp:343] Persisting action (8 bytes) to
leveldb took 31.314548ms
I0924 04:54:06.804520 30002 replica.cpp:679] Persisted action at 0
I0924 04:54:06.805915 30002 replica.cpp:511] Replica received write request for
position 0
I0924 04:54:06.806094 30002 leveldb.cpp:438] Reading position from leveldb took
45268ns
I0924 04:54:06.837913 30002 leveldb.cpp:343] Persisting action (14 bytes) to
leveldb took 31.745501ms
I0924 04:54:06.838006 30002 replica.cpp:679] Persisted action at 0
I0924 04:54:06.838798 30002 replica.cpp:658] Replica received learned notice
for position 0
I0924 04:54:06.871426 30002 leveldb.cpp:343] Persisting action (16 bytes) to
leveldb took 32.585604ms
I0924 04:54:06.871502 30002 replica.cpp:679] Persisted action at 0
I0924 04:54:06.871536 30002 replica.cpp:664] Replica learned NOP action at
position 0
I0924 04:54:06.872514 30006 log.cpp:677] Writer started with ending position 0
I0924 04:54:06.873711 30006 leveldb.cpp:438] Reading position from leveldb took
48006ns
I0924 04:54:06.874902 30004 registrar.cpp:342] Successfully fetched the
registry (0B) in 203136us
I0924 04:54:06.875100 30004 registrar.cpp:441] Applied 1 operations in 34506ns;
attempting to update the 'registry'
I0924 04:54:06.876008 29999 log.cpp:685] Attempting to append 176 bytes to the
log
I0924 04:54:06.876159 30006 coordinator.cpp:341] Coordinator attempting to
write APPEND action at position 1
I0924 04:54:06.877343 29996 replica.cpp:511] Replica received write request for
position 1
I0924 04:54:06.913260 29996 leveldb.cpp:343] Persisting action (195 bytes) to
leveldb took 35.865668ms
I0924 04:54:06.913338 29996 replica.cpp:679] Persisted action at 1
I0924 04:54:06.914463 29996 replica.cpp:658] Replica received learned notice
for position 1
I0924 04:54:06.941951 29996 leveldb.cpp:343] Persisting action (197 bytes) to
leveldb took 27.224493ms
I0924 04:54:06.942047 29996 replica.cpp:679] Persisted action at 1
I0924 04:54:06.942083 29996 replica.cpp:664] Replica learned APPEND action at
position 1
I0924 04:54:06.943408 30005 registrar.cpp:486] Successfully updated the
'registry' in 68.169984ms
I0924 04:54:06.943563 30005 registrar.cpp:372] Successfully recovered registrar
I0924 04:54:06.944015 29998 log.cpp:704] Attempting to truncate the log to 1
I0924 04:54:06.944458 29997 coordinator.cpp:341] Coordinator attempting to
write TRUNCATE action at position 2
I0924 04:54:06.944510 29996 master.cpp:1407] Recovered 0 slaves from the
Registry (137B) ; allowing 10mins for slaves to re-register
I0924 04:54:06.945586 30001 replica.cpp:511] Replica received write request for
position 2
I0924 04:54:06.967072 30001 leveldb.cpp:343] Persisting action (16 bytes) to
leveldb took 21.44102ms
I0924 04:54:06.967136 30001 replica.cpp:679] Persisted action at 2
I0924 04:54:06.968070 30001 replica.cpp:658] Replica received learned notice
for position 2
I0924 04:54:06.992204 30001 leveldb.cpp:343] Persisting action (18 bytes) to
leveldb took 24.095375ms
I0924 04:54:06.992321 30001 leveldb.cpp:401] Deleting ~1 keys from leveldb took
55868ns
I0924 04:54:06.992532 30001 replica.cpp:679] Persisted action at 2
I0924 04:54:06.992811 30001 replica.cpp:664] Replica learned TRUNCATE action at
position 2
I0924 04:54:07.000255 30008 slave.cpp:190] Slave started on
299)@172.17.1.154:55676
I0924 04:54:07.000752 30008 slave.cpp:191] Flags at startup:
--appc_store_dir="/tmp/mesos/store/appc" --authenticatee="crammd5"
--cgroups_cpu_enable_pids_and_tids_count="false" --cgroups_enable_cfs="false"
--cgroups_hierarchy="/sys/fs/cgroup" --cgroups_limit_swap="false"
--cgroups_root="mesos" --container_disk_watch_interval="15secs"
--containerizers="mesos"
--credential="/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_yCbi9q/credential"
--default_role="*" --disk_watch_interval="1mins" --docker="docker"
--docker_kill_orphans="true" --docker_remove_delay="6hrs"
--docker_socket="/var/run/docker.sock" --docker_stop_timeout="0ns"
--enforce_container_disk_quota="false" --executor_registration_timeout="1mins"
--executor_shutdown_grace_period="2mins"
--fetcher_cache_dir="/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_yCbi9q/fetch"
--fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks"
--gc_disk_headroom="0.1" --hadoop_home="" --help="false"
--image_provisioner_backend="copy" --initialize_driver_logging="true"
--isolation="posix/cpu,posix/mem"
--launcher_dir="/mesos/mesos-0.25.0/_build/src" --logbufsecs="0"
--logging_level="INFO" --oversubscribed_resources_interval="15secs"
--perf_duration="10secs" --perf_interval="1mins"
--qos_correction_interval_min="0ns" --quiet="false" --recover="reconnect"
--recovery_timeout="15mins" --registration_backoff_factor="10ms"
--resource_monitoring_interval="1secs"
--resources="cpus:2;mem:1024;disk:1024;ports:[31000-32000]"
--revocable_cpu_low_priority="true" --sandbox_directory="/mnt/mesos/sandbox"
--strict="true" --switch_user="true"
--systemd_runtime_directory="/run/systemd/system" --version="false"
--work_dir="/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_yCbi9q"
I0924 04:54:07.001853 30008 credentials.hpp:85] Loading credential for
authentication from
'/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_yCbi9q/credential'
I0924 04:54:07.002065 29977 sched.cpp:164] Version: 0.25.0
I0924 04:54:07.002388 30008 slave.cpp:321] Slave using credential for:
test-principal
I0924 04:54:07.003141 30004 sched.cpp:262] New master detected at
[email protected]:55676
I0924 04:54:07.003265 30008 slave.cpp:354] Slave resources: cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I0924 04:54:07.003336 30004 sched.cpp:318] Authenticating with master
[email protected]:55676
I0924 04:54:07.003521 30008 slave.cpp:384] Slave hostname: 7c9c99874a2d
I0924 04:54:07.003526 30004 sched.cpp:325] Using default CRAM-MD5 authenticatee
I0924 04:54:07.003676 30008 slave.cpp:389] Slave checkpoint: true
I0924 04:54:07.004098 30011 authenticatee.cpp:115] Creating new client SASL
connection
I0924 04:54:07.004578 30007 master.cpp:5132] Authenticating
[email protected]:55676
I0924 04:54:07.004796 30006 authenticator.cpp:407] Starting authentication
session for crammd5_authenticatee(669)@172.17.1.154:55676
I0924 04:54:07.004910 30004 state.cpp:54] Recovering state from
'/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_yCbi9q/meta'
I0924 04:54:07.005316 29997 authenticator.cpp:92] Creating new server SASL
connection
I0924 04:54:07.005751 30010 status_update_manager.cpp:202] Recovering status
update manager
I0924 04:54:07.005754 29997 authenticatee.cpp:206] Received SASL authentication
mechanisms: CRAM-MD5
I0924 04:54:07.005810 29997 authenticatee.cpp:232] Attempting to authenticate
with mechanism 'CRAM-MD5'
I0924 04:54:07.006036 30010 authenticator.cpp:197] Received SASL authentication
start
I0924 04:54:07.006132 30010 authenticator.cpp:319] Authentication requires more
steps
I0924 04:54:07.006291 30010 authenticatee.cpp:252] Received SASL authentication
step
I0924 04:54:07.006407 29997 slave.cpp:4104] Finished recovery
I0924 04:54:07.006443 30010 authenticator.cpp:225] Received SASL authentication
step
I0924 04:54:07.006700 30010 auxprop.cpp:102] Request to lookup properties for
user: 'test-principal' realm: '7c9c99874a2d' server FQDN: '7c9c99874a2d'
SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false
SASL_AUXPROP_AUTHZID: false
I0924 04:54:07.006810 30010 auxprop.cpp:174] Looking up auxiliary property
'*userPassword'
I0924 04:54:07.007015 30010 auxprop.cpp:174] Looking up auxiliary property
'*cmusaslsecretCRAM-MD5'
I0924 04:54:07.007077 30010 auxprop.cpp:102] Request to lookup properties for
user: 'test-principal' realm: '7c9c99874a2d' server FQDN: '7c9c99874a2d'
SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false
SASL_AUXPROP_AUTHZID: true
I0924 04:54:07.007202 30010 auxprop.cpp:124] Skipping auxiliary property
'*userPassword' since SASL_AUXPROP_AUTHZID == true
I0924 04:54:07.007321 30010 auxprop.cpp:124] Skipping auxiliary property
'*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
I0924 04:54:07.007336 29997 slave.cpp:4261] Querying resource estimator for
oversubscribable resources
I0924 04:54:07.007480 30010 authenticator.cpp:311] Authentication success
I0924 04:54:07.007685 29999 authenticatee.cpp:292] Authentication success
I0924 04:54:07.007732 30004 master.cpp:5162] Successfully authenticated
principal 'test-principal' at
[email protected]:55676
I0924 04:54:07.007818 30010 authenticator.cpp:425] Authentication session
cleanup for crammd5_authenticatee(669)@172.17.1.154:55676
I0924 04:54:07.008213 30006 sched.cpp:407] Successfully authenticated with
master [email protected]:55676
I0924 04:54:07.008236 29997 slave.cpp:699] New master detected at
[email protected]:55676
I0924 04:54:07.008430 30006 sched.cpp:714] Sending SUBSCRIBE call to
[email protected]:55676
I0924 04:54:07.008483 29998 status_update_manager.cpp:176] Pausing sending
status updates
I0924 04:54:07.008586 29997 slave.cpp:762] Authenticating with master
[email protected]:55676
I0924 04:54:07.008723 29997 slave.cpp:767] Using default CRAM-MD5 authenticatee
I0924 04:54:07.008754 30006 sched.cpp:747] Will retry registration in
1.626336152secs if necessary
I0924 04:54:07.008901 29997 slave.cpp:735] Detecting new master
I0924 04:54:07.008930 30006 master.cpp:2173] Received SUBSCRIBE call for
framework 'default' at
[email protected]:55676
I0924 04:54:07.008987 29998 authenticatee.cpp:115] Creating new client SASL
connection
I0924 04:54:07.009007 30006 master.cpp:1636] Authorizing framework principal
'test-principal' to receive offers for role '*'
I0924 04:54:07.009068 29997 slave.cpp:4275] Received oversubscribable resources
from the resource estimator
I0924 04:54:07.009196 30006 master.cpp:5132] Authenticating
slave(299)@172.17.1.154:55676
I0924 04:54:07.009300 29997 authenticator.cpp:407] Starting authentication
session for crammd5_authenticatee(670)@172.17.1.154:55676
I0924 04:54:07.009388 30006 master.cpp:2244] Subscribing framework default with
checkpointing disabled and capabilities [ ]
I0924 04:54:07.009531 30005 authenticator.cpp:92] Creating new server SASL
connection
I0924 04:54:07.009745 29997 authenticatee.cpp:206] Received SASL authentication
mechanisms: CRAM-MD5
I0924 04:54:07.009881 29997 authenticatee.cpp:232] Attempting to authenticate
with mechanism 'CRAM-MD5'
I0924 04:54:07.010095 30006 sched.cpp:641] Framework registered with
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.010107 30002 authenticator.cpp:197] Received SASL authentication
start
I0924 04:54:07.010146 30006 sched.cpp:655] Scheduler::registered took 21969ns
I0924 04:54:07.010164 30002 authenticator.cpp:319] Authentication requires more
steps
I0924 04:54:07.010167 29998 hierarchical.hpp:515] Added framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.010208 29998 hierarchical.hpp:1326] No resources available to
allocate!
I0924 04:54:07.010232 29998 hierarchical.hpp:1421] No inverse offers to send
out!
I0924 04:54:07.010251 29998 hierarchical.hpp:1221] Performed allocation for 0
slaves in 50319ns
I0924 04:54:07.010268 30002 authenticatee.cpp:252] Received SASL authentication
step
I0924 04:54:07.010380 30002 authenticator.cpp:225] Received SASL authentication
step
I0924 04:54:07.010416 30002 auxprop.cpp:102] Request to lookup properties for
user: 'test-principal' realm: '7c9c99874a2d' server FQDN: '7c9c99874a2d'
SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false
SASL_AUXPROP_AUTHZID: false
I0924 04:54:07.010434 30002 auxprop.cpp:174] Looking up auxiliary property
'*userPassword'
I0924 04:54:07.010474 30002 auxprop.cpp:174] Looking up auxiliary property
'*cmusaslsecretCRAM-MD5'
I0924 04:54:07.010514 30002 auxprop.cpp:102] Request to lookup properties for
user: 'test-principal' realm: '7c9c99874a2d' server FQDN: '7c9c99874a2d'
SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false
SASL_AUXPROP_AUTHZID: true
I0924 04:54:07.010532 30002 auxprop.cpp:124] Skipping auxiliary property
'*userPassword' since SASL_AUXPROP_AUTHZID == true
I0924 04:54:07.010545 30002 auxprop.cpp:124] Skipping auxiliary property
'*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
I0924 04:54:07.010566 30002 authenticator.cpp:311] Authentication success
I0924 04:54:07.010723 29998 authenticatee.cpp:292] Authentication success
I0924 04:54:07.010924 30006 master.cpp:5162] Successfully authenticated
principal 'test-principal' at slave(299)@172.17.1.154:55676
I0924 04:54:07.011070 29998 slave.cpp:830] Successfully authenticated with
master [email protected]:55676
I0924 04:54:07.011245 29998 slave.cpp:1224] Will retry registration in
15.453813ms if necessary
I0924 04:54:07.011473 30009 authenticator.cpp:425] Authentication session
cleanup for crammd5_authenticatee(670)@172.17.1.154:55676
I0924 04:54:07.011514 30008 master.cpp:3856] Registering slave at
slave(299)@172.17.1.154:55676 (7c9c99874a2d) with id
bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0
I0924 04:54:07.012133 30010 registrar.cpp:441] Applied 1 operations in 52871ns;
attempting to update the 'registry'
I0924 04:54:07.013129 30004 log.cpp:685] Attempting to append 345 bytes to the
log
I0924 04:54:07.013411 29996 coordinator.cpp:341] Coordinator attempting to
write APPEND action at position 3
I0924 04:54:07.014320 30009 replica.cpp:511] Replica received write request for
position 3
I0924 04:54:07.027837 30007 slave.cpp:1224] Will retry registration in
38.842723ms if necessary
I0924 04:54:07.028054 29996 master.cpp:3844] Ignoring register slave message
from slave(299)@172.17.1.154:55676 (7c9c99874a2d) as admission is already in
progress
I0924 04:54:07.042615 30009 leveldb.cpp:343] Persisting action (364 bytes) to
leveldb took 28.247365ms
I0924 04:54:07.042661 30009 replica.cpp:679] Persisted action at 3
I0924 04:54:07.043460 30003 replica.cpp:658] Replica received learned notice
for position 3
I0924 04:54:07.067387 30004 slave.cpp:1224] Will retry registration in
78.089263ms if necessary
I0924 04:54:07.067571 30009 master.cpp:3844] Ignoring register slave message
from slave(299)@172.17.1.154:55676 (7c9c99874a2d) as admission is already in
progress
I0924 04:54:07.067726 30003 leveldb.cpp:343] Persisting action (366 bytes) to
leveldb took 24.157267ms
I0924 04:54:07.067771 30003 replica.cpp:679] Persisted action at 3
I0924 04:54:07.067807 30003 replica.cpp:664] Replica learned APPEND action at
position 3
I0924 04:54:07.069293 30009 registrar.cpp:486] Successfully updated the
'registry' in 57.085952ms
I0924 04:54:07.069537 30000 log.cpp:704] Attempting to truncate the log to 3
I0924 04:54:07.069870 30000 coordinator.cpp:341] Coordinator attempting to
write TRUNCATE action at position 4
I0924 04:54:07.070420 30000 hierarchical.hpp:675] Added slave
bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0 (7c9c99874a2d) with cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000] (allocated: )
I0924 04:54:07.070986 30000 hierarchical.hpp:1421] No inverse offers to send
out!
I0924 04:54:07.071085 30004 replica.cpp:511] Replica received write request for
position 4
I0924 04:54:07.071107 30005 slave.cpp:3132] Received ping from
slave-observer(268)@172.17.1.154:55676
I0924 04:54:07.071499 30000 hierarchical.hpp:1239] Performed allocation for
slave bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0 in 710571ns
I0924 04:54:07.071566 30005 slave.cpp:874] Registered with master
[email protected]:55676; given slave ID
bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0
I0924 04:54:07.071707 30005 fetcher.cpp:77] Clearing fetcher cache
I0924 04:54:07.071993 30000 status_update_manager.cpp:183] Resuming sending
status updates
I0924 04:54:07.072276 30005 slave.cpp:897] Checkpointing SlaveInfo to
'/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_yCbi9q/meta/slaves/bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0/slave.info'
I0924 04:54:07.072701 30005 slave.cpp:933] Forwarding total oversubscribed
resources
I0924 04:54:07.071038 30011 master.cpp:3924] Registered slave
bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0 at slave(299)@172.17.1.154:55676
(7c9c99874a2d) with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
I0924 04:54:07.073376 30011 master.cpp:4961] Sending 1 offers to framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 (default) at
[email protected]:55676
I0924 04:54:07.073616 30011 master.cpp:4266] Received update of slave
bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0 at slave(299)@172.17.1.154:55676
(7c9c99874a2d) with total oversubscribed resources
I0924 04:54:07.074031 30011 hierarchical.hpp:735] Slave
bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0 (7c9c99874a2d) updated with
oversubscribed resources (total: cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000], allocated: cpus(*):2; mem(*):1024; disk(*):1024;
ports(*):[31000-32000])
I0924 04:54:07.074501 30011 hierarchical.hpp:1326] No resources available to
allocate!
I0924 04:54:07.074534 30011 hierarchical.hpp:1421] No inverse offers to send
out!
I0924 04:54:07.074550 30011 hierarchical.hpp:1239] Performed allocation for
slave bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0 in 464700ns
I0924 04:54:07.074717 29998 sched.cpp:811] Scheduler::resourceOffers took
833347ns
I0924 04:54:07.075932 30001 master.cpp:2912] Processing ACCEPT call for offers:
[ bd006350-cdbf-414f-9eef-25f03ccdc5fb-O0 ] on slave
bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0 at slave(299)@172.17.1.154:55676
(7c9c99874a2d) for framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
(default) at [email protected]:55676
I0924 04:54:07.076009 30001 master.cpp:2708] Authorizing framework principal
'test-principal' to launch task 0 as user 'mesos'
W0924 04:54:07.077509 30001 validation.cpp:422] Executor default for task 0
uses less CPUs (None) than the minimum required (0.01). Please update your
executor, as this will be mandatory in future releases.
W0924 04:54:07.077560 30001 validation.cpp:434] Executor default for task 0
uses less memory (None) than the minimum required (32MB). Please update your
executor, as this will be mandatory in future releases.
I0924 04:54:07.077864 30001 master.hpp:176] Adding task 0 with resources
cpus(*):1; mem(*):64 on slave bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0
(7c9c99874a2d)
I0924 04:54:07.078027 30001 master.cpp:3242] Launching task 0 of framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 (default) at
[email protected]:55676 with
resources cpus(*):1; mem(*):64 on slave bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0
at slave(299)@172.17.1.154:55676 (7c9c99874a2d)
I0924 04:54:07.078366 30002 slave.cpp:1264] Got assigned task 0 for framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.078779 30001 hierarchical.hpp:1103] Recovered cpus(*):1;
mem(*):960; disk(*):1024; ports(*):[31000-32000] (total: cpus(*):2;
mem(*):1024; disk(*):1024; ports(*):[31000-32000], allocated: cpus(*):1;
mem(*):64) on slave bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0 from framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.078855 30001 hierarchical.hpp:1140] Framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 filtered slave
bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0 for 5secs
I0924 04:54:07.078819 30002 slave.cpp:1380] Launching task 0 for framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.090544 30002 slave.cpp:4846] Launching executor default of
framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 with resources in work
directory
'/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_yCbi9q/slaves/bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0/frameworks/bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000/executors/default/runs/53b62666-1c94-4942-9cc0-a96f43a023f0'
I0924 04:54:07.092171 30002 exec.cpp:134] Version: 0.25.0
I0924 04:54:07.092450 29996 exec.cpp:184] Executor started at:
executor(96)@172.17.1.154:55676 with pid 29977
I0924 04:54:07.092682 30002 slave.cpp:1598] Queuing task '0' for executor
default of framework 'bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.092792 30002 slave.cpp:652] Successfully attached file
'/tmp/SlaveTest_TerminatingSlaveDoesNotReregister_yCbi9q/slaves/bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0/frameworks/bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000/executors/default/runs/53b62666-1c94-4942-9cc0-a96f43a023f0'
I0924 04:54:07.093004 30002 slave.cpp:2373] Got registration for executor
'default' of framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 from
executor(96)@172.17.1.154:55676
I0924 04:54:07.093211 30004 leveldb.cpp:343] Persisting action (16 bytes) to
leveldb took 21.885809ms
I0924 04:54:07.093260 30004 replica.cpp:679] Persisted action at 4
I0924 04:54:07.093518 30004 exec.cpp:208] Executor registered on slave
bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0
I0924 04:54:07.093580 30004 exec.cpp:220] Executor::registered took 31899ns
I0924 04:54:07.093616 30002 slave.cpp:1754] Sending queued task '0' to executor
'default' of framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.093946 29998 exec.cpp:295] Executor asked to run task '0'
I0924 04:54:07.094220 29998 exec.cpp:304] Executor::launchTask took 68187ns
I0924 04:54:07.094442 29996 replica.cpp:658] Replica received learned notice
for position 4
I0924 04:54:07.094440 29998 exec.cpp:517] Executor sending status update
TASK_RUNNING (UUID: a2b0d4fd-ef47-40b4-9550-750dfd926f9d) for task 0 of
framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.094717 30011 slave.cpp:2711] Handling status update TASK_RUNNING
(UUID: a2b0d4fd-ef47-40b4-9550-750dfd926f9d) for task 0 of framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 from executor(96)@172.17.1.154:55676
I0924 04:54:07.095018 29998 status_update_manager.cpp:322] Received status
update TASK_RUNNING (UUID: a2b0d4fd-ef47-40b4-9550-750dfd926f9d) for task 0 of
framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.095068 29998 status_update_manager.cpp:499] Creating
StatusUpdate stream for task 0 of framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.095501 29998 status_update_manager.cpp:376] Forwarding update
TASK_RUNNING (UUID: a2b0d4fd-ef47-40b4-9550-750dfd926f9d) for task 0 of
framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 to the slave
I0924 04:54:07.095829 30004 slave.cpp:3010] Forwarding the update TASK_RUNNING
(UUID: a2b0d4fd-ef47-40b4-9550-750dfd926f9d) for task 0 of framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 to [email protected]:55676
I0924 04:54:07.096191 30006 master.cpp:4409] Status update TASK_RUNNING (UUID:
a2b0d4fd-ef47-40b4-9550-750dfd926f9d) for task 0 of framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 from slave
bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0 at slave(299)@172.17.1.154:55676
(7c9c99874a2d)
I0924 04:54:07.096230 30006 master.cpp:4448] Forwarding status update
TASK_RUNNING (UUID: a2b0d4fd-ef47-40b4-9550-750dfd926f9d) for task 0 of
framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.096398 30006 master.cpp:6075] Updating the latest state of task
0 of framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 to TASK_RUNNING
I0924 04:54:07.096498 30004 slave.cpp:2934] Status update manager successfully
handled status update TASK_RUNNING (UUID: a2b0d4fd-ef47-40b4-9550-750dfd926f9d)
for task 0 of framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.096537 30004 slave.cpp:2940] Sending acknowledgement for status
update TASK_RUNNING (UUID: a2b0d4fd-ef47-40b4-9550-750dfd926f9d) for task 0 of
framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 to
executor(96)@172.17.1.154:55676
I0924 04:54:07.096684 30004 exec.cpp:341] Executor received status update
acknowledgement a2b0d4fd-ef47-40b4-9550-750dfd926f9d for task 0 of framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.096706 30005 sched.cpp:918] Scheduler::statusUpdate took 159638ns
I0924 04:54:07.097100 30005 master.cpp:3600] Processing ACKNOWLEDGE call
a2b0d4fd-ef47-40b4-9550-750dfd926f9d for task 0 of framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 (default) at
[email protected]:55676 on slave
bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0
I0924 04:54:07.097384 30006 status_update_manager.cpp:394] Received status
update acknowledgement (UUID: a2b0d4fd-ef47-40b4-9550-750dfd926f9d) for task 0
of framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.097646 30006 slave.cpp:2313] Status update manager successfully
handled status update acknowledgement (UUID:
a2b0d4fd-ef47-40b4-9550-750dfd926f9d) for task 0 of framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.098183 29999 status_update_manager.cpp:176] Pausing sending
status updates
I0924 04:54:07.098225 29997 slave.cpp:699] New master detected at
[email protected]:55676
I0924 04:54:07.098330 29997 slave.cpp:762] Authenticating with master
[email protected]:55676
I0924 04:54:07.098354 29997 slave.cpp:767] Using default CRAM-MD5 authenticatee
I0924 04:54:07.098578 29997 slave.cpp:735] Detecting new master
I0924 04:54:07.098682 29998 authenticatee.cpp:115] Creating new client SASL
connection
I0924 04:54:07.099108 30000 master.cpp:5132] Authenticating
slave(299)@172.17.1.154:55676
I0924 04:54:07.099225 30006 authenticator.cpp:407] Starting authentication
session for crammd5_authenticatee(671)@172.17.1.154:55676
I0924 04:54:07.099690 30000 authenticator.cpp:92] Creating new server SASL
connection
I0924 04:54:07.099915 30006 authenticatee.cpp:206] Received SASL authentication
mechanisms: CRAM-MD5
I0924 04:54:07.100013 30006 authenticatee.cpp:232] Attempting to authenticate
with mechanism 'CRAM-MD5'
I0924 04:54:07.100164 30006 authenticator.cpp:197] Received SASL authentication
start
I0924 04:54:07.100245 30006 authenticator.cpp:319] Authentication requires more
steps
I0924 04:54:07.100467 30007 authenticatee.cpp:252] Received SASL authentication
step
I0924 04:54:07.100605 30007 authenticator.cpp:225] Received SASL authentication
step
I0924 04:54:07.100642 30007 auxprop.cpp:102] Request to lookup properties for
user: 'test-principal' realm: '7c9c99874a2d' server FQDN: '7c9c99874a2d'
SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false
SASL_AUXPROP_AUTHZID: false
I0924 04:54:07.100658 30007 auxprop.cpp:174] Looking up auxiliary property
'*userPassword'
I0924 04:54:07.100718 30007 auxprop.cpp:174] Looking up auxiliary property
'*cmusaslsecretCRAM-MD5'
I0924 04:54:07.100766 30007 auxprop.cpp:102] Request to lookup properties for
user: 'test-principal' realm: '7c9c99874a2d' server FQDN: '7c9c99874a2d'
SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false
SASL_AUXPROP_AUTHZID: true
I0924 04:54:07.100790 30007 auxprop.cpp:124] Skipping auxiliary property
'*userPassword' since SASL_AUXPROP_AUTHZID == true
I0924 04:54:07.100802 30007 auxprop.cpp:124] Skipping auxiliary property
'*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
I0924 04:54:07.100826 30007 authenticator.cpp:311] Authentication success
I0924 04:54:07.101024 30006 authenticatee.cpp:292] Authentication success
I0924 04:54:07.101088 30011 master.cpp:5162] Successfully authenticated
principal 'test-principal' at slave(299)@172.17.1.154:55676
I0924 04:54:07.101315 30006 slave.cpp:830] Successfully authenticated with
master [email protected]:55676
I0924 04:54:07.101629 30006 slave.cpp:1224] Will retry registration in 226426ns
if necessary
I0924 04:54:07.101769 30005 authenticator.cpp:425] Authentication session
cleanup for crammd5_authenticatee(671)@172.17.1.154:55676
I0924 04:54:07.102208 30008 master.cpp:4016] Re-registering slave
bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0 at slave(299)@172.17.1.154:55676
(7c9c99874a2d)
I0924 04:54:07.102916 30008 master.cpp:4204] Sending updated checkpointed
resources to slave bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0 at
slave(299)@172.17.1.154:55676 (7c9c99874a2d)
W0924 04:54:07.103030 30010 slave.cpp:2127] Dropping updateFramework message
for bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 because the slave is in
DISCONNECTED state
I0924 04:54:07.113637 30007 slave.cpp:2242] Updated checkpointed resources from
to
I0924 04:54:07.121868 29996 leveldb.cpp:343] Persisting action (18 bytes) to
leveldb took 27.362746ms
I0924 04:54:07.122032 29996 leveldb.cpp:401] Deleting ~2 keys from leveldb took
100541ns
I0924 04:54:07.122069 29996 replica.cpp:679] Persisted action at 4
I0924 04:54:07.122108 29996 replica.cpp:664] Replica learned TRUNCATE action at
position 4
I0924 04:54:07.128528 30006 slave.cpp:621] Slave asked to shut down by
[email protected]:55676
I0924 04:54:07.129091 30006 slave.cpp:1974] Asked to shut down framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 by [email protected]:55676
I0924 04:54:07.129199 30006 slave.cpp:1999] Shutting down framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.129402 30006 slave.cpp:3745] Shutting down executor 'default' of
framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.130210 30006 slave.cpp:1095] Skipping registration because slave
is terminating
I0924 04:54:07.130611 30006 slave.cpp:1095] Skipping registration because slave
is terminating
I0924 04:54:07.131108 30006 slave.cpp:4261] Querying resource estimator for
oversubscribable resources
I0924 04:54:07.131669 30006 slave.cpp:3846] Ignoring registration timeout for
executor 'default' because the framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 is terminating
I0924 04:54:07.131726 30006 slave.cpp:3168] No pings from master received
within 75secs
I0924 04:54:07.131886 30006 slave.cpp:3920] Current disk usage 13.01%. Max
allowed age: 5.389529832493959days
I0924 04:54:07.132267 29999 hierarchical.hpp:1421] No inverse offers to send
out!
I0924 04:54:07.132382 29999 hierarchical.hpp:1221] Performed allocation for 1
slaves in 1.163512ms
I0924 04:54:07.132840 30006 master.cpp:4961] Sending 1 offers to framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 (default) at
[email protected]:55676
I0924 04:54:07.132977 29999 slave.cpp:3132] Received ping from
slave-observer(268)@172.17.1.154:55676
I0924 04:54:07.133216 29999 slave.cpp:4275] Received oversubscribable resources
from the resource estimator
I0924 04:54:07.133376 30006 sched.cpp:811] Scheduler::resourceOffers took
29150ns
I0924 04:54:07.139232 29977 sched.cpp:1771] Asked to stop the driver
I0924 04:54:07.139361 29977 master.cpp:913] Master terminating
I0924 04:54:07.139490 29999 sched.cpp:1040] Stopping framework
'bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000'
I0924 04:54:07.140082 29996 hierarchical.hpp:706] Removed slave
bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0
W0924 04:54:07.140210 29977 master.cpp:6129] Removing task 0 with resources
cpus(*):1; mem(*):64 of framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 on
slave bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0 at slave(299)@172.17.1.154:55676
(7c9c99874a2d) in non-terminal state TASK_RUNNING
I0924 04:54:07.140774 29977 master.cpp:6172] Removing executor 'default' with
resources of framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000 on slave
bd006350-cdbf-414f-9eef-25f03ccdc5fb-S0 at slave(299)@172.17.1.154:55676
(7c9c99874a2d)
I0924 04:54:07.141835 30000 hierarchical.hpp:552] Removed framework
bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
I0924 04:54:07.142173 30000 slave.cpp:3178] [email protected]:55676 exited
W0924 04:54:07.142199 30000 slave.cpp:3181] Master disconnected! Waiting for a
new master to be elected
I0924 04:54:07.146646 30000 slave.cpp:3814] Killing executor 'default' of
framework bd006350-cdbf-414f-9eef-25f03ccdc5fb-0000
F0924 04:54:07.146987 30000 owned.hpp:111] Check failed: 'get()' Must be non
NULL
*** Check failure stack trace: ***
@ 0x2aeb5646f500 google::LogMessage::Fail()
@ 0x2aeb5646f44c google::LogMessage::SendToLog()
@ 0x2aeb5646ee4e google::LogMessage::Flush()
@ 0x2aeb56471d62 google::LogMessageFatal::~LogMessageFatal()
@ 0xa3e076 google::CheckNotNull<>()
@ 0xa39165 process::Owned<>::operator->()
@ 0xa34091 mesos::internal::tests::TestContainerizer::destroy()
@ 0x2aeb55adde4c
mesos::internal::slave::Slave::shutdownExecutorTimeout()
@ 0x2aeb55b1b2a0
_ZZN7process8dispatchIN5mesos8internal5slave5SlaveERKNS1_11FrameworkIDERKNS1_10ExecutorIDERKNS1_11ContainerIDES5_S8_SB_EEvRKNS_3PIDIT_EEMSF_FvT0_T1_T2_ET3_T4_T5_ENKUlPNS_11ProcessBaseEE_clESS_
@ 0x2aeb55b49f60
_ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal5slave5SlaveERKNS5_11FrameworkIDERKNS5_10ExecutorIDERKNS5_11ContainerIDES9_SC_SF_EEvRKNS0_3PIDIT_EEMSJ_FvT0_T1_T2_ET3_T4_T5_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
@ 0x2aeb563df9df std::function<>::operator()()
@ 0x2aeb563c9211 process::ProcessBase::visit()
@ 0x2aeb563cc08c process::DispatchEvent::visit()
@ 0xbbf49c process::ProcessBase::serve()
@ 0x2aeb563c568a process::ProcessManager::resume()
@ 0x2aeb563b8ece process::internal::schedule()
@ 0x2aeb564154d1
_ZNSt12_Bind_simpleIFPFvvEvEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
@ 0x2aeb5641542b std::_Bind_simple<>::operator()()
@ 0x2aeb564153c4 std::thread::_Impl<>::_M_run()
@ 0x2aeb57bf9a40 (unknown)
@ 0x2aeb58370182 start_thread
@ 0x2aeb5868047d (unknown)
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)