[
https://issues.apache.org/jira/browse/MESOS-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049850#comment-15049850
]
Benjamin Mahler commented on MESOS-1613:
----------------------------------------
For posterity, I also wasn't able to reproduce this by just running in
repetition. However, when I ran one {{openssl speed}} for each core on my
laptop in order to induce load, I could reproduce easily. We probably want to
direct folks to try this when they are having trouble reproducing something
flaky from CI.
I will post a fix through MESOS-4106.
> HealthCheckTest.ConsecutiveFailures is flaky
> --------------------------------------------
>
> Key: MESOS-1613
> URL: https://issues.apache.org/jira/browse/MESOS-1613
> Project: Mesos
> Issue Type: Bug
> Components: test
> Affects Versions: 0.20.0
> Environment: Ubuntu 10.04 GCC
> Reporter: Vinod Kone
> Assignee: Timothy Chen
> Labels: flaky, mesosphere
>
> {code}
> [ RUN ] HealthCheckTest.ConsecutiveFailures
> Using temporary directory '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV'
> I0717 04:39:59.288471 5009 leveldb.cpp:176] Opened db in 21.575631ms
> I0717 04:39:59.295274 5009 leveldb.cpp:183] Compacted db in 6.471982ms
> I0717 04:39:59.295552 5009 leveldb.cpp:198] Created db iterator in 16783ns
> I0717 04:39:59.296026 5009 leveldb.cpp:204] Seeked to beginning of db in
> 2125ns
> I0717 04:39:59.296257 5009 leveldb.cpp:273] Iterated through 0 keys in the
> db in 10747ns
> I0717 04:39:59.296584 5009 replica.cpp:741] Replica recovered with log
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0717 04:39:59.297322 5033 recover.cpp:425] Starting replica recovery
> I0717 04:39:59.297413 5033 recover.cpp:451] Replica is in EMPTY status
> I0717 04:39:59.297824 5033 replica.cpp:638] Replica in EMPTY status received
> a broadcasted recover request
> I0717 04:39:59.297899 5033 recover.cpp:188] Received a recover response from
> a replica in EMPTY status
> I0717 04:39:59.297997 5033 recover.cpp:542] Updating replica status to
> STARTING
> I0717 04:39:59.301985 5031 master.cpp:288] Master
> 20140717-043959-16842879-40280-5009 (lucid) started on 127.0.1.1:40280
> I0717 04:39:59.302026 5031 master.cpp:325] Master only allowing
> authenticated frameworks to register
> I0717 04:39:59.302032 5031 master.cpp:330] Master only allowing
> authenticated slaves to register
> I0717 04:39:59.302039 5031 credentials.hpp:36] Loading credentials for
> authentication from
> '/tmp/HealthCheckTest_ConsecutiveFailures_AzK0OV/credentials'
> I0717 04:39:59.302283 5031 master.cpp:359] Authorization enabled
> I0717 04:39:59.302971 5031 hierarchical_allocator_process.hpp:301]
> Initializing hierarchical allocator process with master :
> [email protected]:40280
> I0717 04:39:59.303022 5031 master.cpp:122] No whitelist given. Advertising
> offers for all slaves
> I0717 04:39:59.303390 5033 leveldb.cpp:306] Persisting metadata (8 bytes) to
> leveldb took 5.325097ms
> I0717 04:39:59.303419 5033 replica.cpp:320] Persisted replica status to
> STARTING
> I0717 04:39:59.304076 5030 master.cpp:1128] The newly elected leader is
> [email protected]:40280 with id 20140717-043959-16842879-40280-5009
> I0717 04:39:59.304095 5030 master.cpp:1141] Elected as the leading master!
> I0717 04:39:59.304102 5030 master.cpp:959] Recovering from registrar
> I0717 04:39:59.304182 5030 registrar.cpp:313] Recovering registrar
> I0717 04:39:59.304635 5033 recover.cpp:451] Replica is in STARTING status
> I0717 04:39:59.304962 5033 replica.cpp:638] Replica in STARTING status
> received a broadcasted recover request
> I0717 04:39:59.305026 5033 recover.cpp:188] Received a recover response from
> a replica in STARTING status
> I0717 04:39:59.305130 5033 recover.cpp:542] Updating replica status to VOTING
> I0717 04:39:59.310416 5033 leveldb.cpp:306] Persisting metadata (8 bytes) to
> leveldb took 5.204157ms
> I0717 04:39:59.310459 5033 replica.cpp:320] Persisted replica status to
> VOTING
> I0717 04:39:59.310534 5033 recover.cpp:556] Successfully joined the Paxos
> group
> I0717 04:39:59.310607 5033 recover.cpp:440] Recover process terminated
> I0717 04:39:59.310773 5033 log.cpp:656] Attempting to start the writer
> I0717 04:39:59.311157 5033 replica.cpp:474] Replica received implicit
> promise request with proposal 1
> I0717 04:39:59.313451 5033 leveldb.cpp:306] Persisting metadata (8 bytes) to
> leveldb took 2.271822ms
> I0717 04:39:59.313627 5033 replica.cpp:342] Persisted promised to 1
> I0717 04:39:59.318038 5031 coordinator.cpp:230] Coordinator attemping to
> fill missing position
> I0717 04:39:59.318430 5031 replica.cpp:375] Replica received explicit
> promise request for position 0 with proposal 2
> I0717 04:39:59.323459 5031 leveldb.cpp:343] Persisting action (8 bytes) to
> leveldb took 5.004323ms
> I0717 04:39:59.323493 5031 replica.cpp:676] Persisted action at 0
> I0717 04:39:59.323799 5031 replica.cpp:508] Replica received write request
> for position 0
> I0717 04:39:59.323837 5031 leveldb.cpp:438] Reading position from leveldb
> took 21901ns
> I0717 04:39:59.329038 5031 leveldb.cpp:343] Persisting action (14 bytes) to
> leveldb took 5.175998ms
> I0717 04:39:59.329244 5031 replica.cpp:676] Persisted action at 0
> I0717 04:39:59.332335 5037 replica.cpp:655] Replica received learned notice
> for position 0
> I0717 04:39:59.337894 5037 leveldb.cpp:343] Persisting action (16 bytes) to
> leveldb took 5.532845ms
> I0717 04:39:59.337926 5037 replica.cpp:676] Persisted action at 0
> I0717 04:39:59.337934 5037 replica.cpp:661] Replica learned NOP action at
> position 0
> I0717 04:39:59.338152 5037 log.cpp:672] Writer started with ending position 0
> I0717 04:39:59.338438 5037 leveldb.cpp:438] Reading position from leveldb
> took 15499ns
> I0717 04:39:59.339648 5037 registrar.cpp:346] Successfully fetched the
> registry (0B)
> I0717 04:39:59.339676 5037 registrar.cpp:422] Attempting to update the
> 'registry'
> I0717 04:39:59.340885 5037 log.cpp:680] Attempting to append 117 bytes to
> the log
> I0717 04:39:59.340929 5037 coordinator.cpp:340] Coordinator attempting to
> write APPEND action at position 1
> I0717 04:39:59.341130 5037 replica.cpp:508] Replica received write request
> for position 1
> I0717 04:39:59.343456 5037 leveldb.cpp:343] Persisting action (134 bytes) to
> leveldb took 2.30603ms
> I0717 04:39:59.343638 5037 replica.cpp:676] Persisted action at 1
> I0717 04:39:59.352447 5034 replica.cpp:655] Replica received learned notice
> for position 1
> I0717 04:39:59.365022 5034 leveldb.cpp:343] Persisting action (136 bytes) to
> leveldb took 12.534333ms
> I0717 04:39:59.365072 5034 replica.cpp:676] Persisted action at 1
> I0717 04:39:59.365082 5034 replica.cpp:661] Replica learned APPEND action at
> position 1
> I0717 04:39:59.365470 5034 registrar.cpp:479] Successfully updated 'registry'
> I0717 04:39:59.365520 5034 registrar.cpp:372] Successfully recovered
> registrar
> I0717 04:39:59.365578 5034 log.cpp:699] Attempting to truncate the log to 1
> I0717 04:39:59.365656 5034 master.cpp:986] Recovered 0 slaves from the
> Registry (81B) ; allowing 10mins for slaves to re-register
> I0717 04:39:59.365696 5034 coordinator.cpp:340] Coordinator attempting to
> write TRUNCATE action at position 2
> I0717 04:39:59.366221 5030 replica.cpp:508] Replica received write request
> for position 2
> I0717 04:39:59.374423 5030 leveldb.cpp:343] Persisting action (16 bytes) to
> leveldb took 8.172997ms
> I0717 04:39:59.374459 5030 replica.cpp:676] Persisted action at 2
> I0717 04:39:59.374671 5030 replica.cpp:655] Replica received learned notice
> for position 2
> I0717 04:39:59.383867 5030 leveldb.cpp:343] Persisting action (18 bytes) to
> leveldb took 9.168201ms
> I0717 04:39:59.383939 5030 leveldb.cpp:401] Deleting ~1 keys from leveldb
> took 33989ns
> I0717 04:39:59.383952 5030 replica.cpp:676] Persisted action at 2
> I0717 04:39:59.383960 5030 replica.cpp:661] Replica learned TRUNCATE action
> at position 2
> I0717 04:39:59.387097 5009 containerizer.cpp:124] Using isolation:
> posix/cpu,posix/mem
> I0717 04:39:59.391551 5009 sched.cpp:139] Version: 0.20.0
> I0717 04:39:59.392025 5036 slave.cpp:168] Slave started on
> 116)@127.0.1.1:40280
> I0717 04:39:59.392042 5036 credentials.hpp:84] Loading credential for
> authentication from
> '/tmp/HealthCheckTest_ConsecutiveFailures_sOluTu/credential'
> I0717 04:39:59.392148 5036 slave.cpp:266] Slave using credential for:
> test-principal
> I0717 04:39:59.392262 5036 slave.cpp:279] Slave resources: cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I0717 04:39:59.392325 5036 slave.cpp:324] Slave hostname: lucid
> I0717 04:39:59.392336 5036 slave.cpp:325] Slave checkpoint: false
> I0717 04:39:59.392853 5036 state.cpp:33] Recovering state from
> '/tmp/HealthCheckTest_ConsecutiveFailures_sOluTu/meta'
> I0717 04:39:59.393005 5036 sched.cpp:235] New master detected at
> [email protected]:40280
> I0717 04:39:59.393023 5036 sched.cpp:285] Authenticating with master
> [email protected]:40280
> I0717 04:39:59.393102 5036 status_update_manager.cpp:193] Recovering status
> update manager
> I0717 04:39:59.393179 5036 authenticatee.hpp:128] Creating new client SASL
> connection
> I0717 04:39:59.393326 5036 master.cpp:3512] Authenticating
> [email protected]:40280
> I0717 04:39:59.393432 5036 containerizer.cpp:287] Recovering containerizer
> I0717 04:39:59.393569 5036 authenticator.hpp:156] Creating new server SASL
> connection
> I0717 04:39:59.393748 5036 authenticatee.hpp:219] Received SASL
> authentication mechanisms: CRAM-MD5
> I0717 04:39:59.393764 5036 authenticatee.hpp:245] Attempting to authenticate
> with mechanism 'CRAM-MD5'
> I0717 04:39:59.393842 5036 authenticator.hpp:262] Received SASL
> authentication start
> I0717 04:39:59.393888 5036 authenticator.hpp:384] Authentication requires
> more steps
> I0717 04:39:59.393934 5036 slave.cpp:3127] Finished recovery
> I0717 04:39:59.394132 5036 authenticatee.hpp:265] Received SASL
> authentication step
> I0717 04:39:59.394208 5036 authenticator.hpp:290] Received SASL
> authentication step
> I0717 04:39:59.394234 5036 auxprop.cpp:81] Request to lookup properties for
> user: 'test-principal' realm: 'lucid' server FQDN: 'lucid'
> SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false
> I0717 04:39:59.394243 5036 auxprop.cpp:153] Looking up auxiliary property
> '*userPassword'
> I0717 04:39:59.394255 5036 auxprop.cpp:153] Looking up auxiliary property
> '*cmusaslsecretCRAM-MD5'
> I0717 04:39:59.394268 5036 auxprop.cpp:81] Request to lookup properties for
> user: 'test-principal' realm: 'lucid' server FQDN: 'lucid'
> SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true
> I0717 04:39:59.394274 5036 auxprop.cpp:103] Skipping auxiliary property
> '*userPassword' since SASL_AUXPROP_AUTHZID == true
> I0717 04:39:59.394279 5036 auxprop.cpp:103] Skipping auxiliary property
> '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> I0717 04:39:59.394290 5036 authenticator.hpp:376] Authentication success
> I0717 04:39:59.394330 5036 slave.cpp:600] New master detected at
> [email protected]:40280
> I0717 04:39:59.394348 5036 slave.cpp:676] Authenticating with master
> [email protected]:40280
> I0717 04:39:59.394392 5036 slave.cpp:649] Detecting new master
> I0717 04:39:59.394434 5036 authenticatee.hpp:305] Authentication success
> I0717 04:39:59.394461 5036 master.cpp:3552] Successfully authenticated
> principal 'test-principal' at
> [email protected]:40280
> I0717 04:39:59.394516 5036 status_update_manager.cpp:167] New master
> detected at [email protected]:40280
> I0717 04:39:59.394542 5036 authenticatee.hpp:128] Creating new client SASL
> connection
> I0717 04:39:59.394605 5036 sched.cpp:359] Successfully authenticated with
> master [email protected]:40280
> I0717 04:39:59.394618 5036 sched.cpp:478] Sending registration request to
> [email protected]:40280
> I0717 04:39:59.394666 5036 master.cpp:3512] Authenticating
> slave(116)@127.0.1.1:40280
> I0717 04:39:59.394728 5036 master.cpp:1247] Received registration request
> from [email protected]:40280
> I0717 04:39:59.394757 5036 master.cpp:1207] Authorizing framework principal
> 'test-principal' to receive offers for role '*'
> I0717 04:39:59.394816 5036 authenticator.hpp:156] Creating new server SASL
> connection
> I0717 04:39:59.394917 5036 authenticatee.hpp:219] Received SASL
> authentication mechanisms: CRAM-MD5
> I0717 04:39:59.394932 5036 authenticatee.hpp:245] Attempting to authenticate
> with mechanism 'CRAM-MD5'
> I0717 04:39:59.394989 5036 master.cpp:1306] Registering framework
> 20140717-043959-16842879-40280-5009-0000 at
> [email protected]:40280
> I0717 04:39:59.395094 5036 authenticator.hpp:262] Received SASL
> authentication start
> I0717 04:39:59.395126 5036 authenticator.hpp:384] Authentication requires
> more steps
> I0717 04:39:59.395166 5036 sched.cpp:409] Framework registered with
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:39:59.395187 5036 sched.cpp:423] Scheduler::registered took 9209ns
> I0717 04:39:59.395221 5036 hierarchical_allocator_process.hpp:331] Added
> framework 20140717-043959-16842879-40280-5009-0000
> I0717 04:39:59.395228 5036 hierarchical_allocator_process.hpp:724] No
> resources available to allocate!
> I0717 04:39:59.395234 5036 hierarchical_allocator_process.hpp:686] Performed
> allocation for 0 slaves in 5941ns
> I0717 04:39:59.395278 5036 authenticatee.hpp:265] Received SASL
> authentication step
> I0717 04:39:59.395310 5036 authenticator.hpp:290] Received SASL
> authentication step
> I0717 04:39:59.395323 5036 auxprop.cpp:81] Request to lookup properties for
> user: 'test-principal' realm: 'lucid' server FQDN: 'lucid'
> SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: false
> I0717 04:39:59.395329 5036 auxprop.cpp:153] Looking up auxiliary property
> '*userPassword'
> I0717 04:39:59.395336 5036 auxprop.cpp:153] Looking up auxiliary property
> '*cmusaslsecretCRAM-MD5'
> I0717 04:39:59.395344 5036 auxprop.cpp:81] Request to lookup properties for
> user: 'test-principal' realm: 'lucid' server FQDN: 'lucid'
> SASL_AUXPROP_OVERRIDE: false SASL_AUXPROP_AUTHZID: true
> I0717 04:39:59.395349 5036 auxprop.cpp:103] Skipping auxiliary property
> '*userPassword' since SASL_AUXPROP_AUTHZID == true
> I0717 04:39:59.395354 5036 auxprop.cpp:103] Skipping auxiliary property
> '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> I0717 04:39:59.395362 5036 authenticator.hpp:376] Authentication success
> I0717 04:39:59.395385 5036 authenticatee.hpp:305] Authentication success
> I0717 04:39:59.395402 5036 master.cpp:3552] Successfully authenticated
> principal 'test-principal' at slave(116)@127.0.1.1:40280
> I0717 04:39:59.395457 5036 slave.cpp:733] Successfully authenticated with
> master [email protected]:40280
> I0717 04:39:59.395503 5036 slave.cpp:971] Will retry registration in
> 15.135662ms if necessary
> I0717 04:39:59.395601 5037 master.cpp:2789] Registering slave at
> slave(116)@127.0.1.1:40280 (lucid) with id
> 20140717-043959-16842879-40280-5009-0
> I0717 04:39:59.395699 5037 registrar.cpp:422] Attempting to update the
> 'registry'
> I0717 04:39:59.396891 5031 log.cpp:680] Attempting to append 287 bytes to
> the log
> I0717 04:39:59.396939 5031 coordinator.cpp:340] Coordinator attempting to
> write APPEND action at position 3
> I0717 04:39:59.397140 5031 replica.cpp:508] Replica received write request
> for position 3
> I0717 04:39:59.405050 5031 leveldb.cpp:343] Persisting action (306 bytes) to
> leveldb took 7.878137ms
> I0717 04:39:59.405097 5031 replica.cpp:676] Persisted action at 3
> I0717 04:39:59.405342 5031 replica.cpp:655] Replica received learned notice
> for position 3
> I0717 04:39:59.412286 5037 slave.cpp:971] Will retry registration in
> 9.096666ms if necessary
> I0717 04:39:59.412365 5037 master.cpp:2777] Ignoring register slave message
> from slave(116)@127.0.1.1:40280 (lucid) as admission is already in progress
> I0717 04:39:59.414619 5031 leveldb.cpp:343] Persisting action (308 bytes) to
> leveldb took 9.256058ms
> I0717 04:39:59.414646 5031 replica.cpp:676] Persisted action at 3
> I0717 04:39:59.414654 5031 replica.cpp:661] Replica learned APPEND action at
> position 3
> I0717 04:39:59.415129 5031 registrar.cpp:479] Successfully updated 'registry'
> I0717 04:39:59.415212 5031 log.cpp:699] Attempting to truncate the log to 3
> I0717 04:39:59.415269 5031 master.cpp:2829] Registered slave
> 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid)
> I0717 04:39:59.415293 5031 master.cpp:3980] Adding slave
> 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid)
> with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I0717 04:39:59.415415 5031 coordinator.cpp:340] Coordinator attempting to
> write TRUNCATE action at position 4
> I0717 04:39:59.415485 5031 slave.cpp:767] Registered with master
> [email protected]:40280; given slave ID 20140717-043959-16842879-40280-5009-0
> I0717 04:39:59.415550 5031 hierarchical_allocator_process.hpp:444] Added
> slave 20140717-043959-16842879-40280-5009-0 (lucid) with cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000] available)
> I0717 04:39:59.415602 5031 hierarchical_allocator_process.hpp:750] Offering
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave
> 20140717-043959-16842879-40280-5009-0 to framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:39:59.415699 5031 hierarchical_allocator_process.hpp:706] Performed
> allocation for slave 20140717-043959-16842879-40280-5009-0 in 119984ns
> I0717 04:39:59.415757 5031 slave.cpp:2324] Received ping from
> slave-observer(104)@127.0.1.1:40280
> I0717 04:39:59.415824 5031 master.hpp:800] Adding offer
> 20140717-043959-16842879-40280-5009-0 with resources cpus(*):2; mem(*):1024;
> disk(*):1024; ports(*):[31000-32000] on slave
> 20140717-043959-16842879-40280-5009-0 (lucid)
> I0717 04:39:59.415864 5031 master.cpp:3459] Sending 1 offers to framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:39:59.416179 5033 replica.cpp:508] Replica received write request
> for position 4
> I0717 04:39:59.416697 5031 sched.cpp:546] Scheduler::resourceOffers took
> 34644ns
> I0717 04:39:59.417361 5037 master.hpp:810] Removing offer
> 20140717-043959-16842879-40280-5009-0 with resources cpus(*):2; mem(*):1024;
> disk(*):1024; ports(*):[31000-32000] on slave
> 20140717-043959-16842879-40280-5009-0 (lucid)
> I0717 04:39:59.417414 5037 master.cpp:2133] Processing reply for offers: [
> 20140717-043959-16842879-40280-5009-0 ] on slave
> 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid)
> for framework 20140717-043959-16842879-40280-5009-0000
> I0717 04:39:59.417453 5037 master.cpp:2219] Authorizing framework principal
> 'test-principal' to launch task 1 as user 'jenkins'
> I0717 04:39:59.417733 5037 master.hpp:772] Adding task 1 with resources
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave
> 20140717-043959-16842879-40280-5009-0 (lucid)
> I0717 04:39:59.417768 5037 master.cpp:2285] Launching task 1 of framework
> 20140717-043959-16842879-40280-5009-0000 with resources cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave
> 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid)
> I0717 04:39:59.417913 5037 slave.cpp:1002] Got assigned task 1 for framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:39:59.418223 5037 slave.cpp:1112] Launching task 1 for framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:39:59.419910 5037 slave.cpp:1222] Queuing task '1' for executor 1
> of framework '20140717-043959-16842879-40280-5009-0000
> I0717 04:39:59.420047 5037 containerizer.cpp:427] Starting container
> '31868c4e-911b-4198-88b2-2ec5a7022adb' for executor '1' of framework
> '20140717-043959-16842879-40280-5009-0000'
> I0717 04:39:59.420281 5037 slave.cpp:561] Successfully attached file
> '/tmp/HealthCheckTest_ConsecutiveFailures_sOluTu/slaves/20140717-043959-16842879-40280-5009-0/frameworks/20140717-043959-16842879-40280-5009-0000/executors/1/runs/31868c4e-911b-4198-88b2-2ec5a7022adb'
> I0717 04:39:59.424384 5033 leveldb.cpp:343] Persisting action (16 bytes) to
> leveldb took 8.154005ms
> I0717 04:39:59.424438 5033 replica.cpp:676] Persisted action at 4
> I0717 04:39:59.424990 5035 replica.cpp:655] Replica received learned notice
> for position 4
> I0717 04:39:59.425606 5037 launcher.cpp:137] Forked child with pid '5746'
> for container '31868c4e-911b-4198-88b2-2ec5a7022adb'
> I0717 04:39:59.428802 5037 containerizer.cpp:537] Fetching URIs for
> container '31868c4e-911b-4198-88b2-2ec5a7022adb' using command
> '/var/jenkins/workspace/mesos-ubuntu-10.04-gcc/src/mesos-fetcher'
> I0717 04:39:59.434170 5035 leveldb.cpp:343] Persisting action (18 bytes) to
> leveldb took 9.084582ms
> I0717 04:39:59.434267 5035 leveldb.cpp:401] Deleting ~2 keys from leveldb
> took 45795ns
> I0717 04:39:59.434283 5035 replica.cpp:676] Persisted action at 4
> I0717 04:39:59.434299 5035 replica.cpp:661] Replica learned TRUNCATE action
> at position 4
> 2014-07-17
> 04:40:00,248:5009(0x2ad17a97f700):ZOO_ERROR@handle_socket_error_msg@1697:
> Socket [127.0.0.1:46988] zk retcode=-4, errno=111(Connection refused): server
> refused to accept the client
> I0717 04:40:00.276789 5035 slave.cpp:2469] Monitoring executor '1' of
> framework '20140717-043959-16842879-40280-5009-0000' in container
> '31868c4e-911b-4198-88b2-2ec5a7022adb'
> I0717 04:40:00.312453 5035 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 43185ns
> I0717 04:40:00.318514 5035 slave.cpp:1733] Got registration for executor '1'
> of framework 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:00.319555 5035 slave.cpp:1852] Flushing queued task 1 for
> executor '1' of framework 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:00.320169 5038 process.cpp:1098] Socket closed while receiving
> I0717 04:40:00.325037 5035 slave.cpp:2087] Handling status update
> TASK_RUNNING (UUID: a769a032-5aa7-41fb-9830-d6745df0e18e) for task 1 of
> framework 20140717-043959-16842879-40280-5009-0000 from
> executor(1)@127.0.1.1:59033
> I0717 04:40:00.325140 5035 status_update_manager.cpp:320] Received status
> update TASK_RUNNING (UUID: a769a032-5aa7-41fb-9830-d6745df0e18e) for task 1
> of framework 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:00.325155 5035 status_update_manager.cpp:499] Creating
> StatusUpdate stream for task 1 of framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:00.325295 5035 status_update_manager.cpp:373] Forwarding status
> update TASK_RUNNING (UUID: a769a032-5aa7-41fb-9830-d6745df0e18e) for task 1
> of framework 20140717-043959-16842879-40280-5009-0000 to
> [email protected]:40280
> I0717 04:40:00.325455 5035 master.cpp:3120] Status update TASK_RUNNING
> (UUID: a769a032-5aa7-41fb-9830-d6745df0e18e) for task 1 of framework
> 20140717-043959-16842879-40280-5009-0000 from slave
> 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid)
> I0717 04:40:00.325495 5035 slave.cpp:2245] Status update manager
> successfully handled status update TASK_RUNNING (UUID:
> a769a032-5aa7-41fb-9830-d6745df0e18e) for task 1 of framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:00.325506 5035 slave.cpp:2251] Sending acknowledgement for
> status update TASK_RUNNING (UUID: a769a032-5aa7-41fb-9830-d6745df0e18e) for
> task 1 of framework 20140717-043959-16842879-40280-5009-0000 to
> executor(1)@127.0.1.1:59033
> I0717 04:40:00.325717 5035 sched.cpp:637] Scheduler::statusUpdate took
> 31980ns
> I0717 04:40:00.325794 5035 master.cpp:2639] Forwarding status update
> acknowledgement a769a032-5aa7-41fb-9830-d6745df0e18e for task 1 of framework
> 20140717-043959-16842879-40280-5009-0000 to slave
> 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid)
> I0717 04:40:00.325899 5035 status_update_manager.cpp:398] Received status
> update acknowledgement (UUID: a769a032-5aa7-41fb-9830-d6745df0e18e) for task
> 1 of framework 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:00.325950 5035 slave.cpp:1673] Status update manager
> successfully handled status update acknowledgement (UUID:
> a769a032-5aa7-41fb-9830-d6745df0e18e) for task 1 of framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:00.326431 5038 process.cpp:1098] Socket closed while receiving
> I0717 04:40:01.322517 5031 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 42022ns
> I0717 04:40:01.376111 5031 slave.cpp:2087] Handling status update
> TASK_RUNNING (UUID: 8e06aa77-eb92-486e-a20f-0d67752a7754) for task 1 in
> health state unhealthy of framework 20140717-043959-16842879-40280-5009-0000
> from executor(1)@127.0.1.1:59033
> I0717 04:40:01.376238 5031 status_update_manager.cpp:320] Received status
> update TASK_RUNNING (UUID: 8e06aa77-eb92-486e-a20f-0d67752a7754) for task 1
> in health state unhealthy of framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:01.376266 5031 status_update_manager.cpp:373] Forwarding status
> update TASK_RUNNING (UUID: 8e06aa77-eb92-486e-a20f-0d67752a7754) for task 1
> in health state unhealthy of framework
> 20140717-043959-16842879-40280-5009-0000 to [email protected]:40280
> I0717 04:40:01.376397 5031 master.cpp:3120] Status update TASK_RUNNING
> (UUID: 8e06aa77-eb92-486e-a20f-0d67752a7754) for task 1 in health state
> unhealthy of framework 20140717-043959-16842879-40280-5009-0000 from slave
> 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid)
> I0717 04:40:01.376454 5031 slave.cpp:2245] Status update manager
> successfully handled status update TASK_RUNNING (UUID:
> 8e06aa77-eb92-486e-a20f-0d67752a7754) for task 1 in health state unhealthy of
> framework 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:01.376466 5031 slave.cpp:2251] Sending acknowledgement for
> status update TASK_RUNNING (UUID: 8e06aa77-eb92-486e-a20f-0d67752a7754) for
> task 1 in health state unhealthy of framework
> 20140717-043959-16842879-40280-5009-0000 to executor(1)@127.0.1.1:59033
> I0717 04:40:01.376648 5031 sched.cpp:637] Scheduler::statusUpdate took
> 18560ns
> I0717 04:40:01.376713 5031 master.cpp:2639] Forwarding status update
> acknowledgement 8e06aa77-eb92-486e-a20f-0d67752a7754 for task 1 of framework
> 20140717-043959-16842879-40280-5009-0000 to slave
> 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid)
> I0717 04:40:01.376857 5035 status_update_manager.cpp:398] Received status
> update acknowledgement (UUID: 8e06aa77-eb92-486e-a20f-0d67752a7754) for task
> 1 of framework 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:01.376911 5035 slave.cpp:1673] Status update manager
> successfully handled status update acknowledgement (UUID:
> 8e06aa77-eb92-486e-a20f-0d67752a7754) for task 1 of framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:01.377177 5038 process.cpp:1098] Socket closed while receiving
> 2014-07-17
> 04:40:04,062:5009(0x2ad17a97f700):ZOO_WARN@zookeeper_interest@1557: Exceeded
> deadline by 483ms
> 2014-07-17
> 04:40:04,062:5009(0x2ad17a97f700):ZOO_ERROR@handle_socket_error_msg@1697:
> Socket [127.0.0.1:46988] zk retcode=-4, errno=111(Connection refused): server
> refused to accept the client
> I0717 04:40:04.063637 5030 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 51405ns
> I0717 04:40:04.134610 5034 slave.cpp:2087] Handling status update
> TASK_RUNNING (UUID: cc04d02c-753e-4fc8-b35a-61cc33b4d796) for task 1 in
> health state unhealthy of framework 20140717-043959-16842879-40280-5009-0000
> from executor(1)@127.0.1.1:59033
> I0717 04:40:04.134814 5034 status_update_manager.cpp:320] Received status
> update TASK_RUNNING (UUID: cc04d02c-753e-4fc8-b35a-61cc33b4d796) for task 1
> in health state unhealthy of framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:04.134857 5034 status_update_manager.cpp:373] Forwarding status
> update TASK_RUNNING (UUID: cc04d02c-753e-4fc8-b35a-61cc33b4d796) for task 1
> in health state unhealthy of framework
> 20140717-043959-16842879-40280-5009-0000 to [email protected]:40280
> I0717 04:40:04.135036 5034 master.cpp:3120] Status update TASK_RUNNING
> (UUID: cc04d02c-753e-4fc8-b35a-61cc33b4d796) for task 1 in health state
> unhealthy of framework 20140717-043959-16842879-40280-5009-0000 from slave
> 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid)
> I0717 04:40:04.135073 5034 slave.cpp:2245] Status update manager
> successfully handled status update TASK_RUNNING (UUID:
> cc04d02c-753e-4fc8-b35a-61cc33b4d796) for task 1 in health state unhealthy of
> framework 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:04.135083 5034 slave.cpp:2251] Sending acknowledgement for
> status update TASK_RUNNING (UUID: cc04d02c-753e-4fc8-b35a-61cc33b4d796) for
> task 1 in health state unhealthy of framework
> 20140717-043959-16842879-40280-5009-0000 to executor(1)@127.0.1.1:59033
> I0717 04:40:04.135267 5034 sched.cpp:637] Scheduler::statusUpdate took
> 27539ns
> I0717 04:40:04.135339 5034 master.cpp:2639] Forwarding status update
> acknowledgement cc04d02c-753e-4fc8-b35a-61cc33b4d796 for task 1 of framework
> 20140717-043959-16842879-40280-5009-0000 to slave
> 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid)
> I0717 04:40:04.135452 5034 status_update_manager.cpp:398] Received status
> update acknowledgement (UUID: cc04d02c-753e-4fc8-b35a-61cc33b4d796) for task
> 1 of framework 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:04.135503 5034 slave.cpp:1673] Status update manager
> successfully handled status update acknowledgement (UUID:
> cc04d02c-753e-4fc8-b35a-61cc33b4d796) for task 1 of framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:04.135925 5038 process.cpp:1098] Socket closed while receiving
> I0717 04:40:04.312355 5036 master.cpp:122] No whitelist given. Advertising
> offers for all slaves
> I0717 04:40:05.072487 5036 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 49651ns
> I0717 04:40:05.084954 5036 slave.cpp:2087] Handling status update
> TASK_RUNNING (UUID: 5ec41670-9320-4ecb-9767-6c020d771cd9) for task 1 in
> health state unhealthy of framework 20140717-043959-16842879-40280-5009-0000
> from executor(1)@127.0.1.1:59033
> I0717 04:40:05.085073 5036 status_update_manager.cpp:320] Received status
> update TASK_RUNNING (UUID: 5ec41670-9320-4ecb-9767-6c020d771cd9) for task 1
> in health state unhealthy of framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:05.085100 5036 status_update_manager.cpp:373] Forwarding status
> update TASK_RUNNING (UUID: 5ec41670-9320-4ecb-9767-6c020d771cd9) for task 1
> in health state unhealthy of framework
> 20140717-043959-16842879-40280-5009-0000 to [email protected]:40280
> I0717 04:40:05.085227 5036 master.cpp:3120] Status update TASK_RUNNING
> (UUID: 5ec41670-9320-4ecb-9767-6c020d771cd9) for task 1 in health state
> unhealthy of framework 20140717-043959-16842879-40280-5009-0000 from slave
> 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid)
> I0717 04:40:05.085263 5036 slave.cpp:2245] Status update manager
> successfully handled status update TASK_RUNNING (UUID:
> 5ec41670-9320-4ecb-9767-6c020d771cd9) for task 1 in health state unhealthy of
> framework 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:05.085273 5036 slave.cpp:2251] Sending acknowledgement for
> status update TASK_RUNNING (UUID: 5ec41670-9320-4ecb-9767-6c020d771cd9) for
> task 1 in health state unhealthy of framework
> 20140717-043959-16842879-40280-5009-0000 to executor(1)@127.0.1.1:59033
> I0717 04:40:05.087087 5037 sched.cpp:637] Scheduler::statusUpdate took
> 19205ns
> I0717 04:40:05.087158 5030 master.cpp:2639] Forwarding status update
> acknowledgement 5ec41670-9320-4ecb-9767-6c020d771cd9 for task 1 of framework
> 20140717-043959-16842879-40280-5009-0000 to slave
> 20140717-043959-16842879-40280-5009-0 at slave(116)@127.0.1.1:40280 (lucid)
> I0717 04:40:05.087779 5038 process.cpp:1098] Socket closed while receiving
> I0717 04:40:05.088122 5032 status_update_manager.cpp:398] Received status
> update acknowledgement (UUID: 5ec41670-9320-4ecb-9767-6c020d771cd9) for task
> 1 of framework 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:05.088302 5036 slave.cpp:1673] Status update manager
> successfully handled status update acknowledgement (UUID:
> 5ec41670-9320-4ecb-9767-6c020d771cd9) for task 1 of framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:06.082578 5031 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 71517ns
> I0717 04:40:07.092687 5037 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 34535ns
> 2014-07-17
> 04:40:07,399:5009(0x2ad17a97f700):ZOO_ERROR@handle_socket_error_msg@1697:
> Socket [127.0.0.1:46988] zk retcode=-4, errno=111(Connection refused): server
> refused to accept the client
> I0717 04:40:08.102656 5035 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 70999ns
> I0717 04:40:09.112658 5031 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 35887ns
> I0717 04:40:09.322443 5037 master.cpp:122] No whitelist given. Advertising
> offers for all slaves
> 2014-07-17
> 04:40:11,494:5009(0x2ad17a97f700):ZOO_WARN@zookeeper_interest@1557: Exceeded
> deadline by 764ms
> 2014-07-17
> 04:40:11,495:5009(0x2ad17a97f700):ZOO_ERROR@handle_socket_error_msg@1697:
> Socket [127.0.0.1:46988] zk retcode=-4, errno=111(Connection refused): server
> refused to accept the client
> I0717 04:40:11.499094 5037 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 46842ns
> I0717 04:40:12.502545 5034 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 34124ns
> I0717 04:40:13.512511 5032 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 33974ns
> I0717 04:40:14.332535 5031 master.cpp:122] No whitelist given. Advertising
> offers for all slaves
> I0717 04:40:14.422514 5037 slave.cpp:2324] Received ping from
> slave-observer(104)@127.0.1.1:40280
> I0717 04:40:14.522536 5037 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 46113ns
> 2014-07-17
> 04:40:14,832:5009(0x2ad17a97f700):ZOO_ERROR@handle_socket_error_msg@1697:
> Socket [127.0.0.1:46988] zk retcode=-4, errno=111(Connection refused): server
> refused to accept the client
> tests/health_check_tests.cpp:247: Failure
> Failed to wait 10secs for status4
> tests/health_check_tests.cpp:222: Failure
> Actual function call count doesn't match EXPECT_CALL(sched,
> statusUpdate(&driver, _))...
> Expected: to be called 6 times
> Actual: called 4 times - unsatisfied and active
> I0717 04:40:15.093816 5009 master.cpp:625] Master terminating
> I0717 04:40:15.093973 5009 master.hpp:790] Removing task 1 with resources
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave
> 20140717-043959-16842879-40280-5009-0 (lucid)
> W0717 04:40:15.094081 5009 master.cpp:4266] Removing task 1 of framework
> 20140717-043959-16842879-40280-5009-0000 and slave
> 20140717-043959-16842879-40280-5009-0 in non-terminal state TASK_RUNNING
> I0717 04:40:15.097363 5036 slave.cpp:2331] [email protected]:40280 exited
> W0717 04:40:15.097385 5036 slave.cpp:2334] Master disconnected! Waiting for
> a new master to be elected
> I0717 04:40:15.097913 5032 containerizer.cpp:903] Destroying container
> '31868c4e-911b-4198-88b2-2ec5a7022adb'
> I0717 04:40:15.146953 5038 process.cpp:1037] Socket closed while receiving
> I0717 04:40:15.532558 5035 containerizer.cpp:1019] Executor for container
> '31868c4e-911b-4198-88b2-2ec5a7022adb' has exited
> I0717 04:40:15.532965 5035 slave.cpp:2527] Executor '1' of framework
> 20140717-043959-16842879-40280-5009-0000 terminated with signal Killed
> I0717 04:40:15.534054 5035 slave.cpp:2087] Handling status update
> TASK_FAILED (UUID: 360ed575-2fac-4a2b-9a5f-794f703546bf) for task 1 of
> framework 20140717-043959-16842879-40280-5009-0000 from @0.0.0.0:0
> I0717 04:40:15.534097 5035 slave.cpp:3769] Terminating task 1
> W0717 04:40:15.534314 5035 containerizer.cpp:809] Ignoring update for
> unknown container: 31868c4e-911b-4198-88b2-2ec5a7022adb
> I0717 04:40:15.534819 5030 status_update_manager.cpp:320] Received status
> update TASK_FAILED (UUID: 360ed575-2fac-4a2b-9a5f-794f703546bf) for task 1 of
> framework 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:15.534883 5030 status_update_manager.cpp:373] Forwarding status
> update TASK_FAILED (UUID: 360ed575-2fac-4a2b-9a5f-794f703546bf) for task 1 of
> framework 20140717-043959-16842879-40280-5009-0000 to [email protected]:40280
> I0717 04:40:15.535047 5030 slave.cpp:2245] Status update manager
> successfully handled status update TASK_FAILED (UUID:
> 360ed575-2fac-4a2b-9a5f-794f703546bf) for task 1 of framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:15.535524 5009 slave.cpp:484] Slave terminating
> I0717 04:40:15.535552 5009 slave.cpp:1406] Asked to shut down framework
> 20140717-043959-16842879-40280-5009-0000 by @0.0.0.0:0
> I0717 04:40:15.535562 5009 slave.cpp:1431] Shutting down framework
> 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:15.535588 5009 slave.cpp:2661] Cleaning up executor '1' of
> framework 20140717-043959-16842879-40280-5009-0000
> I0717 04:40:15.535840 5009 slave.cpp:2736] Cleaning up framework
> 20140717-043959-16842879-40280-5009-0000
> [ FAILED ] HealthCheckTest.ConsecutiveFailures (16270 ms)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)