[
https://issues.apache.org/jira/browse/MESOS-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15244718#comment-15244718
]
haosdent commented on MESOS-1653:
---------------------------------
[~tnachen] After saw the log [~xujyan] posted. The second {statusUpdate} is
nearly 5 seconds delay after {14:46:23}.
{code}
I0909 14:46:23.633633 944 hierarchical_allocator_process.hpp:659] Performed
allocation for 1 slaves in 61631ns
I0909 14:46:27.799932 947 hierarchical_allocator_process.hpp:659] Performed
allocation for 1 slaves in 95512ns
I0909 14:46:27.800237 947 master.cpp:120] No whitelist given. Advertising
offers for all slaves
I0909 14:46:27.800612 947 slave.cpp:2329] Received ping from
slave-observer(2)@127.0.1.1:47396
tests/health_check_tests.cpp:557: Failure
Failed to wait 10secs for statusHealth
tests/health_check_tests.cpp:539: Failure
Actual function call count doesn't match EXPECT_CALL(sched,
statusUpdate(&driver, _))...
Expected: to be called at least twice
Actual: called once - unsatisfied and active
I0909 14:46:27.815444 928 master.cpp:650] Master terminating
I0909 14:46:27.815640 928 master.hpp:851] Removing task 1 with resources
cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave
20140909-144617-16842879-47396-928-0 (lucid)
W0909 14:46:27.815795 928 master.cpp:4419] Removing task 1 of framework
20140909-144617-16842879-47396-928-0000 and slave
20140909-144617-16842879-47396-928-0 in non-terminal state TASK_RUNNING
I0909 14:46:27.823565 943 slave.cpp:2361] [email protected]:47396 exited
W0909 14:46:27.823611 943 slave.cpp:2364] Master disconnected! Waiting for a
new master to be elected
I0909 14:46:27.828475 943 slave.cpp:2093] Handling status update TASK_RUNNING
(UUID: 5f53830d-cd08-4c57-be42-33be367d3f01) for task 1 in health state
unhealthy of framework 20140909-144617-16842879-47396-928-0000 from
executor(1)@127.0.1.1:52801
{code}
I think we need add a {AWAIT_READY(statusRunning);} before advance clock. Do
you think it is OK to add this and reenable the test case?
> HealthCheckTest.GracePeriod is flaky.
> -------------------------------------
>
> Key: MESOS-1653
> URL: https://issues.apache.org/jira/browse/MESOS-1653
> Project: Mesos
> Issue Type: Bug
> Components: test
> Reporter: Benjamin Mahler
> Assignee: Timothy Chen
> Labels: flaky, health-check, mesosphere
>
> {noformat}
> [----------] 3 tests from HealthCheckTest
> [ RUN ] HealthCheckTest.GracePeriod
> Using temporary directory '/tmp/HealthCheckTest_GracePeriod_d7zCPr'
> I0729 17:10:10.484951 1176 leveldb.cpp:176] Opened db in 28.883552ms
> I0729 17:10:10.499487 1176 leveldb.cpp:183] Compacted db in 13.674118ms
> I0729 17:10:10.500200 1176 leveldb.cpp:198] Created db iterator in 7394ns
> I0729 17:10:10.500692 1176 leveldb.cpp:204] Seeked to beginning of db in
> 2317ns
> I0729 17:10:10.501113 1176 leveldb.cpp:273] Iterated through 0 keys in the
> db in 1367ns
> I0729 17:10:10.501535 1176 replica.cpp:741] Replica recovered with log
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0729 17:10:10.502233 1212 recover.cpp:425] Starting replica recovery
> I0729 17:10:10.502295 1212 recover.cpp:451] Replica is in EMPTY status
> I0729 17:10:10.502825 1212 replica.cpp:638] Replica in EMPTY status received
> a broadcasted recover request
> I0729 17:10:10.502877 1212 recover.cpp:188] Received a recover response from
> a replica in EMPTY status
> I0729 17:10:10.502980 1212 recover.cpp:542] Updating replica status to
> STARTING
> I0729 17:10:10.508482 1213 master.cpp:289] Master
> 20140729-171010-16842879-54701-1176 (trusty) started on 127.0.1.1:54701
> I0729 17:10:10.508607 1213 master.cpp:326] Master only allowing
> authenticated frameworks to register
> I0729 17:10:10.508632 1213 master.cpp:331] Master only allowing
> authenticated slaves to register
> I0729 17:10:10.508656 1213 credentials.hpp:36] Loading credentials for
> authentication from '/tmp/HealthCheckTest_GracePeriod_d7zCPr/credentials'
> I0729 17:10:10.509407 1213 master.cpp:360] Authorization enabled
> I0729 17:10:10.510030 1207 hierarchical_allocator_process.hpp:301]
> Initializing hierarchical allocator process with master :
> [email protected]:54701
> I0729 17:10:10.510113 1207 master.cpp:123] No whitelist given. Advertising
> offers for all slaves
> I0729 17:10:10.511699 1213 master.cpp:1129] The newly elected leader is
> [email protected]:54701 with id 20140729-171010-16842879-54701-1176
> I0729 17:10:10.512230 1213 master.cpp:1142] Elected as the leading master!
> I0729 17:10:10.512692 1213 master.cpp:960] Recovering from registrar
> I0729 17:10:10.513226 1210 registrar.cpp:313] Recovering registrar
> I0729 17:10:10.516006 1212 leveldb.cpp:306] Persisting metadata (8 bytes) to
> leveldb took 12.946461ms
> I0729 17:10:10.516047 1212 replica.cpp:320] Persisted replica status to
> STARTING
> I0729 17:10:10.516129 1212 recover.cpp:451] Replica is in STARTING status
> I0729 17:10:10.516520 1212 replica.cpp:638] Replica in STARTING status
> received a broadcasted recover request
> I0729 17:10:10.516592 1212 recover.cpp:188] Received a recover response from
> a replica in STARTING status
> I0729 17:10:10.516767 1212 recover.cpp:542] Updating replica status to VOTING
> I0729 17:10:10.528376 1212 leveldb.cpp:306] Persisting metadata (8 bytes) to
> leveldb took 11.537102ms
> I0729 17:10:10.528430 1212 replica.cpp:320] Persisted replica status to
> VOTING
> I0729 17:10:10.528501 1212 recover.cpp:556] Successfully joined the Paxos
> group
> I0729 17:10:10.528565 1212 recover.cpp:440] Recover process terminated
> I0729 17:10:10.528700 1212 log.cpp:656] Attempting to start the writer
> I0729 17:10:10.528960 1212 replica.cpp:474] Replica received implicit
> promise request with proposal 1
> I0729 17:10:10.537821 1212 leveldb.cpp:306] Persisting metadata (8 bytes) to
> leveldb took 8.830863ms
> I0729 17:10:10.537869 1212 replica.cpp:342] Persisted promised to 1
> I0729 17:10:10.540550 1209 coordinator.cpp:230] Coordinator attemping to
> fill missing position
> I0729 17:10:10.540856 1209 replica.cpp:375] Replica received explicit
> promise request for position 0 with proposal 2
> I0729 17:10:10.547430 1209 leveldb.cpp:343] Persisting action (8 bytes) to
> leveldb took 6.548344ms
> I0729 17:10:10.547471 1209 replica.cpp:676] Persisted action at 0
> I0729 17:10:10.547732 1209 replica.cpp:508] Replica received write request
> for position 0
> I0729 17:10:10.547765 1209 leveldb.cpp:438] Reading position from leveldb
> took 15676ns
> I0729 17:10:10.557169 1209 leveldb.cpp:343] Persisting action (14 bytes) to
> leveldb took 9.373798ms
> I0729 17:10:10.557241 1209 replica.cpp:676] Persisted action at 0
> I0729 17:10:10.560642 1210 replica.cpp:655] Replica received learned notice
> for position 0
> I0729 17:10:10.570312 1210 leveldb.cpp:343] Persisting action (16 bytes) to
> leveldb took 9.61503ms
> I0729 17:10:10.570380 1210 replica.cpp:676] Persisted action at 0
> I0729 17:10:10.570406 1210 replica.cpp:661] Replica learned NOP action at
> position 0
> I0729 17:10:10.570746 1210 log.cpp:672] Writer started with ending position 0
> I0729 17:10:10.571141 1210 leveldb.cpp:438] Reading position from leveldb
> took 24085ns
> I0729 17:10:10.580646 1210 registrar.cpp:346] Successfully fetched the
> registry (0B)
> I0729 17:10:10.580708 1210 registrar.cpp:422] Attempting to update the
> 'registry'
> I0729 17:10:10.582347 1209 log.cpp:680] Attempting to append 118 bytes to
> the log
> I0729 17:10:10.582406 1209 coordinator.cpp:340] Coordinator attempting to
> write APPEND action at position 1
> I0729 17:10:10.582655 1209 replica.cpp:508] Replica received write request
> for position 1
> I0729 17:10:10.591322 1209 leveldb.cpp:343] Persisting action (135 bytes) to
> leveldb took 8.628332ms
> I0729 17:10:10.591384 1209 replica.cpp:676] Persisted action at 1
> I0729 17:10:10.591640 1209 replica.cpp:655] Replica received learned notice
> for position 1
> I0729 17:10:10.601271 1209 leveldb.cpp:343] Persisting action (137 bytes) to
> leveldb took 9.600283ms
> I0729 17:10:10.601327 1209 replica.cpp:676] Persisted action at 1
> I0729 17:10:10.601341 1209 replica.cpp:661] Replica learned APPEND action at
> position 1
> I0729 17:10:10.608666 1209 registrar.cpp:479] Successfully updated 'registry'
> I0729 17:10:10.608739 1209 registrar.cpp:372] Successfully recovered
> registrar
> I0729 17:10:10.608855 1209 master.cpp:987] Recovered 0 slaves from the
> Registry (82B) ; allowing 10mins for slaves to re-register
> I0729 17:10:10.608901 1209 log.cpp:699] Attempting to truncate the log to 1
> I0729 17:10:10.608949 1209 coordinator.cpp:340] Coordinator attempting to
> write TRUNCATE action at position 2
> I0729 17:10:10.609171 1209 replica.cpp:508] Replica received write request
> for position 2
> I0729 17:10:10.618360 1209 leveldb.cpp:343] Persisting action (16 bytes) to
> leveldb took 9.157424ms
> I0729 17:10:10.618412 1209 replica.cpp:676] Persisted action at 2
> I0729 17:10:10.618633 1209 replica.cpp:655] Replica received learned notice
> for position 2
> I0729 17:10:10.627727 1209 leveldb.cpp:343] Persisting action (18 bytes) to
> leveldb took 9.062752ms
> I0729 17:10:10.628556 1209 leveldb.cpp:401] Deleting ~1 keys from leveldb
> took 58420ns
> I0729 17:10:10.628873 1209 replica.cpp:676] Persisted action at 2
> I0729 17:10:10.628903 1209 replica.cpp:661] Replica learned TRUNCATE action
> at position 2
> I0729 17:10:10.640494 1176 containerizer.cpp:124] Using isolation:
> posix/cpu,posix/mem
> I0729 17:10:10.644561 1205 slave.cpp:169] Slave started on
> 29)@127.0.1.1:54701
> I0729 17:10:10.644593 1205 credentials.hpp:84] Loading credential for
> authentication from '/tmp/HealthCheckTest_GracePeriod_f0DeJJ/credential'
> I0729 17:10:10.644702 1205 slave.cpp:267] Slave using credential for:
> test-principal
> I0729 17:10:10.644851 1205 slave.cpp:280] Slave resources: cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I0729 17:10:10.644940 1205 slave.cpp:325] Slave hostname: trusty
> I0729 17:10:10.644954 1205 slave.cpp:326] Slave checkpoint: false
> I0729 17:10:10.645478 1205 state.cpp:33] Recovering state from
> '/tmp/HealthCheckTest_GracePeriod_f0DeJJ/meta'
> I0729 17:10:10.645634 1205 status_update_manager.cpp:193] Recovering status
> update manager
> I0729 17:10:10.645722 1205 containerizer.cpp:287] Recovering containerizer
> I0729 17:10:10.646106 1205 slave.cpp:3128] Finished recovery
> I0729 17:10:10.646536 1205 slave.cpp:601] New master detected at
> [email protected]:54701
> I0729 17:10:10.646589 1205 slave.cpp:677] Authenticating with master
> [email protected]:54701
> I0729 17:10:10.646651 1205 slave.cpp:650] Detecting new master
> I0729 17:10:10.646694 1205 status_update_manager.cpp:167] New master
> detected at [email protected]:54701
> I0729 17:10:10.646770 1205 authenticatee.hpp:128] Creating new client SASL
> connection
> I0729 17:10:10.647045 1205 master.cpp:3504] Authenticating
> slave(29)@127.0.1.1:54701
> I0729 17:10:10.647186 1205 authenticator.hpp:156] Creating new server SASL
> connection
> I0729 17:10:10.647390 1205 authenticatee.hpp:219] Received SASL
> authentication mechanisms: CRAM-MD5
> I0729 17:10:10.647426 1205 authenticatee.hpp:245] Attempting to authenticate
> with mechanism 'CRAM-MD5'
> I0729 17:10:10.647460 1205 authenticator.hpp:262] Received SASL
> authentication start
> I0729 17:10:10.647511 1205 authenticator.hpp:384] Authentication requires
> more steps
> I0729 17:10:10.647543 1205 authenticatee.hpp:265] Received SASL
> authentication step
> I0729 17:10:10.647629 1205 authenticator.hpp:290] Received SASL
> authentication step
> I0729 17:10:10.647678 1205 auxprop.cpp:81] Request to lookup properties for
> user: 'test-principal' realm: 'trusty' server FQDN: 'trusty'
> SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false
> SASL_AUXPROP_AUTHZID: false
> I0729 17:10:10.647706 1205 auxprop.cpp:153] Looking up auxiliary property
> '*userPassword'
> I0729 17:10:10.647721 1205 auxprop.cpp:153] Looking up auxiliary property
> '*cmusaslsecretCRAM-MD5'
> I0729 17:10:10.647765 1205 auxprop.cpp:81] Request to lookup properties for
> user: 'test-principal' realm: 'trusty' server FQDN: 'trusty'
> SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false
> SASL_AUXPROP_AUTHZID: true
> I0729 17:10:10.647776 1205 auxprop.cpp:103] Skipping auxiliary property
> '*userPassword' since SASL_AUXPROP_AUTHZID == true
> I0729 17:10:10.647783 1205 auxprop.cpp:103] Skipping auxiliary property
> '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> I0729 17:10:10.647806 1205 authenticator.hpp:376] Authentication success
> I0729 17:10:10.647850 1205 authenticatee.hpp:305] Authentication success
> I0729 17:10:10.647874 1205 master.cpp:3544] Successfully authenticated
> principal 'test-principal' at slave(29)@127.0.1.1:54701
> I0729 17:10:10.647960 1205 slave.cpp:734] Successfully authenticated with
> master [email protected]:54701
> I0729 17:10:10.648056 1205 slave.cpp:972] Will retry registration in
> 11.218662ms if necessary
> I0729 17:10:10.648203 1210 master.cpp:2762] Registering slave at
> slave(29)@127.0.1.1:54701 (trusty) with id
> 20140729-171010-16842879-54701-1176-0
> I0729 17:10:10.648386 1204 registrar.cpp:422] Attempting to update the
> 'registry'
> I0729 17:10:10.649412 1212 log.cpp:680] Attempting to append 290 bytes to
> the log
> I0729 17:10:10.649459 1212 coordinator.cpp:340] Coordinator attempting to
> write APPEND action at position 3
> I0729 17:10:10.649659 1212 replica.cpp:508] Replica received write request
> for position 3
> I0729 17:10:10.652601 1176 sched.cpp:139] Version: 0.20.0
> I0729 17:10:10.653542 1209 sched.cpp:235] New master detected at
> [email protected]:54701
> I0729 17:10:10.653595 1209 sched.cpp:285] Authenticating with master
> [email protected]:54701
> I0729 17:10:10.653727 1209 authenticatee.hpp:128] Creating new client SASL
> connection
> I0729 17:10:10.653970 1209 master.cpp:3504] Authenticating
> [email protected]:54701
> I0729 17:10:10.654096 1209 authenticator.hpp:156] Creating new server SASL
> connection
> I0729 17:10:10.654660 1209 authenticatee.hpp:219] Received SASL
> authentication mechanisms: CRAM-MD5
> I0729 17:10:10.654700 1209 authenticatee.hpp:245] Attempting to authenticate
> with mechanism 'CRAM-MD5'
> I0729 17:10:10.654752 1209 authenticator.hpp:262] Received SASL
> authentication start
> I0729 17:10:10.655232 1209 authenticator.hpp:384] Authentication requires
> more steps
> I0729 17:10:10.655627 1213 authenticatee.hpp:265] Received SASL
> authentication step
> I0729 17:10:10.656039 1209 authenticator.hpp:290] Received SASL
> authentication step
> I0729 17:10:10.656082 1209 auxprop.cpp:81] Request to lookup properties for
> user: 'test-principal' realm: 'trusty' server FQDN: 'trusty'
> SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false
> SASL_AUXPROP_AUTHZID: false
> I0729 17:10:10.656105 1209 auxprop.cpp:153] Looking up auxiliary property
> '*userPassword'
> I0729 17:10:10.656131 1209 auxprop.cpp:153] Looking up auxiliary property
> '*cmusaslsecretCRAM-MD5'
> I0729 17:10:10.656157 1209 auxprop.cpp:81] Request to lookup properties for
> user: 'test-principal' realm: 'trusty' server FQDN: 'trusty'
> SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false
> SASL_AUXPROP_AUTHZID: true
> I0729 17:10:10.656174 1209 auxprop.cpp:103] Skipping auxiliary property
> '*userPassword' since SASL_AUXPROP_AUTHZID == true
> I0729 17:10:10.656189 1209 auxprop.cpp:103] Skipping auxiliary property
> '*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
> I0729 17:10:10.656215 1209 authenticator.hpp:376] Authentication success
> I0729 17:10:10.656738 1204 authenticatee.hpp:305] Authentication success
> I0729 17:10:10.659857 1212 leveldb.cpp:343] Persisting action (309 bytes) to
> leveldb took 10.166506ms
> I0729 17:10:10.659914 1212 replica.cpp:676] Persisted action at 3
> I0729 17:10:10.660799 1207 slave.cpp:972] Will retry registration in
> 8.165348ms if necessary
> I0729 17:10:10.660914 1207 master.cpp:2750] Ignoring register slave message
> from slave(29)@127.0.1.1:54701 (trusty) as admission is already in progress
> I0729 17:10:10.661286 1212 replica.cpp:655] Replica received learned notice
> for position 3
> I0729 17:10:10.662286 1213 master.cpp:3544] Successfully authenticated
> principal 'test-principal' at
> [email protected]:54701
> I0729 17:10:10.662789 1204 sched.cpp:359] Successfully authenticated with
> master [email protected]:54701
> I0729 17:10:10.663375 1204 sched.cpp:478] Sending registration request to
> [email protected]:54701
> I0729 17:10:10.664089 1207 master.cpp:1248] Received registration request
> from [email protected]:54701
> I0729 17:10:10.664163 1207 master.cpp:1208] Authorizing framework principal
> 'test-principal' to receive offers for role '*'
> I0729 17:10:10.664638 1207 master.cpp:1307] Registering framework
> 20140729-171010-16842879-54701-1176-0000 at
> [email protected]:54701
> I0729 17:10:10.664856 1207 hierarchical_allocator_process.hpp:331] Added
> framework 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:10.664894 1207 hierarchical_allocator_process.hpp:724] No
> resources available to allocate!
> I0729 17:10:10.664917 1207 hierarchical_allocator_process.hpp:686] Performed
> allocation for 0 slaves in 25838ns
> I0729 17:10:10.665393 1204 sched.cpp:409] Framework registered with
> 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:10.666017 1204 sched.cpp:423] Scheduler::registered took 32926ns
> I0729 17:10:10.669276 1210 slave.cpp:972] Will retry registration in
> 28.005974ms if necessary
> I0729 17:10:10.669387 1210 master.cpp:2750] Ignoring register slave message
> from slave(29)@127.0.1.1:54701 (trusty) as admission is already in progress
> I0729 17:10:10.669477 1212 leveldb.cpp:343] Persisting action (311 bytes) to
> leveldb took 8.158554ms
> I0729 17:10:10.669513 1212 replica.cpp:676] Persisted action at 3
> I0729 17:10:10.669533 1212 replica.cpp:661] Replica learned APPEND action at
> position 3
> I0729 17:10:10.670101 1212 registrar.cpp:479] Successfully updated 'registry'
> I0729 17:10:10.670265 1212 log.cpp:699] Attempting to truncate the log to 3
> I0729 17:10:10.670399 1212 master.cpp:2802] Registered slave
> 20140729-171010-16842879-54701-1176-0 at slave(29)@127.0.1.1:54701 (trusty)
> I0729 17:10:10.670454 1212 master.cpp:3972] Adding slave
> 20140729-171010-16842879-54701-1176-0 at slave(29)@127.0.1.1:54701 (trusty)
> with cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]
> I0729 17:10:10.670725 1212 coordinator.cpp:340] Coordinator attempting to
> write TRUNCATE action at position 4
> I0729 17:10:10.670883 1212 slave.cpp:768] Registered with master
> [email protected]:54701; given slave ID 20140729-171010-16842879-54701-1176-0
> I0729 17:10:10.671022 1212 hierarchical_allocator_process.hpp:444] Added
> slave 20140729-171010-16842879-54701-1176-0 (trusty) with cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000] (and cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000] available)
> I0729 17:10:10.671140 1212 hierarchical_allocator_process.hpp:750] Offering
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave
> 20140729-171010-16842879-54701-1176-0 to framework
> 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:10.671286 1212 hierarchical_allocator_process.hpp:706] Performed
> allocation for slave 20140729-171010-16842879-54701-1176-0 in 201281ns
> I0729 17:10:10.671383 1212 slave.cpp:2325] Received ping from
> slave-observer(27)@127.0.1.1:54701
> I0729 17:10:10.671535 1212 master.hpp:816] Adding offer
> 20140729-171010-16842879-54701-1176-0 with resources cpus(*):2; mem(*):1024;
> disk(*):1024; ports(*):[31000-32000] on slave
> 20140729-171010-16842879-54701-1176-0 (trusty)
> I0729 17:10:10.671619 1212 master.cpp:3451] Sending 1 offers to framework
> 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:10.671906 1212 sched.cpp:546] Scheduler::resourceOffers took
> 48649ns
> I0729 17:10:10.672193 1212 replica.cpp:508] Replica received write request
> for position 4
> I0729 17:10:10.675730 1207 master.hpp:826] Removing offer
> 20140729-171010-16842879-54701-1176-0 with resources cpus(*):2; mem(*):1024;
> disk(*):1024; ports(*):[31000-32000] on slave
> 20140729-171010-16842879-54701-1176-0 (trusty)
> I0729 17:10:10.675851 1207 master.cpp:2125] Processing reply for offers: [
> 20140729-171010-16842879-54701-1176-0 ] on slave
> 20140729-171010-16842879-54701-1176-0 at slave(29)@127.0.1.1:54701 (trusty)
> for framework 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:10.675937 1207 master.cpp:2211] Authorizing framework principal
> 'test-principal' to launch task 1 as user 'jenkins'
> I0729 17:10:10.676506 1207 master.hpp:788] Adding task 1 with resources
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave
> 20140729-171010-16842879-54701-1176-0 (trusty)
> I0729 17:10:10.676573 1207 master.cpp:2277] Launching task 1 of framework
> 20140729-171010-16842879-54701-1176-0000 with resources cpus(*):2;
> mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave
> 20140729-171010-16842879-54701-1176-0 at slave(29)@127.0.1.1:54701 (trusty)
> I0729 17:10:10.676806 1207 slave.cpp:1003] Got assigned task 1 for framework
> 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:10.677433 1207 slave.cpp:1113] Launching task 1 for framework
> 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:10.679307 1205 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 34137ns
> I0729 17:10:10.679386 1205 master.cpp:123] No whitelist given. Advertising
> offers for all slaves
> I0729 17:10:10.680003 1205 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 23267ns
> I0729 17:10:10.682334 1212 leveldb.cpp:343] Persisting action (16 bytes) to
> leveldb took 10.104651ms
> I0729 17:10:10.682368 1212 replica.cpp:676] Persisted action at 4
> I0729 17:10:10.682564 1212 replica.cpp:655] Replica received learned notice
> for position 4
> I0729 17:10:10.684147 1210 containerizer.cpp:427] Starting container
> 'a91c7b26-f710-4ef4-b0d8-a21c4e12b982' for executor '1' of framework
> '20140729-171010-16842879-54701-1176-0000'
> I0729 17:10:10.700867 1207 slave.cpp:1223] Queuing task '1' for executor 1
> of framework '20140729-171010-16842879-54701-1176-0000
> I0729 17:10:10.702924 1212 leveldb.cpp:343] Persisting action (18 bytes) to
> leveldb took 20.082347ms
> I0729 17:10:10.703104 1212 leveldb.cpp:401] Deleting ~2 keys from leveldb
> took 110401ns
> I0729 17:10:10.703147 1212 replica.cpp:676] Persisted action at 4
> I0729 17:10:10.703191 1212 replica.cpp:661] Replica learned TRUNCATE action
> at position 4
> I0729 17:10:10.704622 1207 slave.cpp:562] Successfully attached file
> '/tmp/HealthCheckTest_GracePeriod_f0DeJJ/slaves/20140729-171010-16842879-54701-1176-0/frameworks/20140729-171010-16842879-54701-1176-0000/executors/1/runs/a91c7b26-f710-4ef4-b0d8-a21c4e12b982'
> I0729 17:10:10.706490 1210 launcher.cpp:137] Forked child with pid '1418'
> for container 'a91c7b26-f710-4ef4-b0d8-a21c4e12b982'
> I0729 17:10:10.714651 1210 containerizer.cpp:537] Fetching URIs for
> container 'a91c7b26-f710-4ef4-b0d8-a21c4e12b982' using command
> '/var/jenkins/workspace/mesos-ubuntu-14.04-gcc/src/mesos-fetcher'
> I0729 17:10:11.679939 1209 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 74216ns
> I0729 17:10:11.688122 1206 slave.cpp:2470] Monitoring executor '1' of
> framework '20140729-171010-16842879-54701-1176-0000' in container
> 'a91c7b26-f710-4ef4-b0d8-a21c4e12b982'
> I0729 17:10:13.250228 1209 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 58601ns
> I0729 17:10:13.676744 1210 slave.cpp:1734] Got registration for executor '1'
> of framework 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:13.678890 1214 process.cpp:1098] Socket closed while receiving
> I0729 17:10:13.679328 1210 slave.cpp:1853] Flushing queued task 1 for
> executor '1' of framework 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:13.682858 1214 process.cpp:1098] Socket closed while receiving
> I0729 17:10:13.689992 1205 slave.cpp:2088] Handling status update
> TASK_RUNNING (UUID: c0053bfe-8014-4d12-ac27-83fb49300c05) for task 1 of
> framework 20140729-171010-16842879-54701-1176-0000 from
> executor(1)@127.0.1.1:57713
> I0729 17:10:13.690161 1205 status_update_manager.cpp:320] Received status
> update TASK_RUNNING (UUID: c0053bfe-8014-4d12-ac27-83fb49300c05) for task 1
> of framework 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:13.690181 1205 status_update_manager.cpp:499] Creating
> StatusUpdate stream for task 1 of framework
> 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:13.690289 1205 status_update_manager.cpp:373] Forwarding status
> update TASK_RUNNING (UUID: c0053bfe-8014-4d12-ac27-83fb49300c05) for task 1
> of framework 20140729-171010-16842879-54701-1176-0000 to
> [email protected]:54701
> I0729 17:10:13.690444 1205 master.cpp:3131] Forwarding status update
> TASK_RUNNING (UUID: c0053bfe-8014-4d12-ac27-83fb49300c05) for task 1 of
> framework 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:13.690490 1205 master.cpp:3097] Status update TASK_RUNNING
> (UUID: c0053bfe-8014-4d12-ac27-83fb49300c05) for task 1 of framework
> 20140729-171010-16842879-54701-1176-0000 from slave
> 20140729-171010-16842879-54701-1176-0 at slave(29)@127.0.1.1:54701 (trusty)
> I0729 17:10:13.690525 1205 slave.cpp:2246] Status update manager
> successfully handled status update TASK_RUNNING (UUID:
> c0053bfe-8014-4d12-ac27-83fb49300c05) for task 1 of framework
> 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:13.690537 1205 slave.cpp:2252] Sending acknowledgement for
> status update TASK_RUNNING (UUID: c0053bfe-8014-4d12-ac27-83fb49300c05) for
> task 1 of framework 20140729-171010-16842879-54701-1176-0000 to
> executor(1)@127.0.1.1:57713
> I0729 17:10:13.690765 1205 sched.cpp:637] Scheduler::statusUpdate took
> 51765ns
> I0729 17:10:13.690829 1205 master.cpp:2612] Forwarding status update
> acknowledgement c0053bfe-8014-4d12-ac27-83fb49300c05 for task 1 of framework
> 20140729-171010-16842879-54701-1176-0000 to slave
> 20140729-171010-16842879-54701-1176-0 at slave(29)@127.0.1.1:54701 (trusty)
> I0729 17:10:13.690901 1205 status_update_manager.cpp:398] Received status
> update acknowledgement (UUID: c0053bfe-8014-4d12-ac27-83fb49300c05) for task
> 1 of framework 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:13.690966 1205 slave.cpp:1674] Status update manager
> successfully handled status update acknowledgement (UUID:
> c0053bfe-8014-4d12-ac27-83fb49300c05) for task 1 of framework
> 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:13.691596 1214 process.cpp:1098] Socket closed while receiving
> I0729 17:10:15.462894 1205 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 41972ns
> I0729 17:10:15.463147 1205 master.cpp:123] No whitelist given. Advertising
> offers for all slaves
> I0729 17:10:16.464249 1210 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 52485ns
> I0729 17:10:17.465081 1205 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 63487ns
> I0729 17:10:18.465644 1206 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 67852ns
> I0729 17:10:19.466233 1205 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 56573ns
> I0729 17:10:19.676609 1206 slave.cpp:2325] Received ping from
> slave-observer(27)@127.0.1.1:54701
> I0729 17:10:20.463575 1206 master.cpp:123] No whitelist given. Advertising
> offers for all slaves
> I0729 17:10:20.467237 1206 hierarchical_allocator_process.hpp:686] Performed
> allocation for 1 slaves in 64896ns
> tests/health_check_tests.cpp:384: Failure
> Failed to wait 10secs for statusHealth
> tests/health_check_tests.cpp:366: Failure
> Actual function call count doesn't match EXPECT_CALL(sched,
> statusUpdate(&driver, _))...
> Expected: to be called at least twice
> Actual: called once - unsatisfied and active
> I0729 17:10:20.818284 1207 master.cpp:717] Framework
> 20140729-171010-16842879-54701-1176-0000 disconnected
> I0729 17:10:20.818326 1207 master.cpp:1584] Deactivating framework
> 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:20.818362 1207 master.cpp:739] Giving framework
> 20140729-171010-16842879-54701-1176-0000 0ns to failover
> I0729 17:10:20.818475 1207 hierarchical_allocator_process.hpp:407]
> Deactivated framework 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:20.818897 1207 master.cpp:3367] Framework failover timeout,
> removing framework 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:20.818914 1207 master.cpp:3826] Removing framework
> 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:20.819068 1207 master.hpp:806] Removing task 1 with resources
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] on slave
> 20140729-171010-16842879-54701-1176-0 (trusty)
> W0729 17:10:20.819154 1207 master.cpp:4261] Removing task 1 of framework
> 20140729-171010-16842879-54701-1176-0000 and slave
> 20140729-171010-16842879-54701-1176-0 in non-terminal state TASK_RUNNING
> I0729 17:10:20.819250 1207 slave.cpp:1407] Asked to shut down framework
> 20140729-171010-16842879-54701-1176-0000 by [email protected]:54701
> I0729 17:10:20.819282 1207 slave.cpp:1432] Shutting down framework
> 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:20.819300 1207 slave.cpp:2808] Shutting down executor '1' of
> framework 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:20.819830 1204 hierarchical_allocator_process.hpp:635] Recovered
> cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000] (total
> allocatable: cpus(*):2; mem(*):1024; disk(*):1024; ports(*):[31000-32000]) on
> slave 20140729-171010-16842879-54701-1176-0 from framework
> 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:20.819888 1204 hierarchical_allocator_process.hpp:362] Removed
> framework 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:20.824790 1214 process.cpp:1098] Socket closed while receiving
> I0729 17:10:20.826246 1209 master.cpp:626] Master terminating
> I0729 17:10:20.833003 1205 slave.cpp:2332] [email protected]:54701 exited
> W0729 17:10:20.833041 1205 slave.cpp:2335] Master disconnected! Waiting for
> a new master to be elected
> I0729 17:10:20.840898 1207 containerizer.cpp:903] Destroying container
> 'a91c7b26-f710-4ef4-b0d8-a21c4e12b982'
> I0729 17:10:20.973546 1214 process.cpp:1037] Socket closed while receiving
> I0729 17:10:21.469235 1213 containerizer.cpp:1019] Executor for container
> 'a91c7b26-f710-4ef4-b0d8-a21c4e12b982' has exited
> I0729 17:10:21.470084 1213 slave.cpp:2528] Executor '1' of framework
> 20140729-171010-16842879-54701-1176-0000 terminated with signal Killed
> I0729 17:10:21.470188 1213 slave.cpp:2662] Cleaning up executor '1' of
> framework 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:21.470685 1213 slave.cpp:2737] Cleaning up framework
> 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:21.471007 1213 gc.cpp:56] Scheduling
> '/tmp/HealthCheckTest_GracePeriod_f0DeJJ/slaves/20140729-171010-16842879-54701-1176-0/frameworks/20140729-171010-16842879-54701-1176-0000/executors/1/runs/a91c7b26-f710-4ef4-b0d8-a21c4e12b982'
> for gc 6.99999455449185days in the future
> I0729 17:10:21.471166 1213 gc.cpp:56] Scheduling
> '/tmp/HealthCheckTest_GracePeriod_f0DeJJ/slaves/20140729-171010-16842879-54701-1176-0/frameworks/20140729-171010-16842879-54701-1176-0000/executors/1'
> for gc 6.99999455256889days in the future
> I0729 17:10:21.471251 1213 gc.cpp:56] Scheduling
> '/tmp/HealthCheckTest_GracePeriod_f0DeJJ/slaves/20140729-171010-16842879-54701-1176-0/frameworks/20140729-171010-16842879-54701-1176-0000'
> for gc 6.99999455120296days in the future
> I0729 17:10:21.471336 1213 status_update_manager.cpp:282] Closing status
> update streams for framework 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:21.471395 1213 status_update_manager.cpp:530] Cleaning up status
> update stream for task 1 of framework 20140729-171010-16842879-54701-1176-0000
> I0729 17:10:21.472653 1176 slave.cpp:485] Slave terminating
> [ FAILED ] HealthCheckTest.GracePeriod (11019 ms)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)