Vinod Kone created MESOS-3275:
---------------------------------

             Summary: 
ContentType/HttpApiTest.UpdatePidToHttpSchedulerWithoutForce is flaky
                 Key: MESOS-3275
                 URL: https://issues.apache.org/jira/browse/MESOS-3275
             Project: Mesos
          Issue Type: Bug
    Affects Versions: 0.24.0
         Environment: 
https://builds.apache.org/job/Mesos/COMPILER=gcc,CONFIGURATION=--verbose%20--enable-libevent,OS=centos:7,label_exp=docker%7C%7CHadoop/709/consoleFull
            Reporter: Vinod Kone


Observed this on ASF CI.

This looks like the HTTP scheduler received an event (heartbeat?) after the 
master disallowed it to upgrade from PID to HTTP.

{code}
[ RUN      ] ContentType/HttpApiTest.UpdatePidToHttpSchedulerWithoutForce/1
Using temporary directory 
'/tmp/ContentType_HttpApiTest_UpdatePidToHttpSchedulerWithoutForce_1_DTcsOS'
I0816 23:17:25.201861 28791 leveldb.cpp:176] Opened db in 2.031526ms
I0816 23:17:25.202529 28791 leveldb.cpp:183] Compacted db in 623251ns
I0816 23:17:25.202581 28791 leveldb.cpp:198] Created db iterator in 25756ns
I0816 23:17:25.202605 28791 leveldb.cpp:204] Seeked to beginning of db in 2310ns
I0816 23:17:25.202620 28791 leveldb.cpp:273] Iterated through 0 keys in the db 
in 478ns
I0816 23:17:25.202680 28791 replica.cpp:744] Replica recovered with log 
positions 0 -> 0 with 1 holes and 0 unlearned
I0816 23:17:25.203255 28814 recover.cpp:449] Starting replica recovery
I0816 23:17:25.203511 28814 recover.cpp:475] Replica is in EMPTY status
I0816 23:17:25.204787 28819 replica.cpp:641] Replica in EMPTY status received a 
broadcasted recover request
I0816 23:17:25.205131 28823 recover.cpp:195] Received a recover response from a 
replica in EMPTY status
I0816 23:17:25.205700 28825 recover.cpp:566] Updating replica status to STARTING
I0816 23:17:25.206635 28811 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 724632ns
I0816 23:17:25.206665 28811 replica.cpp:323] Persisted replica status to 
STARTING
I0816 23:17:25.206836 28811 recover.cpp:475] Replica is in STARTING status
I0816 23:17:25.207716 28810 replica.cpp:641] Replica in STARTING status 
received a broadcasted recover request
I0816 23:17:25.208051 28812 recover.cpp:195] Received a recover response from a 
replica in STARTING status
I0816 23:17:25.209333 28817 master.cpp:378] Master 
20150816-231725-3170963884-43501-28791 (f3a9451c4cf3) started on 
172.17.1.189:43501
I0816 23:17:25.209509 28817 master.cpp:380] Flags at startup: --acls="" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate="false" --authenticate_slaves="true" --authenticators="crammd5" 
--authorizers="local" 
--credentials="/tmp/ContentType_HttpApiTest_UpdatePidToHttpSchedulerWithoutForce_1_DTcsOS/credentials"
 --framework_sorter="drf" --help="false" --initialize_driver_logging="true" 
--log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
--max_slave_ping_timeouts="5" --quiet="false" 
--recovery_slave_removal_limit="100%" --registry="replicated_log" 
--registry_fetch_timeout="1mins" --registry_store_timeout="25secs" 
--registry_strict="true" --root_submissions="true" 
--slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" 
--user_sorter="drf" --version="false" 
--webui_dir="/mesos/mesos-0.24.0/_inst/share/mesos/webui" 
--work_dir="/tmp/ContentType_HttpApiTest_UpdatePidToHttpSchedulerWithoutForce_1_DTcsOS/master"
 --zk_session_timeout="10secs"
I0816 23:17:25.209997 28817 master.cpp:427] Master allowing unauthenticated 
frameworks to register
I0816 23:17:25.210186 28817 master.cpp:430] Master only allowing authenticated 
slaves to register
I0816 23:17:25.210292 28817 credentials.hpp:37] Loading credentials for 
authentication from 
'/tmp/ContentType_HttpApiTest_UpdatePidToHttpSchedulerWithoutForce_1_DTcsOS/credentials'
I0816 23:17:25.210752 28817 master.cpp:469] Using default 'crammd5' 
authenticator
I0816 23:17:25.210754 28825 recover.cpp:566] Updating replica status to VOTING
I0816 23:17:25.211364 28817 master.cpp:506] Authorization enabled
I0816 23:17:25.211706 28813 hierarchical.hpp:346] Initialized hierarchical 
allocator process
I0816 23:17:25.211778 28811 whitelist_watcher.cpp:79] No whitelist given
I0816 23:17:25.213102 28817 master.cpp:1525] The newly elected leader is 
[email protected]:43501 with id 20150816-231725-3170963884-43501-28791
I0816 23:17:25.213229 28817 master.cpp:1538] Elected as the leading master!
I0816 23:17:25.213354 28817 master.cpp:1308] Recovering from registrar
I0816 23:17:25.213632 28825 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 1.357147ms
I0816 23:17:25.213762 28825 replica.cpp:323] Persisted replica status to VOTING
I0816 23:17:25.213724 28817 registrar.cpp:311] Recovering registrar
I0816 23:17:25.213999 28825 recover.cpp:580] Successfully joined the Paxos group
I0816 23:17:25.215277 28825 recover.cpp:464] Recover process terminated
I0816 23:17:25.215857 28813 log.cpp:661] Attempting to start the writer
I0816 23:17:25.217447 28814 replica.cpp:477] Replica received implicit promise 
request with proposal 1
I0816 23:17:25.218040 28814 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 424309ns
I0816 23:17:25.218194 28814 replica.cpp:345] Persisted promised to 1
I0816 23:17:25.218904 28814 coordinator.cpp:231] Coordinator attemping to fill 
missing position
I0816 23:17:25.220862 28819 replica.cpp:378] Replica received explicit promise 
request for position 0 with proposal 2
I0816 23:17:25.222211 28819 leveldb.cpp:343] Persisting action (8 bytes) to 
leveldb took 1.302196ms
I0816 23:17:25.222260 28819 replica.cpp:679] Persisted action at 0
I0816 23:17:25.223625 28823 replica.cpp:511] Replica received write request for 
position 0
I0816 23:17:25.223696 28823 leveldb.cpp:438] Reading position from leveldb took 
35551ns
I0816 23:17:25.224108 28823 leveldb.cpp:343] Persisting action (14 bytes) to 
leveldb took 358020ns
I0816 23:17:25.224138 28823 replica.cpp:679] Persisted action at 0
I0816 23:17:25.224994 28816 replica.cpp:658] Replica received learned notice 
for position 0
I0816 23:17:25.225673 28816 leveldb.cpp:343] Persisting action (16 bytes) to 
leveldb took 363713ns
I0816 23:17:25.225883 28816 replica.cpp:679] Persisted action at 0
I0816 23:17:25.226079 28816 replica.cpp:664] Replica learned NOP action at 
position 0
I0816 23:17:25.226887 28816 log.cpp:677] Writer started with ending position 0
I0816 23:17:25.228317 28810 leveldb.cpp:438] Reading position from leveldb took 
40494ns
I0816 23:17:25.231591 28810 registrar.cpp:344] Successfully fetched the 
registry (0B) in 17.540864ms
I0816 23:17:25.231739 28810 registrar.cpp:443] Applied 1 operations in 30654ns; 
attempting to update the 'registry'
I0816 23:17:25.234084 28818 log.cpp:685] Attempting to append 178 bytes to the 
log
I0816 23:17:25.234246 28815 coordinator.cpp:341] Coordinator attempting to 
write APPEND action at position 1
I0816 23:17:25.235080 28821 replica.cpp:511] Replica received write request for 
position 1
I0816 23:17:25.235671 28821 leveldb.cpp:343] Persisting action (197 bytes) to 
leveldb took 556874ns
I0816 23:17:25.235700 28821 replica.cpp:679] Persisted action at 1
I0816 23:17:25.236559 28821 replica.cpp:658] Replica received learned notice 
for position 1
I0816 23:17:25.236899 28821 leveldb.cpp:343] Persisting action (199 bytes) to 
leveldb took 317171ns
I0816 23:17:25.236922 28821 replica.cpp:679] Persisted action at 1
I0816 23:17:25.236938 28821 replica.cpp:664] Replica learned APPEND action at 
position 1
I0816 23:17:25.238258 28821 log.cpp:704] Attempting to truncate the log to 1
I0816 23:17:25.238286 28818 registrar.cpp:488] Successfully updated the 
'registry' in 6.47808ms
I0816 23:17:25.238657 28818 registrar.cpp:374] Successfully recovered registrar
I0816 23:17:25.238965 28821 coordinator.cpp:341] Coordinator attempting to 
write TRUNCATE action at position 2
I0816 23:17:25.239836 28823 replica.cpp:511] Replica received write request for 
position 2
I0816 23:17:25.240257 28823 leveldb.cpp:343] Persisting action (16 bytes) to 
leveldb took 384807ns
I0816 23:17:25.240303 28823 replica.cpp:679] Persisted action at 2
I0816 23:17:25.241461 28823 replica.cpp:658] Replica received learned notice 
for position 2
I0816 23:17:25.241971 28813 master.cpp:1335] Recovered 0 slaves from the 
Registry (139B) ; allowing 10mins for slaves to re-register
I0816 23:17:25.242049 28823 leveldb.cpp:343] Persisting action (18 bytes) to 
leveldb took 555462ns
I0816 23:17:25.242127 28823 leveldb.cpp:401] Deleting ~1 keys from leveldb took 
48784ns
I0816 23:17:25.242152 28823 replica.cpp:679] Persisted action at 2
I0816 23:17:25.242205 28823 replica.cpp:664] Replica learned TRUNCATE action at 
position 2
I0816 23:17:25.257192 28791 sched.cpp:164] Version: 0.24.0
I0816 23:17:25.257940 28819 sched.cpp:262] New master detected at 
[email protected]:43501
I0816 23:17:25.258066 28819 sched.cpp:318] Authenticating with master 
[email protected]:43501
I0816 23:17:25.258091 28819 sched.cpp:325] Using default CRAM-MD5 authenticatee
I0816 23:17:25.258472 28819 authenticatee.cpp:115] Creating new client SASL 
connection
I0816 23:17:25.258849 28819 master.cpp:4694] Authenticating 
[email protected]:43501
I0816 23:17:25.259053 28819 authenticator.cpp:407] Starting authentication 
session for crammd5_authenticatee(694)@172.17.1.189:43501
I0816 23:17:25.259399 28812 authenticator.cpp:92] Creating new server SASL 
connection
I0816 23:17:25.259770 28819 authenticatee.cpp:206] Received SASL authentication 
mechanisms: CRAM-MD5
I0816 23:17:25.259820 28819 authenticatee.cpp:232] Attempting to authenticate 
with mechanism 'CRAM-MD5'
I0816 23:17:25.259961 28819 authenticator.cpp:197] Received SASL authentication 
start
I0816 23:17:25.260041 28819 authenticator.cpp:319] Authentication requires more 
steps
I0816 23:17:25.260155 28819 authenticatee.cpp:252] Received SASL authentication 
step
I0816 23:17:25.260319 28812 authenticator.cpp:225] Received SASL authentication 
step
I0816 23:17:25.260354 28812 auxprop.cpp:102] Request to lookup properties for 
user: 'test-principal' realm: 'f3a9451c4cf3' server FQDN: 'f3a9451c4cf3' 
SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false 
SASL_AUXPROP_AUTHZID: false 
I0816 23:17:25.260370 28812 auxprop.cpp:174] Looking up auxiliary property 
'*userPassword'
I0816 23:17:25.260426 28812 auxprop.cpp:174] Looking up auxiliary property 
'*cmusaslsecretCRAM-MD5'
I0816 23:17:25.260464 28812 auxprop.cpp:102] Request to lookup properties for 
user: 'test-principal' realm: 'f3a9451c4cf3' server FQDN: 'f3a9451c4cf3' 
SASL_AUXPROP_VERIFY_AGAINST_HASH: false SASL_AUXPROP_OVERRIDE: false 
SASL_AUXPROP_AUTHZID: true 
I0816 23:17:25.260483 28812 auxprop.cpp:124] Skipping auxiliary property 
'*userPassword' since SASL_AUXPROP_AUTHZID == true
I0816 23:17:25.260494 28812 auxprop.cpp:124] Skipping auxiliary property 
'*cmusaslsecretCRAM-MD5' since SASL_AUXPROP_AUTHZID == true
I0816 23:17:25.260515 28812 authenticator.cpp:311] Authentication success
I0816 23:17:25.260612 28819 authenticatee.cpp:292] Authentication success
I0816 23:17:25.260768 28811 authenticator.cpp:425] Authentication session 
cleanup for crammd5_authenticatee(694)@172.17.1.189:43501
I0816 23:17:25.260694 28812 master.cpp:4724] Successfully authenticated 
principal 'test-principal' at 
[email protected]:43501
I0816 23:17:25.261353 28819 sched.cpp:407] Successfully authenticated with 
master [email protected]:43501
I0816 23:17:25.261376 28819 sched.cpp:713] Sending SUBSCRIBE call to 
[email protected]:43501
I0816 23:17:25.261497 28819 sched.cpp:746] Will retry registration in 
1.206717027secs if necessary
I0816 23:17:25.261682 28825 master.cpp:2094] Received SUBSCRIBE call for 
framework 'default' at 
[email protected]:43501
I0816 23:17:25.261885 28825 master.cpp:1564] Authorizing framework principal 
'test-principal' to receive offers for role '*'
I0816 23:17:25.262307 28825 master.cpp:2164] Subscribing framework default with 
checkpointing disabled and capabilities [  ]
I0816 23:17:25.262744 28814 hierarchical.hpp:391] Added framework 
20150816-231725-3170963884-43501-28791-0000
I0816 23:17:25.262915 28814 hierarchical.hpp:1008] No resources available to 
allocate!
I0816 23:17:25.263098 28814 hierarchical.hpp:908] Performed allocation for 0 
slaves in 192184ns
I0816 23:17:25.263025 28825 sched.cpp:640] Framework registered with 
20150816-231725-3170963884-43501-28791-0000
I0816 23:17:25.263273 28825 sched.cpp:654] Scheduler::registered took 123101ns
I0816 23:17:25.265447 28823 process.cpp:3043] Handling HTTP event for process 
'master' with path: '/master/api/v1/scheduler'
I0816 23:17:25.265732 28823 http.cpp:319] HTTP POST for 
/master/api/v1/scheduler from 172.17.1.189:41579
I0816 23:17:25.266278 28823 master.cpp:1783] Received subscription request for 
HTTP framework 'default'
I0816 23:17:25.266372 28823 master.cpp:1564] Authorizing framework principal 
'test-principal' to receive offers for role '*'
I0816 23:17:25.266746 28822 master.cpp:1875] Subscribing framework 'default' 
with checkpointing disabled and capabilities [  ]
I0816 23:17:25.266922 28822 master.cpp:1916] Updating info for framework 
20150816-231725-3170963884-43501-28791-0000
E0816 23:17:25.267024 28822 master.cpp:1931] Disallowing subscription attempt 
of framework 20150816-231725-3170963884-43501-28791-0000 (default) at 
[email protected]:43501 because it is 
already connected
I0816 23:17:25.269484 28820 hierarchical.hpp:1008] No resources available to 
allocate!
I0816 23:17:25.269604 28820 hierarchical.hpp:908] Performed allocation for 0 
slaves in 137770ns
../../src/tests/http_api_tests.cpp:598: Failure
Value of: event.isPending()
  Actual: false
Expected: true
I0816 23:17:25.285747 28823 master.cpp:1051] Framework 
20150816-231725-3170963884-43501-28791-0000 (default) at 
[email protected]:43501 disconnected
I0816 23:17:25.285939 28823 master.cpp:2370] Disconnecting framework 
20150816-231725-3170963884-43501-28791-0000 (default) at 
[email protected]:43501
I0816 23:17:25.286100 28823 master.cpp:2394] Deactivating framework 
20150816-231725-3170963884-43501-28791-0000 (default) at 
[email protected]:43501
I0816 23:17:25.286303 28814 hierarchical.hpp:474] Deactivated framework 
20150816-231725-3170963884-43501-28791-0000
I0816 23:17:25.286293 28823 master.cpp:1075] Giving framework 
20150816-231725-3170963884-43501-28791-0000 (default) at 
[email protected]:43501 2weeks to 
failover
I0816 23:17:25.286866 28812 master.cpp:860] Master terminating
I0816 23:17:25.287035 28817 hierarchical.hpp:428] Removed framework 
20150816-231725-3170963884-43501-28791-0000
[  FAILED  ] ContentType/HttpApiTest.UpdatePidToHttpSchedulerWithoutForce/1, 
where GetParam() = "application/json" (94 ms)

{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to