Mike Percy created KUDU-2433:
--------------------------------

             Summary: Potential flake in client-negotiation-failover-itest due 
to negotiation timeout
                 Key: KUDU-2433
                 URL: https://issues.apache.org/jira/browse/KUDU-2433
             Project: Kudu
          Issue Type: Bug
            Reporter: Mike Percy


There appears to be a rare flake in client-negotiation-failover-itest in 
ClientFailoverOnNegotiationTimeoutITest.Kudu1580ConnectToTServer due to a 
negotiation timeout. This appeared on one of the test dashboards:
{code:java}
I0507 20:10:39.140327 1387 raft_consensus.cc:2459] T 
7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [term 3 
FOLLOWER]: Leader pre-election lost for term 4. Reason: could not achieve 
majority
W0507 20:10:39.140895 296 leader_election.cc:286] T 
7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd 
[CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer 
5014824b94b64fd49da2c1a645c078aa: Network error: Client connection negotiation 
failed: client connection to 127.0.52.194:43942: connect: Connection refused 
(error 111)
I0507 20:10:39.165484 1389 raft_consensus.cc:435] T 
a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd [term 3 
FOLLOWER]: Starting pre-election (detected failure of leader 
b5e887ddd8a14cf3b2c24a0aa14be025)
I0507 20:10:39.165598 1389 raft_consensus.cc:2704] T 
a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing 
failure detection for 0.853s (starting election)
I0507 20:10:39.165699 1389 raft_consensus.cc:457] T 
a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd [term 3 
FOLLOWER]: Starting pre-election with config: opid_index: -1 OBSOLETE_local: 
false peers { permanent_uuid: "b5e887ddd8a14cf3b2c24a0aa14be025" member_type: 
VOTER last_known_addr { host: "127.0.52.195" port: 38453 } } peers { 
permanent_uuid: "d1bf0f72a88f4483a1906bc8246849dd" member_type: VOTER 
last_known_addr { host: "127.0.52.193" port: 46057 } } peers { permanent_uuid: 
"5014824b94b64fd49da2c1a645c078aa" member_type: VOTER last_known_addr { host: 
"127.0.52.194" port: 43942 } }
I0507 20:10:39.166122 1389 leader_election.cc:230] T 
a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd 
[CANDIDATE]: Term 4 pre-election: Requesting pre-vote from peer 
b5e887ddd8a14cf3b2c24a0aa14be025
I0507 20:10:39.166318 1389 leader_election.cc:230] T 
a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd 
[CANDIDATE]: Term 4 pre-election: Requesting pre-vote from peer 
5014824b94b64fd49da2c1a645c078aa
W0507 20:10:39.166895 297 leader_election.cc:286] T 
a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd 
[CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer 
b5e887ddd8a14cf3b2c24a0aa14be025: Timed out: RequestConsensusVote RPC to 
127.0.52.195:38453 timed out after 0.633s (SENT)
I0507 20:10:39.166954 297 leader_election.cc:257] T 
a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd 
[CANDIDATE]: Term 4 pre-election: Election decided. Result: candidate lost.
W0507 20:10:39.167076 296 leader_election.cc:286] T 
a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd 
[CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer 
5014824b94b64fd49da2c1a645c078aa: Network error: Client connection negotiation 
failed: client connection to 127.0.52.194:43942: connect: Connection refused 
(error 111)
I0507 20:10:39.167102 1389 raft_consensus.cc:2704] T 
a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing 
failure detection for 0.685s (election complete)
I0507 20:10:39.167136 1389 raft_consensus.cc:2459] T 
a0376a5e3b9f495aa00d366b35594a27 P d1bf0f72a88f4483a1906bc8246849dd [term 3 
FOLLOWER]: Leader pre-election lost for term 4. Reason: could not achieve 
majority
W0507 20:10:39.179502 660 negotiation.cc:313] Failed RPC negotiation. Trace:
0507 20:10:36.182479 (+ 0us) reactor.cc:577] Submitting negotiation task for 
client connection to 127.0.52.195:38453
0507 20:10:36.182694 (+ 215us) negotiation.cc:98] Waiting for socket to connect
0507 20:10:36.182707 (+ 13us) client_negotiation.cc:160] Beginning negotiation
0507 20:10:36.182797 (+ 90us) client_negotiation.cc:237] Sending NEGOTIATE 
NegotiatePB request
0507 20:10:39.179166 (+2996369us) negotiation.cc:304] Negotiation complete: 
Timed out: Client connection negotiation failed: client connection to 
127.0.52.195:38453
Metrics: {"client-negotiator.queue_time_us":63}
W0507 20:10:39.179760 659 meta_cache.cc:216] Tablet 
a0376a5e3b9f495aa00d366b35594a27: Replica b5e887ddd8a14cf3b2c24a0aa14be025 
(127.0.52.195:38453) has failed: Timed out: Client connection negotiation 
failed: client connection to 127.0.52.195:38453
W0507 20:10:39.242803 297 leader_election.cc:286] T 
7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd 
[CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer 
b5e887ddd8a14cf3b2c24a0aa14be025: Timed out: RequestConsensusVote RPC to 
127.0.52.195:38453 timed out after 0.101s (SENT)
I0507 20:10:39.242909 297 leader_election.cc:257] T 
7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd 
[CANDIDATE]: Term 4 pre-election: Election decided. Result: candidate lost.
I0507 20:10:39.243211 1389 raft_consensus.cc:2704] T 
7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing 
failure detection for 0.097s (election complete)
I0507 20:10:39.243288 1389 raft_consensus.cc:2459] T 
7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [term 3 
FOLLOWER]: Leader pre-election lost for term 4. Reason: could not achieve 
majority
W0507 20:10:39.263934 297 leader_election.cc:286] T 
d80f4fae4b9f4d238f7f626507290390 P d1bf0f72a88f4483a1906bc8246849dd 
[CANDIDATE]: Term 5 pre-election: RPC error from VoteRequest() call to peer 
b5e887ddd8a14cf3b2c24a0aa14be025: Timed out: RequestConsensusVote RPC to 
127.0.52.195:38453 timed out after 0.975s (SENT)
I0507 20:10:39.264031 297 leader_election.cc:257] T 
d80f4fae4b9f4d238f7f626507290390 P d1bf0f72a88f4483a1906bc8246849dd 
[CANDIDATE]: Term 5 pre-election: Election decided. Result: candidate lost.
I0507 20:10:39.264344 1389 raft_consensus.cc:2704] T 
d80f4fae4b9f4d238f7f626507290390 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing 
failure detection for 0.720s (election complete)
I0507 20:10:39.264427 1389 raft_consensus.cc:2459] T 
d80f4fae4b9f4d238f7f626507290390 P d1bf0f72a88f4483a1906bc8246849dd [term 4 
FOLLOWER]: Leader pre-election lost for term 5. Reason: could not achieve 
majority
I0507 20:10:39.340955 1389 raft_consensus.cc:435] T 
7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [term 3 
FOLLOWER]: Starting pre-election (detected failure of leader 
b5e887ddd8a14cf3b2c24a0aa14be025)
I0507 20:10:39.341055 1389 raft_consensus.cc:2704] T 
7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd: Snoozing 
failure detection for 0.311s (starting election)
I0507 20:10:39.341125 1389 raft_consensus.cc:457] T 
7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd [term 3 
FOLLOWER]: Starting pre-election with config: opid_index: -1 OBSOLETE_local: 
false peers { permanent_uuid: "5014824b94b64fd49da2c1a645c078aa" member_type: 
VOTER last_known_addr { host: "127.0.52.194" port: 43942 } } peers { 
permanent_uuid: "d1bf0f72a88f4483a1906bc8246849dd" member_type: VOTER 
last_known_addr { host: "127.0.52.193" port: 46057 } } peers { permanent_uuid: 
"b5e887ddd8a14cf3b2c24a0aa14be025" member_type: VOTER last_known_addr { host: 
"127.0.52.195" port: 38453 } }
I0507 20:10:39.341584 1389 leader_election.cc:230] T 
7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd 
[CANDIDATE]: Term 4 pre-election: Requesting pre-vote from peer 
5014824b94b64fd49da2c1a645c078aa
I0507 20:10:39.341753 1389 leader_election.cc:230] T 
7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd 
[CANDIDATE]: Term 4 pre-election: Requesting pre-vote from peer 
b5e887ddd8a14cf3b2c24a0aa14be025
W0507 20:10:39.342707 296 leader_election.cc:286] T 
7161aaf1a71d455883ef478337dfaf19 P d1bf0f72a88f4483a1906bc8246849dd 
[CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer 
5014824b94b64fd49da2c1a645c078aa: Network error: Client connection negotiation 
failed: client connection to 127.0.52.194:43942: connect: Connection refused 
(error 111)
W0507 20:10:39.443085 659 batcher.cc:348] Timed out: Failed to write batch of 1 
ops to tablet a0376a5e3b9f495aa00d366b35594a27 after 261 attempt(s): Failed to 
write to server: (no server available): Write(tablet: 
a0376a5e3b9f495aa00d366b35594a27, num_ops: 1, num_attempts: 261) passed its 
deadline: Illegal state: Replica d1bf0f72a88f4483a1906bc8246849dd is not leader 
of this config. Role: FOLLOWER. Consensus state: current_term: 3 leader_uuid: 
"b5e887ddd8a14cf3b2c24a0aa14be025" committed_config { opid_index: -1 
OBSOLETE_local: false peers { permanent_uuid: 
"b5e887ddd8a14cf3b2c24a0aa14be025" member_type: VOTER last_known_addr { host: 
"127.0.52.195" port: 38453 } } peers { permanent_uuid: 
"d1bf0f72a88f4483a1906bc8246849dd" member_type: VOTER last_known_addr { host: 
"127.0.52.193" port: 46057 } } peers { permanent_uuid: 
"5014824b94b64fd49da2c1a645c078aa" member_type: VOTER last_known_addr { host: 
"127.0.52.194" port: 43942 } } }
/data/somelongdirectorytoavoidrpathissues/src/kudu/src/kudu/integration-tests/client-negotiation-failover-itest.cc:186:
 Failure
Failed
Bad status: IO error: Some errors occurred
...
[ FAILED ] ClientFailoverOnNegotiationTimeoutITest.Kudu1580ConnectToTServer 
(325968 ms){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to