[ 
https://issues.apache.org/jira/browse/KUDU-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010986#comment-17010986
 ] 

Alexey Serbin commented on KUDU-1781:
-------------------------------------

[~gaojun2048], this issue doesn't contain a proper report of any isolated 
problem, so it's not possible to answer your question.  RPC queue overflows 
might occur time to time due to spikes in workload, extra load, etc.

For troubleshooting, please contact Kudu Slack channel or Kudu user mailing 
list:
  https://kudu.apache.org/community.html

> RPC calls continuously attempt to elect leader
> ----------------------------------------------
>
>                 Key: KUDU-1781
>                 URL: https://issues.apache.org/jira/browse/KUDU-1781
>             Project: Kudu
>          Issue Type: Bug
>          Components: rpc
>         Environment: Centos 6.4
>            Reporter: Henry Tang
>            Priority: Major
>             Fix For: n/a
>
>
> Kudu fails large inserts eventually due to RPC hang (smaller ones work fine).
> {code:title=90a20dfea5be457b8a47024785be0834.Warning}
> W1201 17:07:20.848736 34491 leader_election.cc:271] T 
> d1d484ace2704bbab6d8421060061502 P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer 
> 783ac0cf9d1c409db2f5c759a5984401: Timed out: RequestConsensusVote RPC to 
> 198.135.236.108:7050 timed out after 4.382s
> W1201 17:07:20.852082 34491 leader_election.cc:271] T 
> 4a0fb77695f7439382a1ce8d36b33e90 P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 5 pre-election: RPC error from VoteRequest() call to peer 
> 783ac0cf9d1c409db2f5c759a5984401: Timed out: RequestConsensusVote RPC to 
> 198.135.236.108:7050 timed out after 5.310s
> W1201 17:07:20.853283 34491 leader_election.cc:271] T 
> 0c4525fd4bc0454f9bda79f83edf0d13 P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 9 pre-election: RPC error from VoteRequest() call to peer 
> 783ac0cf9d1c409db2f5c759a5984401: Timed out: RequestConsensusVote RPC to 
> 198.135.236.108:7050 timed out after 6.091s
> W1201 17:07:20.854065 34491 leader_election.cc:271] T 
> 8c820bf3b775479287a735d6f4538726 P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 3 pre-election: RPC error from VoteRequest() call to peer 
> 783ac0cf9d1c409db2f5c759a5984401: Timed out: RequestConsensusVote RPC to 
> 198.135.236.108:7050 timed out after 6.318s
> W1201 17:07:20.854698 34491 leader_election.cc:271] T 
> 74c24ed49c0c441ba9adde5fb055a713 P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 3 pre-election: RPC error from VoteRequest() call to peer 
> 783ac0cf9d1c409db2f5c759a5984401: Timed out: RequestConsensusVote RPC to 
> 198.135.236.108:7050 timed out after 6.414s
> W1201 17:07:20.855315 34491 leader_election.cc:271] T 
> 1277e88877b046508b794da55e84b40c P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 3 pre-election: RPC error from VoteRequest() call to peer 
> 783ac0cf9d1c409db2f5c759a5984401: Timed out: RequestConsensusVote RPC to 
> 198.135.236.108:7050 timed out after 6.418s
> W1201 17:07:20.857662 34494 leader_election.cc:331] T 
> a775c0a120fb42cdab5aec2148181790 P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 3 pre-election: Vote denied by peer 
> ccf6f0332e414ef6af63fdee7f24e952 with higher term. Message: Invalid argument: 
> T a775c0a120fb42cdab5aec2148181790 P ccf6f0332e414ef6af63fdee7f24e952 [term 4 
> FOLLOWER]: Leader pre-election vote request: Denying vote to candidate 
> 90a20dfea5be457b8a47024785be0834 for term 3 because replica is either leader 
> or believes a valid leader to be alive.
> W1201 17:07:20.857676 34491 leader_election.cc:271] T 
> a775c0a120fb42cdab5aec2148181790 P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 3 pre-election: RPC error from VoteRequest() call to peer 
> 783ac0cf9d1c409db2f5c759a5984401: Timed out: RequestConsensusVote RPC to 
> 198.135.236.108:7050 timed out after 6.902s
> W1201 17:07:20.864228 34494 leader_election.cc:331] T 
> ff94ba2330cd4f1cae67b271a681ee97 P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 4 pre-election: Vote denied by peer 
> ccf6f0332e414ef6af63fdee7f24e952 with higher term. Message: Invalid argument: 
> T ff94ba2330cd4f1cae67b271a681ee97 P ccf6f0332e414ef6af63fdee7f24e952 [term 5 
> FOLLOWER]: Leader pre-election vote request: Denying vote to candidate 
> 90a20dfea5be457b8a47024785be0834 for term 4 because replica is either leader 
> or believes a valid leader to be alive.
> W1201 17:07:20.887080 34491 leader_election.cc:331] T 
> a775c0a120fb42cdab5aec2148181790 P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 3 pre-election: Vote denied by peer 
> 783ac0cf9d1c409db2f5c759a5984401 with higher term. Message: Invalid argument: 
> T a775c0a120fb42cdab5aec2148181790 P 783ac0cf9d1c409db2f5c759a5984401 [term 4 
> LEADER]: Leader pre-election vote request: Denying vote to candidate 
> 90a20dfea5be457b8a47024785be0834 for term 3 because replica is either leader 
> or believes a valid leader to be alive.
> W1201 17:07:20.890411 34491 leader_election.cc:331] T 
> ff94ba2330cd4f1cae67b271a681ee97 P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 4 pre-election: Vote denied by peer 
> 783ac0cf9d1c409db2f5c759a5984401 with higher term. Message: Invalid argument: 
> T ff94ba2330cd4f1cae67b271a681ee97 P 783ac0cf9d1c409db2f5c759a5984401 [term 5 
> LEADER]: Leader pre-election vote request: Denying vote to candidate 
> 90a20dfea5be457b8a47024785be0834 for term 4 because replica is either leader 
> or believes a valid leader to be alive.
> W1201 17:09:07.194459 34491 leader_election.cc:331] T 
> c8a0477d19b5409ea429b7e9256d9cb6 P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 5 pre-election: Vote denied by peer 
> 93e2916028fb4868a0b860bece51b420 with higher term. Message: Invalid argument: 
> T c8a0477d19b5409ea429b7e9256d9cb6 P 93e2916028fb4868a0b860bece51b420 [term 6 
> FOLLOWER]: Leader pre-election vote request: Denying vote to candidate 
> 90a20dfea5be457b8a47024785be0834 for term 5 because replica is either leader 
> or believes a valid leader to be alive.
> W1201 17:09:07.258124 34493 leader_election.cc:331] T 
> 33ea6a60809c4384aadc140d2083ad1d P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 2 pre-election: Vote denied by peer 
> cf8fe3aebd7f403f87f0fc83050238b1 with higher term. Message: Invalid argument: 
> T 33ea6a60809c4384aadc140d2083ad1d P cf8fe3aebd7f403f87f0fc83050238b1 [term 3 
> LEADER]: Leader pre-election vote request: Denying vote to candidate 
> 90a20dfea5be457b8a47024785be0834 for term 2 because replica is either leader 
> or believes a valid leader to be alive.
> W1201 17:09:07.347940 34491 leader_election.cc:331] T 
> c8a0477d19b5409ea429b7e9256d9cb6 P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 5 pre-election: Vote denied by peer 
> 783ac0cf9d1c409db2f5c759a5984401 with higher term. Message: Invalid argument: 
> T c8a0477d19b5409ea429b7e9256d9cb6 P 783ac0cf9d1c409db2f5c759a5984401 [term 6 
> LEADER]: Leader pre-election vote request: Denying vote to candidate 
> 90a20dfea5be457b8a47024785be0834 for term 5 because replica is either leader 
> or believes a valid leader to be alive.
> W1201 17:09:07.412025 34491 leader_election.cc:331] T 
> 33ea6a60809c4384aadc140d2083ad1d P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 2 pre-election: Vote denied by peer 
> 93e2916028fb4868a0b860bece51b420 with higher term. Message: Invalid argument: 
> T 33ea6a60809c4384aadc140d2083ad1d P 93e2916028fb4868a0b860bece51b420 [term 3 
> FOLLOWER]: Leader pre-election vote request: Denying vote to candidate 
> 90a20dfea5be457b8a47024785be0834 for term 2 because replica is either leader 
> or believes a valid leader to be alive.
> W1201 17:09:07.473625 34491 leader_election.cc:331] T 
> 1277e88877b046508b794da55e84b40c P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 3 pre-election: Vote denied by peer 
> 783ac0cf9d1c409db2f5c759a5984401 with higher term. Message: Invalid argument: 
> T 1277e88877b046508b794da55e84b40c P 783ac0cf9d1c409db2f5c759a5984401 [term 4 
> LEADER]: Leader pre-election vote request: Denying vote to candidate 
> 90a20dfea5be457b8a47024785be0834 for term 3 because replica is either leader 
> or believes a valid leader to be alive.
> W1201 17:09:07.575197 34494 leader_election.cc:331] T 
> a808886d01bb4218babf6dd4cf6fd62e P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 5 pre-election: Vote denied by peer 
> ccf6f0332e414ef6af63fdee7f24e952 with higher term. Message: Invalid argument: 
> T a808886d01bb4218babf6dd4cf6fd62e P ccf6f0332e414ef6af63fdee7f24e952 [term 6 
> LEADER]: Leader pre-election vote request: Denying vote to candidate 
> 90a20dfea5be457b8a47024785be0834 for term 5 because replica is either leader 
> or believes a valid leader to be alive.
> W1201 17:09:11.101361 34491 leader_election.cc:331] T 
> 1277e88877b046508b794da55e84b40c P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 3 pre-election: Vote denied by peer 
> 93e2916028fb4868a0b860bece51b420 with higher term. Message: Invalid argument: 
> T 1277e88877b046508b794da55e84b40c P 93e2916028fb4868a0b860bece51b420 [term 4 
> FOLLOWER]: Leader pre-election vote request: Denying vote to candidate 
> 90a20dfea5be457b8a47024785be0834 for term 3 because replica is either leader 
> or believes a valid leader to be alive.
> W1201 17:09:11.126883 34491 leader_election.cc:331] T 
> a808886d01bb4218babf6dd4cf6fd62e P 90a20dfea5be457b8a47024785be0834 
> [CANDIDATE]: Term 5 pre-election: Vote denied by peer 
> 93e2916028fb4868a0b860bece51b420 with higher term. Message: Invalid argument: 
> T a808886d01bb4218babf6dd4cf6fd62e P 93e2916028fb4868a0b860bece51b420 [term 6 
> FOLLOWER]: Leader pre-election vote request: Denying vote to candidate 
> 90a20dfea5be457b8a47024785be0834 for term 5 because replica is either leader 
> or believes a valid leader to be alive.
> W1201 17:06:15.173542 34491 consensus_peers.cc:328] T 
> 7148720df08b4cf2ae08ca08c6a0cdc9 P 90a20dfea5be457b8a47024785be0834 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 7148720df08b4cf2ae08ca08c6a0cdc9. Status: Remote error: Service unavailable: 
> UpdateConsensus request on kudu.consensus.ConsensusService from 
> 198.135.236.109:3614 dropped due to backpressure. The service queue is full; 
> it has 50 items.. Retrying in the next heartbeat period. Already tried 2 
> times.
> W1201 17:06:15.173975 34491 consensus_peers.cc:328] T 
> 3e02c6f1140042ddb45ff60a53be7d18 P 90a20dfea5be457b8a47024785be0834 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 3e02c6f1140042ddb45ff60a53be7d18. Status: Remote error: Service unavailable: 
> UpdateConsensus request on kudu.consensus.ConsensusService from 
> 198.135.236.109:3614 dropped due to backpressure. The service queue is full; 
> it has 50 items.. Retrying in the next heartbeat period. Already tried 2 
> times.
> W1201 17:06:15.174345 34491 consensus_peers.cc:328] T 
> a808886d01bb4218babf6dd4cf6fd62e P 90a20dfea5be457b8a47024785be0834 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> a808886d01bb4218babf6dd4cf6fd62e. Status: Remote error: Service unavailable: 
> UpdateConsensus request on kudu.consensus.ConsensusService from 
> 198.135.236.109:3614 dropped due to backpressure. The service queue is full; 
> it has 50 items.. Retrying in the next heartbeat period. Already tried 2 
> times.
> W1201 17:06:19.025938 34491 consensus_peers.cc:328] T 
> 6fed775bbe6a4e70ad4f22393ed53508 P 90a20dfea5be457b8a47024785be0834 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 6fed775bbe6a4e70ad4f22393ed53508. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.031250 34491 consensus_peers.cc:328] T 
> 58cd689cc54d41e2981dbd4459340889 P 90a20dfea5be457b8a47024785be0834 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 58cd689cc54d41e2981dbd4459340889. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.040518 34491 consensus_peers.cc:328] T 
> a808886d01bb4218babf6dd4cf6fd62e P 90a20dfea5be457b8a47024785be0834 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> a808886d01bb4218babf6dd4cf6fd62e. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> {code}
> {code:title=783ac0cf9d1c409db2f5c759a5984401.Warning}
> W1201 17:06:19.120246  8034 consensus_peers.cc:328] T 
> 8a31f19251c44158af6e6fcd20cbda05 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 8a31f19251c44158af6e6fcd20cbda05. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.121495  8034 consensus_peers.cc:328] T 
> 2a78537dea7e4b6abfee96ea78e2c4ab P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 2a78537dea7e4b6abfee96ea78e2c4ab. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.123369  8034 consensus_peers.cc:328] T 
> a9c651a58c804c5891933372d0190084 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> a9c651a58c804c5891933372d0190084. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.123680  8034 consensus_peers.cc:328] T 
> b6bb1fd948264d0bb070e43fde24311b P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> b6bb1fd948264d0bb070e43fde24311b. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.123839  8034 consensus_peers.cc:328] T 
> 61214477f4254b688860e501b5a08460 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 61214477f4254b688860e501b5a08460. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.123981  8034 consensus_peers.cc:328] T 
> 90624be426244deb8304e61a3bf7de80 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 90624be426244deb8304e61a3bf7de80. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.124114  8034 consensus_peers.cc:328] T 
> 938fe0c077e441c681453f18c70bd72a P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 938fe0c077e441c681453f18c70bd72a. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.124477  8034 consensus_peers.cc:328] T 
> 629637ec8e36411d88e5fec51f778811 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 629637ec8e36411d88e5fec51f778811. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.124670  8034 consensus_peers.cc:328] T 
> 1277e88877b046508b794da55e84b40c P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 1277e88877b046508b794da55e84b40c. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.124796  8034 consensus_peers.cc:328] T 
> 98cb9a3d623041488eea38554255f224 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 98cb9a3d623041488eea38554255f224. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.125264  8034 consensus_peers.cc:328] T 
> d692e056b366489996c45e62a9f19b98 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> d692e056b366489996c45e62a9f19b98. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.125483  8034 consensus_peers.cc:328] T 
> b34bd2ac337140e79c306720641ab624 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> b34bd2ac337140e79c306720641ab624. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.125762  8034 consensus_peers.cc:328] T 
> ef6051d64b8b465e9be70c049f3d475a P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> ef6051d64b8b465e9be70c049f3d475a. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.125893  8034 consensus_peers.cc:328] T 
> 64a7e27aa1a3488096999556810d929a P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 64a7e27aa1a3488096999556810d929a. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.126030  8034 consensus_peers.cc:328] T 
> 0ad86cc48c3e45188956a4f019c11488 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 0ad86cc48c3e45188956a4f019c11488. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.126160  8034 consensus_peers.cc:328] T 
> 9fe8802892bc4bd18d3bf0ce64c8a65c P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> 9fe8802892bc4bd18d3bf0ce64c8a65c. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:19.126323  8034 consensus_peers.cc:328] T 
> c8a0477d19b5409ea429b7e9256d9cb6 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't 
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet 
> c8a0477d19b5409ea429b7e9256d9cb6. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:20.012154 37885 arena.cc:132] Arena 0x2fcc1980 footprint 
> (276365079 bytes) exceeded warning threshold (268435456 bytes)
>     @          0x1a1adef  kudu::ArenaBase<>::AddComponent()
>     @          0x1a1b324  kudu::ArenaBase<>::AllocateBytesFallback()
>     @           0x9544bf  kudu::tablet::MemRowSet::Insert()
>     @           0x90060f  kudu::tablet::Tablet::InsertOrUpsertUnlocked()
>     @           0x900d3e  kudu::tablet::Tablet::ApplyRowOperation()
>     @           0x900e86  kudu::tablet::Tablet::ApplyRowOperations()
>     @           0x93407f  kudu::tablet::WriteTransaction::Apply()
>     @           0x92de09  kudu::tablet::TransactionDriver::ApplyTask()
>     @          0x1a6067e  kudu::ThreadPool::DispatchThread()
>     @          0x1a5b42a  kudu::Thread::SuperviseThread()
>     @     0x7fb2dafe0a51  start_thread
>     @     0x7fb2d991c93d  clone
>     @              (nil)  (unknown)
> W1201 17:06:22.162607  8037 consensus_peers.cc:328] T 
> 679a82648d094465a2fc43b1a139bb70 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't 
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet 
> 679a82648d094465a2fc43b1a139bb70. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:22.163031  8037 consensus_peers.cc:328] T 
> 5c218656ce724d5d9ed8d9b4cfe31d3f P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't 
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet 
> 5c218656ce724d5d9ed8d9b4cfe31d3f. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:22.163307  8037 consensus_peers.cc:328] T 
> 5270ccb899ee4555bf2db043949adff3 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't 
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet 
> 5270ccb899ee4555bf2db043949adff3. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:22.163542  8037 consensus_peers.cc:328] T 
> bedcc43b14064131a04e2e185a79dea6 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't 
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet 
> bedcc43b14064131a04e2e185a79dea6. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:22.164099  8037 consensus_peers.cc:328] T 
> 47c9adb8dfbb438fa6245bce263f01f7 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't 
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet 
> 47c9adb8dfbb438fa6245bce263f01f7. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:22.164248  8037 consensus_peers.cc:328] T 
> 125aedead9d64b90b6905f139b277a7f P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't 
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet 
> 125aedead9d64b90b6905f139b277a7f. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:22.164388  8037 consensus_peers.cc:328] T 
> ff94ba2330cd4f1cae67b271a681ee97 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't 
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet 
> ff94ba2330cd4f1cae67b271a681ee97. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:22.164504  8037 consensus_peers.cc:328] T 
> 57fdc0e34e704782878ebd739151b4f3 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't 
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet 
> 57fdc0e34e704782878ebd739151b4f3. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:06:22.164909  8037 consensus_peers.cc:328] T 
> 98cb9a3d623041488eea38554255f224 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't 
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet 
> 98cb9a3d623041488eea38554255f224. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already...
> ...........
> W1201 17:08:09.960381  8037 consensus_peers.cc:328] T 
> 30b2ead4ee044bb096582d554972ec30 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't 
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet 
> 30b2ead4ee044bb096582d554972ec30. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:08:11.079704  8037 outbound_call.cc:199] RPC callback for RPC call 
> kudu.consensus.ConsensusService.UpdateConsensus -> 
> {remote=198.135.236.110:7050, user_credentials={real_user=kudu}} blocked 
> reactor thread for 1.11935e+06us
> W1201 17:08:11.068311 25944 log.cc:531] Time spent Append to log took a long 
> time: real 0.948s  user 0.000s sys 0.000s
> W1201 17:08:11.068356 25992 log.cc:531] Time spent Append to log took a long 
> time: real 0.951s  user 0.000s sys 0.000s
> W1201 17:08:11.068356 26352 log.cc:531] Time spent Append to log took a long 
> time: real 0.948s  user 0.000s sys 0.000s
> W1201 17:08:11.068473 25984 log.cc:531] Time spent Append to log took a long 
> time: real 0.949s  user 0.000s sys 0.000s
> W1201 17:08:11.068496 26015 log.cc:531] Time spent Append to log took a long 
> time: real 0.947s  user 0.000s sys 0.000s
> W1201 17:08:11.068532 26032 log.cc:531] Time spent Append to log took a long 
> time: real 0.948s  user 0.000s sys 0.000s
> W1201 17:08:11.068536 26101 log.cc:531] Time spent Append to log took a long 
> time: real 0.946s  user 0.001s sys 0.000s
> W1201 17:08:11.068539 26025 log.cc:531] Time spent Append to log took a long 
> time: real 0.951s  user 0.000s sys 0.000s
> W1201 17:08:11.068676 26212 log.cc:531] Time spent Append to log took a long 
> time: real 0.961s  user 0.000s sys 0.000s
> W1201 17:08:11.068936 25979 log.cc:531] Time spent Append to log took a long 
> time: real 0.956s  user 0.000s sys 0.000s
> W1201 17:08:11.080740  8037 connection.cc:205] RPC call timeout handler was 
> delayed by 1.11752s! This may be due to a process-wide pause such as 
> swapping, logging-related delays, or allocator lock contention. Will allow an 
> additional 0.1s for a response.
> W1201 17:08:11.082247  8216 rpcz_store.cc:234] Call 
> kudu.consensus.ConsensusService.UpdateConsensus from 198.135.236.105:64587 
> (request call id 146425) took 960ms (client timeout 1000).
> W1201 17:08:11.083259  8037 connection.cc:205] RPC call timeout handler was 
> delayed by 1.1151s! This may be due to a process-wide pause such as swapping, 
> logging-related delays, or allocator lock contention. Will allow an 
> additional 0.1s for a response.
> W1201 17:08:11.083643  8216 rpcz_store.cc:238] Trace:
> 1201 17:08:10.121239 (+     0us) service_pool.cc:143] Inserting onto call 
> queue
> 1201 17:08:10.121250 (+    11us) service_pool.cc:202] Handling call
> 1201 17:08:10.121310 (+    60us) raft_consensus.cc:1127] Updating replica for 
> 1 ops
> 1201 17:08:10.121325 (+    15us) raft_consensus.cc:1166] Early marking 
> committed up to index 0
> 1201 17:08:10.121325 (+     0us) raft_consensus.cc:1171] Triggering prepare 
> for 1 ops
> 1201 17:08:10.121646 (+   321us) log.cc:447] Serialized 114384 byte log entry
> 1201 17:08:10.121647 (+     1us) raft_consensus.cc:1274] Marking committed up 
> to 5529
> 1201 17:08:10.121648 (+     1us) raft_consensus.cc:1323] Filling consensus 
> response to leader.
> 1201 17:08:10.121649 (+     1us) raft_consensus.cc:1297] Waiting on the 
> replicates to finish logging
> 1201 17:08:11.082219 (+960570us) raft_consensus.cc:1310] finished
> 1201 17:08:11.082220 (+     1us) raft_consensus.cc:1318] UpdateReplicas() 
> finished
> 1201 17:08:11.082230 (+    10us) inbound_call.cc:130] Queueing success 
> response
> Related trace 'txn':
> 1201 17:08:10.121603 (+     0us) write_transaction.cc:75] PREPARE: Starting
> 1201 17:08:10.121642 (+    39us) write_transaction.cc:241] Acquiring schema 
> lock in shared mode
> 1201 17:08:10.121642 (+     0us) write_transaction.cc:244] Acquired schema 
> lock
> 1201 17:08:10.121642 (+     0us) tablet.cc:306] PREPARE: Decoding operations
> 1201 17:08:10.121862 (+   220us) tablet.cc:335] PREPARE: Acquiring locks for 
> 172 operations
> 1201 17:08:10.122085 (+   223us) tablet.cc:339] PREPARE: locks acquired
> 1201 17:08:10.122085 (+     0us) write_transaction.cc:100] PREPARE: finished.
> 1201 17:08:10.122094 (+     9us) write_transaction.cc:106] Start()
> 1201 17:08:10.122097 (+     3us) write_transaction.cc:108] Timestamp: P: 
> 1480630090120349 usec, L: 0
> Metrics: 
> {"tcmalloc_contention_cycles":16000,"child_traces":[["txn",{"num_ops":172,"prepare.queue_time_us":242,"prepare.run_cpu_time_us":496,"prepare.run_wall_time_us":500,"tcmalloc_contention_cycles":74624,"thread_start_us":230,"threads_started":1}]]}
> W1201 17:08:11.083673  8037 connection.cc:205] RPC call timeout handler was 
> delayed by 1.11161s! This may be due to a process-wide pause such as 
> swapping, logging-related delays, or allocator lock contention. Will allow an 
> additional 0.1s for a response.
> W1201 17:08:11.084110  8037 connection.cc:205] RPC call timeout handler was 
> delayed by 1.111s! This may be due to a process-wide pause such as swapping, 
> logging-related delays, or allocator lock contention. Will allow an 
> additional 0.1s for a response.
> W1201 17:08:11.084240  8037 connection.cc:205] RPC call timeout handler was 
> delayed by 1.09875s! This may be due to a process-wide pause such as 
> swapping, logging-related delays, or allocator lock contention. Will allow an 
> additional 0.1s for a response.
> W1201 17:08:11.084365  8037 connection.cc:205] RPC call timeout handler was 
> delayed by 1.08527s! This may be due to a process-wide pause such as 
> swapping, logging-related delays, or allocator lock contention. Will allow an 
> additional 0.1s for a response.
> W1201 17:08:11.084503  8037 consensus_peers.cc:328] T 
> f165a3739dca46aab29b2c965212d3f5 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> 90a20dfea5be457b8a47024785be0834 (hadoop109.datablocks.net:7050): Couldn't 
> send request to peer 90a20dfea5be457b8a47024785be0834 for tablet 
> f165a3739dca46aab29b2c965212d3f5. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.109:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 41 times.
> W1201 17:08:11.084656  8037 consensus_peers.cc:328] T 
> c121f0c905df47eb9d33773e5528e854 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't 
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet 
> c121f0c905df47eb9d33773e5528e854. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:08:11.084799  8037 connection.cc:205] RPC call timeout handler was 
> delayed by 1.0541s! This may be due to a process-wide pause such as swapping, 
> logging-related delays, or allocator lock contention. Will allow an 
> additional 0.1s for a response.
> W1201 17:08:11.084997  8037 consensus_peers.cc:328] T 
> ff94ba2330cd4f1cae67b271a681ee97 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't 
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet 
> ff94ba2330cd4f1cae67b271a681ee97. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> W1201 17:08:11.085119  8037 connection.cc:205] RPC call timeout handler was 
> delayed by 1.02432s! This may be due to a process-wide pause such as 
> swapping, logging-related delays, or allocator lock contention. Will allow an 
> additional 0.1s for a response.
> W1201 17:08:11.085245  8037 consensus_peers.cc:328] T 
> 0a22984862114f6ca1b8488236ac5ad0 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer 
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't 
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet 
> 0a22984862114f6ca1b8488236ac5ad0. Status: Timed out: UpdateConsensus RPC to 
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat 
> period. Already tried 1 times.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to