[
https://issues.apache.org/jira/browse/KUDU-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010986#comment-17010986
]
Alexey Serbin commented on KUDU-1781:
-------------------------------------
[~gaojun2048], this issue doesn't contain a proper report of any isolated
problem, so it's not possible to answer your question. RPC queue overflows
might occur time to time due to spikes in workload, extra load, etc.
For troubleshooting, please contact Kudu Slack channel or Kudu user mailing
list:
https://kudu.apache.org/community.html
> RPC calls continuously attempt to elect leader
> ----------------------------------------------
>
> Key: KUDU-1781
> URL: https://issues.apache.org/jira/browse/KUDU-1781
> Project: Kudu
> Issue Type: Bug
> Components: rpc
> Environment: Centos 6.4
> Reporter: Henry Tang
> Priority: Major
> Fix For: n/a
>
>
> Kudu fails large inserts eventually due to RPC hang (smaller ones work fine).
> {code:title=90a20dfea5be457b8a47024785be0834.Warning}
> W1201 17:07:20.848736 34491 leader_election.cc:271] T
> d1d484ace2704bbab6d8421060061502 P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 4 pre-election: RPC error from VoteRequest() call to peer
> 783ac0cf9d1c409db2f5c759a5984401: Timed out: RequestConsensusVote RPC to
> 198.135.236.108:7050 timed out after 4.382s
> W1201 17:07:20.852082 34491 leader_election.cc:271] T
> 4a0fb77695f7439382a1ce8d36b33e90 P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 5 pre-election: RPC error from VoteRequest() call to peer
> 783ac0cf9d1c409db2f5c759a5984401: Timed out: RequestConsensusVote RPC to
> 198.135.236.108:7050 timed out after 5.310s
> W1201 17:07:20.853283 34491 leader_election.cc:271] T
> 0c4525fd4bc0454f9bda79f83edf0d13 P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 9 pre-election: RPC error from VoteRequest() call to peer
> 783ac0cf9d1c409db2f5c759a5984401: Timed out: RequestConsensusVote RPC to
> 198.135.236.108:7050 timed out after 6.091s
> W1201 17:07:20.854065 34491 leader_election.cc:271] T
> 8c820bf3b775479287a735d6f4538726 P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 3 pre-election: RPC error from VoteRequest() call to peer
> 783ac0cf9d1c409db2f5c759a5984401: Timed out: RequestConsensusVote RPC to
> 198.135.236.108:7050 timed out after 6.318s
> W1201 17:07:20.854698 34491 leader_election.cc:271] T
> 74c24ed49c0c441ba9adde5fb055a713 P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 3 pre-election: RPC error from VoteRequest() call to peer
> 783ac0cf9d1c409db2f5c759a5984401: Timed out: RequestConsensusVote RPC to
> 198.135.236.108:7050 timed out after 6.414s
> W1201 17:07:20.855315 34491 leader_election.cc:271] T
> 1277e88877b046508b794da55e84b40c P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 3 pre-election: RPC error from VoteRequest() call to peer
> 783ac0cf9d1c409db2f5c759a5984401: Timed out: RequestConsensusVote RPC to
> 198.135.236.108:7050 timed out after 6.418s
> W1201 17:07:20.857662 34494 leader_election.cc:331] T
> a775c0a120fb42cdab5aec2148181790 P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 3 pre-election: Vote denied by peer
> ccf6f0332e414ef6af63fdee7f24e952 with higher term. Message: Invalid argument:
> T a775c0a120fb42cdab5aec2148181790 P ccf6f0332e414ef6af63fdee7f24e952 [term 4
> FOLLOWER]: Leader pre-election vote request: Denying vote to candidate
> 90a20dfea5be457b8a47024785be0834 for term 3 because replica is either leader
> or believes a valid leader to be alive.
> W1201 17:07:20.857676 34491 leader_election.cc:271] T
> a775c0a120fb42cdab5aec2148181790 P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 3 pre-election: RPC error from VoteRequest() call to peer
> 783ac0cf9d1c409db2f5c759a5984401: Timed out: RequestConsensusVote RPC to
> 198.135.236.108:7050 timed out after 6.902s
> W1201 17:07:20.864228 34494 leader_election.cc:331] T
> ff94ba2330cd4f1cae67b271a681ee97 P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 4 pre-election: Vote denied by peer
> ccf6f0332e414ef6af63fdee7f24e952 with higher term. Message: Invalid argument:
> T ff94ba2330cd4f1cae67b271a681ee97 P ccf6f0332e414ef6af63fdee7f24e952 [term 5
> FOLLOWER]: Leader pre-election vote request: Denying vote to candidate
> 90a20dfea5be457b8a47024785be0834 for term 4 because replica is either leader
> or believes a valid leader to be alive.
> W1201 17:07:20.887080 34491 leader_election.cc:331] T
> a775c0a120fb42cdab5aec2148181790 P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 3 pre-election: Vote denied by peer
> 783ac0cf9d1c409db2f5c759a5984401 with higher term. Message: Invalid argument:
> T a775c0a120fb42cdab5aec2148181790 P 783ac0cf9d1c409db2f5c759a5984401 [term 4
> LEADER]: Leader pre-election vote request: Denying vote to candidate
> 90a20dfea5be457b8a47024785be0834 for term 3 because replica is either leader
> or believes a valid leader to be alive.
> W1201 17:07:20.890411 34491 leader_election.cc:331] T
> ff94ba2330cd4f1cae67b271a681ee97 P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 4 pre-election: Vote denied by peer
> 783ac0cf9d1c409db2f5c759a5984401 with higher term. Message: Invalid argument:
> T ff94ba2330cd4f1cae67b271a681ee97 P 783ac0cf9d1c409db2f5c759a5984401 [term 5
> LEADER]: Leader pre-election vote request: Denying vote to candidate
> 90a20dfea5be457b8a47024785be0834 for term 4 because replica is either leader
> or believes a valid leader to be alive.
> W1201 17:09:07.194459 34491 leader_election.cc:331] T
> c8a0477d19b5409ea429b7e9256d9cb6 P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 5 pre-election: Vote denied by peer
> 93e2916028fb4868a0b860bece51b420 with higher term. Message: Invalid argument:
> T c8a0477d19b5409ea429b7e9256d9cb6 P 93e2916028fb4868a0b860bece51b420 [term 6
> FOLLOWER]: Leader pre-election vote request: Denying vote to candidate
> 90a20dfea5be457b8a47024785be0834 for term 5 because replica is either leader
> or believes a valid leader to be alive.
> W1201 17:09:07.258124 34493 leader_election.cc:331] T
> 33ea6a60809c4384aadc140d2083ad1d P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 2 pre-election: Vote denied by peer
> cf8fe3aebd7f403f87f0fc83050238b1 with higher term. Message: Invalid argument:
> T 33ea6a60809c4384aadc140d2083ad1d P cf8fe3aebd7f403f87f0fc83050238b1 [term 3
> LEADER]: Leader pre-election vote request: Denying vote to candidate
> 90a20dfea5be457b8a47024785be0834 for term 2 because replica is either leader
> or believes a valid leader to be alive.
> W1201 17:09:07.347940 34491 leader_election.cc:331] T
> c8a0477d19b5409ea429b7e9256d9cb6 P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 5 pre-election: Vote denied by peer
> 783ac0cf9d1c409db2f5c759a5984401 with higher term. Message: Invalid argument:
> T c8a0477d19b5409ea429b7e9256d9cb6 P 783ac0cf9d1c409db2f5c759a5984401 [term 6
> LEADER]: Leader pre-election vote request: Denying vote to candidate
> 90a20dfea5be457b8a47024785be0834 for term 5 because replica is either leader
> or believes a valid leader to be alive.
> W1201 17:09:07.412025 34491 leader_election.cc:331] T
> 33ea6a60809c4384aadc140d2083ad1d P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 2 pre-election: Vote denied by peer
> 93e2916028fb4868a0b860bece51b420 with higher term. Message: Invalid argument:
> T 33ea6a60809c4384aadc140d2083ad1d P 93e2916028fb4868a0b860bece51b420 [term 3
> FOLLOWER]: Leader pre-election vote request: Denying vote to candidate
> 90a20dfea5be457b8a47024785be0834 for term 2 because replica is either leader
> or believes a valid leader to be alive.
> W1201 17:09:07.473625 34491 leader_election.cc:331] T
> 1277e88877b046508b794da55e84b40c P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 3 pre-election: Vote denied by peer
> 783ac0cf9d1c409db2f5c759a5984401 with higher term. Message: Invalid argument:
> T 1277e88877b046508b794da55e84b40c P 783ac0cf9d1c409db2f5c759a5984401 [term 4
> LEADER]: Leader pre-election vote request: Denying vote to candidate
> 90a20dfea5be457b8a47024785be0834 for term 3 because replica is either leader
> or believes a valid leader to be alive.
> W1201 17:09:07.575197 34494 leader_election.cc:331] T
> a808886d01bb4218babf6dd4cf6fd62e P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 5 pre-election: Vote denied by peer
> ccf6f0332e414ef6af63fdee7f24e952 with higher term. Message: Invalid argument:
> T a808886d01bb4218babf6dd4cf6fd62e P ccf6f0332e414ef6af63fdee7f24e952 [term 6
> LEADER]: Leader pre-election vote request: Denying vote to candidate
> 90a20dfea5be457b8a47024785be0834 for term 5 because replica is either leader
> or believes a valid leader to be alive.
> W1201 17:09:11.101361 34491 leader_election.cc:331] T
> 1277e88877b046508b794da55e84b40c P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 3 pre-election: Vote denied by peer
> 93e2916028fb4868a0b860bece51b420 with higher term. Message: Invalid argument:
> T 1277e88877b046508b794da55e84b40c P 93e2916028fb4868a0b860bece51b420 [term 4
> FOLLOWER]: Leader pre-election vote request: Denying vote to candidate
> 90a20dfea5be457b8a47024785be0834 for term 3 because replica is either leader
> or believes a valid leader to be alive.
> W1201 17:09:11.126883 34491 leader_election.cc:331] T
> a808886d01bb4218babf6dd4cf6fd62e P 90a20dfea5be457b8a47024785be0834
> [CANDIDATE]: Term 5 pre-election: Vote denied by peer
> 93e2916028fb4868a0b860bece51b420 with higher term. Message: Invalid argument:
> T a808886d01bb4218babf6dd4cf6fd62e P 93e2916028fb4868a0b860bece51b420 [term 6
> FOLLOWER]: Leader pre-election vote request: Denying vote to candidate
> 90a20dfea5be457b8a47024785be0834 for term 5 because replica is either leader
> or believes a valid leader to be alive.
> W1201 17:06:15.173542 34491 consensus_peers.cc:328] T
> 7148720df08b4cf2ae08ca08c6a0cdc9 P 90a20dfea5be457b8a47024785be0834 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 7148720df08b4cf2ae08ca08c6a0cdc9. Status: Remote error: Service unavailable:
> UpdateConsensus request on kudu.consensus.ConsensusService from
> 198.135.236.109:3614 dropped due to backpressure. The service queue is full;
> it has 50 items.. Retrying in the next heartbeat period. Already tried 2
> times.
> W1201 17:06:15.173975 34491 consensus_peers.cc:328] T
> 3e02c6f1140042ddb45ff60a53be7d18 P 90a20dfea5be457b8a47024785be0834 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 3e02c6f1140042ddb45ff60a53be7d18. Status: Remote error: Service unavailable:
> UpdateConsensus request on kudu.consensus.ConsensusService from
> 198.135.236.109:3614 dropped due to backpressure. The service queue is full;
> it has 50 items.. Retrying in the next heartbeat period. Already tried 2
> times.
> W1201 17:06:15.174345 34491 consensus_peers.cc:328] T
> a808886d01bb4218babf6dd4cf6fd62e P 90a20dfea5be457b8a47024785be0834 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> a808886d01bb4218babf6dd4cf6fd62e. Status: Remote error: Service unavailable:
> UpdateConsensus request on kudu.consensus.ConsensusService from
> 198.135.236.109:3614 dropped due to backpressure. The service queue is full;
> it has 50 items.. Retrying in the next heartbeat period. Already tried 2
> times.
> W1201 17:06:19.025938 34491 consensus_peers.cc:328] T
> 6fed775bbe6a4e70ad4f22393ed53508 P 90a20dfea5be457b8a47024785be0834 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 6fed775bbe6a4e70ad4f22393ed53508. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.031250 34491 consensus_peers.cc:328] T
> 58cd689cc54d41e2981dbd4459340889 P 90a20dfea5be457b8a47024785be0834 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 58cd689cc54d41e2981dbd4459340889. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.040518 34491 consensus_peers.cc:328] T
> a808886d01bb4218babf6dd4cf6fd62e P 90a20dfea5be457b8a47024785be0834 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> a808886d01bb4218babf6dd4cf6fd62e. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> {code}
> {code:title=783ac0cf9d1c409db2f5c759a5984401.Warning}
> W1201 17:06:19.120246 8034 consensus_peers.cc:328] T
> 8a31f19251c44158af6e6fcd20cbda05 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 8a31f19251c44158af6e6fcd20cbda05. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.121495 8034 consensus_peers.cc:328] T
> 2a78537dea7e4b6abfee96ea78e2c4ab P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 2a78537dea7e4b6abfee96ea78e2c4ab. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.123369 8034 consensus_peers.cc:328] T
> a9c651a58c804c5891933372d0190084 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> a9c651a58c804c5891933372d0190084. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.123680 8034 consensus_peers.cc:328] T
> b6bb1fd948264d0bb070e43fde24311b P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> b6bb1fd948264d0bb070e43fde24311b. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.123839 8034 consensus_peers.cc:328] T
> 61214477f4254b688860e501b5a08460 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 61214477f4254b688860e501b5a08460. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.123981 8034 consensus_peers.cc:328] T
> 90624be426244deb8304e61a3bf7de80 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 90624be426244deb8304e61a3bf7de80. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.124114 8034 consensus_peers.cc:328] T
> 938fe0c077e441c681453f18c70bd72a P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 938fe0c077e441c681453f18c70bd72a. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.124477 8034 consensus_peers.cc:328] T
> 629637ec8e36411d88e5fec51f778811 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 629637ec8e36411d88e5fec51f778811. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.124670 8034 consensus_peers.cc:328] T
> 1277e88877b046508b794da55e84b40c P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 1277e88877b046508b794da55e84b40c. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.124796 8034 consensus_peers.cc:328] T
> 98cb9a3d623041488eea38554255f224 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 98cb9a3d623041488eea38554255f224. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.125264 8034 consensus_peers.cc:328] T
> d692e056b366489996c45e62a9f19b98 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> d692e056b366489996c45e62a9f19b98. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.125483 8034 consensus_peers.cc:328] T
> b34bd2ac337140e79c306720641ab624 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> b34bd2ac337140e79c306720641ab624. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.125762 8034 consensus_peers.cc:328] T
> ef6051d64b8b465e9be70c049f3d475a P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> ef6051d64b8b465e9be70c049f3d475a. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.125893 8034 consensus_peers.cc:328] T
> 64a7e27aa1a3488096999556810d929a P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 64a7e27aa1a3488096999556810d929a. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.126030 8034 consensus_peers.cc:328] T
> 0ad86cc48c3e45188956a4f019c11488 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 0ad86cc48c3e45188956a4f019c11488. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.126160 8034 consensus_peers.cc:328] T
> 9fe8802892bc4bd18d3bf0ce64c8a65c P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> 9fe8802892bc4bd18d3bf0ce64c8a65c. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:19.126323 8034 consensus_peers.cc:328] T
> c8a0477d19b5409ea429b7e9256d9cb6 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 93e2916028fb4868a0b860bece51b420 (hadoop105.datablocks.net:7050): Couldn't
> send request to peer 93e2916028fb4868a0b860bece51b420 for tablet
> c8a0477d19b5409ea429b7e9256d9cb6. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.105:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:20.012154 37885 arena.cc:132] Arena 0x2fcc1980 footprint
> (276365079 bytes) exceeded warning threshold (268435456 bytes)
> @ 0x1a1adef kudu::ArenaBase<>::AddComponent()
> @ 0x1a1b324 kudu::ArenaBase<>::AllocateBytesFallback()
> @ 0x9544bf kudu::tablet::MemRowSet::Insert()
> @ 0x90060f kudu::tablet::Tablet::InsertOrUpsertUnlocked()
> @ 0x900d3e kudu::tablet::Tablet::ApplyRowOperation()
> @ 0x900e86 kudu::tablet::Tablet::ApplyRowOperations()
> @ 0x93407f kudu::tablet::WriteTransaction::Apply()
> @ 0x92de09 kudu::tablet::TransactionDriver::ApplyTask()
> @ 0x1a6067e kudu::ThreadPool::DispatchThread()
> @ 0x1a5b42a kudu::Thread::SuperviseThread()
> @ 0x7fb2dafe0a51 start_thread
> @ 0x7fb2d991c93d clone
> @ (nil) (unknown)
> W1201 17:06:22.162607 8037 consensus_peers.cc:328] T
> 679a82648d094465a2fc43b1a139bb70 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet
> 679a82648d094465a2fc43b1a139bb70. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:22.163031 8037 consensus_peers.cc:328] T
> 5c218656ce724d5d9ed8d9b4cfe31d3f P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet
> 5c218656ce724d5d9ed8d9b4cfe31d3f. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:22.163307 8037 consensus_peers.cc:328] T
> 5270ccb899ee4555bf2db043949adff3 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet
> 5270ccb899ee4555bf2db043949adff3. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:22.163542 8037 consensus_peers.cc:328] T
> bedcc43b14064131a04e2e185a79dea6 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet
> bedcc43b14064131a04e2e185a79dea6. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:22.164099 8037 consensus_peers.cc:328] T
> 47c9adb8dfbb438fa6245bce263f01f7 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet
> 47c9adb8dfbb438fa6245bce263f01f7. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:22.164248 8037 consensus_peers.cc:328] T
> 125aedead9d64b90b6905f139b277a7f P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet
> 125aedead9d64b90b6905f139b277a7f. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:22.164388 8037 consensus_peers.cc:328] T
> ff94ba2330cd4f1cae67b271a681ee97 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet
> ff94ba2330cd4f1cae67b271a681ee97. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:22.164504 8037 consensus_peers.cc:328] T
> 57fdc0e34e704782878ebd739151b4f3 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet
> 57fdc0e34e704782878ebd739151b4f3. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:06:22.164909 8037 consensus_peers.cc:328] T
> 98cb9a3d623041488eea38554255f224 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet
> 98cb9a3d623041488eea38554255f224. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already...
> ...........
> W1201 17:08:09.960381 8037 consensus_peers.cc:328] T
> 30b2ead4ee044bb096582d554972ec30 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet
> 30b2ead4ee044bb096582d554972ec30. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:08:11.079704 8037 outbound_call.cc:199] RPC callback for RPC call
> kudu.consensus.ConsensusService.UpdateConsensus ->
> {remote=198.135.236.110:7050, user_credentials={real_user=kudu}} blocked
> reactor thread for 1.11935e+06us
> W1201 17:08:11.068311 25944 log.cc:531] Time spent Append to log took a long
> time: real 0.948s user 0.000s sys 0.000s
> W1201 17:08:11.068356 25992 log.cc:531] Time spent Append to log took a long
> time: real 0.951s user 0.000s sys 0.000s
> W1201 17:08:11.068356 26352 log.cc:531] Time spent Append to log took a long
> time: real 0.948s user 0.000s sys 0.000s
> W1201 17:08:11.068473 25984 log.cc:531] Time spent Append to log took a long
> time: real 0.949s user 0.000s sys 0.000s
> W1201 17:08:11.068496 26015 log.cc:531] Time spent Append to log took a long
> time: real 0.947s user 0.000s sys 0.000s
> W1201 17:08:11.068532 26032 log.cc:531] Time spent Append to log took a long
> time: real 0.948s user 0.000s sys 0.000s
> W1201 17:08:11.068536 26101 log.cc:531] Time spent Append to log took a long
> time: real 0.946s user 0.001s sys 0.000s
> W1201 17:08:11.068539 26025 log.cc:531] Time spent Append to log took a long
> time: real 0.951s user 0.000s sys 0.000s
> W1201 17:08:11.068676 26212 log.cc:531] Time spent Append to log took a long
> time: real 0.961s user 0.000s sys 0.000s
> W1201 17:08:11.068936 25979 log.cc:531] Time spent Append to log took a long
> time: real 0.956s user 0.000s sys 0.000s
> W1201 17:08:11.080740 8037 connection.cc:205] RPC call timeout handler was
> delayed by 1.11752s! This may be due to a process-wide pause such as
> swapping, logging-related delays, or allocator lock contention. Will allow an
> additional 0.1s for a response.
> W1201 17:08:11.082247 8216 rpcz_store.cc:234] Call
> kudu.consensus.ConsensusService.UpdateConsensus from 198.135.236.105:64587
> (request call id 146425) took 960ms (client timeout 1000).
> W1201 17:08:11.083259 8037 connection.cc:205] RPC call timeout handler was
> delayed by 1.1151s! This may be due to a process-wide pause such as swapping,
> logging-related delays, or allocator lock contention. Will allow an
> additional 0.1s for a response.
> W1201 17:08:11.083643 8216 rpcz_store.cc:238] Trace:
> 1201 17:08:10.121239 (+ 0us) service_pool.cc:143] Inserting onto call
> queue
> 1201 17:08:10.121250 (+ 11us) service_pool.cc:202] Handling call
> 1201 17:08:10.121310 (+ 60us) raft_consensus.cc:1127] Updating replica for
> 1 ops
> 1201 17:08:10.121325 (+ 15us) raft_consensus.cc:1166] Early marking
> committed up to index 0
> 1201 17:08:10.121325 (+ 0us) raft_consensus.cc:1171] Triggering prepare
> for 1 ops
> 1201 17:08:10.121646 (+ 321us) log.cc:447] Serialized 114384 byte log entry
> 1201 17:08:10.121647 (+ 1us) raft_consensus.cc:1274] Marking committed up
> to 5529
> 1201 17:08:10.121648 (+ 1us) raft_consensus.cc:1323] Filling consensus
> response to leader.
> 1201 17:08:10.121649 (+ 1us) raft_consensus.cc:1297] Waiting on the
> replicates to finish logging
> 1201 17:08:11.082219 (+960570us) raft_consensus.cc:1310] finished
> 1201 17:08:11.082220 (+ 1us) raft_consensus.cc:1318] UpdateReplicas()
> finished
> 1201 17:08:11.082230 (+ 10us) inbound_call.cc:130] Queueing success
> response
> Related trace 'txn':
> 1201 17:08:10.121603 (+ 0us) write_transaction.cc:75] PREPARE: Starting
> 1201 17:08:10.121642 (+ 39us) write_transaction.cc:241] Acquiring schema
> lock in shared mode
> 1201 17:08:10.121642 (+ 0us) write_transaction.cc:244] Acquired schema
> lock
> 1201 17:08:10.121642 (+ 0us) tablet.cc:306] PREPARE: Decoding operations
> 1201 17:08:10.121862 (+ 220us) tablet.cc:335] PREPARE: Acquiring locks for
> 172 operations
> 1201 17:08:10.122085 (+ 223us) tablet.cc:339] PREPARE: locks acquired
> 1201 17:08:10.122085 (+ 0us) write_transaction.cc:100] PREPARE: finished.
> 1201 17:08:10.122094 (+ 9us) write_transaction.cc:106] Start()
> 1201 17:08:10.122097 (+ 3us) write_transaction.cc:108] Timestamp: P:
> 1480630090120349 usec, L: 0
> Metrics:
> {"tcmalloc_contention_cycles":16000,"child_traces":[["txn",{"num_ops":172,"prepare.queue_time_us":242,"prepare.run_cpu_time_us":496,"prepare.run_wall_time_us":500,"tcmalloc_contention_cycles":74624,"thread_start_us":230,"threads_started":1}]]}
> W1201 17:08:11.083673 8037 connection.cc:205] RPC call timeout handler was
> delayed by 1.11161s! This may be due to a process-wide pause such as
> swapping, logging-related delays, or allocator lock contention. Will allow an
> additional 0.1s for a response.
> W1201 17:08:11.084110 8037 connection.cc:205] RPC call timeout handler was
> delayed by 1.111s! This may be due to a process-wide pause such as swapping,
> logging-related delays, or allocator lock contention. Will allow an
> additional 0.1s for a response.
> W1201 17:08:11.084240 8037 connection.cc:205] RPC call timeout handler was
> delayed by 1.09875s! This may be due to a process-wide pause such as
> swapping, logging-related delays, or allocator lock contention. Will allow an
> additional 0.1s for a response.
> W1201 17:08:11.084365 8037 connection.cc:205] RPC call timeout handler was
> delayed by 1.08527s! This may be due to a process-wide pause such as
> swapping, logging-related delays, or allocator lock contention. Will allow an
> additional 0.1s for a response.
> W1201 17:08:11.084503 8037 consensus_peers.cc:328] T
> f165a3739dca46aab29b2c965212d3f5 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> 90a20dfea5be457b8a47024785be0834 (hadoop109.datablocks.net:7050): Couldn't
> send request to peer 90a20dfea5be457b8a47024785be0834 for tablet
> f165a3739dca46aab29b2c965212d3f5. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.109:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 41 times.
> W1201 17:08:11.084656 8037 consensus_peers.cc:328] T
> c121f0c905df47eb9d33773e5528e854 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet
> c121f0c905df47eb9d33773e5528e854. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:08:11.084799 8037 connection.cc:205] RPC call timeout handler was
> delayed by 1.0541s! This may be due to a process-wide pause such as swapping,
> logging-related delays, or allocator lock contention. Will allow an
> additional 0.1s for a response.
> W1201 17:08:11.084997 8037 consensus_peers.cc:328] T
> ff94ba2330cd4f1cae67b271a681ee97 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet
> ff94ba2330cd4f1cae67b271a681ee97. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> W1201 17:08:11.085119 8037 connection.cc:205] RPC call timeout handler was
> delayed by 1.02432s! This may be due to a process-wide pause such as
> swapping, logging-related delays, or allocator lock contention. Will allow an
> additional 0.1s for a response.
> W1201 17:08:11.085245 8037 consensus_peers.cc:328] T
> 0a22984862114f6ca1b8488236ac5ad0 P 783ac0cf9d1c409db2f5c759a5984401 -> Peer
> ccf6f0332e414ef6af63fdee7f24e952 (hadoop110.datablocks.net:7050): Couldn't
> send request to peer ccf6f0332e414ef6af63fdee7f24e952 for tablet
> 0a22984862114f6ca1b8488236ac5ad0. Status: Timed out: UpdateConsensus RPC to
> 198.135.236.110:7050 timed out after 1.000s. Retrying in the next heartbeat
> period. Already tried 1 times.
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)