Hi,
Our kudu cluster have ran well a long time, but write became slowly
recently,client also come out rpc timeout. I check the warning and find
vast error look this:
W1104 10:25:16.833736 10271 consensus_peers.cc:365] T
149ffa58ac274c9ba8385ccfdc01ea14 P 59c768eb799243678ee7fa3f83801316 -> Peer
1c67a7e7ff8f4de494469766641fccd1 (cloud-sk-ds-08:7050): Couldn't send
request to peer 1c67a7e7ff8f4de494469766641fccd1 for tablet
149ffa58ac274c9ba8385ccfdc01ea14. Status: Timed out: UpdateConsensus RPC to
10.6.60.9:7050 timed out after 1.000s (SENT). Retrying in the next
heartbeat period. Already tried 5 times.
I change the
configure rpc_service_queue_length=400,rpc_num_service_threads=40, but it
takes no effect.
Our cluster include 5 master , 10 ts. 3800G data, 800 tablet per ts. I
check one of the ts machine's memory, 14G left(128 In all), thread 4739(max
32000), openfile 28000(max 65536), cpu disk utilization ratio about 30%(32
core), disk util less than 30%.
Any suggestion for this? Thanks!