We have a Kudu cluster with 5 tablet servers each has 28 CPU cores, 160GB RAM
and 2TB SSD.
The RPC queue length we set is 500.
We now write to 10 tables at the same time.
We’re using 10 threads each to write(simply insert) to 8 out of these 10 tables.
We have 5 task (each task with 10 threads) to upsert corresponding fields for
the rest two tables.
For example, for one of these two tables we have 5 fields(a,b,c,d,e) with `key`
fields as primary key .
1 task(10 thread) is running upsert (key, a)
1 task(10 thread) is running upsert (key, b)
1 task(10 thread) is running upsert (key, c)
1 task(10 thread) is running upsert (key, d)
1 task(10 thread) is running upsert (key, e)
Now we observed that writes are very slow(less than 1000 thousand
records/second).
We also observed when we have less threads for writing, the speed is not that
bad(about a few thousand records/ second).
Here’s the CPU utilization report for Kudu threads.
Threads: 724 total, 15 running, 709 sleeping, 0 stopped, 0 zombie
%Cpu(s): 18.5 us, 8.3 sy, 0.0 ni, 67.6 id, 4.9 wa, 0.0 hi, 0.7 si, 0.0 st
KiB Mem : 16488888+total, 1776956 free, 12737900 used, 15037401+buff/cache
KiB Swap: 3145724 total, 3004924 free, 140800 used. 15048467+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14000 kudu 20 0 70.0g 12.7g 1.9g R 72.8 8.1 26083:49
MaintenanceMgr
13992 kudu 20 0 70.0g 12.7g 1.9g R 53.2 8.1 4067:40 rpc
reactor-139
13995 kudu 20 0 70.0g 12.7g 1.9g R 42.9 8.1 3996:32 rpc
reactor-139
13993 kudu 20 0 70.0g 12.7g 1.9g R 39.9 8.1 3167:19 rpc
reactor-139
14231 kudu 20 0 70.0g 12.7g 1.9g S 11.3 8.1 142:14.00 rpc
worker-1423
14242 kudu 20 0 70.0g 12.7g 1.9g S 11.3 8.1 107:12.97 rpc
worker-1424
14226 kudu 20 0 70.0g 12.7g 1.9g S 10.6 8.1 109:21.71 rpc
worker-1422
14274 kudu 20 0 70.0g 12.7g 1.9g S 10.6 8.1 95:54.12 rpc
worker-1427
14216 kudu 20 0 70.0g 12.7g 1.9g S 10.0 8.1 136:26.72 rpc
worker-1421
14221 kudu 20 0 70.0g 12.7g 1.9g S 10.0 8.1 129:04.78 rpc
worker-1422
14253 kudu 20 0 70.0g 12.7g 1.9g S 9.3 8.1 104:26.75 rpc
worker-1425
14250 kudu 20 0 70.0g 12.7g 1.9g S 8.6 8.1 145:44.18 rpc
worker-1425
14224 kudu 20 0 70.0g 12.7g 1.9g S 7.3 8.1 112:22.56 rpc
worker-1422
14255 kudu 20 0 70.0g 12.7g 1.9g S 7.3 8.1 133:47.72 rpc
worker-1425
14282 kudu 20 0 70.0g 12.7g 1.9g S 7.3 8.1 126:01.94 rpc
worker-1428
10864 kudu 20 0 70.0g 12.7g 1.9g S 7.3 8.1 0:00.45 apply
[worker]-
10932 kudu 20 0 70.0g 12.7g 1.9g S 6.6 8.1 0:00.38 apply
[worker]-
14271 kudu 20 0 70.0g 12.7g 1.9g S 6.3 8.1 98:02.94 rpc
worker-1427
11099 kudu 20 0 70.0g 12.7g 1.9g S 6.3 8.1 0:00.19 prepare
[worker
11001 kudu 20 0 70.0g 12.7g 1.9g S 6.0 8.1 0:00.29 apply
[worker]-
11103 kudu 20 0 70.0g 12.7g 1.9g S 6.0 8.1 0:00.18 prepare
[worker
11105 kudu 20 0 70.0g 12.7g 1.9g S 6.0 8.1 0:00.18 prepare
[worker
14057 kudu 20 0 70.0g 12.7g 1.9g S 5.3 8.1 1427:58 rpc
worker-1405
11004 kudu 20 0 70.0g 12.7g 1.9g S 5.3 8.1 0:00.23 apply
[worker]-
11037 kudu 20 0 70.0g 12.7g 1.9g S 5.3 8.1 0:00.20 prepare
[worker
14270 kudu 20 0 70.0g 12.7g 1.9g S 5.0 8.1 146:34.95 rpc
worker-1427
14280 kudu 20 0 70.0g 12.7g 1.9g R 5.0 8.1 133:00.90 rpc
worker-1428
10366 kudu 20 0 70.0g 12.7g 1.9g S 5.0 8.1 0:00.77 raft
[worker]-1
10749 kudu 20 0 70.0g 12.7g 1.9g S 5.0 8.1 0:00.36 raft
[worker]-1
14053 kudu 20 0 70.0g 12.7g 1.9g S 4.7 8.1 1428:13 rpc
worker-1405
14213 kudu 20 0 70.0g 12.7g 1.9g S 4.7 8.1 145:18.80 rpc
worker-1421
And memory usage info
total used free shared buff/cache
available
Mem: 157G 12G 1.7G 726M 143G 143G
Swap: 3.0G 137M 2.9G
Here’s the recent logs from one of the tablet servers.
https://justpaste.it/76qg2
Please advise me how I can optimize the write performance.