Hi, Thank you for the stats.
I guess one crucial point is using proper flush mode for Kudu sessions. Make sure it's AUTO_FLUSH_BACKGROUND, not AUTO_FLUSH_SYNC. Another important point is the number of RPC workers: by default it's 20, but given that your server has 28 cores (I guess it's 2 times more if counting in hyperthreading, right?), you could try to increase --rpc_num_service_threads up to 30 or even 40. More RPC workers will be able to clear the RPC queue faster if there is enough hardware resources available. I guess with too many writer threads some requests in RPC queue time out eventually and there might be RPC queue overflows, and Kudu client automatically retries failed write requests, but as the result the overall write performance will degrade compared with the case of no retries. In addition, make sure the cluster is well balanced, so it's much less chance for hot-spotting. I'd run 'kudu cluster rebalance' prior to running the benchmarks. Also, I'd take a look at the IO statistics reported by iostat (e.g., run `iostat -dx 1` for some time and see the numbers for write/read IO stats and most importantly, the IO bandwidth utilization). If you see iostat reporting bandwidth utilization close to 100%, then consider adding a separate SSD drive for tablet write-ahead logs (WAL). A very good starting point for troubleshooting is looking into the tablet servers' logs and /metrics page -- warning messages might give you some insights what's going wrong, if anything. Best regards, Alexey On Tue, Jun 9, 2020 at 8:47 AM Ray Liu (rayliu) <ray...@cisco.com> wrote: > We also observed when we have less threads for writing, the speed is not > that bad(about a few thousand records/ second). > > This sentence may not be correct. > > > > We just ran another test to reduce the thread number from 10 to 2. > > The speed is much slower. > > > > We’re using Flink Kudu sink to write to Kudu cluster. > > https://github.com/apache/bahir-flink/blob/8723e6b01dd5568f318204abdf7f7a07b32ff70d/flink-connector-kudu/src/main/java/org/apache/flink/connectors/kudu/connector/writer/KuduWriter.java#L89 > > Basically, it calls KuduSession.apply with each element needs to be > inserted. > > > > > > > > *From: *"Ray Liu (rayliu)" <ray...@cisco.com> > *Reply-To: *"user@kudu.apache.org" <user@kudu.apache.org> > *Date: *Tuesday, June 9, 2020 at 23:31 > *To: *"user@kudu.apache.org" <user@kudu.apache.org> > *Subject: *Why is it slow to write Kudu with 100+ threads? > > > > We have a Kudu cluster with 5 tablet servers each has 28 CPU cores, 160GB > RAM and 2TB SSD. > > > > The RPC queue length we set is 500. > > > > We now write to 10 tables at the same time. > > > > We’re using 10 threads each to write(simply insert) to 8 out of these 10 > tables. > > > > We have 5 task (each task with 10 threads) to upsert corresponding fields > for the rest two tables. > > > > For example, for one of these two tables we have 5 fields(a,b,c,d,e) with > `key` fields as primary key . > > > > 1 task(10 thread) is running upsert (key, a) > > 1 task(10 thread) is running upsert (key, b) > > 1 task(10 thread) is running upsert (key, c) > > 1 task(10 thread) is running upsert (key, d) > > 1 task(10 thread) is running upsert (key, e) > > > > Now we observed that writes are very slow(less than 1000 thousand > records/second). > > We also observed when we have less threads for writing, the speed is not > that bad(about a few thousand records/ second). > > > > Here’s the CPU utilization report for Kudu threads. > > > > Threads: 724 total, 15 running, 709 sleeping, 0 stopped, 0 zombie > > %Cpu(s): 18.5 us, 8.3 sy, 0.0 ni, 67.6 id, 4.9 wa, 0.0 hi, 0.7 si, > 0.0 st > > KiB Mem : 16488888+total, 1776956 free, 12737900 used, 15037401+buff/cache > > KiB Swap: 3145724 total, 3004924 free, 140800 used. 15048467+avail Mem > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 14000 kudu 20 0 70.0g 12.7g 1.9g R 72.8 8.1 26083:49 > MaintenanceMgr > > 13992 kudu 20 0 70.0g 12.7g 1.9g R 53.2 8.1 4067:40 rpc > reactor-139 > > 13995 kudu 20 0 70.0g 12.7g 1.9g R 42.9 8.1 3996:32 rpc > reactor-139 > > 13993 kudu 20 0 70.0g 12.7g 1.9g R 39.9 8.1 3167:19 rpc > reactor-139 > > 14231 kudu 20 0 70.0g 12.7g 1.9g S 11.3 8.1 142:14.00 rpc > worker-1423 > > 14242 kudu 20 0 70.0g 12.7g 1.9g S 11.3 8.1 107:12.97 rpc > worker-1424 > > 14226 kudu 20 0 70.0g 12.7g 1.9g S 10.6 8.1 109:21.71 rpc > worker-1422 > > 14274 kudu 20 0 70.0g 12.7g 1.9g S 10.6 8.1 95:54.12 rpc > worker-1427 > > 14216 kudu 20 0 70.0g 12.7g 1.9g S 10.0 8.1 136:26.72 rpc > worker-1421 > > 14221 kudu 20 0 70.0g 12.7g 1.9g S 10.0 8.1 129:04.78 rpc > worker-1422 > > 14253 kudu 20 0 70.0g 12.7g 1.9g S 9.3 8.1 104:26.75 rpc > worker-1425 > > 14250 kudu 20 0 70.0g 12.7g 1.9g S 8.6 8.1 145:44.18 rpc > worker-1425 > > 14224 kudu 20 0 70.0g 12.7g 1.9g S 7.3 8.1 112:22.56 rpc > worker-1422 > > 14255 kudu 20 0 70.0g 12.7g 1.9g S 7.3 8.1 133:47.72 rpc > worker-1425 > > 14282 kudu 20 0 70.0g 12.7g 1.9g S 7.3 8.1 126:01.94 rpc > worker-1428 > > 10864 kudu 20 0 70.0g 12.7g 1.9g S 7.3 8.1 0:00.45 apply > [worker]- > > 10932 kudu 20 0 70.0g 12.7g 1.9g S 6.6 8.1 0:00.38 apply > [worker]- > > 14271 kudu 20 0 70.0g 12.7g 1.9g S 6.3 8.1 98:02.94 rpc > worker-1427 > > 11099 kudu 20 0 70.0g 12.7g 1.9g S 6.3 8.1 0:00.19 prepare > [worker > > 11001 kudu 20 0 70.0g 12.7g 1.9g S 6.0 8.1 0:00.29 apply > [worker]- > > 11103 kudu 20 0 70.0g 12.7g 1.9g S 6.0 8.1 0:00.18 prepare > [worker > > 11105 kudu 20 0 70.0g 12.7g 1.9g S 6.0 8.1 0:00.18 prepare > [worker > > 14057 kudu 20 0 70.0g 12.7g 1.9g S 5.3 8.1 1427:58 rpc > worker-1405 > > 11004 kudu 20 0 70.0g 12.7g 1.9g S 5.3 8.1 0:00.23 apply > [worker]- > > 11037 kudu 20 0 70.0g 12.7g 1.9g S 5.3 8.1 0:00.20 prepare > [worker > > 14270 kudu 20 0 70.0g 12.7g 1.9g S 5.0 8.1 146:34.95 rpc > worker-1427 > > 14280 kudu 20 0 70.0g 12.7g 1.9g R 5.0 8.1 133:00.90 rpc > worker-1428 > > 10366 kudu 20 0 70.0g 12.7g 1.9g S 5.0 8.1 0:00.77 raft > [worker]-1 > > 10749 kudu 20 0 70.0g 12.7g 1.9g S 5.0 8.1 0:00.36 raft > [worker]-1 > > 14053 kudu 20 0 70.0g 12.7g 1.9g S 4.7 8.1 1428:13 rpc > worker-1405 > > 14213 kudu 20 0 70.0g 12.7g 1.9g S 4.7 8.1 145:18.80 rpc > worker-1421 > > > > And memory usage info > > total used free shared > buff/cache available > > Mem: 157G 12G 1.7G 726M 143G > 143G > > Swap: 3.0G 137M 2.9G > > > > Here’s the recent logs from one of the tablet servers. > > https://justpaste.it/76qg2 > > > > Please advise me how I can optimize the write performance. > > >