Re: Why is it slow to write Kudu with 100+ threads?

Alexey Serbin Tue, 09 Jun 2020 10:16:15 -0700

Hi,

Thank  you for the stats.


I guess one crucial point is using proper flush mode for Kudu sessions.
Make sure it's AUTO_FLUSH_BACKGROUND, not AUTO_FLUSH_SYNC.

Another important point is the number of RPC workers: by default it's 20,
but given that your server has 28 cores (I guess it's 2 times more if
counting in hyperthreading, right?), you could try to increase
--rpc_num_service_threads up to 30 or even 40.  More RPC workers will be
able to clear the RPC queue faster if there is enough hardware resources
available.  I guess with too many writer threads some requests in RPC queue
time out eventually and there might be RPC queue overflows, and Kudu client
automatically retries failed write requests, but as the result the overall
write performance will degrade compared with the case of no retries.

In addition, make sure the cluster is well balanced, so it's much less
chance for hot-spotting.  I'd run 'kudu cluster rebalance' prior to running
the benchmarks.

Also, I'd take a look at the IO statistics reported by iostat (e.g., run
`iostat -dx 1` for some time and see the numbers for write/read IO stats
and most importantly, the IO bandwidth utilization).  If you see iostat
reporting bandwidth utilization close to 100%, then consider adding a
separate SSD drive for tablet write-ahead logs (WAL).

A very good starting point for troubleshooting is looking into the tablet
servers' logs and /metrics page -- warning messages might give you some
insights what's going wrong, if anything.


Best regards,

Alexey

On Tue, Jun 9, 2020 at 8:47 AM Ray Liu (rayliu) <ray...@cisco.com> wrote:

> We also observed when we have less threads for writing, the speed is not
> that bad(about a few thousand records/ second).
>
> This sentence may not be correct.
>
>
>
> We just ran another test to reduce the thread number from 10 to 2.
>
> The speed is much slower.
>
>
>
> We’re using Flink Kudu sink to write to Kudu cluster.
>
> https://github.com/apache/bahir-flink/blob/8723e6b01dd5568f318204abdf7f7a07b32ff70d/flink-connector-kudu/src/main/java/org/apache/flink/connectors/kudu/connector/writer/KuduWriter.java#L89
>
> Basically, it calls KuduSession.apply with each element needs to be
> inserted.
>
>
>
>
>
>
>
> *From: *"Ray Liu (rayliu)" <ray...@cisco.com>
> *Reply-To: *"user@kudu.apache.org" <user@kudu.apache.org>
> *Date: *Tuesday, June 9, 2020 at 23:31
> *To: *"user@kudu.apache.org" <user@kudu.apache.org>
> *Subject: *Why is it slow to write Kudu with 100+ threads?
>
>
>
> We have a Kudu cluster with 5 tablet servers each has 28 CPU cores, 160GB
> RAM and 2TB SSD.
>
>
>
> The RPC queue length we set is 500.
>
>
>
> We now write to 10 tables at the same time.
>
>
>
> We’re using 10 threads each to write(simply insert) to 8 out of these 10
> tables.
>
>
>
> We have 5 task (each task with 10 threads)  to upsert corresponding fields
> for the rest two tables.
>
>
>
> For example, for one of these two tables we have 5 fields(a,b,c,d,e) with
> `key` fields as primary key .
>
>
>
> 1 task(10 thread) is  running upsert (key, a)
>
> 1 task(10 thread) is  running upsert (key, b)
>
> 1 task(10 thread) is  running upsert (key, c)
>
> 1 task(10 thread) is  running upsert (key, d)
>
> 1 task(10 thread) is  running upsert (key, e)
>
>
>
> Now we observed that writes are very slow(less than 1000 thousand
> records/second).
>
> We also observed when we have less threads for writing, the speed is not
> that bad(about a few thousand records/ second).
>
>
>
> Here’s the CPU utilization report for Kudu threads.
>
>
>
> Threads: 724 total,  15 running, 709 sleeping,   0 stopped,   0 zombie
>
> %Cpu(s): 18.5 us,  8.3 sy,  0.0 ni, 67.6 id,  4.9 wa,  0.0 hi,  0.7 si,
> 0.0 st
>
> KiB Mem : 16488888+total,  1776956 free, 12737900 used, 15037401+buff/cache
>
> KiB Swap:  3145724 total,  3004924 free,   140800 used. 15048467+avail Mem
>
>
>
>   PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
>
> 14000 kudu      20   0   70.0g  12.7g   1.9g R 72.8  8.1  26083:49
> MaintenanceMgr
>
> 13992 kudu      20   0   70.0g  12.7g   1.9g R 53.2  8.1   4067:40 rpc
> reactor-139
>
> 13995 kudu      20   0   70.0g  12.7g   1.9g R 42.9  8.1   3996:32 rpc
> reactor-139
>
> 13993 kudu      20   0   70.0g  12.7g   1.9g R 39.9  8.1   3167:19 rpc
> reactor-139
>
> 14231 kudu      20   0   70.0g  12.7g   1.9g S 11.3  8.1 142:14.00 rpc
> worker-1423
>
> 14242 kudu      20   0   70.0g  12.7g   1.9g S 11.3  8.1 107:12.97 rpc
> worker-1424
>
> 14226 kudu      20   0   70.0g  12.7g   1.9g S 10.6  8.1 109:21.71 rpc
> worker-1422
>
> 14274 kudu      20   0   70.0g  12.7g   1.9g S 10.6  8.1  95:54.12 rpc
> worker-1427
>
> 14216 kudu      20   0   70.0g  12.7g   1.9g S 10.0  8.1 136:26.72 rpc
> worker-1421
>
> 14221 kudu      20   0   70.0g  12.7g   1.9g S 10.0  8.1 129:04.78 rpc
> worker-1422
>
> 14253 kudu      20   0   70.0g  12.7g   1.9g S  9.3  8.1 104:26.75 rpc
> worker-1425
>
> 14250 kudu      20   0   70.0g  12.7g   1.9g S  8.6  8.1 145:44.18 rpc
> worker-1425
>
> 14224 kudu      20   0   70.0g  12.7g   1.9g S  7.3  8.1 112:22.56 rpc
> worker-1422
>
> 14255 kudu      20   0   70.0g  12.7g   1.9g S  7.3  8.1 133:47.72 rpc
> worker-1425
>
> 14282 kudu      20   0   70.0g  12.7g   1.9g S  7.3  8.1 126:01.94 rpc
> worker-1428
>
> 10864 kudu      20   0   70.0g  12.7g   1.9g S  7.3  8.1   0:00.45 apply
> [worker]-
>
> 10932 kudu      20   0   70.0g  12.7g   1.9g S  6.6  8.1   0:00.38 apply
> [worker]-
>
> 14271 kudu      20   0   70.0g  12.7g   1.9g S  6.3  8.1  98:02.94 rpc
> worker-1427
>
> 11099 kudu      20   0   70.0g  12.7g   1.9g S  6.3  8.1   0:00.19 prepare
> [worker
>
> 11001 kudu      20   0   70.0g  12.7g   1.9g S  6.0  8.1   0:00.29 apply
> [worker]-
>
> 11103 kudu      20   0   70.0g  12.7g   1.9g S  6.0  8.1   0:00.18 prepare
> [worker
>
> 11105 kudu      20   0   70.0g  12.7g   1.9g S  6.0  8.1   0:00.18 prepare
> [worker
>
> 14057 kudu      20   0   70.0g  12.7g   1.9g S  5.3  8.1   1427:58 rpc
> worker-1405
>
> 11004 kudu      20   0   70.0g  12.7g   1.9g S  5.3  8.1   0:00.23 apply
> [worker]-
>
> 11037 kudu      20   0   70.0g  12.7g   1.9g S  5.3  8.1   0:00.20 prepare
> [worker
>
> 14270 kudu      20   0   70.0g  12.7g   1.9g S  5.0  8.1 146:34.95 rpc
> worker-1427
>
> 14280 kudu      20   0   70.0g  12.7g   1.9g R  5.0  8.1 133:00.90 rpc
> worker-1428
>
> 10366 kudu      20   0   70.0g  12.7g   1.9g S  5.0  8.1   0:00.77 raft
> [worker]-1
>
> 10749 kudu      20   0   70.0g  12.7g   1.9g S  5.0  8.1   0:00.36 raft
> [worker]-1
>
> 14053 kudu      20   0   70.0g  12.7g   1.9g S  4.7  8.1   1428:13 rpc
> worker-1405
>
> 14213 kudu      20   0   70.0g  12.7g   1.9g S  4.7  8.1 145:18.80 rpc
> worker-1421
>
>
>
> And memory usage info
>
>                        total        used        free      shared
> buff/cache   available
>
> Mem:           157G         12G        1.7G        726M        143G
> 143G
>
> Swap:          3.0G        137M        2.9G
>
>
>
> Here’s the recent logs from one of the tablet servers.
>
> https://justpaste.it/76qg2
>
>
>
> Please advise me how I can optimize the write performance.
>
>
>

Re: Why is it slow to write Kudu with 100+ threads?

Reply via email to