GitHub user bowenli86 opened a pull request: https://github.com/apache/flink/pull/4656
[FLINK-7508][kinesis] switch FlinkKinesisProducer to use KPL's ThreadingMode to ThreadedPool mode rather than Per_Request mode ## What is the purpose of the change KinesisProducerLibrary (KPL) 0.10.x had been using a One-New-Thread-Per-Request model for all requests sent to AWS Kinesis, which is very expensive. 0.12.4 introduced a new ThreadingMode - Pooled, which will use a thread pool. This hugely improves KPL's performance and reduces consumed resources. By default, KPL still uses per-request mode. We should explicitly switch FlinkKinesisProducer's KPL threading mode to 'Pooled'. This work depends on FLINK-7366 and FLINK-7508 Benchmarking I did: - Environment: Running a Flink hourly-sliding windowing job on 18-node EMR cluster with R4.2xlarge instances. Each hourly-sliding window in the Flink job generates about 21million UserRecords, which means that we generated a test load of 21million UserRecords at the first minute of each hour. - Criteria: Test KPL throughput per minute. Since the default RecordTTL for KPL is 30 sec, we can be sure that either all UserRecords are sent by KPL within a minute, or we will see UserRecord expiration errors. - One-New-Thread-Per-Request model: max throughput is about 2million UserRecords per min; it doesn't go beyond that because CPU utilization goes to 100%, everything stopped working and that Flink job crashed. - Thread-Pool model with pool size of 10: it sends out 21million UserRecords within 30 sec without any UserRecord expiration errors. The average peak CPU utilization is about 20% - 30%. So 21million UserRecords/min is not the max throughput of thread-pool model. We didn't go any further because 1) this throughput is already a couple times more than what we really need, and 2) we don't have a quick way of increasing the test load Thus, I propose switching FlinkKinesisProducer to Thread-Pool mode. ## Brief change log - *switch FlinkKinesisProducer to use KPL's ThreadingMode to ThreadedPool mode rather than Per_Request mode* - *update docs* ## Verifying this change This change added tests and can be verified as follows: - *added unit tests in flink-connector-kinesis/src/test/java/org/apache/flink/streaming/connectors/kinesis/util/KinesisConfigUtilTest.java* ## Does this pull request potentially affect one of the following parts: ## Documentation - Does this pull request introduce a new feature? (yes) - If yes, how is the feature documented? (docs) You can merge this pull request into a Git repository by running: $ git pull https://github.com/bowenli86/flink FLINK-7508 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4656.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4656 ---- commit 6386983239bd3024b395c865ec4fd33e232ca5a3 Author: Bowen Li <bowenl...@gmail.com> Date: 2017-08-30T16:35:03Z FLINK-7422 Upgrade Kinesis Client Library (KCL) and AWS SDK in flink-connector-kinesis commit 381cd4156b84673a1d32d2db3f7b2d748d90d980 Author: Bowen Li <bowenl...@gmail.com> Date: 2017-09-07T06:33:37Z Merge remote-tracking branch 'upstream/master' commit 893ec61bebfa20a038819bf1929791e57b98f33b Author: Bowen Li <bowenl...@gmail.com> Date: 2017-09-07T20:34:09Z FLINK-7508 switch FlinkKinesisProducer to use KPL's ThreadingMode to threaded-pool mode rather than one_thread_per_request mode ---- ---