[ 
https://issues.apache.org/jira/browse/FLINK-7508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bowen Li updated FLINK-7508:
----------------------------
    Description: 
KinesisProducerLibrary (KPL) 0.10.x had been using a One-New-Thread-Per-Request 
model for all requests sent to AWS Kinesis, which is very expensive.

0.12.4 introduced a new [ThreadingMode - 
Pooled|https://github.com/awslabs/amazon-kinesis-producer/blob/master/java/amazon-kinesis-producer/src/main/java/com/amazonaws/services/kinesis/producer/KinesisProducerConfiguration.java#L225],
 which will use a thread pool. This hugely improves KPL's performance and 
reduces consumed resources. By default, KPL still uses per-request mode. We 
should explicitly switch FlinkKinesisProducer's KPL threading mode to 'Pooled'.

This work depends on FLINK-7366 and FLINK-7508

Benchmarking I did:

* Environment: Running a Flink hourly-sliding windowing job on 18-node EMR 
cluster with R4.2xlarge instances. Each hourly-sliding window in the Flink job 
generates about 21million UserRecords, which means that we generated a test 
load of 21million UserRecords at the first minute of each hour.
* Criteria: 
* One-New-Thread-Per-Request model: max throughput is about 2million 
UserRecords per min; it doesn't go beyond that because CPU utilization goes to 
100%, everything stopped working and that Flink job crashed.
* Thread-Pool model: it sends out 21million UserRecords within one minute. The 
CPU utilization is about 





  was:
KinesisProducerLibrary (KPL) 0.10.x had been using a One-New-Thread-Per-Request 
model for all requests sent to AWS Kinesis, which is very expensive.

0.12.4 introduced a new [ThreadingMode - 
Pooled|https://github.com/awslabs/amazon-kinesis-producer/blob/master/java/amazon-kinesis-producer/src/main/java/com/amazonaws/services/kinesis/producer/KinesisProducerConfiguration.java#L225],
 which will use a thread pool. This hugely improves KPL's performance and 
reduces consumed resources. By default, KPL still uses per-request mode. We 
should explicitly switch FlinkKinesisProducer's KPL threading mode to 'Pooled'.

This work depends on FLINK-7366 and FLINK-7508




> switch FlinkKinesisProducer to use KPL's ThreadingMode to ThreadedPool mode 
> rather than Per_Request mode
> --------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-7508
>                 URL: https://issues.apache.org/jira/browse/FLINK-7508
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kinesis Connector
>    Affects Versions: 1.3.0
>            Reporter: Bowen Li
>            Assignee: Bowen Li
>            Priority: Critical
>             Fix For: 1.4.0, 1.3.3
>
>
> KinesisProducerLibrary (KPL) 0.10.x had been using a 
> One-New-Thread-Per-Request model for all requests sent to AWS Kinesis, which 
> is very expensive.
> 0.12.4 introduced a new [ThreadingMode - 
> Pooled|https://github.com/awslabs/amazon-kinesis-producer/blob/master/java/amazon-kinesis-producer/src/main/java/com/amazonaws/services/kinesis/producer/KinesisProducerConfiguration.java#L225],
>  which will use a thread pool. This hugely improves KPL's performance and 
> reduces consumed resources. By default, KPL still uses per-request mode. We 
> should explicitly switch FlinkKinesisProducer's KPL threading mode to 
> 'Pooled'.
> This work depends on FLINK-7366 and FLINK-7508
> Benchmarking I did:
> * Environment: Running a Flink hourly-sliding windowing job on 18-node EMR 
> cluster with R4.2xlarge instances. Each hourly-sliding window in the Flink 
> job generates about 21million UserRecords, which means that we generated a 
> test load of 21million UserRecords at the first minute of each hour.
> * Criteria: 
> * One-New-Thread-Per-Request model: max throughput is about 2million 
> UserRecords per min; it doesn't go beyond that because CPU utilization goes 
> to 100%, everything stopped working and that Flink job crashed.
> * Thread-Pool model: it sends out 21million UserRecords within one minute. 
> The CPU utilization is about 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to