[jira] [Commented] (FLINK-29398) Utilize Rack Awareness in Flink Consumer

Jeremy DeGroot (Jira) Thu, 13 Apr 2023 07:47:32 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-29398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17711929#comment-17711929
 ]


Jeremy DeGroot commented on FLINK-29398:
----------------------------------------

[~chiggi_dev] There is a PR open for it now 
(https://github.com/apache/flink-connector-kafka/pull/20), so hopefully in the 
next release or two. 

Regarding how we use it, we have an MSK cluster stretched across three AZs and 
a Flink cluster across those same AZs. Since our consumers do a lot of feature 
extraction, filtering, mapping, and windowing of the data reduces the volume 
significantly from what is initially read in from kafka. What gets sent to 
downstream processors and sinks is orders of magnitude smaller and less 
expensive. If your workflow doesn't reduce intra-cluster traffic as 
dramatically, you'll probably want to look carefully at your partitioning and 
chaining choices in your jobs.

> Utilize Rack Awareness in Flink Consumer
> ----------------------------------------
>
>                 Key: FLINK-29398
>                 URL: https://issues.apache.org/jira/browse/FLINK-29398
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / Kafka
>            Reporter: Jeremy DeGroot
>            Assignee: Jeremy DeGroot
>            Priority: Major
>              Labels: pull-request-available
>
> [KIP-36|https://cwiki.apache.org/confluence/display/KAFKA/KIP-36+Rack+aware+replica+assignment]
>  was implemented some time ago in Kafka. This allows brokers and consumers to 
> communicate about the rack (or AWS Availability Zone) they're located in. 
> Reading from a local broker can save money in bandwidth and improve latency 
> for your consumers.
> Flink Kafka consumers currently cannot easily use rack awareness if they're 
> deployed across multiple racks or availability zones, because they have no 
> control over which rack the Task Manager they'll be assigned to may be in. 
> This improvement proposes that a Kafka Consumer could be configured with a 
> callback or Future that could be run when it's being configured on the task 
> manager, that will set the appropriate value at runtime if a value is 
> provided. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-29398) Utilize Rack Awareness in Flink Consumer

Reply via email to