Xu xiaolong created FLINK-22840:
-----------------------------------
Summary: Assign-evenly kafkaTopicPartitions of multiple topics to
flinkKafkaConsumer subtask
Key: FLINK-22840
URL: https://issues.apache.org/jira/browse/FLINK-22840
Project: Flink
Issue Type: Improvement
Components: Connectors / Kafka
Affects Versions: 1.11.0
Reporter: Xu xiaolong
Now ,with flink1.11 kafka connecotr,when we consume multiple kafka topics by
one flinkkafkaconsumer , when we set the consumer parallelism equals with the
total partitions count of multiple topic , make a decision each topic partition
consume by one kafka consumer, so each topic partition count is less than the
subtask count. But,the problem is that currently, some subtask is total free
while someothers workload is very high, this problem is caused by that the
partitionAssigner assign partion of earch topic indepently currently.
Following is one example: Target topics: topi1, topic2 ,topic3 ,topic4. each
has 3 partitions. In our job we consume the 4 topic by one consumer , our flink
standalone cluster got 9 taskworkers on different nodes. we want balance the
workload as much as possible, so we set the paralelism of flinkkafkaconsumer
to 12. from the UI we notice that the 0-5 subtask is free without partition
assigned, the total 12 partiton is assigned to 6-11 subtask. We learned the
source code of KafkaTopicPartitionAssigner to explain this phenomenon ,and then
we extend one more partition assign strategy which can deal with the need we
describe up, this stategy can evenly assign partiton from multiple topic
grobally to subtask of consumer. we want to contibute to flink, so someone has
the same requirement can use it directlly.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)