[
https://issues.apache.org/jira/browse/FLINK-11948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794775#comment-16794775
]
Congxian Qiu(klion26) commented on FLINK-11948:
-----------------------------------------------
As the doc said,
{noformat}
To avoid such an unbalanced partitioning, use a round-robin kafka partitioner
(note that this will
* cause a lot of network connections between all the Flink instances and all
the Kafka brokers).{noformat}
Maybe we could have a round-robin partitioner here?
> When kafka sink parallelism<kafka partition num,kafka data distribution
> unbalance
> ---------------------------------------------------------------------------------
>
> Key: FLINK-11948
> URL: https://issues.apache.org/jira/browse/FLINK-11948
> Project: Flink
> Issue Type: Bug
> Components: Connectors / Kafka
> Affects Versions: 1.6.3, 1.6.4, 1.7.2
> Reporter: qi quan
> Priority: Major
>
> The default FlinkFixedPartitioner return int[] partitions by subtaskid %
> partitions.length.When kafka sink parallelism<kafka partition num。It only the
> first few kafka partitions will write data.
> I think it needs to be improved here.
> {code:java}
> @Override
> public int partition(T record, byte[] key, byte[] value, String
> targetTopic, int[] partitions) {
> Preconditions.checkArgument(
> partitions != null && partitions.length > 0,
> "Partitions of the target topic is empty.");
> return partitions[parallelInstanceId % partitions.length];
> }
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)