[
https://issues.apache.org/jira/browse/FLINK-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16340564#comment-16340564
]
Thomas Weise commented on FLINK-8516:
-------------------------------------
Relevant piece of code:
[https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/KinesisDataFetcher.java#L594]
{code:java}
public static boolean isThisSubtaskShouldSubscribeTo(StreamShardHandle shard,
int totalNumberOfConsumerSubtasks,
int indexOfThisConsumerSubtask) {
return (Math.abs(shard.hashCode() % totalNumberOfConsumerSubtasks)) ==
indexOfThisConsumerSubtask;
}{code}
> FlinkKinesisConsumer does not balance shards over subtasks
> ----------------------------------------------------------
>
> Key: FLINK-8516
> URL: https://issues.apache.org/jira/browse/FLINK-8516
> Project: Flink
> Issue Type: Bug
> Components: Kinesis Connector
> Reporter: Thomas Weise
> Priority: Major
>
> The hash code of the shard is used to distribute discovered shards over
> subtasks round robin. This works as long as shard identifiers are sequential.
> After shards are rebalanced in Kinesis, that may no longer be the case and
> the distribution become skewed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)