Tzu-Li (Gordon) Tai created FLINK-7407: ------------------------------------------
Summary: Assumption of partition id strict contiguity is too naive in Kafka consumer's AbstractPartitionDiscoverer Key: FLINK-7407 URL: https://issues.apache.org/jira/browse/FLINK-7407 Project: Flink Issue Type: Improvement Components: Kafka Connector Reporter: Tzu-Li (Gordon) Tai Fix For: 1.4.0 In the Kafka Consumer's {{AbstractPartitionDiscoverer}}, for partition discovery, already discovered partitions are tracked with the following map: ``` Map<String, Integer> topicsToLargestDiscoveredPartitionId ``` Simply put, on each discovery attempt's metadata fetch, all partition ids of a given topic that are smaller than the largest seen id will be ignored and not assigned. This approach lies on the assumption that fetched partition ids of a single topic are always strictly contiguous starting from 0. This assumption may be too naive, in that partitions which were temporarily unavailable at the time of a discovery would be shadowed by available partitions with larger ids, and from then on would be left unassigned. We should redesign how the `AbstractPartitionDiscoverer` tracks discovered partitions by not relying on the contiguity assumption, and also add test cases for non-contiguous fetched partition ids. -- This message was sent by Atlassian JIRA (v6.4.14#64029)