[jira] [Updated] (FLINK-7407) Assumption of partition id strict contiguity is too naive in Kafka consumer's AbstractPartitionDiscoverer
[ https://issues.apache.org/jira/browse/FLINK-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai updated FLINK-7407: --- Priority: Blocker (was: Critical) > Assumption of partition id strict contiguity is too naive in Kafka consumer's > AbstractPartitionDiscoverer > - > > Key: FLINK-7407 > URL: https://issues.apache.org/jira/browse/FLINK-7407 > Project: Flink > Issue Type: Improvement > Components: Kafka Connector >Affects Versions: 1.4.0 >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai >Priority: Blocker > Fix For: 1.4.0 > > > In the Kafka Consumer's {{AbstractPartitionDiscoverer}}, for partition > discovery, already discovered partitions are tracked with the following map: > {code} > MaptopicsToLargestDiscoveredPartitionId > {code} > Simply put, on each discovery attempt's metadata fetch, all partition ids of > a given topic that are smaller than the largest seen id will be ignored and > not assigned. This approach lies on the assumption that fetched partition ids > of a single topic are always strictly contiguous starting from 0. > This assumption may be too naive, in that partitions which were temporarily > unavailable at the time of a discovery would be shadowed by available > partitions with larger ids, and from then on would be left unassigned. > We should redesign how the {{AbstractPartitionDiscoverer}} tracks discovered > partitions by not relying on the contiguity assumption, and also add test > cases for non-contiguous fetched partition ids. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLINK-7407) Assumption of partition id strict contiguity is too naive in Kafka consumer's AbstractPartitionDiscoverer
[ https://issues.apache.org/jira/browse/FLINK-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai updated FLINK-7407: --- Description: In the Kafka Consumer's {{AbstractPartitionDiscoverer}}, for partition discovery, already discovered partitions are tracked with the following map: {code} MaptopicsToLargestDiscoveredPartitionId {code} Simply put, on each discovery attempt's metadata fetch, all partition ids of a given topic that are smaller than the largest seen id will be ignored and not assigned. This approach lies on the assumption that fetched partition ids of a single topic are always strictly contiguous starting from 0. This assumption may be too naive, in that partitions which were temporarily unavailable at the time of a discovery would be shadowed by available partitions with larger ids, and from then on would be left unassigned. We should redesign how the {{AbstractPartitionDiscoverer}} tracks discovered partitions by not relying on the contiguity assumption, and also add test cases for non-contiguous fetched partition ids. was: In the Kafka Consumer's {{AbstractPartitionDiscoverer}}, for partition discovery, already discovered partitions are tracked with the following map: ``` Map topicsToLargestDiscoveredPartitionId ``` Simply put, on each discovery attempt's metadata fetch, all partition ids of a given topic that are smaller than the largest seen id will be ignored and not assigned. This approach lies on the assumption that fetched partition ids of a single topic are always strictly contiguous starting from 0. This assumption may be too naive, in that partitions which were temporarily unavailable at the time of a discovery would be shadowed by available partitions with larger ids, and from then on would be left unassigned. We should redesign how the `AbstractPartitionDiscoverer` tracks discovered partitions by not relying on the contiguity assumption, and also add test cases for non-contiguous fetched partition ids. > Assumption of partition id strict contiguity is too naive in Kafka consumer's > AbstractPartitionDiscoverer > - > > Key: FLINK-7407 > URL: https://issues.apache.org/jira/browse/FLINK-7407 > Project: Flink > Issue Type: Improvement > Components: Kafka Connector >Affects Versions: 1.4.0 >Reporter: Tzu-Li (Gordon) Tai > Fix For: 1.4.0 > > > In the Kafka Consumer's {{AbstractPartitionDiscoverer}}, for partition > discovery, already discovered partitions are tracked with the following map: > {code} > Map topicsToLargestDiscoveredPartitionId > {code} > Simply put, on each discovery attempt's metadata fetch, all partition ids of > a given topic that are smaller than the largest seen id will be ignored and > not assigned. This approach lies on the assumption that fetched partition ids > of a single topic are always strictly contiguous starting from 0. > This assumption may be too naive, in that partitions which were temporarily > unavailable at the time of a discovery would be shadowed by available > partitions with larger ids, and from then on would be left unassigned. > We should redesign how the {{AbstractPartitionDiscoverer}} tracks discovered > partitions by not relying on the contiguity assumption, and also add test > cases for non-contiguous fetched partition ids. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLINK-7407) Assumption of partition id strict contiguity is too naive in Kafka consumer's AbstractPartitionDiscoverer
[ https://issues.apache.org/jira/browse/FLINK-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai updated FLINK-7407: --- Priority: Critical (was: Major) > Assumption of partition id strict contiguity is too naive in Kafka consumer's > AbstractPartitionDiscoverer > - > > Key: FLINK-7407 > URL: https://issues.apache.org/jira/browse/FLINK-7407 > Project: Flink > Issue Type: Improvement > Components: Kafka Connector >Affects Versions: 1.4.0 >Reporter: Tzu-Li (Gordon) Tai >Priority: Critical > Fix For: 1.4.0 > > > In the Kafka Consumer's {{AbstractPartitionDiscoverer}}, for partition > discovery, already discovered partitions are tracked with the following map: > {code} > MaptopicsToLargestDiscoveredPartitionId > {code} > Simply put, on each discovery attempt's metadata fetch, all partition ids of > a given topic that are smaller than the largest seen id will be ignored and > not assigned. This approach lies on the assumption that fetched partition ids > of a single topic are always strictly contiguous starting from 0. > This assumption may be too naive, in that partitions which were temporarily > unavailable at the time of a discovery would be shadowed by available > partitions with larger ids, and from then on would be left unassigned. > We should redesign how the {{AbstractPartitionDiscoverer}} tracks discovered > partitions by not relying on the contiguity assumption, and also add test > cases for non-contiguous fetched partition ids. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (FLINK-7407) Assumption of partition id strict contiguity is too naive in Kafka consumer's AbstractPartitionDiscoverer
[ https://issues.apache.org/jira/browse/FLINK-7407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tzu-Li (Gordon) Tai updated FLINK-7407: --- Affects Version/s: 1.4.0 > Assumption of partition id strict contiguity is too naive in Kafka consumer's > AbstractPartitionDiscoverer > - > > Key: FLINK-7407 > URL: https://issues.apache.org/jira/browse/FLINK-7407 > Project: Flink > Issue Type: Improvement > Components: Kafka Connector >Affects Versions: 1.4.0 >Reporter: Tzu-Li (Gordon) Tai > Fix For: 1.4.0 > > > In the Kafka Consumer's {{AbstractPartitionDiscoverer}}, for partition > discovery, already discovered partitions are tracked with the following map: > {code} > MaptopicsToLargestDiscoveredPartitionId > {code} > Simply put, on each discovery attempt's metadata fetch, all partition ids of > a given topic that are smaller than the largest seen id will be ignored and > not assigned. This approach lies on the assumption that fetched partition ids > of a single topic are always strictly contiguous starting from 0. > This assumption may be too naive, in that partitions which were temporarily > unavailable at the time of a discovery would be shadowed by available > partitions with larger ids, and from then on would be left unassigned. > We should redesign how the {{AbstractPartitionDiscoverer}} tracks discovered > partitions by not relying on the contiguity assumption, and also add test > cases for non-contiguous fetched partition ids. -- This message was sent by Atlassian JIRA (v6.4.14#64029)