[jira] [Updated] (FLINK-32019) EARLIEST offset strategy for partitions discoveried later based on FLIP-288

2024-01-26 Thread Martijn Visser (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser updated FLINK-32019:
---
Fix Version/s: kafka-3.1.0
   (was: kafka-4.0.0)

> EARLIEST offset strategy for partitions discoveried later based on FLIP-288
> ---
>
> Key: FLINK-32019
> URL: https://issues.apache.org/jira/browse/FLINK-32019
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Kafka
>Affects Versions: kafka-3.0.0
>Reporter: Hongshun Wang
>Assignee: Hongshun Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: kafka-3.1.0
>
>
> As described in 
> [FLIP-288|https://cwiki.apache.org/confluence/display/FLINK/FLIP-288%3A+Enable+Dynamic+Partition+Discovery+by+Default+in+Kafka+Source],
>  the strategy used for new partitions is the same as the initial offset 
> strategy, which is not reasonable.
> According to the semantics, if the startup strategy is latest, the consumed 
> data should include all data from the moment of startup, which also includes 
> all messages from new created partitions. However, the latest strategy 
> currently maybe used for new partitions, leading to the loss of some data 
> (thinking a new partition is created and might be discovered by Kafka source 
> several minutes later, and the message produced into the partition within the 
> gap might be dropped if we use for example "latest" as the initial offset 
> strategy).if the data from all new partitions is not read, it does not meet 
> the user's expectations.
> Other ploblems see final Section of 
> [FLIP-288|https://cwiki.apache.org/confluence/display/FLINK/FLIP-288%3A+Enable+Dynamic+Partition+Discovery+by+Default+in+Kafka+Source]:
>  {{User specifies OffsetsInitializer for new partition}} .
> Therefore, it’s better to provide an *EARLIEST* strategy for later discovered 
> partitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32019) EARLIEST offset strategy for partitions discoveried later based on FLIP-288

2023-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-32019:
---
Labels: pull-request-available  (was: )

> EARLIEST offset strategy for partitions discoveried later based on FLIP-288
> ---
>
> Key: FLINK-32019
> URL: https://issues.apache.org/jira/browse/FLINK-32019
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Kafka
>Affects Versions: kafka-3.0.0
>Reporter: Hongshun Wang
>Assignee: Hongshun Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: kafka-4.0.0
>
>
> As described in 
> [FLIP-288|https://cwiki.apache.org/confluence/display/FLINK/FLIP-288%3A+Enable+Dynamic+Partition+Discovery+by+Default+in+Kafka+Source],
>  the strategy used for new partitions is the same as the initial offset 
> strategy, which is not reasonable.
> According to the semantics, if the startup strategy is latest, the consumed 
> data should include all data from the moment of startup, which also includes 
> all messages from new created partitions. However, the latest strategy 
> currently maybe used for new partitions, leading to the loss of some data 
> (thinking a new partition is created and might be discovered by Kafka source 
> several minutes later, and the message produced into the partition within the 
> gap might be dropped if we use for example "latest" as the initial offset 
> strategy).if the data from all new partitions is not read, it does not meet 
> the user's expectations.
> Other ploblems see final Section of 
> [FLIP-288|https://cwiki.apache.org/confluence/display/FLINK/FLIP-288%3A+Enable+Dynamic+Partition+Discovery+by+Default+in+Kafka+Source]:
>  {{User specifies OffsetsInitializer for new partition}} .
> Therefore, it’s better to provide an *EARLIEST* strategy for later discovered 
> partitions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)