[jira] [Commented] (SPARK-29223) Kafka source: offset by timestamp - allow specifying timestamp for "all partitions"
[ https://issues.apache.org/jira/browse/SPARK-29223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403341#comment-17403341 ] Andrew Grigorev commented on SPARK-29223: - Just an idea - couldn't this be implemented in "timestamp field filter pushdown"-like way? > Kafka source: offset by timestamp - allow specifying timestamp for "all > partitions" > --- > > Key: SPARK-29223 > URL: https://issues.apache.org/jira/browse/SPARK-29223 > Project: Spark > Issue Type: Improvement > Components: SQL, Structured Streaming >Affects Versions: 3.1.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Minor > Fix For: 3.2.0 > > > This issue is a follow-up of SPARK-26848. > In SPARK-26848, we decided to open possibility to let end users set > individual timestamp per partition. But in many cases, specifying timestamp > represents the intention that we would want to go back to specific timestamp > and reprocess records, which should be applied to all topics and partitions. > According to the format of > `startingOffsetsByTimestamp`/`endingOffsetsByTimestamp`, while it's not > intuitive to provide an option to set a global timestamp across topic, it's > still intuitive to provide an option to set a global timestamp across > partitions in a topic. > This issue tracks the efforts to deal with this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29223) Kafka source: offset by timestamp - allow specifying timestamp for "all partitions"
[ https://issues.apache.org/jira/browse/SPARK-29223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17348508#comment-17348508 ] Apache Spark commented on SPARK-29223: -- User 'HeartSaVioR' has created a pull request for this issue: https://github.com/apache/spark/pull/32609 > Kafka source: offset by timestamp - allow specifying timestamp for "all > partitions" > --- > > Key: SPARK-29223 > URL: https://issues.apache.org/jira/browse/SPARK-29223 > Project: Spark > Issue Type: Improvement > Components: SQL, Structured Streaming >Affects Versions: 3.1.0 >Reporter: Jungtaek Lim >Priority: Minor > > This issue is a follow-up of SPARK-26848. > In SPARK-26848, we decided to open possibility to let end users set > individual timestamp per partition. But in many cases, specifying timestamp > represents the intention that we would want to go back to specific timestamp > and reprocess records, which should be applied to all topics and partitions. > According to the format of > `startingOffsetsByTimestamp`/`endingOffsetsByTimestamp`, while it's not > intuitive to provide an option to set a global timestamp across topic, it's > still intuitive to provide an option to set a global timestamp across > partitions in a topic. > This issue tracks the efforts to deal with this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29223) Kafka source: offset by timestamp - allow specifying timestamp for "all partitions"
[ https://issues.apache.org/jira/browse/SPARK-29223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16936287#comment-16936287 ] Jungtaek Lim commented on SPARK-29223: -- Working on this. > Kafka source: offset by timestamp - allow specifying timestamp for "all > partitions" > --- > > Key: SPARK-29223 > URL: https://issues.apache.org/jira/browse/SPARK-29223 > Project: Spark > Issue Type: Improvement > Components: SQL, Structured Streaming >Affects Versions: 3.0.0 >Reporter: Jungtaek Lim >Priority: Minor > > This issue is a follow-up of SPARK-26848. > In SPARK-26848, we decided to open possibility to let end users set > individual timestamp per partition. But in many cases, specifying timestamp > represents the intention that we would want to go back to specific timestamp > and reprocess records, which should be applied to all topics and partitions. > According to the format of > `startingOffsetsByTimestamp`/`endingOffsetsByTimestamp`, while it's not > intuitive to provide an option to set a global timestamp across topic, it's > still intuitive to provide an option to set a global timestamp across > partitions in a topic. > This issue tracks the efforts to deal with this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org