[jira] [Commented] (SPARK-29223) Kafka source: offset by timestamp - allow specifying timestamp for "all partitions"

2021-08-23 Thread Andrew Grigorev (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403341#comment-17403341
 ] 

Andrew Grigorev commented on SPARK-29223:
-

Just an idea - couldn't this be implemented in "timestamp field filter 
pushdown"-like way?

> Kafka source: offset by timestamp - allow specifying timestamp for "all 
> partitions"
> ---
>
> Key: SPARK-29223
> URL: https://issues.apache.org/jira/browse/SPARK-29223
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Structured Streaming
>Affects Versions: 3.1.0
>Reporter: Jungtaek Lim
>Assignee: Jungtaek Lim
>Priority: Minor
> Fix For: 3.2.0
>
>
> This issue is a follow-up of SPARK-26848.
> In SPARK-26848, we decided to open possibility to let end users set 
> individual timestamp per partition. But in many cases, specifying timestamp 
> represents the intention that we would want to go back to specific timestamp 
> and reprocess records, which should be applied to all topics and partitions.
> According to the format of 
> `startingOffsetsByTimestamp`/`endingOffsetsByTimestamp`, while it's not 
> intuitive to provide an option to set a global timestamp across topic, it's 
> still intuitive to provide an option to set a global timestamp across 
> partitions in a topic.
> This issue tracks the efforts to deal with this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29223) Kafka source: offset by timestamp - allow specifying timestamp for "all partitions"

2021-05-20 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17348508#comment-17348508
 ] 

Apache Spark commented on SPARK-29223:
--

User 'HeartSaVioR' has created a pull request for this issue:
https://github.com/apache/spark/pull/32609

> Kafka source: offset by timestamp - allow specifying timestamp for "all 
> partitions"
> ---
>
> Key: SPARK-29223
> URL: https://issues.apache.org/jira/browse/SPARK-29223
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Structured Streaming
>Affects Versions: 3.1.0
>Reporter: Jungtaek Lim
>Priority: Minor
>
> This issue is a follow-up of SPARK-26848.
> In SPARK-26848, we decided to open possibility to let end users set 
> individual timestamp per partition. But in many cases, specifying timestamp 
> represents the intention that we would want to go back to specific timestamp 
> and reprocess records, which should be applied to all topics and partitions.
> According to the format of 
> `startingOffsetsByTimestamp`/`endingOffsetsByTimestamp`, while it's not 
> intuitive to provide an option to set a global timestamp across topic, it's 
> still intuitive to provide an option to set a global timestamp across 
> partitions in a topic.
> This issue tracks the efforts to deal with this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29223) Kafka source: offset by timestamp - allow specifying timestamp for "all partitions"

2019-09-23 Thread Jungtaek Lim (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16936287#comment-16936287
 ] 

Jungtaek Lim commented on SPARK-29223:
--

Working on this.

> Kafka source: offset by timestamp - allow specifying timestamp for "all 
> partitions"
> ---
>
> Key: SPARK-29223
> URL: https://issues.apache.org/jira/browse/SPARK-29223
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Structured Streaming
>Affects Versions: 3.0.0
>Reporter: Jungtaek Lim
>Priority: Minor
>
> This issue is a follow-up of SPARK-26848.
> In SPARK-26848, we decided to open possibility to let end users set 
> individual timestamp per partition. But in many cases, specifying timestamp 
> represents the intention that we would want to go back to specific timestamp 
> and reprocess records, which should be applied to all topics and partitions.
> According to the format of 
> `startingOffsetsByTimestamp`/`endingOffsetsByTimestamp`, while it's not 
> intuitive to provide an option to set a global timestamp across topic, it's 
> still intuitive to provide an option to set a global timestamp across 
> partitions in a topic.
> This issue tracks the efforts to deal with this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org