[
https://issues.apache.org/jira/browse/SPARK-37970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-37970:
------------------------------------
Assignee: (was: Apache Spark)
> Introduce a new interface on streaming data source to notify the latest seen
> offset
> -----------------------------------------------------------------------------------
>
> Key: SPARK-37970
> URL: https://issues.apache.org/jira/browse/SPARK-37970
> Project: Spark
> Issue Type: New Feature
> Components: Structured Streaming
> Affects Versions: 3.3.0
> Reporter: Jungtaek Lim
> Priority: Major
>
> We figure out the case of streaming data source that knowing the latest seen
> offset when restarting query would be handy and useful to implement some
> feature. One useful case is enabling the data source to track the offset by
> itself, for the case where the external storage of data source is not
> exposing any API to provide the latest available offset.
> We will propose a new interface on streaming data source, which indicates
> Spark to give the latest seen offset whenever the query is being restarted.
> For the first start of the query, the initial offset of the data source
> should be retrieved from calling initialOffset.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]