Github user jose-torres commented on a diff in the pull request:
https://github.com/apache/spark/pull/20675#discussion_r171014505
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/streaming/ContinuousDataReader.java
---
@@ -33,4 +33,16 @@
* as a restart checkpoint.
*/
PartitionOffset getOffset();
+
+ /**
+ * Set the start offset for the current record, only used in task
retry. If setOffset keep
+ * default implementation, it means current ContinuousDataReader can't
support task level retry.
+ *
+ * @param offset last offset before task retry.
+ */
+ default void setOffset(PartitionOffset offset) {
--- End diff --
I think it might be better to create a new interface
ContinuousDataReaderFactory, and implement this there as something like
`createDataReaderWithOffset(PartitionOffset offset)`. That way the intended
lifecycle is explicit.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]