HyukjinKwon commented on code in PR #42430:
URL: https://github.com/apache/spark/pull/42430#discussion_r1290817310
##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala:
##########
@@ -247,6 +247,24 @@ final class DataStreamWriter[T] private[sql] (ds:
Dataset[T]) extends Logging {
this
}
+ /**
+ * :: Experimental ::
+ *
+ * (Java-specific) Sets the output of the streaming query to be processed
using the provided
+ * function. This is supported only in the micro-batch execution modes (that
is, when the
+ * trigger is not continuous). In every micro-batch, the provided function
will be called in
+ * every micro-batch with (i) the output rows as a Dataset and (ii) the
batch identifier. The
+ * batchId can be used to deduplicate and transactionally write the output
(that is, the
+ * provided Dataset) to external systems. The output Dataset is guaranteed
to be exactly the
+ * same for the same batchId (assuming all operations are deterministic in
the query).
+ *
+ * @since 2.5.0
Review Comment:
```suggestion
* @since 3.5.0
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]