Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r208437780
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/StreamingWriteSupportProvider.java
---
@@ -29,24 +28,24 @@
* provide data writing ability for structured streaming.
*/
@InterfaceStability.Evolving
-public interface StreamWriteSupport extends DataSourceV2,
BaseStreamingSink {
+public interface StreamingWriteSupportProvider extends DataSourceV2,
BaseStreamingSink {
- /**
- * Creates an optional {@link StreamWriter} to save the data to this
data source. Data
- * sources can return None if there is no writing needed to be done.
- *
- * @param queryId A unique string for the writing query. It's possible
that there are many
- * writing queries running at the same time, and the
returned
- * {@link DataSourceWriter} can use this id to
distinguish itself from others.
- * @param schema the schema of the data to be written.
- * @param mode the output mode which determines what successive epoch
output means to this
- * sink, please refer to {@link OutputMode} for more
details.
- * @param options the options for the returned data source writer,
which is an immutable
- * case-insensitive string-to-string map.
- */
- StreamWriter createStreamWriter(
- String queryId,
- StructType schema,
- OutputMode mode,
- DataSourceOptions options);
+ /**
+ * Creates an optional {@link StreamingWriteSupport} to save the data to
this data source. Data
+ * sources can return None if there is no writing needed to be done.
+ *
+ * @param queryId A unique string for the writing query. It's possible
that there are many
+ * writing queries running at the same time, and the
returned
+ * {@link StreamingWriteSupport} can use this id to
distinguish itself from others.
+ * @param schema the schema of the data to be written.
+ * @param mode the output mode which determines what successive epoch
output means to this
+ * sink, please refer to {@link OutputMode} for more details.
+ * @param options the options for the returned data source writer, which
is an immutable
+ * case-insensitive string-to-string map.
+ */
+ StreamingWriteSupport createStreamingWritSupport(
+ String queryId,
--- End diff --
for the batch API, I think we can remove job id and ask the data source to
generate UUID themselves. But for streaming, I'm not sure. Maybe we need it for
failure recovery or streaming restart, cc @jose-torres
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]