Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/20752#discussion_r172620679
--- Diff:
sql/core/src/main/java/org/apache/spark/sql/sources/v2/writer/streaming/StreamWriter.java
---
@@ -27,6 +28,9 @@
*
* Streaming queries are divided into intervals of data called epochs,
with a monotonically
* increasing numeric ID. This writer handles commits and aborts for each
successive epoch.
+ *
+ * Note that StreamWriter implementations should provide instances of
+ * {@link StreamingDataWriterFactory}.
--- End diff --
What do you think about removing the `SupportsWriteInternalRow` and always
using `InternalRow`? For the read side, I think using `Row` and `UnsafeRow` is
a problem: https://issues.apache.org/jira/browse/SPARK-23325
I don't see the value of using `Row` instead of `InternalRow` for readers,
so maybe we should just simplify on both the read and write paths.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]