HeartSaVioR commented on a change in pull request #29767:
URL: https://github.com/apache/spark/pull/29767#discussion_r490214083
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala
##########
@@ -300,54 +301,44 @@ final class DataStreamWriter[T] private[sql](ds:
Dataset[T]) {
"write files of Hive data source directly.")
}
- if (source == "memory") {
+ if (source == SOURCE_NAME_TABLE) {
+ assertNotPartitioned("table")
+
+ import df.sparkSession.sessionState.analyzer.CatalogAndIdentifier
+
+ import org.apache.spark.sql.connector.catalog.CatalogV2Implicits._
+ val CatalogAndIdentifier(catalog, identifier) =
df.sparkSession.sessionState.sqlParser
Review comment:
Looks like it requires handling V1Table after loadTable (for view), as
well as pattern match with `AsTableIdentifier(tableIdentifier)` (for temporary
view).
In either way, I see DataFrameWriter leverages UnresolvedRelation to defer
resolution, but streaming query doesn't add a writer node in logical plan and
passes the actual table instance (either SupportsWrite for V2 or Sink for V1)
directly, so the situation looks to be a bit different.
(Btw, interesting one to test even on batch query. Probably I'd test with
creating temp view with V2 table and try to write. If that would work for
DataFrameWriter.insertInto, that's probably one thing which DataFrameWriterV2
may not support as of now, as it doesn't have fail-back to V1 path.)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]