[GitHub] [spark] HeartSaVioR commented on a change in pull request #29767: [SPARK-32896][SS] Add DataStreamWriter.table API

GitBox Thu, 17 Sep 2020 05:46:41 -0700


HeartSaVioR commented on a change in pull request #29767:
URL: https://github.com/apache/spark/pull/29767#discussion_r490214083




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/streaming/DataStreamWriter.scala
##########
@@ -300,54 +301,44 @@ final class DataStreamWriter[T] private[sql](ds: 
Dataset[T]) {
         "write files of Hive data source directly.")
     }
 
-    if (source == "memory") {
+    if (source == SOURCE_NAME_TABLE) {
+      assertNotPartitioned("table")
+
+      import df.sparkSession.sessionState.analyzer.CatalogAndIdentifier
+
+      import org.apache.spark.sql.connector.catalog.CatalogV2Implicits._
+      val CatalogAndIdentifier(catalog, identifier) = 
df.sparkSession.sessionState.sqlParser

Review comment:
       Looks like it requires handling V1Table after loadTable (for view), as 
well as pattern match with `AsTableIdentifier(tableIdentifier)` (for temporary 
view).
   
   In either way, I see DataFrameWriter leverages UnresolvedRelation to defer 
resolution, but streaming query doesn't add a writer node in logical plan and 
passes the actual table instance (either SupportsWrite for V2 or Sink for V1) 
directly, so the situation looks to be a bit different.
   
   (Btw, interesting one to test even on batch query. Probably I'd test with 
creating temp view with V2 table and try to write. If that would work for 
DataFrameWriter.insertInto, that's probably one thing which DataFrameWriterV2 
may not support as of now, as it doesn't have fail-back to V1 path.)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HeartSaVioR commented on a change in pull request #29767: [SPARK-32896][SS] Add DataStreamWriter.table API

Reply via email to