[GitHub] [spark] brkyvz commented on a change in pull request #26868: [SPARK-29665][SQL] refine the TableProvider interface

GitBox Thu, 16 Jan 2020 17:19:31 -0800

brkyvz commented on a change in pull request #26868: [SPARK-29665][SQL] refine 
the TableProvider interface
URL: https://github.com/apache/spark/pull/26868#discussion_r367729351


 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
 ##########
 @@ -257,6 +257,20 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
       val options = sessionOptions ++ extraOptions
       val dsOptions = new CaseInsensitiveStringMap(options.asJava)
 
+      def getTable: Table = {
+        // For file source, it's expensive to infer schema/partition at each 
write. Here we pass
+        // the schema of input query and the user-specified partitioning to 
`getTable`. If the
+        // query schema is not compatible with the existing data, the write 
can still success but
+        // following reads would fail.
+        if (provider.isInstanceOf[FileDataSourceV2]) {
+          import org.apache.spark.sql.connector.catalog.CatalogV2Implicits._
+          val partitioning = partitioningColumns.getOrElse(Nil).asTransforms
+          provider.getTable(df.schema, partitioning, 
dsOptions.asCaseSensitiveMap())
 
 Review comment:
   `df.schema.asNullable`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] brkyvz commented on a change in pull request #26868: [SPARK-29665][SQL] refine the TableProvider interface

Reply via email to