gengliangwang commented on a change in pull request #30273:
URL: https://github.com/apache/spark/pull/30273#discussion_r519891333
##########
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
##########
@@ -325,11 +325,12 @@ final class DataFrameWriter[T] private[sql](ds:
Dataset[T]) {
val dsOptions = new CaseInsensitiveStringMap(finalOptions.asJava)
def getTable: Table = {
- // For file source, it's expensive to infer schema/partition at each
write. Here we pass
- // the schema of input query and the user-specified partitioning to
`getTable`. If the
+ // If the source accepts external table metadata, here we pass the
schema of input query
+ // and the user-specified partitioning to `getTable`. This is for
avoiding
+ // schema/partitioning inference, which can be very expensive. If the
// query schema is not compatible with the existing data, the write
can still success but
Review comment:
+1, thanks
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]