[ https://issues.apache.org/jira/browse/SPARK-27669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eric Liang updated SPARK-27669: ------------------------------- Summary: Refactor DataFrameWriter to resolve datasources in a command (was: Refactor DataFrameWriter to always go through Catalyst for analysis) > Refactor DataFrameWriter to resolve datasources in a command > ------------------------------------------------------------ > > Key: SPARK-27669 > URL: https://issues.apache.org/jira/browse/SPARK-27669 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.3 > Reporter: Eric Liang > Priority: Major > > Currently, DataFrameWriter.save() does a large amount of ad-hoc analysis > (e.g., loading data source classes, validating options, and so on) before > executing the command. > The execution of this code falls outside the scope of any SQL execution, > which is unfortunate since it means it's untracked by Spark (e.g., in the > Spark UI), and also means df.write ops cannot be manipulated by custom > catalyst rules prior to execution. > These issues can be largely resolved by creating a command that represents > df.write.save/saveAsTable(). -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org