[ https://issues.apache.org/jira/browse/SPARK-18243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Li updated SPARK-18243: ---------------------------- Assignee: Wenchen Fan > Converge the insert path of Hive tables with data source tables > --------------------------------------------------------------- > > Key: SPARK-18243 > URL: https://issues.apache.org/jira/browse/SPARK-18243 > Project: Spark > Issue Type: Sub-task > Components: SQL > Reporter: Reynold Xin > Assignee: Wenchen Fan > Fix For: 2.2.0 > > > Inserting data into Hive tables has its own implementation that is distinct > from data sources: InsertIntoHiveTable, SparkHiveWriterContainer and > SparkHiveDynamicPartitionWriterContainer. > I think it should be possible to unify these with data source implementations > InsertIntoHadoopFsRelationCommand. We can start by implementing an > OutputWriterFactory/OutputWriter that uses Hive's serdes to write data. > Note that one other major difference is that data source tables write > directly to the final destination without using some staging directory, and > then Spark itself adds the partitions/tables to the catalog. Hive tables > actually write to some staging directory, and then call Hive metastore's > loadPartition/loadTable function to load those data in. > -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org