[ 
https://issues.apache.org/jira/browse/SPARK-18185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618949#comment-16618949
 ] 

Deepanker edited comment on SPARK-18185 at 9/18/18 11:40 AM:
-------------------------------------------------------------

What is the difference between this Jira and this one: 
https://issues.apache.org/jira/browse/SPARK-20236

I tested this out with spark 2.2 this only works for external tables not 
managed tables in hive? Any reason why is that?

With 2.3 we can enable/disable this behaviour via this property: 
_spark.sql.sources.partitionOverwriteMode_ whereas previously it was default? 

*Update:* I got it. SPARK-20236 provides a feature flag to override this 
behaviour via the above mentioned property whereas this Jira fixes the insert 
overwrite behaviour overall.

Although this still doesn't work for Hive managed tables only for external 
tables. Is this behaviour intentional (as in an external table is considered as 
datasource table managed via Hive whereas a managed table doesn't) ?


was (Author: deepanker):
What is the difference between this Jira and this one: 
https://issues.apache.org/jira/browse/SPARK-20236

I tested this out with spark 2.2 this only works for external tables not 
managed tables in hive? Any reason why is that?

With 2.3 we can enable/disable this behaviour via this property: 
_spark.sql.sources.partitionOverwriteMode_ whereas previously it was default? 

*Update:* I got it. SPARK-20236 provides a feature flag to override this 
behaviour via the above mentioned property whereas this Jira fixes the insert 
overwrite behaviour overall.

Although this still doesn't work for Hive managed tables only for external 
tables. Is this behaviour intentional?

> Should fix INSERT OVERWRITE TABLE of Datasource tables with dynamic partitions
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-18185
>                 URL: https://issues.apache.org/jira/browse/SPARK-18185
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Eric Liang
>            Assignee: Eric Liang
>            Priority: Major
>             Fix For: 2.1.0
>
>
> As of current 2.1, INSERT OVERWRITE with dynamic partitions against a 
> Datasource table will overwrite the entire table instead of only the updated 
> partitions as in Hive. It also doesn't respect custom partition locations.
> We should delete only the proper partitions, scan the metastore for affected 
> partitions with custom locations, and ensure that deletes/writes go to the 
> right locations for those as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to