[jira] [Commented] (HIVE-25849) Disable insert overwrite for bucket partitioned Iceberg tables

2022-01-07 Thread Marton Bod (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17470700#comment-17470700
 ] 

Marton Bod commented on HIVE-25849:
---

Pushed to master. Thanks [~szita] and [~pvary] for checking it.

> Disable insert overwrite for bucket partitioned Iceberg tables
> --
>
> Key: HIVE-25849
> URL: https://issues.apache.org/jira/browse/HIVE-25849
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Insert overwrite should be disabled where the target Iceberg table is a 
> bucket partitioned table, since which existing partitions will be overwritten 
> is very hard to predict from a user's POV, as it depends on the bucket hash 
> values calculated for the new dataset's rows. It's better to be on the safe 
> side and disable this operation to avoid unwanted data loss.
> Note: this the same approach followed by Impala too.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25849) Disable insert overwrite for bucket partitioned Iceberg tables

2022-01-06 Thread Marton Bod (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17469972#comment-17469972
 ] 

Marton Bod commented on HIVE-25849:
---

PR: [https://github.com/apache/hive/pull/2856/]

 

> Disable insert overwrite for bucket partitioned Iceberg tables
> --
>
> Key: HIVE-25849
> URL: https://issues.apache.org/jira/browse/HIVE-25849
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>
> Insert overwrite should be disabled where the target Iceberg table is a 
> bucket partitioned table, since which existing partitions will be overwritten 
> is very hard to predict from a user's POV, as it depends on the bucket hash 
> values calculated for the new dataset's rows. It's better to be on the safe 
> side and disable this operation to avoid unwanted data loss.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)