[jira] [Commented] (HIVE-28935) Iceberg's Minor Compaction replicates records

Dmitriy Fingerman (Jira) Tue, 29 Apr 2025 13:07:04 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-28935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17948272#comment-17948272
 ]


Dmitriy Fingerman commented on HIVE-28935:
------------------------------------------

Here are the partitions and their partition_hash values from the q-test:

 
||Partition||Partition Spec||PARTITION__HASH||
|{"key_bucket":0,"key_bucket_8":null} |0|*961*|
|{"key_bucket":3,"key_bucket_8":null} |0|1054|
|{"key_bucket":null,"key_bucket_8":0}|1|*961*|
|{"key_bucket":null,"key_bucket_8":3}|1|964|
|{"key_bucket":null,"key_bucket_8":4}|1|965|
|{"key_bucket":null,"key_bucket_8":null}|1|*961*|

> Iceberg's Minor Compaction replicates records
> ---------------------------------------------
>
>                 Key: HIVE-28935
>                 URL: https://issues.apache.org/jira/browse/HIVE-28935
>             Project: Hive
>          Issue Type: Bug
>          Components: Iceberg integration
>            Reporter: Shohei Okumiya
>            Assignee: Dmitriy Fingerman
>            Priority: Critical
>
> I observed that some records in the current snapshot were replicated after 
> running a minor compaction. I reproduced the issue when I combined a minor 
> compaction, bucket transform, and partition evolutions. I've not identified 
> which factor caused the issue.
> This is a reproduction. The result set of `SELECT * FROM 
> default.srcbucket_big ORDER BY id` changes after the compaction.
> [https://github.com/okumin/hive/commit/3949afb1d4b96714571123c5087e89ff078200cc#r156028049]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-28935) Iceberg's Minor Compaction replicates records

Reply via email to