Re: [PR] HIVE-28935: Iceberg: fix partition filtering condition in compaction query [hive]

via GitHub Sat, 10 May 2025 08:48:09 -0700


difin commented on code in PR #5792:
URL: https://github.com/apache/hive/pull/5792#discussion_r2083224234



##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/compaction/IcebergQueryCompactor.java:
##########
@@ -96,16 +106,46 @@ public boolean run(CompactorContext context) throws 
IOException, HiveException,
         throw new HiveException(ErrorMsg.COMPACTION_NO_PARTITION);
       }
     } else {
-      long partitionHash = IcebergTableUtil.getPartitionHash(icebergTable, 
partSpec);
+      Pair<Integer, StructProjection> partSpecPair =
+          IcebergTableUtil.getPartitionSpecIdAndStruct(icebergTable, partSpec);

Review Comment:
   > partition struct in the PARTITIONS table == partition struct in the FILES 
table
   
   Correct
   
   > partition struct in the PARTITIONS table == partition struct in the FILES 
table, so I still don't understand why 
   
   we can't directly look up the FILES table?
   We don't have the values that can be used in SQL conditions on the partition 
struct.
   
   We do not lookup the `partitions` table using SQL. We lookup using Iceberg 
API to get the raw values that after transformations can be used in SQL on 
partition struct.
   
   > From the above example, what do we pass in partitionPath: 
event_src_trunc=BBB/event_time_month=2024-08? Based on that can't we construct 
a proper named_struct filter?
   
   No, we can't construct based on human readable values.
   In this example, `ci.partName` = 
`event_src_trunc=BBB/event_time_month=2024-08`
   but struct in files metadata table is  
`named_struct("event_src_trunc","AAA",event_time_month,655)`
   
   `event_time_month` is int with value of 655.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HIVE-28935: Iceberg: fix partition filtering condition in compaction query [hive]

Reply via email to