Re: [PR] [lake/iceberg] Iceberg encoding strategy [fluss]

via GitHub Mon, 21 Jul 2025 10:17:41 -0700


MehulBatra commented on PR #1350:
URL: https://github.com/apache/fluss/pull/1350#issuecomment-3097654330


   > > Hi @luoyuxia I tried to take inspiration from 
https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/types/Conversions.java
 for encoding 
https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/types/Types.java
 for types
   > 
   > Thanks for the explaination. I think it make senses to be inspired from 
iceberg's encoding. But keep in mind we use it for different purpose:
   > 
   > * Iceebrg use it to store the binary values of field statistic, right?
   > * We use it for distribute records, in Fluss's implementation, it'll first 
encoding the bucket columns, and then use the encoded value to do bucket 
partition transfrom which should align with iceberg. But iceberg use literal 
value to do [bucket partition 
transfrom](https://github.com/apache/iceberg/blob/6bbbbf6f6f9ba6dc8cb273f8dcd00be1b5dfc399/api/src/main/java/org/apache/iceberg/transforms/Bucket.java#L98).
 So it means we don't really care about how to encode the bucket columns into 
binary value. We only need to get the literal value from the binary value(this 
will be called decoding binary value).  That means we can use any encoding way. 
But It's fine for me to following how iceberg encode the values.
   
   -> Agreed Iceberg does use these to store partition stats and min/max for 
filtering.
   -> The end goal is to encode the values in iceberg compatibal format, can 
you add more on this?
   We only need to get the literal value from the binary value(this will be 
called decoding binary value).  That means we can use any encoding way. But 
It's fine for me to following how iceberg encode the values.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [lake/iceberg] Iceberg encoding strategy [fluss]

Reply via email to