MehulBatra commented on PR #1350: URL: https://github.com/apache/fluss/pull/1350#issuecomment-3097654330
> > Hi @luoyuxia I tried to take inspiration from https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/types/Conversions.java for encoding https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/types/Types.java for types > > Thanks for the explaination. I think it make senses to be inspired from iceberg's encoding. But keep in mind we use it for different purpose: > > * Iceebrg use it to store the binary values of field statistic, right? > * We use it for distribute records, in Fluss's implementation, it'll first encoding the bucket columns, and then use the encoded value to do bucket partition transfrom which should align with iceberg. But iceberg use literal value to do [bucket partition transfrom](https://github.com/apache/iceberg/blob/6bbbbf6f6f9ba6dc8cb273f8dcd00be1b5dfc399/api/src/main/java/org/apache/iceberg/transforms/Bucket.java#L98). So it means we don't really care about how to encode the bucket columns into binary value. We only need to get the literal value from the binary value(this will be called decoding binary value). That means we can use any encoding way. But It's fine for me to following how iceberg encode the values. -> Agreed Iceberg does use these to store partition stats and min/max for filtering. -> The end goal is to encode the values in iceberg compatibal format, can you add more on this? We only need to get the literal value from the binary value(this will be called decoding binary value). That means we can use any encoding way. But It's fine for me to following how iceberg encode the values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
