danny0405 commented on PR #12050: URL: https://github.com/apache/hudi/pull/12050#issuecomment-2401029138
> To calculate tight bound, we look at the colstats partition for the uncompacted or unclustered files and then merge the colstats with that of the compacted or clustered files. Are you saying instead of using the native min_max range for columns in files generated from compaction and clustering, we recompute the column stats ranges from the source files? For example if we have f1 with range [v1, v2] and f2 with range [v3, v4], instead of using [v1, v4] as the compaction file range, we still use the composition of [v1, v2] and [v3, v4] ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
