Re: [PR] [HUDI-8208] Fix partition stats bound when compacting or clustering [hudi]

via GitHub Tue, 08 Oct 2024 19:35:24 -0700


danny0405 commented on PR #12050:
URL: https://github.com/apache/hudi/pull/12050#issuecomment-2401029138


   > To calculate tight bound, we look at the colstats partition for the 
uncompacted or unclustered files and then merge the colstats with that of the 
compacted or clustered files. 
   
   Are you saying instead of using the native min_max range for columns in 
files generated from compaction and clustering, we recompute the column stats 
ranges from the source files? For example if we have f1 with range [v1, v2] and 
f2 with range [v3, v4], instead of using [v1, v4] as the compaction file range, 
we still use the composition of [v1, v2] and [v3, v4] ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-8208] Fix partition stats bound when compacting or clustering [hudi]

Reply via email to