mapleFU commented on PR #8257:
URL: https://github.com/apache/arrow-rs/pull/8257#issuecomment-3266991890

   Generally different level of data would have different distribution, and 
like what query-optimizer meets, data changes ( like frequently insertion or 
insert overwrite ) might need to re-sampling the data. So I may think runtime 
config would be different from others
   
   And z-ordering clustering or other cluserting might also changes the 
distribution score. So currently I may think: a user config can set the own 
score, maybe different score for just ingested data (which might need fast 
write) or well clustered data ( which might need well compressed ). 10% is a 
good intuition but it's hard to define it's good. When compressed size > 
uncompressed size it's 100% worse.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to