harshkumar-2005 commented on issue #47752:
URL: https://github.com/apache/arrow/issues/47752#issuecomment-3438786398
Hi 👋
While reviewing this section of the code, I noticed that a page is currently
marked as compressed as soon as the compressed size is smaller than the
uncompressed one:
```
if (compressor_temp_buffer_->size() < values->size()) {
page_is_compressed = true;
}
```
This means even a very small compression gain (e.g., 1%) would still trigger
compression, which could add overhead during reads due to decompression cost.
Would it make sense to introduce a configurable threshold (for example,
require at least a 5–10% reduction before setting page_is_compressed = true)?
If so, what ratio or configuration mechanism would you prefer — a fixed
constant, a build option, or a runtime setting?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]