HaoYang670 commented on code in PR #1905:
URL: https://github.com/apache/arrow-rs/pull/1905#discussion_r901063650
##########
parquet/src/encodings/encoding.rs:
##########
@@ -307,12 +307,10 @@ impl<T: DataType> DictEncoder<T> {
#[inline]
fn bit_width(&self) -> u8 {
let num_entries = self.uniques.len();
- if num_entries == 0 {
- 0
- } else if num_entries == 1 {
- 1
+ if num_entries <= 1 {
+ num_entries as u8
} else {
- log2(num_entries as u64) as u8
+ num_required_bits(num_entries as u64 - 1)
Review Comment:
This is because the ceiling of `log2(n)` is always equal to
`num_required_bits(n-1)`
For example,
`log2(8)` = 3 = `num_required_bits(7)`
You can see the log2 function:
```rust
pub fn log2(mut x: u64) -> i32 {
if x == 1 {
return 0;
}
x -= 1;
let mut result = 0;
while x > 0 {
x >>= 1;
result += 1;
}
result
}
```
I just move the `x -= 1;` outside the function.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]