etseidl commented on code in PR #37940:
URL: https://github.com/apache/arrow/pull/37940#discussion_r1341515246


##########
cpp/src/parquet/encoding.cc:
##########
@@ -2250,17 +2269,17 @@ void DeltaBitPackEncoder<DType>::FlushBlock() {
         std::min(values_per_mini_block_, values_current_block_);
 
     const uint32_t start = i * values_per_mini_block_;
-    const UT max_delta = *std::max_element(
+    const T max_delta = *std::max_element(
         deltas_.begin() + start, deltas_.begin() + start + 
values_current_mini_block);
 
     // The minimum number of bits required to write any of values in deltas_ 
vector.
     // See overflow comment above.
-    const auto bit_width = bit_width_data[i] =
-        bit_util::NumRequiredBits(max_delta - min_delta);
+    const auto bit_width = bit_width_data[i] = bit_util::NumRequiredBits(
+        static_cast<UT>(SafeSignedSubtractSigned(max_delta, min_delta)));

Review Comment:
   Yes, you are correct. Leaving everything unsigned, we have (unsigned)(0 - 1) 
= 0xffffffffU. `min_delta == 1`, `max_delta == 0xffffffffU`, so `max_delta - 
min_delta == 0xfffffffeU`, which requires the full 32 bits to encode.
   
   The end result is correct, but uses more space than it needs to.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to