vertexclique commented on a change in pull request #7061:
URL: https://github.com/apache/arrow/pull/7061#discussion_r418999354



##########
File path: rust/arrow/src/util/bit_util.rs
##########
@@ -148,11 +148,17 @@ pub fn count_set_bits_offset(data: &[u8], offset: usize, 
length: usize) -> usize
 /// Returns the ceil of `value`/`divisor`
 #[inline]
 pub fn ceil(value: usize, divisor: usize) -> usize {
-    let mut result = value / divisor;
-    if value % divisor != 0 {
-        result += 1
-    };
-    result
+    if value == 0_usize {

Review comment:
       Oh it is, meanwhile looking for zero sized allocations I came across 
with this, from this chunk of code:
   ```
   impl BufferBuilderTrait<BooleanType> for BufferBuilder<BooleanType> {
       fn new(capacity: usize) -> Self {
           let byte_capacity = bit_util::ceil(capacity, 8);
           let actual_capacity = 
bit_util::round_upto_multiple_of_64(byte_capacity);
           let mut buffer = MutableBuffer::new(actual_capacity);
           buffer.set_null_bits(0, actual_capacity);
           Self {
               buffer,
               len: 0,
               _marker: PhantomData,
           }
       }
   ```
   
   BufferBuilderTrait is using this code for every reallocation. Ceil is not 
euclidean C- division. According to this paper: 
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/divmodnote-letter.pdf
 So I thought better to use established C-division in this case, where things 
got improved from that side too.
   
   Separate PR should be open to fixing this in Parquet too.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to