jhorstmann commented on pull request #8598:
URL: https://github.com/apache/arrow/pull/8598#issuecomment-723326224


   When I introduced this initially in [ARROW-10040][1] one feedback was that 
big endian was not supported yet anyway so it would not be necessary to worry 
about that now. I think it could be made to work rather easily by calling 
`to_le` in 2-3 places if I had access to a big endian test machine or CI 
pipeline.
   
   Adding a dependency that already implements the chunking and remainder logic 
is nice. I would have expected that to reduce the code size though.
   
   The `buffer_bit_ops` microbenchmark seems to be affected quite a bit:
   ```
   buffer_bit_ops and      time:   [1.1393 us 1.1413 us 1.1433 us]              
                  
                           change: [+889.05% +892.72% +896.41%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   ```
   
   The sum aggregation kernel is another bigger user of the bit slice functions 
and also regressed a bit:
   ```
   sum nulls 512           time:   [305.83 ns 306.31 ns 306.82 ns]              
            
                           change: [+25.194% +25.552% +25.936%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   ```
   
   Most benchmarks don't seem to be affected much, probably because there is 
some other overhead or they are not using the chunked functions. Cast kernels 
for example are implemented using iterators of optional values and so use a 
different code path.
   
    [1]: https://github.com/apache/arrow/pull/8262


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to