[GitHub] [arrow] jhorstmann commented on pull request #8598: ARROW-10500: [Rust] Refactor bit slice, bit view iterator for array buffers

GitBox Wed, 11 Nov 2020 07:11:30 -0800


jhorstmann commented on pull request #8598:
URL: https://github.com/apache/arrow/pull/8598#issuecomment-725477741



   > > I think we should address jhorstmann 's measurements of performance 
regressions before this pR is merged.
   > 
   > I measured the performance. upside_down_face It is in the PR description.
   
   That's exactly where I took the benchmark results from. But yes, the 
regression in `buffer_bit_ops and` does not seem to have any big effect.
   
   I had one other comment about the separate testcases for big-endian 
architectures, or restricting tests to little-endian, that was not yet 
addressed:
   
   > I'm wondering whether this is really correct, the way I understood it is 
that little/big endian only affect the layout of bytes in memory, not how 
individual bits are accessed in a number. In this testcase the least 
significant bit of the first byte is zero and would be considered the first 
value if this was a boolean array or null bitmap. Same for the 4th least 
significant bit, which is where the slice here should start. This means the 
least significant bit of the chunk should be zero.
   
   Consider the following buffer of u8, used as bit-packed data, with the 
indices of bytes and bits written below
   
   ```
   00000000 00010000
          0        1
   76543210 76543210
   ```
   
   To get the value of the 12th bit we would check bit (12%8) of byte (12/8). 
Viewing this as a larger type (u16 for simplification):
   
   ```
   0001000000000000
                  0
   111111
   5432109876543210
   ```
   
   To check the same bit we would need to check bit (12%16) of word (12/16). So 
the value as u16 would be 4096 and this should be independent of the 
machine-endianness. Endianness only influences how the u16 would be stored in 
memory, but our underlying data consists of u8 in memory.
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [arrow] jhorstmann commented on pull request #8598: ARROW-10500: [Rust] Refactor bit slice, bit view iterator for array buffers

Reply via email to