Hi Wes,

I am interesting in this. In this PR [1] we are exposing BitmapWordReader/
Writer [2] to the outside, which may help the 'batch-at-a-time' scenario.

[1] https://github.com/apache/arrow/pull/10487
[2]
https://github.com/apache/arrow/blob/bcce18e5d4d83f0831de71b363ad91470376084c/cpp/src/arrow/util/bitmap_reader.h#L149-L231

On Wed, Jun 23, 2021 at 11:21 AM Wes McKinney <wesmck...@gmail.com> wrote:

> One project I was interested in getting to but haven't had the time
> was introducing branch-free code into vector_selection.cc and reducing
> the use of if-statements to try to improve performance.
>
> One way to do this is to take code that looks like this:
>
> if (BitUtil::GetBit(filter_data_, filter_offset_ + in_position)) {
>   BitUtil::SetBit(out_is_valid_, out_offset_ + out_position_);
>   out_data_[out_position_++] = values_data_[in_position];
> }
> ++in_position;
>
> and change it to a branch-free version
>
> bool advance = BitUtil::GetBit(filter_data_, filter_offset_ + in_position);
> BitUtil::SetBitTo(out_is_valid_, out_offset_ + out_position_, advance);
> out_data_[out_position_] = values_data_[in_position];
> out_position_ += advance; // may need static_cast<int> here
> ++in_position;
>
> Since more people are working on kernels and computing now, I thought
> this might be an interesting project for someone to explore and see
> what improvements are possible (and what the differences between e.g.
> x86 and ARM architecture are like when it comes to reducing
> branching). Another thing to look at might be batch-at-a-time
> bitpacking in the output bitmap versus bit-at-a-time.
>


-- 
Niranda Perera
https://niranda.dev/
@n1r44 <https://twitter.com/N1R44>

Reply via email to