js8544 opened a new pull request, #14535:
URL: https://github.com/apache/arrow/pull/14535

   The current Filter implementation always drops the filtered values. In some 
use cases, it's required for the output array to have the same size as the inut 
array. So I added a new option FilterOptions::KEEP_NULL where the filtered 
values are kept as nulls.
   
   For example, with input [1, 2, 3] and filter [true, false, true], the 
current implementation will output [1, 3] and with the new option it will 
output [1, null, 3]
   
   This option is simpler to implement since we only need to construct a new 
validity bitmap and reuse the input buffers and child arrays. Except for dense 
union arrays which don't have validity bitmaps.
   
   It is also faster to filter with FilterOptions::KEEP_NULL according to the 
benchmark result in most cases, except for the case when selection percentage 
is extremely small so it's cheaper to copy over the selected values.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to