neilconway opened a new pull request, #20677: URL: https://github.com/apache/datafusion/pull/20677
## Which issue does this PR close? N/A ## Rationale for this change In #20374, `array_has` with a scalar needle was optimized to reconstruct matches more efficiently. Unfortunately, that code was incorrect for sliced arrays: `values()` returns the entire value buffer (including elements outside the visible slice), so we need to skip the corresponding indexes in the result bitmap. We could fix this by just skipping indexes, but it seems more robust and efficient to arrange to not compare the needle against elements outside the visible range in the first place. `array_position` has a similar behavior: it didn't have the buggy behavior, but it still did extra work for sliced arrays by comparing against elements outside the visible range. Benchmarking the revised code, there is no performance regression for unsliced arrays. ## What changes are included in this PR? * Fix `array_has` bug for sliced arrays with scalar needle * Improve `array_has` and `array_position` to not compare against elements outside the visible range of a sliced array * Add unit test for `array_has` bug * Add unit test to increase confidence in `array_position` behavior for sliced arrays ## Are these changes tested? Yes. ## Are there any user-facing changes? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
