jayzhan211 opened a new issue, #12163: URL: https://github.com/apache/datafusion/issues/12163
### Is your feature request related to a problem or challenge? We found that computing with Eq Kernel is much faster than RowConverter for `array_has` There are other functions that have potential to further speedup with Eq Kernel. - [ ] array_has_all - [ ] array_has_any - [ ] array_intersect - [ ] array_distinct (Maybe 🤔 ?) - [ ] array_except (Maybe 🤔 ?) ### Describe the solution you'd like The overall idea is that we flatten the left hand side of the list and iterate right hand side elements, apply Eq kernel for each element. #12062 We could know whether the element is in left hand side of the list by checking the `true_count` (and `null_count` for null handling). The eq kernel is vectorized, that is the key of speedup. ## Example ### array_has_all array_has_all([1,2,3], [1,2]) -> true Iterate [1,2], compare [1,2,3] with 1 and [1,2,3] with 2. We will get [true,false,false] and [false,true,false]. Both boolean array contains true, therefore return true ### array_has_any array_has_any([1,2,3], [1,4]) -> true Iterate [1,2], compare [1,2,3] with 1 and [1,2,3] with 4. We will get [true,false,false] and [false,false,false]. Since first boolean array contains true, therefore return true ### array_has_intersect array_has_intersect([1,2,3], [1,2]) -> [1,2] The same idea like above, we know that both element is contained in the list. The difference is that we expect to return Array. We could get the expected array with MutableArrayData I'm not pretty sure about distinct and except, but worth to figure out as well For more detail implementation could take `array_has` as reference ### Describe alternatives you've considered _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org