[
https://issues.apache.org/jira/browse/ARROW-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17627997#comment-17627997
]
Jacek Pliszka commented on ARROW-18097:
---------------------------------------
Maybe no need for new name but is_in can be reused:
{code:python}
pc.is_in("a", arr)
{code}
> [C++] Add a "list_contains" kernel
> ----------------------------------
>
> Key: ARROW-18097
> URL: https://issues.apache.org/jira/browse/ARROW-18097
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++
> Reporter: Joris Van den Bossche
> Priority: Major
> Labels: compute, kernel
>
> Assume you have a list array:
> {code}
> arr = pa.array([["a", "b"], ["a", "c"], ["b", "c", "d"]])
> {code}
> And you want to know for each list if it contains a certain value (of the
> same type as the list's values). A "list_contains" function (or other name)
> would be useful for that:
> {code}
> pc.list_contains(arr, "a")
> # -> True, True False
> {code}
> The current workaround that I found was flattening, checking equality, and
> then reducing again with groupby, but this is quite tedious:
> {code}
> >>> temp = pa.table({'index': pc.list_parent_indices(arr), 'contains_value':
> >>> pc.equal(pc.list_flatten(arr), "a")})
> >>> temp.group_by('index').aggregate([('contains_value',
> >>> 'any')])['contains_value_any'].chunk(0)
> <pyarrow.lib.BooleanArray object at 0x7ffaf3f8de20>
> [
> true,
> true,
> false
> ]
> {code}
> But this also only works if there are no empty or missing list values.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)