[ 
https://issues.apache.org/jira/browse/ARROW-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17627997#comment-17627997
 ] 

Jacek Pliszka commented on ARROW-18097:
---------------------------------------

Maybe no need for new name but is_in can be reused:
{code:python}
pc.is_in("a", arr)
{code}

> [C++] Add a "list_contains" kernel
> ----------------------------------
>
>                 Key: ARROW-18097
>                 URL: https://issues.apache.org/jira/browse/ARROW-18097
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Joris Van den Bossche
>            Priority: Major
>              Labels: compute, kernel
>
> Assume you have a list array:
> {code}
> arr = pa.array([["a", "b"], ["a", "c"], ["b", "c", "d"]])
> {code}
> And you want to know for each list if it contains a certain value (of the 
> same type as the list's values). A "list_contains" function (or other name) 
> would be useful for that:
> {code}
> pc.list_contains(arr, "a")
> # -> True, True False
> {code}
> The current workaround that I found was flattening, checking equality, and 
> then reducing again with groupby, but this is quite tedious:
> {code}
> >>> temp = pa.table({'index': pc.list_parent_indices(arr), 'contains_value': 
> >>> pc.equal(pc.list_flatten(arr), "a")})
> >>> temp.group_by('index').aggregate([('contains_value', 
> >>> 'any')])['contains_value_any'].chunk(0)
> <pyarrow.lib.BooleanArray object at 0x7ffaf3f8de20>
> [
>   true,
>   true,
>   false
> ]
> {code}
> But this also only works if there are no empty or missing list values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to