I think "index_in" does the index in the other way around? It gives,
for each value of the array, the index in the set. While if I
understand the question correctly, Niranda is looking for the index
into the array for elements that are present in the set.

Something like that could be achieved by using "is_in", and then
getting the indices of the True values:

>>> pc.is_in(pa.array([1, 2, 3]), value_set=pa.array([1, 3]))
<pyarrow.lib.BooleanArray object at 0x7fcc96896a00>
[
  true,
  false,
  true
]

To get the location of the True values, in numpy this is called
"nonzero", and we have an open JIRA for adding this as a kernel
(https://issues.apache.org/jira/browse/ARROW-13035)

On Thu, 25 Nov 2021 at 11:17, Alessandro Molina
<[email protected]> wrote:
>
> I think index_in is what you are looking for
>
> >>> pc.index_in(pa.array([1, 2, 3]), value_set=pa.array([1, 3]))
> <pyarrow.lib.Int32Array object at 0x11e2a6580>
> [
>   0,
>   null,
>   1
> ]
>
> On Sat, Nov 20, 2021 at 4:49 AM Niranda Perera <[email protected]> 
> wrote:
>>
>> Hi all, is there a compute API for searching a value index (and a set of 
>> values) in an Array?
>> ex:
>> ```python
>> a = [1, 2, 2, 3, 4, 1]
>> values= pa.array([1, 2, 1])
>>
>> index = find_index(a, 1) # = [0, 5]
>> indices = find_indices(a, values) # = [0, 1, 2, 5]
>> ```
>> I am currently using `compute.is_in` and traversing the true indices of the 
>> result Bitmap. Is there a better way?
>>
>> Best
>> --
>> Niranda Perera
>> https://niranda.dev/
>> @n1r44
>>

Reply via email to