I think "index_in" does the index in the other way around? It gives, for each value of the array, the index in the set. While if I understand the question correctly, Niranda is looking for the index into the array for elements that are present in the set.
Something like that could be achieved by using "is_in", and then getting the indices of the True values: >>> pc.is_in(pa.array([1, 2, 3]), value_set=pa.array([1, 3])) <pyarrow.lib.BooleanArray object at 0x7fcc96896a00> [ true, false, true ] To get the location of the True values, in numpy this is called "nonzero", and we have an open JIRA for adding this as a kernel (https://issues.apache.org/jira/browse/ARROW-13035) On Thu, 25 Nov 2021 at 11:17, Alessandro Molina <[email protected]> wrote: > > I think index_in is what you are looking for > > >>> pc.index_in(pa.array([1, 2, 3]), value_set=pa.array([1, 3])) > <pyarrow.lib.Int32Array object at 0x11e2a6580> > [ > 0, > null, > 1 > ] > > On Sat, Nov 20, 2021 at 4:49 AM Niranda Perera <[email protected]> > wrote: >> >> Hi all, is there a compute API for searching a value index (and a set of >> values) in an Array? >> ex: >> ```python >> a = [1, 2, 2, 3, 4, 1] >> values= pa.array([1, 2, 1]) >> >> index = find_index(a, 1) # = [0, 5] >> indices = find_indices(a, values) # = [0, 1, 2, 5] >> ``` >> I am currently using `compute.is_in` and traversing the true indices of the >> result Bitmap. Is there a better way? >> >> Best >> -- >> Niranda Perera >> https://niranda.dev/ >> @n1r44 >>
