@allasandro I opened a JIRA <https://issues.apache.org/jira/browse/ARROW-14946>. Maybe we could discuss things further there.
On Wed, Dec 1, 2021 at 6:27 AM Alessandro Molina < [email protected]> wrote: > Yes please, I think it makes sense and should be fairly straightforward > > On Mon, Nov 29, 2021 at 5:38 PM Niranda Perera <[email protected]> > wrote: > >> Should I open a JIRA on this? >> >> On Mon, Nov 29, 2021, 10:52 Alessandro Molina < >> [email protected]> wrote: >> >>> Oh, ops, sorry my fault, I understood the question reversed :D >>> >>> I think that if we had a compute function that returns indices of a >>> matching value that could also be applied to masks to retrieve the indices >>> of any "true" value thus also solving your question if combined with is_in >>> (or any other predicate at that point). That might be a reasonable addition >>> to compute functions. >>> >>> >>> On Sun, Nov 28, 2021 at 7:00 AM Niranda Perera <[email protected]> >>> wrote: >>> >>>> Hi guys, sorry for the late reply. >>>> >>>> Yes, Joris is right. I want the converse (I think 😊 ) of index in. I >>>> was discussing this with Eduardo in zulip [1]. >>>> >>>> I was hoping that I could do this. >>>> ``` >>>> values = pa.array([1, 2, 2, 3, 4, 1]) >>>> to_find= pa.array([1, 2, 1]) >>>> indices = pc.index_in(to_find, value_set=values) # expected = [0, 5, >>>> 1, 2, 0, 5] received = [0, 1, 0] >>>> ``` >>>> So, index_in does not handle duplicated indices of values (I am >>>> guessing it creates a hashmap of values, and not a multimap). >>>> >>>> One suggestion was to use `aggregations.index`. And I think that might >>>> work recursively, as follows. But I haven't tested this. >>>> ``` >>>> indices = [] >>>> for f in to_find: >>>> idx = -1 >>>> while true: >>>> idx = pc.index(values, f, start=idx + 1, end=len(values)) >>>> if idx == -1: >>>> break >>>> else: >>>> indices.append(idx) >>>> ``` >>>> >>>> But I was thinking if it would make sense to give a method to find all >>>> indices of a value (inner while loop)? >>>> >>>> Best >>>> >>>> [1] >>>> https://ursalabs.zulipchat.com/#narrow/stream/180245-dev/topic/Find.20a.20value.20indices.20in.20an.20array/near/262351923 >>>> >>>> >>>> On Thu, Nov 25, 2021 at 3:14 PM Joris Van den Bossche < >>>> [email protected]> wrote: >>>> >>>>> I think "index_in" does the index in the other way around? It gives, >>>>> for each value of the array, the index in the set. While if I >>>>> understand the question correctly, Niranda is looking for the index >>>>> into the array for elements that are present in the set. >>>>> >>>>> Something like that could be achieved by using "is_in", and then >>>>> getting the indices of the True values: >>>>> >>>>> >>> pc.is_in(pa.array([1, 2, 3]), value_set=pa.array([1, 3])) >>>>> <pyarrow.lib.BooleanArray object at 0x7fcc96896a00> >>>>> [ >>>>> true, >>>>> false, >>>>> true >>>>> ] >>>>> >>>>> To get the location of the True values, in numpy this is called >>>>> "nonzero", and we have an open JIRA for adding this as a kernel >>>>> (https://issues.apache.org/jira/browse/ARROW-13035) >>>>> >>>>> On Thu, 25 Nov 2021 at 11:17, Alessandro Molina >>>>> <[email protected]> wrote: >>>>> > >>>>> > I think index_in is what you are looking for >>>>> > >>>>> > >>> pc.index_in(pa.array([1, 2, 3]), value_set=pa.array([1, 3])) >>>>> > <pyarrow.lib.Int32Array object at 0x11e2a6580> >>>>> > [ >>>>> > 0, >>>>> > null, >>>>> > 1 >>>>> > ] >>>>> > >>>>> > On Sat, Nov 20, 2021 at 4:49 AM Niranda Perera < >>>>> [email protected]> wrote: >>>>> >> >>>>> >> Hi all, is there a compute API for searching a value index (and a >>>>> set of values) in an Array? >>>>> >> ex: >>>>> >> ```python >>>>> >> a = [1, 2, 2, 3, 4, 1] >>>>> >> values= pa.array([1, 2, 1]) >>>>> >> >>>>> >> index = find_index(a, 1) # = [0, 5] >>>>> >> indices = find_indices(a, values) # = [0, 1, 2, 5] >>>>> >> ``` >>>>> >> I am currently using `compute.is_in` and traversing the true >>>>> indices of the result Bitmap. Is there a better way? >>>>> >> >>>>> >> Best >>>>> >> -- >>>>> >> Niranda Perera >>>>> >> https://niranda.dev/ >>>>> >> @n1r44 >>>>> >> >>>>> >>>> >>>> >>>> -- >>>> Niranda Perera >>>> https://niranda.dev/ >>>> @n1r44 <https://twitter.com/N1R44> >>>> >>>> -- Niranda Perera https://niranda.dev/ @n1r44 <https://twitter.com/N1R44>
