Thank You, Joris. I will look into the pc.SetLookupOptions.
With Regards, Vibhatha Abeykoon On Thu, Nov 19, 2020 at 12:02 PM Joris Van den Bossche < [email protected]> wrote: > Hi, > > The "is_in" docstring is not directly clear about it, but you need to > pass the second argument as a keyword argument using "value_set" keyword > name. Small example: > > In [19]: pc.is_in(pa.array(["a", "b", "c", "d"]), > value_set=pa.array(["a", "c"])) > Out[19]: > <pyarrow.lib.BooleanArray object at 0x7f508af95ac8> > [ > true, > false, > true, > false > ] > > You can find this keyword in the keywords of pc.SetLookupOptions. > > Best, > Joris > > On Wed, 18 Nov 2020 at 16:43, Vibhatha Abeykoon <[email protected]> > wrote: > >> Hello, >> >> I am working on a dataset API on top of Arrow kernels. I am looking into >> the usage of >> *is_in* function in the compute API. >> >> I couldn't figure out how arguments are passed for a is_in check. A >> simple scenario would be; >> >> >> *cylon_tb.from_list([[2,1], [1,0]]* >> *cylon_tb.isin([2])* >> >> Is this very similar to Pandas isin: >> https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isin.html >> ? If not how could we use *is_in* op? >> >> With Regards, >> Vibhatha Abeykoon >> >
