Re: [PySPark] How to check if value of one column is in array of another column

2023-01-18 Thread Oliver Ruebenacker
Awesome, thanks, this was exactly what I needed! On Tue, Jan 17, 2023 at 5:23 PM Sean Owen wrote: > I think you want array_contains: > > https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.array_contains.html > > On Tue, Jan 17, 2023 at 4:18 PM Oliver

Re: [PySPark] How to check if value of one column is in array of another column

2023-01-17 Thread Sean Owen
I think you want array_contains: https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.array_contains.html On Tue, Jan 17, 2023 at 4:18 PM Oliver Ruebenacker < oliv...@broadinstitute.org> wrote: > > Hello, > > I have data originally stored as JSON.

[PySPark] How to check if value of one column is in array of another column

2023-01-17 Thread Oliver Ruebenacker
Hello, I have data originally stored as JSON. Column gene contains a string, column nearest an array of strings. How can I check whether the value of gene is an element of the array of nearest? I tried: genes_joined.gene.isin(genes_joined.nearest) But I get an error that says: pyspar