Hi All, I am using or operator "|" in withColumn clause on a DataFrame in pyspark. However it looks like it always evaluates all the conditions regardless of first condition being true. Please find a sample below:
contains = udf(lambda s, arr : s in arr, BooleanType())
df.withColumn('match_flag', (col('list_names').isNull()) |
(contains(col('name'), col('list_names'))))
Here where list_names is null, it starts to throw an error : NoneType is
not iterable.
Any idea?
--
Regards,
Rishi Shah
