I threw together a quick example that replicates what you see, then looked at the physical plan:
from pyspark.sql.functions import * from pyspark.sql.types import * from pyspark.sql import Row df = spark.createDataFrame([Row(list_names=['a', 'b', 'c', 'd'], name=None), Row(list_names=['a', 'b', 'c', 'd'], name='a')]) df2 = df.withColumn('match_flag', col('list_names').isNull() | contains(col('name'), col('list_names'))) Running df2.show() returns the error you mentioned. However, if you look at the query plan you see the following: == Physical Plan == *(1) Project [list_names#27, name#28, (isnull(list_names#27) || pythonUDF0#47) AS match_flag#32] +- BatchEvalPython [<lambda>(name#28, list_names#27)], [list_names#27, name#28, pythonUDF0#47] +- Scan ExistingRDD[list_names#27,name#28] Spark needs to evaluate the Python UDF in the case that it might be needed. My guess is that the architecture of the PythonUDF pipeline requires the values to be processed together in a batch. It appears that the result is stored into a column reference that is then used the WholeStageCodegen phase that follows the UDF evaluation: [image: Screen Shot 2019-05-13 at 4.31.17 PM.png] If you look at the code that is generated by the codegen, it seems like the or condition might be optimized into a nested if..then..else statement but I'm not experienced in digging into codegen output. Hope this helps! -Nick Nicholas Szandor Hakobian, Ph.D. Principal Data Scientist Rally Health On Mon, May 13, 2019 at 8:38 AM Rishi Shah <rishishah.s...@gmail.com> wrote: > Hi All, > > I am using or operator "|" in withColumn clause on a DataFrame in pyspark. > However it looks like it always evaluates all the conditions regardless of > first condition being true. Please find a sample below: > > contains = udf(lambda s, arr : s in arr, BooleanType()) > > df.withColumn('match_flag', (col('list_names').isNull()) | > (contains(col('name'), col('list_names')))) > > Here where list_names is null, it starts to throw an error : NoneType is > not iterable. > > Any idea? > > -- > Regards, > > Rishi Shah >