[
https://issues.apache.org/jira/browse/SPARK-37752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon resolved SPARK-37752.
----------------------------------
Resolution: Not A Bug
> Python UDF fails when it should not get evaluated
> -------------------------------------------------
>
> Key: SPARK-37752
> URL: https://issues.apache.org/jira/browse/SPARK-37752
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.3.4
> Reporter: Ohad Raviv
> Priority: Minor
>
> Haven't checked on newer versions yet.
> If i define in Python:
> {code:java}
> def udf1(col1):
> print(col1[2])
> return "blah"
> spark.udf.register("udf1", udf1) {code}
> and then use it in SQL:
> {code:java}
> select case when length(c)>2 then udf1(c) end
> from (
> select explode(array("123","234","12")) as c
> ) {code}
> it fails on:
> {noformat}
> File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 253,
> in main
> process()
> File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 248,
> in process
> serializer.dump_stream(func(split_index, iterator), outfile)
> File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 155,
> in <lambda>
> func = lambda _, it: map(mapper, it)
> File "<string>", line 1, in <lambda>
> File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 76, in
> <lambda>
> return lambda *a: f(*a)
> File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/util.py", line 55, in
> wrapper
> return f(*args, **kwargs)
> File "<stdin>", line 3, in udf1
> IndexError: string index out of range{noformat}
> Although in the out-of-range row it should not get evaluated at all as the
> case-when filters for lengths of more than 2 letters.
> the same scenario works great when we define instead a Scala UDF.
> will check now if it happens also for newer versions.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]