Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21467#discussion_r193302276
--- Diff: python/pyspark/sql/tests.py ---
@@ -4096,6 +4080,43 @@ def foo(df):
def foo(k, v, w):
return k
+ def test_stopiteration_in_udf(self):
+ from pyspark.sql.functions import udf, pandas_udf, PandasUDFType
+ from py4j.protocol import Py4JJavaError
+
+ def foo(x):
+ raise StopIteration()
+
+ def foofoo(x, y):
+ raise StopIteration()
+
+ exc_message = "Caught StopIteration thrown from user's code;
failing the task"
+ df = self.spark.range(0, 100)
+
+ # plain udf (test for SPARK-23754)
+ self.assertRaisesRegexp(Py4JJavaError, exc_message, df.withColumn(
+ 'v', udf(foo)('id')
+ ).collect)
--- End diff --
tiny nit: I would do:
```
self.assertRaisesRegexp(
Py4JJavaError, exc_message, df.withColumn('v', udf(foo)('id')).collect)
```
or
```
self.assertRaisesRegexp(
Py4JJavaError,
exc_message,
df.withColumn('v', udf(foo)('id')).collect)
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]