Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/21383#discussion_r191422043
--- Diff: python/pyspark/sql/udf.py ---
@@ -157,7 +157,17 @@ def _create_judf(self):
spark = SparkSession.builder.getOrCreate()
sc = spark.sparkContext
- wrapped_func = _wrap_function(sc, self.func, self.returnType)
+ func = fail_on_stopiteration(self.func)
+
+ # prevent inspect to fail
+ # e.g. inspect.getargspec(sum) raises
+ # TypeError: <built-in function sum> is not a Python function
+ try:
+ func._argspec = _get_argspec(self.func)
+ except TypeError:
--- End diff --
Eh, don't we have `self.evalType` and I thought we could simply check it? I
got that the current way is the "recommended" way to deal with divergence in
Python but let's just explicitly scope here to make it easier to be fixed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]