Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/20163#discussion_r159796028
--- Diff: python/pyspark/sql/udf.py ---
@@ -26,6 +26,28 @@
def _wrap_function(sc, func, returnType):
+ def coerce_to_str(v):
+ import datetime
+ if type(v) == datetime.date or type(v) == datetime.datetime:
+ return str(v)
+ else:
+ return v
+
+ # Pyrolite will unpickle both Python datetime.date and
datetime.datetime objects
+ # into java.util.Calendar objects, so the type information on the
Python side is lost.
+ # This is problematic when Spark SQL needs to cast such objects into
Spark SQL string type,
+ # because the format of the string should be different, depending on
the type of the input
+ # object. So for those two specific types we eagerly convert them to
string here, where the
+ # Python type information is still intact.
+ if returnType == StringType():
--- End diff --
This is to handle when a python udf returns `date` or `datetime` but mark
the return type as string?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]