Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/18945#discussion_r140166654
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1761,12 +1761,37 @@ def toPandas(self):
raise ImportError("%s\n%s" % (e.message, msg))
else:
dtype = {}
+ columns_with_null_int = {}
+ def null_handler(rows, columns_with_null_int):
+ for row in rows:
+ row = row.asDict()
+ for column in columns_with_null_int:
+ val = row[column]
+ dt = dtype[column]
+ if val is not None:
--- End diff --
Don't we want to fix the issue when pandas type in (np.int8, np.int16,
np.int32) and the field is nullable, the `dtype` we get will cause exception
later when converting a `None` to int type such as np.int16?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]