Github user edlee123 commented on the issue:
https://github.com/apache/spark/pull/18378
I see the rationale now, thank you everyone
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional
Github user edlee123 commented on the issue:
https://github.com/apache/spark/pull/18378
Ok I see, I can see part of the rationale is performance (from discussion
of astype above) and consistency with pyarrow
https://arrow.apache.org/docs/python/pandas.html
I guess without
Github user edlee123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18378#discussion_r181567770
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1750,6 +1761,24 @@ def _to_scala_map(sc, jm):
return sc._jvm.PythonUtils.toScalaMap(jm
Github user edlee123 commented on a diff in the pull request:
https://github.com/apache/spark/pull/18378#discussion_r181565606
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -1750,6 +1761,24 @@ def _to_scala_map(sc, jm):
return sc._jvm.PythonUtils.toScalaMap(jm