BryanCutler commented on a change in pull request #30393:
URL: https://github.com/apache/spark/pull/30393#discussion_r525846522
##########
File path: python/pyspark/sql/pandas/types.py
##########
@@ -306,3 +322,23 @@ def _check_series_convert_timestamps_tz_local(s, timezone):
`pandas.Series` where if it is a timestamp, has been converted to
tz-naive
"""
return _check_series_convert_timestamps_localize(s, timezone, None)
+
+
+def _convert_map_items_to_dict(s):
Review comment:
Note: these conversion functions are because pyarrow expects map items
as a list of (key, value) pairs, and has this format when converting to Pandas
also. The reason is that the arrow spec could allow for duplicate key values in
a row, and doesn't say how these should be handled exactly. So by having these
conversions, we match the non-arrow behavior for maps, with a dictionary as
input/output.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]