gaogaotiantian commented on code in PR #53730:
URL: https://github.com/apache/spark/pull/53730#discussion_r2674282915
##########
python/pyspark/sql/pandas/types.py:
##########
@@ -842,6 +842,32 @@ def _to_corrected_pandas_type(dt: DataType) ->
Optional[Any]:
return None
+def _to_corrected_pandas_ext_type(dt: DataType) -> Optional[Any]:
+ """
+ Convert spark datatype to a Pandas extension type which support nullable
data.
+ """
+ import pandas as pd
+
+ if type(dt) == ByteType:
Review Comment:
I don't believe the performance matters a lot here, but we should just get
type of `dt` once, instead of repeatedly calling it.
Also, could we use `is` for type comparison? `isinstance` is an alternative
if we do not plan to do exact check. `==` comparison between types is
discouraged by linter and we ignored that for now (I will change it in the
future).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]