Re: [PR] [SPARK-55225][PYTHON][PS] Restore to the original dtype for Datetime [spark]

via GitHub Tue, 27 Jan 2026 21:57:10 -0800


gaogaotiantian commented on code in PR #54017:
URL: https://github.com/apache/spark/pull/54017#discussion_r2734977778



##########
python/pyspark/pandas/data_type_ops/datetime_ops.py:
##########
@@ -128,6 +129,13 @@ def prepare(self, col: pd.Series) -> pd.Series:
         """Prepare column when from_pandas."""
         return col
 
+    def restore(self, col: pd.Series) -> pd.Series:

Review Comment:
   So `restore` is a method that would be called by 
`InternalFrame.restore_index` - that's the function to convert 
`pyspark.DataFrame` back to `pandas.DataFrame`. `InternalFrame` keeps a record 
of what the pandas dtypes should be for each column, and try to restore the 
types during conversion. However, I assume because we always use 
`datetime64[ns]` for `TimestampType`, we never write the `restore` function for 
`TimestampType` - they will always be converted to `datetime64[ns]`. With this 
newly added method, it can be converted back to what it's supposed to be from 
original pandas.
   
   In practice, it would be something like 
`ps.from_pandas(pd.some_df_or_series_or_index())` having the same column type 
as the original pd df/series/index.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-55225][PYTHON][PS] Restore to the original dtype for Datetime [spark]

Reply via email to