HyukjinKwon commented on a change in pull request #33876:
URL: https://github.com/apache/spark/pull/33876#discussion_r700695609
##########
File path: python/pyspark/sql/pandas/conversion.py
##########
@@ -369,7 +373,10 @@ def _convert_from_pandas(self, pdf, schema, timezone):
pdf[field.name] = s
else:
for column, series in pdf.iteritems():
- s = _check_series_convert_timestamps_tz_local(series,
timezone)
+ s = series
+ should_localize = not self._is_timestamp_ntz_preferred()
+ if should_localize and is_datetime64tz_dtype(s.dtype) and
s.dt.tz is not None:
Review comment:
Yes, it is being handled there:
https://github.com/apache/spark/blob/0494dc90af48ce7da0625485a4dc6917a244d580/python/pyspark/sql/pandas/types.py#L284-L292
For non `is_datetime64tz_dtype`, we skip localization because timezone is
unavailable when pandas comtains datetime with `object` type as an example:
```python
>>> import pandas as pd
>>> import datetime
>>> s = pd.Series([datetime.datetime.now()])
>>> s.astype("object").dt
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../pandas/core/generic.py", line 5487, in __getattr__
return object.__getattribute__(self, name)
File "/.../pandas/core/accessor.py", line 181, in __get__
accessor_obj = self._accessor(obj)
File "/.../pandas/core/indexes/accessors.py", line 506, in __new__
raise AttributeError("Can only use .dt accessor with datetimelike
values")
AttributeError: Can only use .dt accessor with datetimelike values
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]