Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20559#discussion_r167391409
  
    --- Diff: python/pyspark/sql/types.py ---
    @@ -1766,15 +1781,13 @@ def _check_series_convert_timestamps_localize(s, 
from_timezone, to_timezone):
     
         import pandas as pd
         from pandas.api.types import is_datetime64tz_dtype, is_datetime64_dtype
    -    from_tz = from_timezone or 'tzlocal()'
    -    to_tz = to_timezone or 'tzlocal()'
    +    from_tz = from_timezone or _get_local_timezone()
    +    to_tz = to_timezone or _get_local_timezone()
         # TODO: handle nested timestamps, such as ArrayType(TimestampType())?
         if is_datetime64tz_dtype(s.dtype):
             return s.dt.tz_convert(to_tz).dt.tz_localize(None)
         elif is_datetime64_dtype(s.dtype) and from_tz != to_tz:
    -        # `s.dt.tz_localize('tzlocal()')` doesn't work properly when 
including NaT.
    -        return s.apply(lambda ts: 
ts.tz_localize(from_tz).tz_convert(to_tz).tz_localize(None)
    -                       if ts is not pd.NaT else pd.NaT)
    +        return 
s.dt.tz_localize(from_tz).dt.tz_convert(to_tz).dt.tz_localize(None)
    --- End diff --
    
    @ueshin, is it safe to remove `if ts is not pd.NaT else pd.NaT`? Seems 
there is a small possibility for `tzlocal()`:
    
    https://github.com/pandas-dev/pandas/blob/0.19.x/pandas/tslib.pyx#L1760
    https://github.com/pandas-dev/pandas/blob/0.19.x/pandas/tslib.pyx#L54
    https://github.com/dateutil/dateutil/blob/2.6.1/dateutil/tz/tz.py#L1362
    https://github.com/dateutil/dateutil/blob/2.6.1/dateutil/tz/tz.py#L1408


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to