[jira] [Updated] (ARROW-5912) [Python] conversion from datetime objects with mixed timezones should normalize to UTC

2020-01-07 Thread Wes McKinney (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-5912:

Fix Version/s: (was: 0.16.0)

> [Python] conversion from datetime objects with mixed timezones should 
> normalize to UTC
> --
>
> Key: ARROW-5912
> URL: https://issues.apache.org/jira/browse/ARROW-5912
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Joris Van den Bossche
>Priority: Major
>  Labels: beginner
>
> Currently, when having objects with mixed timezones, they are each separately 
> interpreted as their local time:
> {code:python}
> >>> ts_pd_paris = pd.Timestamp("1970-01-01 01:00", tz="Europe/Paris")
> >>> ts_pd_paris
> Timestamp('1970-01-01 01:00:00+0100', tz='Europe/Paris')
> >>> ts_pd_helsinki = pd.Timestamp("1970-01-01 02:00", tz="Europe/Helsinki")
> >>> ts_pd_helsinki
> Timestamp('1970-01-01 02:00:00+0200', tz='Europe/Helsinki')
> >>> a = pa.array([ts_pd_paris, ts_pd_helsinki])   
> >>>   
> >>>  
> >>> a
> 
> [
>   1970-01-01 01:00:00.00,
>   1970-01-01 02:00:00.00
> ]
> >>> a.type
> TimestampType(timestamp[us])
> {code}
> So both times are actually about the same moment in time (the same value in 
> UTC; in pandas their stored {{value}} is also the same), but once converted 
> to pyarrow, they are both tz-naive but no longer the same time. That seems 
> rather unexpected and a source for bugs.
> I think a better option would be to normalize to UTC, and result in a 
> tz-aware TimestampArray with UTC as timezone. 
> That is also the behaviour of pandas if you force the conversion to result in 
> datetimes (by default pandas will keep them as object array preserving the 
> different timezones).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-5912) [Python] conversion from datetime objects with mixed timezones should normalize to UTC

2019-08-15 Thread lidavidm (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lidavidm updated ARROW-5912:

Labels: beginner  (was: )

> [Python] conversion from datetime objects with mixed timezones should 
> normalize to UTC
> --
>
> Key: ARROW-5912
> URL: https://issues.apache.org/jira/browse/ARROW-5912
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Joris Van den Bossche
>Priority: Major
>  Labels: beginner
> Fix For: 1.0.0
>
>
> Currently, when having objects with mixed timezones, they are each separately 
> interpreted as their local time:
> {code:python}
> >>> ts_pd_paris = pd.Timestamp("1970-01-01 01:00", tz="Europe/Paris")
> >>> ts_pd_paris
> Timestamp('1970-01-01 01:00:00+0100', tz='Europe/Paris')
> >>> ts_pd_helsinki = pd.Timestamp("1970-01-01 02:00", tz="Europe/Helsinki")
> >>> ts_pd_helsinki
> Timestamp('1970-01-01 02:00:00+0200', tz='Europe/Helsinki')
> >>> a = pa.array([ts_pd_paris, ts_pd_helsinki])   
> >>>   
> >>>  
> >>> a
> 
> [
>   1970-01-01 01:00:00.00,
>   1970-01-01 02:00:00.00
> ]
> >>> a.type
> TimestampType(timestamp[us])
> {code}
> So both times are actually about the same moment in time (the same value in 
> UTC; in pandas their stored {{value}} is also the same), but once converted 
> to pyarrow, they are both tz-naive but no longer the same time. That seems 
> rather unexpected and a source for bugs.
> I think a better option would be to normalize to UTC, and result in a 
> tz-aware TimestampArray with UTC as timezone. 
> That is also the behaviour of pandas if you force the conversion to result in 
> datetimes (by default pandas will keep them as object array preserving the 
> different timezones).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (ARROW-5912) [Python] conversion from datetime objects with mixed timezones should normalize to UTC

2019-07-11 Thread Joris Van den Bossche (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joris Van den Bossche updated ARROW-5912:
-
Fix Version/s: 1.0.0

> [Python] conversion from datetime objects with mixed timezones should 
> normalize to UTC
> --
>
> Key: ARROW-5912
> URL: https://issues.apache.org/jira/browse/ARROW-5912
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Joris Van den Bossche
>Priority: Major
> Fix For: 1.0.0
>
>
> Currently, when having objects with mixed timezones, they are each separately 
> interpreted as their local time:
> {code:python}
> >>> ts_pd_paris = pd.Timestamp("1970-01-01 01:00", tz="Europe/Paris")
> >>> ts_pd_paris
> Timestamp('1970-01-01 01:00:00+0100', tz='Europe/Paris')
> >>> ts_pd_helsinki = pd.Timestamp("1970-01-01 02:00", tz="Europe/Helsinki")
> >>> ts_pd_helsinki
> Timestamp('1970-01-01 02:00:00+0200', tz='Europe/Helsinki')
> >>> a = pa.array([ts_pd_paris, ts_pd_helsinki])   
> >>>   
> >>>  
> >>> a
> 
> [
>   1970-01-01 01:00:00.00,
>   1970-01-01 02:00:00.00
> ]
> >>> a.type
> TimestampType(timestamp[us])
> {code}
> So both times are actually about the same moment in time (the same value in 
> UTC; in pandas their stored {{value}} is also the same), but once converted 
> to pyarrow, they are both tz-naive but no longer the same time. That seems 
> rather unexpected and a source for bugs.
> I think a better option would be to normalize to UTC, and result in a 
> tz-aware TimestampArray with UTC as timezone. 
> That is also the behaviour of pandas if you force the conversion to result in 
> datetimes (by default pandas will keep them as object array preserving the 
> different timezones).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)