jorisvandenbossche commented on issue #40122:
URL: https://github.com/apache/arrow/issues/40122#issuecomment-1954012388

   I think everything is behaving as expected, but the repr is confusing.
   
   
   
   > **Datetimes**
   > ...
   > Converting to pyarrow makes it 11 PM in UTC+1, which is a different time 
(not just the same time in a different coordinate system):
   
   So it actually did convert it to 11 PM in UTC (the UTC value is what is 
stored under the hood), and thus still representing midnight in UTC+1.  
   One might expect wall-time relative to the type in the repr, but so the repr 
is showing the UTC values, which is causing the confusion.
   
   In the latest pyarrow 15.0, we improved the repr a _little_ bit by adding 
"Z" as an indicator that what you see are UTC values:
   
   ```
   In [4]: a
   Out[4]: 
   <pyarrow.lib.TimestampArray object at 0x7f2fe814ebc0>
   [
     1999-12-31 23:00:00.000000Z
   ]
   
   In [5]: a.type
   Out[5]: TimestampType(timestamp[us, tz=+01:00])
   ```
   
   > **Times**
   > 
   > Creating a pyarrow array from a datetime.time strips timezone info:
   
   The Time type in Arrow format does not have a notion of timezones (see 
[fatbuffer 
spec](https://github.com/apache/arrow/blob/4dc3d04ae84d97d02443c0cef555a46535925c2b/format/Schema.fbs#L255-L274)),
 and pyarrow's conversion code for python->arrow currently just ignores the 
timezone. 
   This is not ideal, and maybe it would be better if pyarrow just raised an 
error instead? (indicating it doesn't support `datetime.time` objects with a 
timezone?)
   
   
   
   
   
   
   > which seems to be intentional according to 
https://arrow.apache.org/docs/python/generated/pyarrow.Time64Array.html:
   > 
   > > Localized timestamps will currently be returned as UTC (pandas’s native 
representation). Timezone-naive data will be implicitly interpreted as UTC.
   
   Hmm, that note seems quite outdated (also, it is about timestamps, not about 
times .. although it appears in the Time64Array doc page, but that's because 
this `from_pandas` method is implemented on the base class and thus has the 
same docstring for all concrete Array subclasses). 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to