jorisvandenbossche commented on issue #41268:
URL: https://github.com/apache/arrow/issues/41268#issuecomment-2063190391

   Casting string to timestamp is essentially parsing of the string 
(`strptime`), and for that we currently don't allow to parse to a non-tz-aware 
string to a tz-aware timestamp (for that you would need to guess if the string 
is in local wall time or in UTC, i.e. is it a tz localize or a tz convert 
operation, in pandas terms).
   
   The other examples you give are parsing a non-tz-aware string to a 
non-tz-aware timestamp (no ambiguity, this works fine) and casting non-tz-aware 
timestamp to tz-aware timestamp. This last case is also potentially ambiguous, 
but the casting here is a very simple zero-copy cast that essentially just 
changes the metadata of the timestamp type (to add a timezone), and thus 
essentially treats the input as UTC (and not local wall time, for which there 
is a specific kernel `pc.assume_timezone`).
   
   And so parsing a non-tz-aware string to a tz-aware timestamp can always be 
done in two steps, first parsing / casting to timestamp, and then converting to 
tz-aware timestamp:
   
   ```
   >>> pa.array(["2024-01-01 
05:00:00"]).cast(pa.timestamp("s")).cast(pa.timestamp("s", "Europe/Brussels"))
   <pyarrow.lib.TimestampArray object at 0x7f065c331960>
   [
     2024-01-01 05:00:00Z
   ]
   >>> pc.assume_timezone(pa.array(["2024-01-01 
05:00:00"]).cast(pa.timestamp("s")), "Europe/Brussels")
   <pyarrow.lib.TimestampArray object at 0x7f065c2d26e0>
   [
     2024-01-01 04:00:00Z
   ]
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to