HyukjinKwon edited a comment on pull request #33875:
URL: https://github.com/apache/spark/pull/33875#issuecomment-909834254


   Just to clarify a bit more, Arrow specification describes as below 
(previously it was documented a local datetime)
   
   > If a Timestamp column has a non-empty timezone value, its epoch is 
1970-01-01 00:00:00 (January 1st 1970, midnight)  in an \*unknown\* timezone.
   
   > In particular, it is \*not\* possible to interpret an unset or empty 
timezone as the same as "UTC"
   
   which I believe is inspired from naive datetime vs aware datetime: 
https://docs.python.org/3/library/datetime.html#aware-and-naive-objects:
   
   > Because naive `datetime` objects are treated by many `datetime` methods as 
local times
   
   Here is Spark's take: 
   - With `TimestampType`, we will interpret `datetime` as a local times 
(a.k.a. `TIMESTAMP WITH TIME ZONE`)
   - With `TimestampNTZType`, we will also interpret `datetime` in an unknown 
timezone (a.k.a. `TIMESTAMP WITHOUT LOCAL TIME ZONE`), and computes them 
without caring the local (session) timezone
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to