HyukjinKwon edited a comment on pull request #33875:
URL: https://github.com/apache/spark/pull/33875#issuecomment-909834254


   Just to clarify a bit more, Arrow specification describes as below 
(previously it was documented a local datetime)
   
   > If a Timestamp column has a non-empty timezone value, its epoch is 
1970-01-01 00:00:00 (January 1st 1970, midnight)  in an \*unknown\* timezone.
   
   > In particular, it is \*not\* possible to interpret an unset or empty 
timezone as the same as "UTC"
   
   which I believe is inspired from other places like naive datetime vs aware 
datetime: 
https://docs.python.org/3/library/datetime.html#aware-and-naive-objects:
   
   > Because naive `datetime` objects are treated by many `datetime` methods as 
local times
   
   Here is Spark's take: 
   - With `TimestampType`, we will interpret naive `datetime` as a local time 
(a.k.a. `TIMESTAMP WITH TIME ZONE`)
   - With `TimestampNTZType`, we will also interpret naive `datetime` as a time 
in an unknown timezone (a.k.a. `TIMESTAMP WITHOUT LOCAL TIME ZONE`), and 
computes them without caring the local (session) timezone
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to