[ 
https://issues.apache.org/jira/browse/ARROW-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331258#comment-16331258
 ] 

Li Jin edited comment on ARROW-1425 at 1/18/18 9:29 PM:
--------------------------------------------------------

Here is my attempt to explain this issue (wip):

https://docs.google.com/document/d/1vfL8gLWKCgf7ZVLglnNffdvwjJjC4MqwnnoCuEJaRrU/edit?usp=sharing


was (Author: icexelloss):
Here is my attempt to explain this issue (wip):

https://docs.google.com/document/d/1vfL8gLWKCgf7ZVLglnNffdvwjJjC4MqwnnoCuEJaRrU/edit#heading=h.132ni22bywvl

> [Python] Document semantic differences between Spark timestamps and Arrow 
> timestamps
> ------------------------------------------------------------------------------------
>
>                 Key: ARROW-1425
>                 URL: https://issues.apache.org/jira/browse/ARROW-1425
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Wes McKinney
>            Assignee: Heimir Thor Sverrisson
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.9.0
>
>
> The way that Spark treats non-timezone-aware timestamps as session local can 
> be problematic when using pyarrow which may view the data coming from 
> toPandas() as time zone naive (but with fields as though it were UTC, not 
> session local). We should document carefully how to properly handle the data 
> coming from Spark to avoid problems.
> cc [~bryanc] [~holdenkarau]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to