[
https://issues.apache.org/jira/browse/ARROW-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331258#comment-16331258
]
Li Jin edited comment on ARROW-1425 at 1/18/18 9:29 PM:
--------------------------------------------------------
Here is my attempt to explain this issue (wip):
https://docs.google.com/document/d/1vfL8gLWKCgf7ZVLglnNffdvwjJjC4MqwnnoCuEJaRrU/edit?usp=sharing
was (Author: icexelloss):
Here is my attempt to explain this issue (wip):
https://docs.google.com/document/d/1vfL8gLWKCgf7ZVLglnNffdvwjJjC4MqwnnoCuEJaRrU/edit#heading=h.132ni22bywvl
> [Python] Document semantic differences between Spark timestamps and Arrow
> timestamps
> ------------------------------------------------------------------------------------
>
> Key: ARROW-1425
> URL: https://issues.apache.org/jira/browse/ARROW-1425
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Python
> Reporter: Wes McKinney
> Assignee: Heimir Thor Sverrisson
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.9.0
>
>
> The way that Spark treats non-timezone-aware timestamps as session local can
> be problematic when using pyarrow which may view the data coming from
> toPandas() as time zone naive (but with fields as though it were UTC, not
> session local). We should document carefully how to properly handle the data
> coming from Spark to avoid problems.
> cc [~bryanc] [~holdenkarau]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)