Github user BryanCutler commented on the issue:

    https://github.com/apache/spark/pull/18664
  
    > I don't think Scala/Java Timestamp encoder has the same issue
    
    Scala and Python handle Timestamps the same way, they both store internally 
as time from `1970-01-01 00:00:00.0 UTC` and conversion to/from internal is 
done with local time, even with `SESSION_LOCAL_TIMEZONE` set.  here is an 
example, create internally as EST timestamp, but when collect, it is PST.
    
    ```scala
    scala> spark.conf.set("spark.sql.session.timeZone", "America/New_York")
    
    scala> val ds = spark.range(3).withColumn("ts", current_timestamp())
    ds: org.apache.spark.sql.DataFrame = [id: bigint, ts: timestamp]
    
    scala> ds.show(truncate=false)
    +---+-----------------------+
    |id |ts                     |
    +---+-----------------------+
    |0  |2017-08-01 14:21:31.386|
    |1  |2017-08-01 14:21:31.386|
    |2  |2017-08-01 14:21:31.386|
    +---+---------------------+
    
    scala> dss.select("ts").collect()
    res6: Array[org.apache.spark.sql.Row] = Array([2017-08-01 11:22:23.2], 
[2017-08-01 11:22:23.2], [2017-08-01 11:22:23.2])
    ```
    
    For the 2 issues you mentioned
    
    1) I'm fine with Arrow data stored with session local timezone.  I think it 
makes things a little more confusing for the user since it is not used when 
import/exporting data, but it doesn't cause any big problem.  The issues I'm 
bringing up are really with how Spark handles timestamps in general and not 
really related to what we do with Arrow data.  So hopefully they can be 
addressed later.
    
    2) Once Python receives Arrow data, I think it's best to leave as is for 
performance reasons so that pyarrow can just read the buffers without any 
further conversions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to