Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/18664
> I don't think Scala/Java Timestamp encoder has the same issue
Scala and Python handle Timestamps the same way, they both store internally
as time from `1970-01-01 00:00:00.0 UTC` and conversion to/from internal is
done with local time, even with `SESSION_LOCAL_TIMEZONE` set. here is an
example, create internally as EST timestamp, but when collect, it is PST.
```scala
scala> spark.conf.set("spark.sql.session.timeZone", "America/New_York")
scala> val ds = spark.range(3).withColumn("ts", current_timestamp())
ds: org.apache.spark.sql.DataFrame = [id: bigint, ts: timestamp]
scala> ds.show(truncate=false)
+---+-----------------------+
|id |ts |
+---+-----------------------+
|0 |2017-08-01 14:21:31.386|
|1 |2017-08-01 14:21:31.386|
|2 |2017-08-01 14:21:31.386|
+---+---------------------+
scala> dss.select("ts").collect()
res6: Array[org.apache.spark.sql.Row] = Array([2017-08-01 11:22:23.2],
[2017-08-01 11:22:23.2], [2017-08-01 11:22:23.2])
```
For the 2 issues you mentioned
1) I'm fine with Arrow data stored with session local timezone. I think it
makes things a little more confusing for the user since it is not used when
import/exporting data, but it doesn't cause any big problem. The issues I'm
bringing up are really with how Spark handles timestamps in general and not
really related to what we do with Arrow data. So hopefully they can be
addressed later.
2) Once Python receives Arrow data, I think it's best to leave as is for
performance reasons so that pyarrow can just read the buffers without any
further conversions.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]