Github user BryanCutler commented on the issue:
https://github.com/apache/spark/pull/18664
@ueshin @icexelloss I'm not sure `SESSION_LOCAL_TIMEZONE` really makes
things better for this. The issue I see is that it does nothing when
importing/exporting data from Spark DataFrames, it only pertains to timestamp
data while in the context of Spark SQL. Let me show an example, where my local
timezone is "America/Los_Angeles" and I set the session tz to "New_York".
```
In [1]: import datetime
...: from pyspark.sql.types import *
...: spark.conf.set("spark.sql.session.timeZone", "America/New_York")
...: spark.conf.set("spark.sql.execution.arrow.enable", "false")
...: dt = datetime.datetime(1970, 1, 1, 0, 0, 1)
...: TimestampType().toInternal(dt)
...:
Out[1]: 28801000000 # this is still offset to "Los_Angeles"
In [2]: df = spark.createDataFrame([(dt,)],
schema=StructType([StructField("ts", TimestampType(), True)]))
...: df.show()
...:
+-------------------+
| ts|
+-------------------+
|1970-01-01 03:00:01|
+-------------------+
# this displays correctly
In [3]: df.collect()
Out[3]: [Row(ts=datetime.datetime(1970, 1, 1, 0, 0, 1))]
# Spark does not pass the tz info on collect
In [4]: df.toPandas()
Out[4]:
ts
0 1970-01-01 00:00:01
# if this had used arrow with session tz, it would display 1970-01-01
03:00:01
```
When importing the timestamp, it still assumes that it is from my local
timezone. When calling `show()` it displays with the session timezone, but
with `collect()` or `toPandas()` it is ignored and just converts back to my
local timezone. So I think this doesn't really improve much and still leads to
inconsistent behavior. Do you guys agree?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]