Github user HyukjinKwon commented on the issue:
https://github.com/apache/spark/pull/18664
The baseline should be (as said above): Internal optimisation should not
introduce any behaviour change, and we are discouraged to change the previous
behaviour unless it has bugs in general.
So, the issue is the previous behaviour without-Arrow does not respect
`SESSION_LOCAL_TIMEZONE` but we should respect; however, this is a behaviour
change so we should discuss further and possibly introduce a configuration to
control this to prevent the behaviour change. Also, to be clear, I assume we
admit not respecting `SESSION_LOCAL_TIMEZONE` is not a bug but a behaviour
change to be fixed as an improvement, right?
How about matching the behaviour with-Arrow to the previous behaviour
without-Arrow (not respecting `SESSION_LOCAL_TIMEZONE`) first and then
separately make a PR to introduce a configuration to respect
`SESSION_LOCAL_TIMEZONE` for both?
I actually think I a little bit doubt about respecting
`SESSION_LOCAL_TIMEZONE` when we `collect()` or `toPandas()`, and want to
separate this discussion into another PR, JIRA or mailing list if possible.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]