[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82836/ Test PASSed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82836 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82836/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82836 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82836/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-17 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18664 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82832/ Test FAILed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82832 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82832/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82832 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82832/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82826 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82826/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82826/ Test FAILed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82826 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82826/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82650/ Test FAILed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82650 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82650/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 I think I sort of have things working now the way we discussed. Working with timestamps in `toPandas()` was pretty straightforward, but there are some differences with them in `pandas_udf` and

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82650 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82650/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82641/ Test PASSed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82641 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82641/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 Thanks @ueshin , I agree it is better to convert the timezone to Python system local first and then localize to make tz-naive in case the Python system local tz is different that

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82641 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82641/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18664 That's a great explanation. I think you are right. Using `SQLConf.SESSION_LOCAL_TIMEZONE` makes much more sense to me now. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-11 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18664 I disagree with using `DateTimeUtils.defaultTimeZone()` for the timezone. If `DateTimeUtils.defaultTimeZone()` is different from system timezone in Python, the return values are different between

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82613 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82613/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82613/ Test FAILed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82613 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82613/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82612/ Test FAILed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82612 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82612/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 @HyukjinKwon @ueshin , please take a look. This should handle timestamps with Arrow the same as without Arrow. I still need to add some tests for timestamps with `pandas_udf`s. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82612 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82612/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82611/ Test FAILed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82611 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82611/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #82611 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82611/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 I'll work on doing (1) to have conversions in Python for Arrow to match Non-Arrow and we can see how that turns out. --- -

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 > I'm just wondering what if we use timestamp in nested types. Currently we don't support nested types but in the future? I'll try to take this into account, or at least add a note for

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 > BTW, do you think it is possible to easily de-duplicate timezone handling for both with-Arrow and without-Arrow within Python side if we go for 1. in the separate PR? @HyukjinKwon ,

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18664 I'd say I prefer 1, too. I'm just wondering what if we use timestamp in nested types. Currently we don't support nested types but in the future? ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18664 > Write Arrow data with SESSION_LOCAL timestamp (as is currently in this PR) BTW, could we just use `DateTimeUtils.defaultTimeZone()` instead of `SQLConf.SESSION_LOCAL_TIMEZONE` if you

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18664 @BryanCutler, BTW, do you think it is possible to de-duplicate timezone handling within Python side if we go for 1.? --- -

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18664 I think I prefer 1. Do you maybe have a preference @ueshin? I believe you are more insightful in this. --- - To

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-09 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 Ok sounds good. Could I get some opinions on the best way to convert internal Spark timestamps since they are stored as UTC time? I think we have the following options: 1. Write

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18664 Yup, I think we already don't have timezone in `udf` too? I think we are fine as long as it keeps the existing behaviour. Let's don't forget to handle all those cases when we deal with timezone

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-09 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 @HyukjinKwon and @ueshin so with Arrow, the Pandas DataFrame from `toPandas()` timestamp columns will not have a timezone - are we going to do the same thing for `pandas_udf` Series? I was

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-08 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18664 I'm sorry for the delay. I agree with @HyukjinKwon's suggestion to keep the behavior of current `toPandas` without Arrow for now. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18664 Yup, that's what I suggested. To me, it sounds few issues are convoluted here and want to proceed what we are clear for now separately. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-06 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 Made [SPARK-1](https://issues.apache.org/jira/browse/SPARK-1) for user doc, once we decide what to do with timestampes it can be completed ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-06 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18664 Bryan, I haven't created. Go ahead! On Fri, Oct 6, 2017 at 5:45 PM Bryan Cutler wrote: > Thanks all for the discussion. I think there are a lot of

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-06 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 Thanks all for the discussion. I think there are a lot of subtleties at play here, and what may or may not be considered a bug can depend on the users intent. Regardless, I agree that there

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18664 I am okay with proceeding separately for dealing with timezone, and matching the behaviour with Arrow to the existing behaviour without Arrow here with respect to timezone. Less sure

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-06 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18664 If we all agree on the necessity of a design doc first, I can create a Jira and we can make progress there. What do you all think? @BryanCutler @gatorsmile @HyukjinKwon ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-06 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18664 I agree. I think some high level document describing these differences so we can discuss it. I think we should be more careful about Arrow-version behavior before releasing support for timestamp

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18664 Yup, I admit there could be some exceptions (there have been actually) but that should still be the baseline we should basically pursue. Probably, we could treat this Arrow optimisation as an

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-06 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18664 > The baseline should be (as said above): Internal optimisation should not introduce any behaviour change, and we are discouraged to change the previous behaviour unless it has bugs in general.

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/18664 The baseline should be (as said above): Internal optimisation should not introduce any behaviour change, and we are discouraged to change the previous behaviour unless it has bugs in general.

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-06 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18664 cc @ueshin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-06 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18664 Thanks @gatorsmile for the constructive feedback! I don't want to make this more complicated but I also want to make sure we are aware that there is also difference between

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18664 I think all of the involved reviewers agree this is a pretty serious design issue. We are unable to change the behavior after we officially release it. Thus, we have to be very very careful

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-05 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18664 I agree with Bryan. I think we might want to rethink the assumption that toPandas result with arrow / without arrow should be 100% the same. For instance, non-Arrow doesn't respect

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-05 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 @ueshin @HyukjinKwon , I think it would be critical for users to have timestamps working for Arrow. Just to recap, the remaining issue here was that `toPandas()` without Arrow does not have

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80646/ Test PASSed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #80646 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80646/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #80646 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80646/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-14 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 I'm ok with that @ueshin , I'll revert back to the PR you made then remove the default value and throw exception if there is a TimestampType and `timeZoneId` is `None. --- If your project is

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-13 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18664 @BryanCutler I'm sorry for the delay. I think it's too strict as an API to use `SparkSession` to apply timezone. How about throwing an exception instead of using

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-10 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 Hi @ueshin , do you have an idea on how to proceed here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80180/ Test FAILed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #80180 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80180/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80168/ Test PASSed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80167/ Test PASSed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #80168 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80168/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #80167 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80167/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80169/ Test FAILed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #80169 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80169/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #80169 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80169/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #80168 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80168/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 Sorry @ueshin, I forgot to push the changes described in my last comment, please take a look when you can. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #80167 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80167/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80136/ Test FAILed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #80136 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80136/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18664 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80134/ Test PASSed. ---

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #80134 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80134/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #80136 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80136/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 I merged your changes @ueshin , but having timezone as an Option this way makes me a little nervous. It will be easy for people to omit it and in doing so won't cause an immediate failure, but

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18664 **[Test build #80134 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80134/testReport)** for PR 18664 at commit

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18664 To Wes's concern, I think we are only dealing with values in UTC here, both Spark and Arrow internally represents timestamp as microseconds since epoch. To the two issues Bryan and

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread wesm
Github user wesm commented on the issue: https://github.com/apache/spark/pull/18664 For item 2, in Arrow-land if the data is time zone aware, then it must be internally normalized to UTC. Conversions are therefore metadata-only operations and do not require any computation. The

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 > I don't think Scala/Java Timestamp encoder has the same issue Scala and Python handle Timestamps the same way, they both store internally as time from `1970-01-01 00:00:00.0 UTC` and

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-08-01 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18664 I don't think Scala/Java Timestamp encoder has the same issue because `java.sql.Timestamp` always has the timestamp value from `1970-01-01 00:00:00.0 UTC` regardless of timezone as the same as Spark

  1   2   >