[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 I am merging this in - maybe I am rushing it but please allow me to go ahead since it's going to block other PySpark PRs. At worst case, I am willing to revert and propose this again if

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98808/ Test PASSed. ---

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23021 **[Test build #98808 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98808/testReport)** for PR 23021 at commit

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 adding @holdenk, @ueshin and @icexelloss as well. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 adding @icexelloss as well. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 > Did you test on python3 as well? Of course! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23021 **[Test build #98808 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98808/testReport)** for PR 23021 at commit

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 > Could you add some descriptions to run a single test file or a single test case if exists? Done! --- - To

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 Yup! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/23021 One thing - I would put “pandas” right after test_ so you get the natural logical grouping with sorting by file name. On Tue, Nov 13, 2018 at 4:58 PM Hyukjin Kwon wrote:

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 I am going to push after testing and double checking. The line counts would look like this ``` 54 ./test_utils.py 199 ./test_catalog.py 503

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 > I'd break the pandas udf one into smaller pieces too, as you suggested. We should also investigate why the runtime didn't improve ... One suspection from my investigation is, it

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 Yup, will break pandas one into smaller ones as well. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98783/ Test PASSed. ---

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23021 **[Test build #98783 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98783/testReport)** for PR 23021 at commit

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23021 **[Test build #98782 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98782/testReport)** for PR 23021 at commit

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/23021 Great initiative! I'd break the pandas udf one into smaller pieces too, as you suggested. We should also investigate why the runtime didn't improve ... ---

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/23021 thanks for working on this @HyukjinKwon . Did you test on python3 as well? I'm surprised elapsed time is the same -- I would have expected it to be faster as the tests could run in

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23021 **[Test build #98783 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98783/testReport)** for PR 23021 at commit

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 Elapsed time looks virtually same. All tests looks running fine. The last commit should show skipped tests fine as well. Should be ready for a look. ---

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23021 **[Test build #98782 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98782/testReport)** for PR 23021 at commit

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23021 **[Test build #98769 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98769/testReport)** for PR 23021 at commit

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98767/ Test PASSed. ---

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23021 **[Test build #98767 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98767/testReport)** for PR 23021 at commit

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 For your information, here's the line counts for each file: ``` 52 ./test_utils.py 197 ./test_catalog.py 43 ./test_group.py 318 ./test_session.py

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23021 **[Test build #98769 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98769/testReport)** for PR 23021 at commit

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 FWIW, I at least double checked if they are any tests missing, and if they are actually being ran (via coverage). --- - To

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 adding @rxin (derived from mailing list) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23021 @BryanCutler and @squito, Here is the official first attempt to break `pyspark/sql/tests.py` into multiple small files. If there are no outstanding issues (for instance, if we are

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23021 **[Test build #98767 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98767/testReport)** for PR 23021 at commit

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23021 **[Test build #98765 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98765/testReport)** for PR 23021 at commit

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23021 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98765/ Test FAILed. ---

[GitHub] spark issue #23021: [SPARK-26032][PYTHON] Break large sql/tests.py files int...

2018-11-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23021 **[Test build #98765 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98765/testReport)** for PR 23021 at commit