[GitHub] spark pull request: [Minor][DOC] Minor typo fixes
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12755 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13289][MLLIB] Fix infinite distances be...
Github user flyjy commented on the pull request: https://github.com/apache/spark/pull/11812#issuecomment-215632760 @srowen The PR with unit testing passed after rebasing master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][DOC] Minor typo fixes
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12755#issuecomment-215632671 Merging in master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7264][ML] Parallel lapply for sparkR
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12426#issuecomment-215632614 @mengxr - We should add details about this in SparkR programming guide. Can you add this to the QA/docs JIRA we have for 2.0 ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14978][PySpark] PySpark TrainValidation...
Github user taku-k commented on a diff in the pull request: https://github.com/apache/spark/pull/12767#discussion_r61536588 --- Diff: python/pyspark/ml/tests.py --- @@ -616,6 +622,7 @@ def test_save_load(self): tvsModel.save(tvsModelPath) loadedModel = TrainValidationSplitModel.load(tvsModelPath) self.assertEqual(loadedModel.bestModel.uid, tvsModel.bestModel.uid) +self.assertEqual(len(loadedModel.validationMetrics), len(tvsModel.validationMetrics)) --- End diff -- I agree! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14994][SQL] Remove execution hive from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12770#issuecomment-215632358 **[Test build #2931 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2931/consoleFull)** for PR 12770 at commit [`b74f5a5`](https://github.com/apache/spark/commit/b74f5a5faf7c9c1128768194bbb8d0e6f378f6ad). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13289][MLLIB] Fix infinite distances be...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11812#issuecomment-215632322 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13289][MLLIB] Fix infinite distances be...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11812#issuecomment-215632323 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57311/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14988][PYTHON] SparkSession catalog and...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12765#issuecomment-215632276 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57306/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14988][PYTHON] SparkSession catalog and...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12765#issuecomment-215632274 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13289][MLLIB] Fix infinite distances be...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11812#issuecomment-215632229 **[Test build #57311 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57311/consoleFull)** for PR 11812 at commit [`bb03e08`](https://github.com/apache/spark/commit/bb03e08ce237571f42691a06d5b62db89f57ab76). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14988][PYTHON] SparkSession catalog and...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12765#issuecomment-215632186 **[Test build #57306 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57306/consoleFull)** for PR 12765 at commit [`923b92a`](https://github.com/apache/spark/commit/923b92aee7220ec2f2960080853ce8af6d8f51a2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14978][PySpark] PySpark TrainValidation...
Github user taku-k commented on a diff in the pull request: https://github.com/apache/spark/pull/12767#discussion_r61536420 --- Diff: python/pyspark/ml/tuning.py --- @@ -613,7 +615,9 @@ def copy(self, extra=None): """ if extra is None: extra = dict() -return TrainValidationSplitModel(self.bestModel.copy(extra)) +bestModel = self.bestModel.copy(extra) +validationMetrics = self.validationMetrics +return TrainValidationSplitModel(bestModel, validationMetrics) --- End diff -- @vectorijk Thank you for your comments. I'm sure. I'll add `test_copy` method. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][DOC] Minor typo fixes
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12755#issuecomment-215632083 **[Test build #2930 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2930/consoleFull)** for PR 12755 at commit [`ef6c1fb`](https://github.com/apache/spark/commit/ef6c1fbd69afe1cf8113727b323ed5275649d2bd). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14829][MLLIB] Deprecate GLM APIs using ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12596 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-14314][SparkR] Add model persistence to...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/12680#issuecomment-215631757 @GayathriMurali Since the feature freeze deadline is coming and there are some follow-up tasks blocked by this PR and #12683, do you mind @yanboliang sending out new PRs based on yours? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7264][ML] Parallel lapply for sparkR
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12426 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14829][MLLIB] Deprecate GLM APIs using ...
Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/12596#issuecomment-215631602 LGTM Merging with master Thank you! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7264][ML] Parallel lapply for sparkR
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/12426#issuecomment-215631563 LGTM2. Merged into master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14994][SQL] Remove execution hive from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12770#issuecomment-215630612 **[Test build #2931 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2931/consoleFull)** for PR 12770 at commit [`b74f5a5`](https://github.com/apache/spark/commit/b74f5a5faf7c9c1128768194bbb8d0e6f378f6ad). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215630110 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215630103 **[Test build #57314 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57314/consoleFull)** for PR 12764 at commit [`234356a`](https://github.com/apache/spark/commit/234356a413c4d6af70db5e29cc0939ae358c2c06). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215630111 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57314/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215630046 **[Test build #57315 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57315/consoleFull)** for PR 12764 at commit [`fbfd06b`](https://github.com/apache/spark/commit/fbfd06badbb2f41cca9bdebb39b02675317976b7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14994][SQL] Remove execution hive from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12770#issuecomment-215629908 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14994][SQL] Remove execution hive from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12770#issuecomment-215629910 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57313/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14994][SQL] Remove execution hive from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12770#issuecomment-215629896 **[Test build #57313 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57313/consoleFull)** for PR 12770 at commit [`b74f5a5`](https://github.com/apache/spark/commit/b74f5a5faf7c9c1128768194bbb8d0e6f378f6ad). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215629614 **[Test build #57314 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57314/consoleFull)** for PR 12764 at commit [`234356a`](https://github.com/apache/spark/commit/234356a413c4d6af70db5e29cc0939ae358c2c06). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14996][SQL] Add TPCDS Benchmark Queries...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12771#issuecomment-215629419 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57307/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14996][SQL] Add TPCDS Benchmark Queries...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12771#issuecomment-215629417 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14996][SQL] Add TPCDS Benchmark Queries...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12771#issuecomment-215629333 **[Test build #57307 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57307/consoleFull)** for PR 12771 at commit [`461ab81`](https://github.com/apache/spark/commit/461ab81adbc76f2d04ab5aed46b7ebb24cf5c7af). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` | and i_class in('personal', 'portable', 'reference', 'self-help')` * ` | and i_class in('accessories', 'classical', 'fragrances', 'pants')` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14994][SQL] Remove execution hive from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12770#issuecomment-215629230 **[Test build #57313 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57313/consoleFull)** for PR 12770 at commit [`b74f5a5`](https://github.com/apache/spark/commit/b74f5a5faf7c9c1128768194bbb8d0e6f378f6ad). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14511][Build] Upgrade genjavadoc to lat...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/12707#issuecomment-215628995 +1 on merging this first and fixing remaining issues later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-7264][ML] Parallel lapply for sparkR
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12426#issuecomment-215628939 This looks pretty good to me. @mengxr @felixcheung any other comments ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215628799 **[Test build #57312 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57312/consoleFull)** for PR 12493 at commit [`3efe9f5`](https://github.com/apache/spark/commit/3efe9f5f067bf66d35c1c8243d00f2f1fdb4e6f9). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user shivaram commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215628604 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14931][ML][PySpark] Mismatched default ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12738#issuecomment-215628414 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57308/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][CORE] fix a concurrence issue in NewA...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12773#issuecomment-215628448 **[Test build #2926 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2926/consoleFull)** for PR 12773 at commit [`ea2ba20`](https://github.com/apache/spark/commit/ea2ba20a1141f1a5ac89a780906b0ef90b40fc80). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14981][SQL] Throws exception if DESC is...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12759#issuecomment-215628417 **[Test build #57310 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57310/consoleFull)** for PR 12759 at commit [`9b0d518`](https://github.com/apache/spark/commit/9b0d5187fb7c7dc9d9f648a3badf7130c6df6050). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14931][ML][PySpark] Mismatched default ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12738#issuecomment-215628413 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-13289][MLLIB] Fix infinite distances be...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/11812#issuecomment-215628418 **[Test build #57311 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57311/consoleFull)** for PR 11812 at commit [`bb03e08`](https://github.com/apache/spark/commit/bb03e08ce237571f42691a06d5b62db89f57ab76). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14931][ML][PySpark] Mismatched default ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12738#issuecomment-215628390 **[Test build #57308 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57308/consoleFull)** for PR 12738 at commit [`161aa91`](https://github.com/apache/spark/commit/161aa9177565e14af8136e884193423b82cccf6b). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14981][SQL] Throws exception if DESC is...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/12759#issuecomment-215628056 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14994][SQL] Remove execution hive from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12770#issuecomment-215627923 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14994][SQL] Remove execution hive from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12770#issuecomment-215627896 **[Test build #57309 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57309/consoleFull)** for PR 12770 at commit [`2a84533`](https://github.com/apache/spark/commit/2a84533e18daf56bac1f7c278972082e7ebcd190). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14994][SQL] Remove execution hive from ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12770#issuecomment-215627926 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57309/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14997]Files in subdirectories are incor...
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12774#issuecomment-215627856 (I think "(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)" can be removed in the PR description) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][CORE] fix a concurrence issue in NewA...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12773#issuecomment-215627564 **[Test build #2928 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2928/consoleFull)** for PR 12773 at commit [`ea2ba20`](https://github.com/apache/spark/commit/ea2ba20a1141f1a5ac89a780906b0ef90b40fc80). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][CORE] fix a concurrence issue in NewA...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12773#issuecomment-215627327 **[Test build #2927 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2927/consoleFull)** for PR 12773 at commit [`ea2ba20`](https://github.com/apache/spark/commit/ea2ba20a1141f1a5ac89a780906b0ef90b40fc80). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14991][SQL] Remove HiveNativeCommand
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/12769#discussion_r61534645 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -132,7 +132,6 @@ statement hiveNativeCommands --- End diff -- Also `hiveNativeCommands #executeNativeCommand`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14991][SQL] Remove HiveNativeCommand
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/12769#discussion_r61534626 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -132,7 +132,6 @@ statement hiveNativeCommands --- End diff -- Since we don't have `visitExecuteNativeCommand` now, do we still need this rule? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14997]Files in subdirectories are incor...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12774#discussion_r61534612 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/fileSourceInterfaces.scala --- @@ -376,14 +376,10 @@ class HDFSFileCatalog( HadoopFsRelation.shouldFilterOut(name) } - val (dirs, files) = statuses.partition(_.isDirectory) + val (_, files) = statuses.partition(_.isDirectory) // It uses [[LinkedHashSet]] since the order of files can affect the results. (SPARK-11500) - if (dirs.isEmpty) { -mutable.LinkedHashSet(files: _*) - } else { -mutable.LinkedHashSet(files: _*) ++ listLeafFiles(dirs.map(_.getPath)) - } + mutable.LinkedHashSet(files: _*) --- End diff -- Also, I believe there is another method in `HadoopFsRelation` companion object to list up files parallely. This will use this method based on a threshold. I think that should be also corrected if it is really problematic. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14991][SQL] Remove HiveNativeCommand
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12769 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][CORE] fix a concurrence issue in NewA...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/12773 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14991][SQL] Remove HiveNativeCommand
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12769#issuecomment-215627044 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][CORE] fix a concurrence issue in NewA...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12773#issuecomment-215626991 **[Test build #2929 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2929/consoleFull)** for PR 12773 at commit [`ea2ba20`](https://github.com/apache/spark/commit/ea2ba20a1141f1a5ac89a780906b0ef90b40fc80). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][CORE] fix a concurrence issue in NewA...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12773#issuecomment-215626966 I'm going to merge this. Let's see how the tests pan out. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14991][SQL] Remove HiveNativeCommand
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12769#issuecomment-215626945 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57303/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14991][SQL] Remove HiveNativeCommand
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12769#issuecomment-215626944 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][CORE] fix a concurrence issue in NewA...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12773#issuecomment-215627013 **[Test build #2925 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2925/consoleFull)** for PR 12773 at commit [`ea2ba20`](https://github.com/apache/spark/commit/ea2ba20a1141f1a5ac89a780906b0ef90b40fc80). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215626891 Actually it seems like some files are missing? For example, ThriftBinaryCLIService.java is not there, but it is in Hive. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14991][SQL] Remove HiveNativeCommand
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12769#issuecomment-215626863 **[Test build #57303 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57303/consoleFull)** for PR 12769 at commit [`308a896`](https://github.com/apache/spark/commit/308a89682624967a0c65985585adb951cbad5d4c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14487][SQL] User Defined Type registrat...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/12259#discussion_r61534345 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/types/UDTRegistration.scala --- @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.types + +import scala.collection.mutable + +import org.apache.spark.SparkException +import org.apache.spark.internal.Logging +import org.apache.spark.util.Utils + +/** + * This object keeps the mappings between user classes and their User Defined Types (UDTs). + * Previously we use the annotation `SQLUserDefinedType` to register UDTs for user classes. + * However, by doing this, we add SparkSQL dependency on user classes. This object provides + * alterntive approach to register UDTs for user classes. + */ +private[spark] +object UDTRegistration extends Serializable with Logging { + + /** The mapping between the Class between UserDefinedType and user classes. */ + private lazy val udtMap: mutable.Map[String, String] = mutable.Map( --- End diff -- yea, user UDTs can be registered in this map. However, as this is just private api, we may expect to refactor it in 2.1 then have it public. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][CORE] fix a concurrence issue in NewA...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12773#issuecomment-215626610 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57300/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14994][SQL] Remove execution hive from ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12770#issuecomment-215626624 **[Test build #57309 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57309/consoleFull)** for PR 12770 at commit [`2a84533`](https://github.com/apache/spark/commit/2a84533e18daf56bac1f7c278972082e7ebcd190). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][CORE] fix a concurrence issue in NewA...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12773#issuecomment-215626609 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][CORE] fix a concurrence issue in NewA...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12773#issuecomment-215626453 **[Test build #57300 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57300/consoleFull)** for PR 12773 at commit [`ea2ba20`](https://github.com/apache/spark/commit/ea2ba20a1141f1a5ac89a780906b0ef90b40fc80). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14997]Files in subdirectories are incor...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12774#discussion_r61534248 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/fileSourceInterfaces.scala --- @@ -376,14 +376,10 @@ class HDFSFileCatalog( HadoopFsRelation.shouldFilterOut(name) } - val (dirs, files) = statuses.partition(_.isDirectory) + val (_, files) = statuses.partition(_.isDirectory) // It uses [[LinkedHashSet]] since the order of files can affect the results. (SPARK-11500) - if (dirs.isEmpty) { -mutable.LinkedHashSet(files: _*) - } else { -mutable.LinkedHashSet(files: _*) ++ listLeafFiles(dirs.map(_.getPath)) - } + mutable.LinkedHashSet(files: _*) --- End diff -- Are you sure of the difference between 1.6.1 and master? I see this logics are not changed comparing to that [interfaces.scala#L467-L472](https://github.com/apache/spark/blob/branch-1.6/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala#L467-L472) Also, does this still support to read [partitioned tables](http://spark.apache.org/docs/latest/sql-programming-guide.html#partition-discovery)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215625985 Whenever you change the dependencies you'd need to change the file I think. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14931][ML][PySpark] Mismatched default ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12738#issuecomment-215625153 **[Test build #57308 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57308/consoleFull)** for PR 12738 at commit [`161aa91`](https://github.com/apache/spark/commit/161aa9177565e14af8136e884193423b82cccf6b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14998][SQL]fix ArrayIndexOutOfBoundsExc...
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12772#issuecomment-215624817 Maybe I think the title is incomplete. It would be nicer if the title includes where (in.. where). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14997]Files in subdirectories are incor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12774#issuecomment-215624123 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14997]Files in subdirectories are incor...
GitHub user sbcd90 opened a pull request: https://github.com/apache/spark/pull/12774 [SPARK-14997]Files in subdirectories are incorrectly considered in sqlContext.read.json() ## What changes were proposed in this pull request? This PR fixes the issue of "Files in subdirectories are incorrectly considered in sqlContext.read.json()". An example, ``` xyz/file0.json xyz/subdir1/file1.json xyz/subdir2/file2.json xyz/subdir1/subsubdir1/file3.json sqlContext.read.json("xyz") should read only file0.json according to behavior in Spark 1.6.1. However in current master, all the 4 files are read. ``` ## How was this patch tested? unit tests (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) You can merge this pull request into a Git repository by running: $ git pull https://github.com/sbcd90/spark jsonReadIssue Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/12774.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #12774 commit a69329790648fc53d4cf8cc5be659f6ae1989046 Author: Subhobrata DeyDate: 2016-04-29T04:25:53Z [SPARK-14997]Files in subdirectories are incorrectly considered in sqlContext.read.json() --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][DOC] Minor typo fixes
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12755#issuecomment-215623845 **[Test build #2930 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2930/consoleFull)** for PR 12755 at commit [`ef6c1fb`](https://github.com/apache/spark/commit/ef6c1fbd69afe1cf8113727b323ed5275649d2bd). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor][DOC] Minor typo fixes
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/12755#issuecomment-215623807 LGTM pending jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][CORE] fix a concurrence issue in NewA...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12773#issuecomment-215622662 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57299/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][CORE] fix a concurrence issue in NewA...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12773#issuecomment-215622661 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [HOTFIX][CORE] fix a concurrence issue in NewA...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12773#issuecomment-215622600 **[Test build #57299 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57299/consoleFull)** for PR 12773 at commit [`ea2ba20`](https://github.com/apache/spark/commit/ea2ba20a1141f1a5ac89a780906b0ef90b40fc80). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-14976][Streaming] make StreamingContext...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12752#issuecomment-215622100 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57301/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-14976][Streaming] make StreamingContext...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12752#issuecomment-215622098 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-14976][Streaming] make StreamingContext...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12752#issuecomment-215621978 **[Test build #57301 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57301/consoleFull)** for PR 12752 at commit [`f1d14bd`](https://github.com/apache/spark/commit/f1d14bd0f9d1b1f572b5c850f67a51e094c9f331). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14996][SQL] Add TPCDS Benchmark Queries...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12771#issuecomment-215621546 **[Test build #57307 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57307/consoleFull)** for PR 12771 at commit [`461ab81`](https://github.com/apache/spark/commit/461ab81adbc76f2d04ab5aed46b7ebb24cf5c7af). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14996][SQL] Add TPCDS Benchmark Queries...
Github user sameeragarwal commented on the pull request: https://github.com/apache/spark/pull/12771#issuecomment-215621056 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14988][PYTHON] SparkSession catalog and...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12765#issuecomment-215620953 **[Test build #57306 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57306/consoleFull)** for PR 12765 at commit [`923b92a`](https://github.com/apache/spark/commit/923b92aee7220ec2f2960080853ce8af6d8f51a2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14939][SQL] Improve EliminateSorts opti...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12719#issuecomment-215620961 Right. Thank you so much for enriching ideas! I'll update this PR with `FoldablePropagation`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14988][PYTHON] SparkSession catalog and...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/12765#issuecomment-215620780 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14939][SQL] Improve EliminateSorts opti...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12719#issuecomment-215620606 the scope of `NullPropagation` is one operator, but we need a `FoldablePropagation` whose scope is the whole plan tree. Think about `Sort(a, Filter(true, Project(1 AS a)))`, we should be able to propagate the foldable information up. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14939][SQL] Improve EliminateSorts opti...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12719#issuecomment-215620522 Oh, I got. Thanks. I will try to generalize. * Sort(_, Project(_)) * Project(_, Project(...)) And so on. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14939][SQL] Improve EliminateSorts opti...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12719#issuecomment-215620369 If that's just about how to handle `Sort(_, Project(_,_))` expressions in `EliminateSorts`, I can easily modify this PR according to your advice. After moving up the foldables, and the existing `case` statement removes them eventually. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14996][SQL] Add TPCDS Benchmark Queries...
Github user sameeragarwal commented on a diff in the pull request: https://github.com/apache/spark/pull/12771#discussion_r61532083 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/TPCDSBenchmark.scala --- @@ -0,0 +1,1225 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.datasources.parquet + +import org.apache.spark.{SparkConf, SparkContext} +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.sql.SQLContext +import org.apache.spark.sql.catalyst.TableIdentifier +import org.apache.spark.sql.catalyst.analysis.UnresolvedRelation +import org.apache.spark.util.Benchmark + +/** + * Benchmark to measure TPCDS query performance. + * To run this: + * spark-submit --class --jars + */ +object TPCDSBenchmark { + val conf = new SparkConf() + conf.set("spark.sql.parquet.compression.codec", "snappy") + conf.set("spark.sql.shuffle.partitions", "4") + conf.set("spark.driver.memory", "3g") + conf.set("spark.executor.memory", "3g") + conf.set("spark.sql.autoBroadcastJoinThreshold", (20 * 1024 * 1024).toString) + + val sc = new SparkContext("local[1]", "test-sql-context", conf) + val sqlContext = new SQLContext(sc) + + // These queries a subset of the TPCDS benchmark queries and are taken from + // https://github.com/databricks/spark-sql-perf/blob/master/src/main/scala/com/databricks/spark/ + // sql/perf/tpcds/ImpalaKitQueries.scala + val tpcds = Seq( +("q19", """ + |select + | i_brand_id, + | i_brand, + | i_manufact_id, + | i_manufact, + | sum(ss_ext_sales_price) ext_price + |from + | store_sales + | join item on (store_sales.ss_item_sk = item.i_item_sk) + | join store on (store_sales.ss_store_sk = store.s_store_sk) + | join date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk) + | join customer on (store_sales.ss_customer_sk = customer.c_customer_sk) + | join customer_address on + |(customer.c_current_addr_sk = customer_address.ca_address_sk) + |where + | ss_sold_date_sk between 2451484 and 2451513 + | and d_moy = 11 + | and d_year = 1999 + | and i_manager_id = 7 + | and substr(ca_zip, 1, 5) <> substr(s_zip, 1, 5) + |group by + | i_brand, + | i_brand_id, + | i_manufact_id, + | i_manufact + |order by + | ext_price desc, + | i_brand, + | i_brand_id, + | i_manufact_id, + | i_manufact + |limit 100 +""".stripMargin), + +/* +Java HotSpot(TM) 64-Bit Server VM 1.8.0_73-b02 on Mac OS X 10.11.4 +Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz + +TPCDS Snappy (scale = 5): Best/Avg Time(ms)Rate(M/s) Per Row(ns) Relative + --- +q19 1710 / 1858 8.7 114.5 1.0X + */ + +("q27", """ + |select + | i_item_id, + | s_state, + | avg(ss_quantity) agg1, + | avg(ss_list_price) agg2, + | avg(ss_coupon_amt) agg3, + | avg(ss_sales_price) agg4 + |from + | store_sales + | join store on (store_sales.ss_store_sk = store.s_store_sk) + | join customer_demographics on + |(store_sales.ss_cdemo_sk = customer_demographics.cd_demo_sk) +
[GitHub] spark pull request: [SPARK-3767] [CORE] Support wildcard in Spark ...
Github user devaraj-kavali commented on the pull request: https://github.com/apache/spark/pull/12753#issuecomment-215619844 Thanks @rxin for checking this, I don't think @ is used any where. Here again we are replacing only for 'spark.executor.extraJavaOptions' value when @execid@ occurs, any other @ symbols we leave as it is, so I don't think any problem occurs due to this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215619600 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57296/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215619599 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-12919][SPARKR] Implement dapply() on Da...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12493#issuecomment-215619544 **[Test build #57296 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57296/consoleFull)** for PR 12493 at commit [`3efe9f5`](https://github.com/apache/spark/commit/3efe9f5f067bf66d35c1c8243d00f2f1fdb4e6f9). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14939][SQL] Improve EliminateSorts opti...
Github user dongjoon-hyun commented on the pull request: https://github.com/apache/spark/pull/12719#issuecomment-215619413 Actually, `Sort` is dead end, we can not propagate up anymore. So, in that case, removing looks more efficient. Do you mean more generalized `FoldablePropagation` like `NullPropagation` by 'not only Sort'? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215619043 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215619044 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57305/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215619040 **[Test build #57305 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57305/consoleFull)** for PR 12764 at commit [`dc010bc`](https://github.com/apache/spark/commit/dc010bcb8520e16a6d174ec04df4b6e0f3a3589d). * This patch **fails build dependency tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14987] [SQL] inline hive-service (cli) ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12764#issuecomment-215618895 **[Test build #57305 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57305/consoleFull)** for PR 12764 at commit [`dc010bc`](https://github.com/apache/spark/commit/dc010bcb8520e16a6d174ec04df4b6e0f3a3589d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-14939][SQL] Improve EliminateSorts opti...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/12719#issuecomment-215618718 `select 1 as a from tbl order by a` is equal to `select 1 as a from tbl order by 1`. When the child operator is `Project` and has foldable output, if the parent operator references the foldable output, we should replace the attribute with the real foldable expression in `Project`. (and keep the alias to preserve the naming info) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org