[GitHub] spark issue #15554: [SPARK-16078] [SQL] Backport: from_utc_timestamp/to_utc_...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15554 Merged in. Can you close this? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15558: [SPARK-17357][SPARK-6624][SQL] Convert filter predicate ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15558 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15558: [SPARK-17357][SPARK-6624][SQL] Convert filter predicate ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15558 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67234/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15558: [SPARK-17357][SPARK-6624][SQL] Convert filter predicate ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15558 **[Test build #67234 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67234/consoleFull)** for PR 15558 at commit [`5343947`](https://github.com/apache/spark/commit/5343947cfeb287e1f0e02e472cc2ada441c671a4). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15560: [SPARKR] fix warnings
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15560 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15560: [SPARKR] fix warnings
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15560 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67238/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15560: [SPARKR] fix warnings
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15560 **[Test build #67238 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67238/consoleFull)** for PR 15560 at commit [`b25b277`](https://github.com/apache/spark/commit/b25b277d8ebabb6aabd3c1c80bc9140c8670407f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15559: [SPARK-18013][SPARKR] add crossJoin API
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15559 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67236/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15559: [SPARK-18013][SPARKR] add crossJoin API
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15559 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15559: [SPARK-18013][SPARKR] add crossJoin API
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15559 **[Test build #67236 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67236/consoleFull)** for PR 15559 at commit [`c43d169`](https://github.com/apache/spark/commit/c43d1690ba9b00828765bf8c3c588b7245eb4fb2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15551: [SPARK-18012][SQL] Simplify WriterContainer
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15551 cc @tejasapatil fyi on the change --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15561: [SPARK-18012][SQL] Simplify WriterContainer follow-up
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15561 **[Test build #67240 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67240/consoleFull)** for PR 15561 at commit [`426ed1f`](https://github.com/apache/spark/commit/426ed1f3d8b680adb83b3983ddbd368612be1e5f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15428: [SPARK-17219][ML] enhanced NaN value handling in Bucketi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15428 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67233/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15428: [SPARK-17219][ML] enhanced NaN value handling in Bucketi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15428 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15561: [SPARK-18012][SQL] Simplify WriterContainer follo...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/15561 [SPARK-18012][SQL] Simplify WriterContainer follow-up ## What changes were proposed in this pull request? This patch is a follow-up to https://github.com/apache/spark/pull/15551 and did a few cosmetic changes. ## How was this patch tested? N/A - should be covered by existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark SPARK-18012 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15561.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15561 commit 426ed1f3d8b680adb83b3983ddbd368612be1e5f Author: Reynold XinDate: 2016-10-20T05:29:23Z [SPARK-18012][SQL] Simplify WriterContainer follow-up --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15428: [SPARK-17219][ML] enhanced NaN value handling in Bucketi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15428 **[Test build #67233 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67233/consoleFull)** for PR 15428 at commit [`d1dd840`](https://github.com/apache/spark/commit/d1dd84035802f5aad620aae2953bed4842b49aa8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15551: [SPARK-18012][SQL] Simplify WriterContainer
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15551 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15551: [SPARK-18012][SQL] Simplify WriterContainer
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/15551 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15377: [SPARK-17802] Improved caller context logging.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15377 **[Test build #67239 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67239/consoleFull)** for PR 15377 at commit [`9cffcf3`](https://github.com/apache/spark/commit/9cffcf3755e9f1dc31477f042ee246dabf2c6740). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15560: [SPARKR] fix warnings
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15560 **[Test build #67238 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67238/consoleFull)** for PR 15560 at commit [`b25b277`](https://github.com/apache/spark/commit/b25b277d8ebabb6aabd3c1c80bc9140c8670407f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15560: [SPARKR] fix warnings
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/15560 [SPARKR] fix warnings ## What changes were proposed in this pull request? Fix for a bunch of test warnings that were added recently. We need to investigate why warnings are not turning into errors. ## How was this patch tested? unit tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/felixcheung/spark rwarnings Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15560.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15560 commit b25b277d8ebabb6aabd3c1c80bc9140c8670407f Author: Felix CheungDate: 2016-10-20T05:09:18Z fix warnings --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15423: [SPARK-17860][SQL] SHOW COLUMN's database conflict check...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15423 **[Test build #67237 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67237/consoleFull)** for PR 15423 at commit [`15c568f`](https://github.com/apache/spark/commit/15c568f2cddf28b9542c7d96339b356ed5c578f7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15559: [SPARK-18013][SPARKR] add crossJoin API
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15559 **[Test build #67236 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67236/consoleFull)** for PR 15559 at commit [`c43d169`](https://github.com/apache/spark/commit/c43d1690ba9b00828765bf8c3c588b7245eb4fb2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15559: [SPARK-18013][SPARKR] add crossJoin API
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/15559 [SPARK-18013][SPARKR] add crossJoin API ## What changes were proposed in this pull request? Add crossJoin and do not default to cross join if joinExpr is left out ## How was this patch tested? unit test You can merge this pull request into a Git repository by running: $ git pull https://github.com/felixcheung/spark rcrossjoin Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15559.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15559 commit c43d1690ba9b00828765bf8c3c588b7245eb4fb2 Author: Felix CheungDate: 2016-10-20T04:56:14Z add crossJoin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15544: [SPARK-17997] [SQL] Add an aggregation function for coun...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15544 **[Test build #67235 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67235/consoleFull)** for PR 15544 at commit [`90a17f6`](https://github.com/apache/spark/commit/90a17f6abe8ac88d5793ea9ccc0a804cfb4331c0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15377: [SPARK-17802] Improved caller context logging.
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/15377#discussion_r84212380 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -2432,6 +2432,26 @@ private[spark] object Utils extends Logging { } } +private[util] object CallerContext { + private def callerContextClassExists(): Boolean = { +try { + // scalastyle:off classforname + Class.forName("org.apache.hadoop.ipc.CallerContext") + Class.forName("org.apache.hadoop.ipc.CallerContext$Builder") + // scalastyle:on classforname + true +} catch { + case _ : ClassNotFoundException => false +} + } --- End diff -- The above code can be simplified with `Try { }.isSuccess`, please check `scala.util.Try`. Then these two methods can be merged into one expression. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15550: [SPARK-18003][Spark Core] Fix bug of RDD zipWithIndex & ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15550 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67228/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15550: [SPARK-18003][Spark Core] Fix bug of RDD zipWithIndex & ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15550 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15550: [SPARK-18003][Spark Core] Fix bug of RDD zipWithIndex & ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15550 **[Test build #67228 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67228/consoleFull)** for PR 15550 at commit [`b83a606`](https://github.com/apache/spark/commit/b83a60671e5ab1df4a5b5602db125b5da18d3bd6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15523: [SPARK-17981] [SPARK-17957] [SQL] Fix Incorrect Nullabil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15523 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15523: [SPARK-17981] [SPARK-17957] [SQL] Fix Incorrect Nullabil...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15523 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67229/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15523: [SPARK-17981] [SPARK-17957] [SQL] Fix Incorrect Nullabil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15523 **[Test build #67229 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67229/consoleFull)** for PR 15523 at commit [`52cb8fb`](https://github.com/apache/spark/commit/52cb8fb33e79b527008b68dccbaeb5bcc82f5feb). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15552: [SPARK-18007][SparkR][ML] update SparkR MLP - add...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/15552#discussion_r84210373 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/MultilayerPerceptronClassifierWrapper.scala --- @@ -73,6 +75,8 @@ private[r] object MultilayerPerceptronClassifierWrapper .setStepSize(stepSize) .setPredictionCol(PREDICTED_LABEL_COL) if (seed != null && seed.length > 0) mlp.setSeed(seed.toInt) +if (initialWeights != null) mlp.setInitialWeights(Vectors.dense(initialWeights)) --- End diff -- that seems somewhat involved, so it is likely better to not duplicate the logic on the R side. we should check here if initialWeights != null its length > 0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15428: [SPARK-17219][ML] enhanced NaN value handling in Bucketi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15428 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67232/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15428: [SPARK-17219][ML] enhanced NaN value handling in Bucketi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15428 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15428: [SPARK-17219][ML] enhanced NaN value handling in Bucketi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15428 **[Test build #67232 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67232/consoleFull)** for PR 15428 at commit [`0fb8d38`](https://github.com/apache/spark/commit/0fb8d38dca953b2fb1f05b8debb263ed26e0ed36). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15515: [SPARK-17970][SQL][WIP] store partition spec in metastor...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15515 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15515: [SPARK-17970][SQL][WIP] store partition spec in metastor...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15515 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67230/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15515: [SPARK-17970][SQL][WIP] store partition spec in metastor...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15515 **[Test build #67230 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67230/consoleFull)** for PR 15515 at commit [`4d93f48`](https://github.com/apache/spark/commit/4d93f48788d05ae6983f0f2feae6cc4368c039e7). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15551: [SPARK-18012][SQL] Simplify WriterContainer
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15551 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67227/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15558: [SPARK-17357][SPARK-6624][SQL] Convert filter predicate ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15558 **[Test build #67234 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67234/consoleFull)** for PR 15558 at commit [`5343947`](https://github.com/apache/spark/commit/5343947cfeb287e1f0e02e472cc2ada441c671a4). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15551: [SPARK-18012][SQL] Simplify WriterContainer
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15551 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15551: [SPARK-18012][SQL] Simplify WriterContainer
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15551 **[Test build #67227 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67227/consoleFull)** for PR 15551 at commit [`0ca0c81`](https://github.com/apache/spark/commit/0ca0c817fcbf5c421b1af9c122fb8c93638a0383). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15558: [SPARK-17357][SPARK-6624][SQL] Convert filter pre...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/15558 [SPARK-17357][SPARK-6624][SQL] Convert filter predicate to CNF in Optimizer for pushdown ## What changes were proposed in this pull request? This PR is proposed to solve the problem #14912 tried to solve before. Simply said, currently some predicates can not be correctly pushdown through operators due to its format is a bunch of ORs. A simple example is (a > 10) || (b > 2 && c == 3). If a datasource has attributes a and b, this filtering predicate cannot be pushdown. If we can convert it to CNF (a > 10 || b > 2) && (a > 10 || c == 3). Then we can push down (a > 10 || b > 2). To convert the predicate to CNF format can solve this formally instead of a hacky way on #14912. We have previous PRs for CNF conversion, such as #8200. Most of added tests in `CNFNormalizationSuite` are copied from #8200. ## How was this patch tested? Jenkins tests. Please review https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 filter-cnf Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15558.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15558 commit baac6327b5a9c1a234e34da538a72d8ef87a9e35 Author: Liang-Chi HsiehDate: 2016-10-06T14:47:34Z Convert filter predicate to CNF in Optimizer. commit c0637b26808aed386c4d937ebca44958e9f89c09 Author: Liang-Chi Hsieh Date: 2016-10-07T02:49:35Z Improve test. commit f0872fe8b208ddda6e2cb335f9c6a58a195a0960 Author: Liang-Chi Hsieh Date: 2016-10-07T02:50:08Z improve test. commit 62a23691be61f33fa079520e00b573b4ad4aaf3e Author: Liang-Chi Hsieh Date: 2016-10-19T15:35:01Z Merge remote-tracking branch 'upstream/master' into filter-cnf commit 5343947cfeb287e1f0e02e472cc2ada441c671a4 Author: Liang-Chi Hsieh Date: 2016-10-19T15:36:53Z Add comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15421: [SPARK-17811] SparkR cannot parallelize data.frame with ...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/15421 @falaki I mean for the case the exception is raised, but it sounds it is the same test just on different R version LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15428: [SPARK-17219][ML] enhanced NaN value handling in Bucketi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15428 **[Test build #67233 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67233/consoleFull)** for PR 15428 at commit [`d1dd840`](https://github.com/apache/spark/commit/d1dd84035802f5aad620aae2953bed4842b49aa8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15009: [SPARK-17443][SPARK-11035] Stop Spark Application if lau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15009 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67225/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15009: [SPARK-17443][SPARK-11035] Stop Spark Application if lau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15009 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15009: [SPARK-17443][SPARK-11035] Stop Spark Application if lau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15009 **[Test build #67225 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67225/consoleFull)** for PR 15009 at commit [`92e4445`](https://github.com/apache/spark/commit/92e444578ba5dfea02f473101999a82fa8c8d325). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15546: [SPARK-17982][SQL] SQLBuilder should wrap the gen...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15546#discussion_r84205834 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala --- @@ -138,6 +138,9 @@ class SQLBuilder private ( case g: Generate => generateToSQL(g) +case Limit(limitExpr, child @ Project(_, _)) => + s"(${toSQL(child)} LIMIT ${limitExpr.sql})" + case Limit(limitExpr, child) => --- End diff -- During testing, the previous code is designed to add "LIMIT" string **without** parenthesis to handle the most cases. And, Spark do not allow double parenthesis. - ORDER BY: Limit(_, Sort) - GROUP BY: Limit(_, Aggr) ... `Project` is the only observed cases in `CREAE VIEW` or `SELECT * FROM (SELECT ... LIMIT ..)`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15428: [SPARK-17219][ML] enhanced NaN value handling in Bucketi...
Github user VinceShieh commented on the issue: https://github.com/apache/spark/pull/15428 Thanks for your valuable suggestions. @jkbradley @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15428: [SPARK-17219][ML] enhanced NaN value handling in ...
Github user VinceShieh commented on a diff in the pull request: https://github.com/apache/spark/pull/15428#discussion_r84205467 --- Diff: python/pyspark/ml/feature.py --- @@ -1157,9 +1157,11 @@ class QuantileDiscretizer(JavaEstimator, HasInputCol, HasOutputCol, JavaMLReadab categorical features. The number of bins can be set using the :py:attr:`numBuckets` parameter. It is possible that the number of buckets used will be less than this value, for example, if --- End diff -- same as the comment above. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15428: [SPARK-17219][ML] enhanced NaN value handling in ...
Github user VinceShieh commented on a diff in the pull request: https://github.com/apache/spark/pull/15428#discussion_r84205458 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala --- @@ -66,11 +67,13 @@ private[feature] trait QuantileDiscretizerBase extends Params /** * `QuantileDiscretizer` takes a column with continuous features and outputs a column with binned * categorical features. The number of bins can be set using the `numBuckets` parameter. It is - * possible that the number of buckets used will be less than this value, for example, if there - * are too few distinct values of the input to create enough distinct quantiles. Note also that - * NaN values are handled specially and placed into their own bucket. For example, if 4 buckets - * are used, then non-NaN data will be put into buckets(0-3), but NaNs will be counted in a special - * bucket(4). + * possible that the number of buckets used will be less than this value, for example, if there are --- End diff -- same as the comment above --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15428: [SPARK-17219][ML] enhanced NaN value handling in Bucketi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15428 **[Test build #67232 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67232/consoleFull)** for PR 15428 at commit [`0fb8d38`](https://github.com/apache/spark/commit/0fb8d38dca953b2fb1f05b8debb263ed26e0ed36). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15428: [SPARK-17219][ML] enhanced NaN value handling in ...
Github user VinceShieh commented on a diff in the pull request: https://github.com/apache/spark/pull/15428#discussion_r84205414 --- Diff: docs/ml-features.md --- @@ -1104,9 +1104,11 @@ for more details on the API. `QuantileDiscretizer` takes a column with continuous features and outputs a column with binned categorical features. The number of bins is set by the `numBuckets` parameter. It is possible that the number of buckets used will be less than this value, for example, if there are too few -distinct values of the input to create enough distinct quantiles. Note also that NaN values are -handled specially and placed into their own bucket. For example, if 4 buckets are used, then -non-NaN data will be put into buckets[0-3], but NaNs will be counted in a special bucket[4]. +distinct values of the input to create enough distinct quantiles. Note also that QuantileDiscretizer --- End diff -- In cases when the number of buckets requested by the user is greater than the number of distinct splits generated from Bucketizer, the returned number of buckets will be less than requested. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15428: [SPARK-17219][ML] enhanced NaN value handling in Bucketi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15428 **[Test build #67231 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67231/consoleFull)** for PR 15428 at commit [`c350e9f`](https://github.com/apache/spark/commit/c350e9fd0d20ad7114644d53df83231548171f89). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15428: [SPARK-17219][ML] enhanced NaN value handling in Bucketi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15428 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67231/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15428: [SPARK-17219][ML] enhanced NaN value handling in Bucketi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15428 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15551: [SPARK-18012][SQL] Simplify WriterContainer
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15551 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67226/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15551: [SPARK-18012][SQL] Simplify WriterContainer
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15551 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15428: [SPARK-17219][ML] enhanced NaN value handling in Bucketi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15428 **[Test build #67231 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67231/consoleFull)** for PR 15428 at commit [`c350e9f`](https://github.com/apache/spark/commit/c350e9fd0d20ad7114644d53df83231548171f89). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15551: [SPARK-18012][SQL] Simplify WriterContainer
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15551 **[Test build #67226 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67226/consoleFull)** for PR 15551 at commit [`cfba7bb`](https://github.com/apache/spark/commit/cfba7bbe05a511b960aba07794c654791009e07a). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15555: [Minor][ML] Refactor clustering summary.
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/1#discussion_r84204673 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -354,21 +354,41 @@ object KMeans extends DefaultParamsReadable[KMeans] { @Since("2.0.0") @Experimental class KMeansSummary private[clustering] ( -@Since("2.0.0") @transient val predictions: DataFrame, -@Since("2.0.0") val predictionCol: String, -@Since("2.0.0") val featuresCol: String, -@Since("2.0.0") val k: Int) extends Serializable { +predictions: DataFrame, +predictionCol: String, +featuresCol: String, +k: Int) + extends ClusteringSummary ( +predictions, +predictionCol, +featuresCol, +k + ) + +/** + * :: Experimental :: + * Summary of clustering. + * + * @param predictions [[DataFrame]] produced by model.transform() + * @param predictionCol Name for column of predicted clusters in `predictions` + * @param featuresCol Name for column of features in `predictions` + * @param k Number of clusters + */ +@Experimental --- End diff -- ```ClusteringSummary``` will be succeeded by summaries who were added in different version, so I think we should not add since version here. To the issue for a new file, I think ```ClusteringSummary``` is a small class, we can place it here temporarily. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15532: [SPARK-17989][SQL] Check ascendingOrder type in s...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15532 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15423: [SPARK-17860][SQL] SHOW COLUMN's database conflict check...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/15423 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15546: [SPARK-17982][SQL] SQLBuilder should wrap the gen...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15546#discussion_r84204517 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala --- @@ -138,6 +138,9 @@ class SQLBuilder private ( case g: Generate => generateToSQL(g) +case Limit(limitExpr, child @ Project(_, _)) => + s"(${toSQL(child)} LIMIT ${limitExpr.sql})" + case Limit(limitExpr, child) => --- End diff -- is it always safe to just add limit to any plan? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15423: [SPARK-17860][SQL] SHOW COLUMN's database conflic...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15423#discussion_r84204534 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala --- @@ -1713,4 +1713,21 @@ class DDLSuite extends QueryTest with SharedSQLContext with BeforeAndAfterEach { assert(sql("show user functions").count() === 1L) } } + + test("show columns - negative test") { +// When case sensitivity is true, the user supplied database name in table identifier +// should match the supplied database name in case sensitive way. +withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") { + withTempDatabase { db => +val tabName = s"$db.showcolumn" +withTable(tabName) { + sql(s"CREATE TABLE $tabName(col1 int, col2 string) USING parquet ") + val message = intercept[AnalysisException] { + sql(s"SHOW COLUMNS IN $db.showcolumn FROM ${db.toUpperCase}") --- End diff -- nit: wrong ident here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15532: [SPARK-17989][SQL] Check ascendingOrder type in sort_arr...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15532 Thanks - merging in master/branch-2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15493: [SPARK-17946][PYSPARK] Python crossJoin API similar to S...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15493 Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15550: [SPARK-18003][Spark Core] Fix bug of RDD zipWithIndex & ...
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15550 Cool - LGTM! (I will merge once Jenkins comes back positive) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15493: [SPARK-17946][PYSPARK] Python crossJoin API similar to S...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/15493 I will take that and add my note. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15539: [SPARK-17994] [SQL] Add back a file status cache ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15539#discussion_r84203885 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveTablePerfStatsSuite.scala --- @@ -103,11 +92,103 @@ class HiveDataFrameSuite extends QueryTest with TestHiveSingleton with SQLTestUt assert(HiveCatalogMetrics.METRIC_PARTITIONS_FETCHED.getCount() == 5) assert(HiveCatalogMetrics.METRIC_FILES_DISCOVERED.getCount() == 5) - // read all should be cached + // read all should not be cached HiveCatalogMetrics.reset() spark.sql("select * from test").count() + assert(HiveCatalogMetrics.METRIC_PARTITIONS_FETCHED.getCount() == 5) + assert(HiveCatalogMetrics.METRIC_FILES_DISCOVERED.getCount() == 5) + + // cache should be disabled + assert(HiveCatalogMetrics.METRIC_FILE_CACHE_HITS.getCount() == 0) +} + } +} + } + + test("lazy partition pruning with file status caching enabled") { +withSQLConf( +"spark.sql.hive.filesourcePartitionPruning" -> "true", +"spark.sql.hive.filesourcePartitionFileCacheSize" -> "999") { + withTable("test") { +withTempDir { dir => + setupPartitionedTable("test", dir) + HiveCatalogMetrics.reset() + assert(spark.sql("select * from test where partCol1 = 999").count() == 0) assert(HiveCatalogMetrics.METRIC_PARTITIONS_FETCHED.getCount() == 0) assert(HiveCatalogMetrics.METRIC_FILES_DISCOVERED.getCount() == 0) + assert(HiveCatalogMetrics.METRIC_FILE_CACHE_HITS.getCount() == 0) + + HiveCatalogMetrics.reset() + assert(spark.sql("select * from test where partCol1 < 2").count() == 2) + assert(HiveCatalogMetrics.METRIC_PARTITIONS_FETCHED.getCount() == 2) + assert(HiveCatalogMetrics.METRIC_FILES_DISCOVERED.getCount() == 2) + assert(HiveCatalogMetrics.METRIC_FILE_CACHE_HITS.getCount() == 0) + + HiveCatalogMetrics.reset() + assert(spark.sql("select * from test where partCol1 < 3").count() == 3) + assert(HiveCatalogMetrics.METRIC_PARTITIONS_FETCHED.getCount() == 3) + assert(HiveCatalogMetrics.METRIC_FILES_DISCOVERED.getCount() == 1) + assert(HiveCatalogMetrics.METRIC_FILE_CACHE_HITS.getCount() == 2) + + HiveCatalogMetrics.reset() + assert(spark.sql("select * from test").count() == 5) + assert(HiveCatalogMetrics.METRIC_PARTITIONS_FETCHED.getCount() == 5) + assert(HiveCatalogMetrics.METRIC_FILES_DISCOVERED.getCount() == 2) + assert(HiveCatalogMetrics.METRIC_FILE_CACHE_HITS.getCount() == 3) + + HiveCatalogMetrics.reset() + assert(spark.sql("select * from test").count() == 5) + assert(HiveCatalogMetrics.METRIC_PARTITIONS_FETCHED.getCount() == 5) + assert(HiveCatalogMetrics.METRIC_FILES_DISCOVERED.getCount() == 0) + assert(HiveCatalogMetrics.METRIC_FILE_CACHE_HITS.getCount() == 5) +} + } +} + } + + test("file status caching respects refresh table and refreshByPath") { +withSQLConf( +"spark.sql.hive.filesourcePartitionPruning" -> "true", +"spark.sql.hive.filesourcePartitionFileCacheSize" -> "999") { + withTable("test") { +withTempDir { dir => + setupPartitionedTable("test", dir) + HiveCatalogMetrics.reset() + assert(spark.sql("select * from test").count() == 5) + assert(HiveCatalogMetrics.METRIC_FILES_DISCOVERED.getCount() == 5) + assert(HiveCatalogMetrics.METRIC_FILE_CACHE_HITS.getCount() == 0) + + HiveCatalogMetrics.reset() + spark.sql("refresh table test") + assert(spark.sql("select * from test").count() == 5) + assert(HiveCatalogMetrics.METRIC_FILES_DISCOVERED.getCount() == 5) + assert(HiveCatalogMetrics.METRIC_FILE_CACHE_HITS.getCount() == 0) + + spark.catalog.cacheTable("test") + HiveCatalogMetrics.reset() + spark.catalog.refreshByPath(dir.getAbsolutePath) + assert(spark.sql("select * from test").count() == 5) + assert(HiveCatalogMetrics.METRIC_FILES_DISCOVERED.getCount() == 5) + assert(HiveCatalogMetrics.METRIC_FILE_CACHE_HITS.getCount() == 0) +} + } +} + } + + test("file status cache respects size limit") { +withSQLConf( +"spark.sql.hive.filesourcePartitionPruning" -> "true", +"spark.sql.hive.filesourcePartitionFileCacheSize" -> "1" /* 1 byte */) { ---
[GitHub] spark pull request #15539: [SPARK-17994] [SQL] Add back a file status cache ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15539#discussion_r84203441 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -231,11 +231,16 @@ private[hive] class HiveMetastoreCatalog(sparkSession: SparkSession) extends Log val sizeInBytes = metastoreRelation.statistics.sizeInBytes.toLong val fileCatalog = { val catalog = new TableFileCatalog( -sparkSession, db, table, Some(partitionSchema), sizeInBytes) +sparkSession, db, table, Some(partitionSchema), sizeInBytes, +fileStatusCacheSize = if (lazyPruningEnabled) { --- End diff -- shall we inline this logic in `TableFileCatalog`? `TableFileCatalog` can access `SparkSession` so it can know these 2 flags: `filesourcePartitionPruning` and `filesourcePartitionFileCacheSize` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15555: [Minor][ML] Refactor clustering summary.
Github user zhengruifeng commented on a diff in the pull request: https://github.com/apache/spark/pull/1#discussion_r84203200 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/KMeans.scala --- @@ -354,21 +354,41 @@ object KMeans extends DefaultParamsReadable[KMeans] { @Since("2.0.0") @Experimental class KMeansSummary private[clustering] ( -@Since("2.0.0") @transient val predictions: DataFrame, -@Since("2.0.0") val predictionCol: String, -@Since("2.0.0") val featuresCol: String, -@Since("2.0.0") val k: Int) extends Serializable { +predictions: DataFrame, +predictionCol: String, +featuresCol: String, +k: Int) + extends ClusteringSummary ( +predictions, +predictionCol, +featuresCol, +k + ) + +/** + * :: Experimental :: + * Summary of clustering. + * + * @param predictions [[DataFrame]] produced by model.transform() + * @param predictionCol Name for column of predicted clusters in `predictions` + * @param featuresCol Name for column of features in `predictions` + * @param k Number of clusters + */ +@Experimental --- End diff -- what about adding `@Since("2.1.0")` here? Create a new scala file named `Clustering.scala` and move `ClusteringSummary` into it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15550: [SPARK-18003][Spark Core] Fix bug of RDD zipWithIndex & ...
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/15550 @tejasapatil check other reference to scala's `zipWithIndex` and fix similar case in `RDD.zipWithUniqueId`, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15550: [SPARK-18003][Spark Core] Fix bug of RDD zipWithIndex ge...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15550 **[Test build #67228 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67228/consoleFull)** for PR 15550 at commit [`b83a606`](https://github.com/apache/spark/commit/b83a60671e5ab1df4a5b5602db125b5da18d3bd6). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15515: [SPARK-17970][SQL][WIP] store partition spec in metastor...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15515 **[Test build #67230 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67230/consoleFull)** for PR 15515 at commit [`4d93f48`](https://github.com/apache/spark/commit/4d93f48788d05ae6983f0f2feae6cc4368c039e7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15523: [SPARK-17981] [SPARK-17957] [SQL] Fix Incorrect Nullabil...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15523 **[Test build #67229 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67229/consoleFull)** for PR 15523 at commit [`52cb8fb`](https://github.com/apache/spark/commit/52cb8fb33e79b527008b68dccbaeb5bcc82f5feb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15523: [SPARK-17981] [SPARK-17957] [SQL] Fix Incorrect Nullabil...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15523 The parm name of the verification function is wrong. It should be `expectedNonNullableColumns`. Please check the test case again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15539: [SPARK-17994] [SQL] Add back a file status cache ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15539#discussion_r84202841 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ListingFileCatalog.scala --- @@ -64,11 +66,18 @@ class ListingFileCatalog( } override def refresh(): Unit = { +refresh0(true) + } + + private def refresh0(invalidateSharedCache: Boolean): Unit = { --- End diff -- the cache is shared? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15555: [Minor][ML] Refactor clustering summary.
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/1 cc @zhengruifeng @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14847: [SPARK-17254][SQL] Add StopAfter physical plan for the f...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/14847 @ioana-delaney Thanks for review! I replied few points first. I will add the tests you mentioned later. 4. This feature is motivated from the bucketed (and sorted, of course) table jira. For this case, the Filter can be applied on the sorted data. At that time, we can leverage it and optimize the filtering. Another case I can think is cached data, when you cache sorted data as I did in the current tests, the Filter won't be pushed down and will work on the sorted data directly. 5. Yeah. From the view of `StopAfter` operator, it only cares about if the child is sorted or not. If the bucketed table is sorted too, it can support it. Of course I will add a test for it. 6. Yes it does. Currently if the bucketed table is inserted or appended, we can't guarantee its sort order. So it will be skipped. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15541: [SPARK-17637][Scheduler]Packed scheduling for Spark task...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15541 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15541: [SPARK-17637][Scheduler]Packed scheduling for Spark task...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15541 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67224/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15541: [SPARK-17637][Scheduler]Packed scheduling for Spark task...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15541 **[Test build #67224 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67224/consoleFull)** for PR 15541 at commit [`e81b279`](https://github.com/apache/spark/commit/e81b2796f7b0c0c4a52d7345f14e0ab4e0e143b3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15523: [SPARK-17981] [SPARK-17957] [SQL] Fix Incorrect Nullabil...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15523 @gatorsmile A predicate like `IsNotNull(a + b + Rand())` will let this change to wrongly set the nullability of `a` and `b` to true. Isn't it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15545: [SPARK-17999][Kafka][SQL] Add getPreferredLocations for ...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/15545 @zsxwing , would you mind taking a look at this PR? Thanks a lot. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15551: [SPARK-18012][SQL] Simplify WriterContainer
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15551 **[Test build #67227 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67227/consoleFull)** for PR 15551 at commit [`0ca0c81`](https://github.com/apache/spark/commit/0ca0c817fcbf5c421b1af9c122fb8c93638a0383). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15552: [SPARK-18007][SparkR][ML] update SparkR MLP - add...
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/15552#discussion_r84199689 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/MultilayerPerceptronClassifierWrapper.scala --- @@ -73,6 +75,8 @@ private[r] object MultilayerPerceptronClassifierWrapper .setStepSize(stepSize) .setPredictionCol(PREDICTED_LABEL_COL) if (seed != null && seed.length > 0) mlp.setSeed(seed.toInt) +if (initialWeights != null) mlp.setInitialWeights(Vectors.dense(initialWeights)) --- End diff -- The length must be specific value, for example, if layer = [a, b, c], weight length = a * (b + 1) + b * (c + 1) this will be checked inside scala MLP class. so does r-side need recheck it ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15423: [SPARK-17860][SQL] SHOW COLUMN's database conflict check...
Github user dilipbiswal commented on the issue: https://github.com/apache/spark/pull/15423 @cloud-fan Hi wenchen, i have added the test cases for temp view. Could we please look at this again? Thanks ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15539 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15539 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67222/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15539: [SPARK-17994] [SQL] Add back a file status cache for cat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15539 **[Test build #67222 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67222/consoleFull)** for PR 15539 at commit [`35b565b`](https://github.com/apache/spark/commit/35b565bf7aa114068fc11fd50a0c51574be0cbe8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15552: [SPARK-18007][SparkR][ML] update SparkR MLP - add...
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/15552#discussion_r84198787 --- Diff: R/pkg/R/mllib.R --- @@ -706,10 +707,13 @@ setMethod("spark.mlp", signature(data = "SparkDataFrame"), if (!is.null(seed)) { seed <- as.character(as.integer(seed)) } +if (!is.null(initialWeights)) { + initialWeights <- as.array(as.numeric(na.omit(initialWeights))) --- End diff -- scala-side wrapper need Array[Double] param so use as.array here, like other parameters such `layers`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15009: [SPARK-17443][SPARK-11035] Stop Spark Application if lau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15009 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67221/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15009: [SPARK-17443][SPARK-11035] Stop Spark Application if lau...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15009 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15009: [SPARK-17443][SPARK-11035] Stop Spark Application if lau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15009 **[Test build #67221 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67221/consoleFull)** for PR 15009 at commit [`14050f5`](https://github.com/apache/spark/commit/14050f5d16b01d5081b456b92afbe78cd79b8d25). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15551: [SPARK-18012][SQL] Simplify WriterContainer
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15551 **[Test build #67226 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/67226/consoleFull)** for PR 15551 at commit [`cfba7bb`](https://github.com/apache/spark/commit/cfba7bbe05a511b960aba07794c654791009e07a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15551: [SPARK-18012][SQL] Simplify WriterContainer
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15551 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15551: [SPARK-18012][SQL] Simplify WriterContainer
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15551 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/67223/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org