[GitHub] spark issue #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up options...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15292 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up options...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15292 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66365/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up options...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15292 **[Test build #66365 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66365/consoleFull)** for PR 15292 at commit [`3fa9f43`](https://github.com/apache/spark/commit/3fa9f43686f1195a9f86ab1bcda054119c332a20). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15231: [SPARK-17658][SPARKR] read.df/write.df API taking path o...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/15231 I think also it would be great to add to examples one for read.df and one for write.df without the path parameter (like a jdbc one) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15359: [Minor][ML] Avoid 2D array flatten in NB training.
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15359 **[Test build #66377 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66377/consoleFull)** for PR 15359 at commit [`0d9a9c7`](https://github.com/apache/spark/commit/0d9a9c74ca714b1df3dde50f2c0386a4a974fa73). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15231: [SPARK-17658][SPARKR] read.df/write.df API taking...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/15231#discussion_r81904325 --- Diff: R/pkg/inst/tests/testthat/test_utils.R --- @@ -167,10 +167,13 @@ test_that("convertToJSaveMode", { }) test_that("captureJVMException", { - expect_error(tryCatch(callJStatic("org.apache.spark.sql.api.r.SQLUtils", "getSQLDataType", + method <- "getSQLDataType" + expect_error(tryCatch(callJStatic("org.apache.spark.sql.api.r.SQLUtils", method, --- End diff -- let's change this test to `handledCallJStatic` too in a follow up? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15359: [Minor][ML] Avoid 2D array flatten in NB training.
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/15359 cc @zhengruifeng @sethah --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15359: [Minor][ML] Avoid 2D array flatten in NB training...
GitHub user yanboliang opened a pull request: https://github.com/apache/spark/pull/15359 [Minor][ML] Avoid 2D array flatten in NB training. ## What changes were proposed in this pull request? Avoid 2D array flatten in ```NaiveBayes``` training, since flatten method might be expensive (It will create another array and copy data there). ## How was this patch tested? Existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/yanboliang/spark nb-theta Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15359.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15359 commit 0d9a9c74ca714b1df3dde50f2c0386a4a974fa73 Author: Yanbo LiangDate: 2016-10-05T05:39:13Z Avoid 2D array flatten in NB training. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15258: [SPARK-17689][SQL][STREAMING] added excludeFiles option ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15258 **[Test build #66374 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66374/consoleFull)** for PR 15258 at commit [`01cb666`](https://github.com/apache/spark/commit/01cb6664ea9ea2da7bc861432c19e3ac14ede524). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15262: [SPARK-17690][STREAMING][SQL] Add mini-dfs cluster based...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15262 **[Test build #66373 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66373/consoleFull)** for PR 15262 at commit [`3a1cd22`](https://github.com/apache/spark/commit/3a1cd221402f4ade6b496996b81665ad19ce3e86). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15355: [SPARK-17782][STREAMING] Disable Kafka 010 pattern based...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15355 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15355: [SPARK-17782][STREAMING] Disable Kafka 010 pattern based...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15355 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66361/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14151: [SPARK-16496][SQL] Add wholetext as option for reading t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14151 **[Test build #66375 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66375/consoleFull)** for PR 14151 at commit [`e263b15`](https://github.com/apache/spark/commit/e263b1508a77424b371a0796ea4f9c05bc1c0121). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14087: [SPARK-16411][SQL][STREAMING] Add textFile to Structured...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14087 **[Test build #66376 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66376/consoleFull)** for PR 14087 at commit [`ecdf653`](https://github.com/apache/spark/commit/ecdf6539c8c19da3f019601309993fde634d6c22). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15355: [SPARK-17782][STREAMING] Disable Kafka 010 pattern based...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15355 **[Test build #66361 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66361/consoleFull)** for PR 15355 at commit [`b7074d4`](https://github.com/apache/spark/commit/b7074d48159804035eaf00e1abed35e408684b42). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15354 **[Test build #66372 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66372/consoleFull)** for PR 15354 at commit [`5f185e3`](https://github.com/apache/spark/commit/5f185e36aba86865e2cae772351e90fb8bec6492). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12355: [SPARK-14344][SQL] Not creating meta files when summary-...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/12355 It seems working fine now. Therefore, it seems not a problem. ```scala test("SPARK-14344 - write metadata") withSQLConf(ParquetOutputFormat.ENABLE_JOB_SUMMARY -> "true") { withTempPath { dir => val path = s"${dir.getCanonicalPath}/part-r-0.parquet" spark.range(10).write.parquet(path) val compressedFiles = new File(path).listFiles() assert(compressedFiles.exists(_.getName.endsWith("_common_metadata"))) } } withSQLConf(ParquetOutputFormat.ENABLE_JOB_SUMMARY -> "false") { withTempPath { dir => val path = s"${dir.getCanonicalPath}/part-r-0.parquet" spark.range(10).write.parquet(path) val compressedFiles = new File(path).listFiles() assert(!compressedFiles.exists(_.getName.endsWith("_common_metadata"))) } } } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSets
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15249 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSets
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15249 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66358/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSets
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15249 **[Test build #66358 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66358/consoleFull)** for PR 15249 at commit [`89d3c5e`](https://github.com/apache/spark/commit/89d3c5eb44939c38b0be14a6fc10c2139d0126ab). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14452 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14452 **[Test build #66360 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66360/consoleFull)** for PR 14452 at commit [`cebfbf5`](https://github.com/apache/spark/commit/cebfbf5e3dd7b2d2365e5152991ab7ff2c63dd90). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15307 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15307 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66362/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15357: [SPARK-17328][SQL] Fix NPE with EXPLAIN DESCRIBE ...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15357#discussion_r81901737 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -265,7 +265,9 @@ class SparkSqlAstBuilder(conf: SQLConf) extends AstBuilder { } val statement = plan(ctx.statement) -if (isExplainableStatement(statement)) { +if (statement == null) { + null // This is enough since ParseException will raise later. --- End diff -- I added the testcase to intercept that. If it returns `null`, `ParseDriver.scala` recognizes it as a parsing error and raises `Unsupported SQL statement`. ``` override def parsePlan(sqlText: String): LogicalPlan = parse(sqlText) { parser => astBuilder.visitSingleStatement(parser.singleStatement()) match { case plan: LogicalPlan => plan case _ => val position = Origin(None, None) throw new ParseException(Option(sqlText), "Unsupported SQL statement", position, position) } } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15307 **[Test build #66362 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66362/consoleFull)** for PR 15307 at commit [`f5732a5`](https://github.com/apache/spark/commit/f5732a50da7f0df326f52ad9b85da3876ecfafbc). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12693: [SPARK-14914] Fix Resource not closed after using, mostl...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/12693 @srowen I am willing to proceed this too if you approve and @taoli91 is not echoing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15357: [SPARK-17328][SQL] Fix NPE with EXPLAIN DESCRIBE TABLE
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15357 **[Test build #66371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66371/consoleFull)** for PR 15357 at commit [`45e46a9`](https://github.com/apache/spark/commit/45e46a969919c3fb184a3678764fa094054d223a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12135: [SPARK-14352][SQL] approxQuantile should support multi c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12135 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12135: [SPARK-14352][SQL] approxQuantile should support multi c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12135 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66359/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12135: [SPARK-14352][SQL] approxQuantile should support multi c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12135 **[Test build #66359 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66359/consoleFull)** for PR 12135 at commit [`89ed0cc`](https://github.com/apache/spark/commit/89ed0ccfe22e345655f33fb77b670c4d2309ecd7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12135: [SPARK-14352][SQL] approxQuantile should support multi c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12135 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12135: [SPARK-14352][SQL] approxQuantile should support multi c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/12135 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66363/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12135: [SPARK-14352][SQL] approxQuantile should support multi c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12135 **[Test build #66363 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66363/consoleFull)** for PR 12135 at commit [`a475090`](https://github.com/apache/spark/commit/a475090f5424752a1cfe04983d964f6fb85181b0). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15357: [SPARK-17328][SQL] Fix NPE with EXPLAIN DESCRIBE ...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15357#discussion_r81900796 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -265,7 +265,9 @@ class SparkSqlAstBuilder(conf: SQLConf) extends AstBuilder { } val statement = plan(ctx.statement) -if (isExplainableStatement(statement)) { +if (statement == null) { + null // This is enough since ParseException will raise later. --- End diff -- Sure! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15357: [SPARK-17328][SQL] Fix NPE with EXPLAIN DESCRIBE ...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15357#discussion_r81900460 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -265,7 +265,9 @@ class SparkSqlAstBuilder(conf: SQLConf) extends AstBuilder { } val statement = plan(ctx.statement) -if (isExplainableStatement(statement)) { +if (statement == null) { + null // This is enough since ParseException will raise later. --- End diff -- Where will the parseException be raised? Could you be a bit more specific in your comment? Maybe add a small test for this? Could you also add a few unit tests (not end-2-end) to SparkSqlParserSuite. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15358: [SPARK-17783] [SQL] Hide Credentials in CREATE and DESC ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15358 **[Test build #66370 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66370/consoleFull)** for PR 15358 at commit [`d3cc470`](https://github.com/apache/spark/commit/d3cc47025df10012940f281af5db94c90fc83917). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15358: Hide Credentials in CREATE and DESC FORMATTED/EXT...
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/15358 Hide Credentials in CREATE and DESC FORMATTED/EXTENDED a PERSISTENT/TEMP Table for JDBC ### What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ### How was this patch tested? Added test cases You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark maskCredentials Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15358.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15358 commit d3cc47025df10012940f281af5db94c90fc83917 Author: gatorsmileDate: 2016-10-05T04:37:35Z fix. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15354 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66356/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15354 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15354: [SPARK-17764][SQL] Add `to_json` supporting to convert n...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15354 **[Test build #66356 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66356/consoleFull)** for PR 15354 at commit [`eec0cd3`](https://github.com/apache/spark/commit/eec0cd32bde8564a080da425be48986055523e8c). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15314: [SPARK-17747][ML] WeightCol support non-double datatypes
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15314 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15314: [SPARK-17747][ML] WeightCol support non-double datatypes
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15314 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66364/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15314: [SPARK-17747][ML] WeightCol support non-double datatypes
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15314 **[Test build #66364 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66364/consoleFull)** for PR 15314 at commit [`423fd51`](https://github.com/apache/spark/commit/423fd5117e32e971e47a02728d6a863a726fc539). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r81899064 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -525,8 +645,62 @@ class StreamExecution( case object TERMINATED extends State } -object StreamExecution { +object StreamExecution extends Logging { private val _nextId = new AtomicLong(0) + /** + * Get the number of input rows from the executed plan of the trigger + * @param triggerExecutionPlan Execution plan of the trigger + * @param triggerLogicalPlan Logical plan of the trigger, generated from the query logical plan + * @param sourceToDataframe Source to DataFrame returned by the source.getBatch in this trigger + */ + def getNumInputRowsFromTrigger( --- End diff -- I managed to improve the test code, so remove this static method. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15307 **[Test build #66369 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66369/consoleFull)** for PR 15307 at commit [`e708b3b`](https://github.com/apache/spark/commit/e708b3b86a69833169962713ce8bef88bcbdc2f7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15357: [SPARK-17328][SQL] Fix NPE with EXPLAIN DESCRIBE TABLE
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15357 **[Test build #66368 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66368/consoleFull)** for PR 15357 at commit [`4b60195`](https://github.com/apache/spark/commit/4b601951e4b3311501363ac4de864c4bf9a1a756). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r81899011 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -525,8 +645,62 @@ class StreamExecution( case object TERMINATED extends State } -object StreamExecution { +object StreamExecution extends Logging { private val _nextId = new AtomicLong(0) + /** + * Get the number of input rows from the executed plan of the trigger + * @param triggerExecutionPlan Execution plan of the trigger + * @param triggerLogicalPlan Logical plan of the trigger, generated from the query logical plan + * @param sourceToDataframe Source to DataFrame returned by the source.getBatch in this trigger + */ + def getNumInputRowsFromTrigger( + triggerExecutionPlan: SparkPlan, + triggerLogicalPlan: LogicalPlan, + sourceToDataframe: Map[Source, DataFrame]): Map[Source, Long] = { + +// We want to associate execution plan leaves to sources that generate them, so that we match +// the their metrics (e.g. numOutputRows) to the sources. To do this we do the following. +// Consider the translation from the streaming logical plan to the final executed plan. +// +// streaming logical plan (with sources) <==> trigger's logical plan <==> executed plan +// +// 1. We keep track of streaming sources associated with each leaf in the trigger's logical plan +//- Each logical plan leaf will be associated with a single streaming source. +//- There can be multiple logical plan leaves associated a streaming source. +//- There can be leaves not associated with any streaming source, because they were +// generated from a batch source (e.g. stream-batch joins) +// +// 2. Assuming that the executed plan has same number of leaves in the same order as that of +//the trigger logical plan, we associate executed plan leaves with corresponding +//streaming sources. +// +// 3. For each source, we sum the metrics of the associated execution plan leaves. +// +val logicalPlanLeafToSource = sourceToDataframe.flatMap { case (source, df) => + df.logicalPlan.collectLeaves().map { leaf => leaf -> source } +} +val allLogicalPlanLeaves = triggerLogicalPlan.collectLeaves() // includes non-streaming sources +val allExecPlanLeaves = triggerExecutionPlan.collectLeaves() +if (allLogicalPlanLeaves.size == allExecPlanLeaves.size) { + val execLeafToSource = allLogicalPlanLeaves.zip(allExecPlanLeaves).flatMap { +case (lp, ep) => logicalPlanLeafToSource.get(lp).map { source => ep -> source } + } + val sourceToNumInputRows = execLeafToSource.map { case (execLeaf, source) => +val numRows = execLeaf.metrics.get("numOutputRows").map(_.value).getOrElse(0L) +source -> numRows + } + sourceToNumInputRows.groupBy(_._1).mapValues(_.map(_._2).sum) // sum up rows for each source +} else { + def toString[T](seq: Seq[T]): String = s"(size = ${seq.size}), ${seq.mkString(", ")}" + logWarning( +"Could not report metrics as number leaves in trigger logical plan did not match that" + --- End diff -- I added [`logPeriodicWarning`](https://github.com/apache/spark/pull/15307/commits/e708b3b86a69833169962713ce8bef88bcbdc2f7) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15044: [WIP][SQL][SPARK-17490] Optimize SerializeFromObject() f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15044 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66357/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15357: [SPARK-17328][SQL] Fix NPE with EXPLAIN DESCRIBE ...
GitHub user dongjoon-hyun opened a pull request: https://github.com/apache/spark/pull/15357 [SPARK-17328][SQL] Fix NPE with EXPLAIN DESCRIBE TABLE ## What changes were proposed in this pull request? This PR fixes the following NPE scenario. **Reported Error Scenario** ``` scala> sql("EXPLAIN DESCRIBE TABLE x").show(truncate = false) INFO SparkSqlParser: Parsing command: EXPLAIN DESCRIBE TABLE x java.lang.NullPointerException ``` ## How was this patch tested? Pass the Jenkins test with a new test case. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dongjoon-hyun/spark SPARK-17328 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15357.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15357 commit 4b601951e4b3311501363ac4de864c4bf9a1a756 Author: Dongjoon HyunDate: 2016-10-05T04:24:27Z [SPARK-17328][SQL] Fix NPE with EXPLAIN DESCRIBE TABLE --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15044: [WIP][SQL][SPARK-17490] Optimize SerializeFromObject() f...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15044 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15044: [WIP][SQL][SPARK-17490] Optimize SerializeFromObject() f...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15044 **[Test build #66357 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66357/consoleFull)** for PR 15044 at commit [`2b22d12`](https://github.com/apache/spark/commit/2b22d128ef4c51643cd4dcdbe17a1f3d28362a90). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSets
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15249 **[Test build #66367 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66367/consoleFull)** for PR 15249 at commit [`a6c863f`](https://github.com/apache/spark/commit/a6c863f2462986b66a93f0beac3bb1f163afa50d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15044: [WIP][SQL][SPARK-17490] Optimize SerializeFromObject() f...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/15044 Thanks, I should update one file and update it later today in Japan. PR #13758 can also solve this issue without allocating UnsafeArrayData. I think that PR #13758 is a generic solution and has small amount of changes. Which PR is preferable, #15044 or #13758? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSets
Github user squito commented on the issue: https://github.com/apache/spark/pull/15249 @kayousterhout @mridulm thanks for the feedback. obviously still need to figure out the timeout thing but otherwise think I've addressed things. will do another pass in the morning. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSet...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/15249#discussion_r81898588 --- Diff: core/src/main/scala/org/apache/spark/scheduler/BlacklistTracker.scala --- @@ -0,0 +1,128 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler + +import org.apache.spark.SparkConf +import org.apache.spark.internal.config +import org.apache.spark.internal.Logging +import org.apache.spark.util.Utils + +private[scheduler] object BlacklistTracker extends Logging { + + private val DEFAULT_TIMEOUT = "1h" --- End diff -- (longer top-level comment responding to this) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSets
Github user squito commented on the issue: https://github.com/apache/spark/pull/15249 @mridulm on the questions about expiry from blacklists, you are not missing anything -- this explictly does not do any timeouts at the taskset level (this is mentioned in the design doc). The timeout code you see is mostly just incremental stuff as a step towards https://github.com/apache/spark/pull/14079, but doesn't actually add any value here. The primary motivation for blacklisting that I've seen is actually quite different from the use case you are describing -- its not to help deal w/ resource contention, but to deal w/ truly broken resources (a bad disk in all the cases I can think of). In fact, in these cases, 1 hour is really short -- users really want something more like 6-12 hours probably. But 1 hr really isn't so bad, it just means that the bad resources need to be "rediscovered" that often, with a scheduling hiccup while that happens. This is really different from the use case you are describing -- its a form of back off to deal w/ resource contention. I have actually talked to a couple of different folks about doing something like this recently and think it would be great, though I see problems with this approach, since it allows other tasks to still be scheduled on those executors, and also the time isn't relative to the task runtime etc. Nonetheless, an issue here might be that the old option serves some purpose which is no longer supported. Do we need to add it back in? Just adding the logic for the timeouts again is pretty easy, though (a) I need to figure out the right place to do it so that it doesn't impact scheduling performance and more importantly (b) I really worry about being able to configure things so that blacklisting can actually handle totally broken resources. Eg., say that you set the timeout to 10s. If your tasks take 1 minute each, then your one bad executor might cycle through the leftover tasks, fail them all, pass the timeout, and repeat that cycle a few times till you go over spark.task.maxFailures. I don't see a good way to deal w/ while setting a sensible a timeout for the entire application. Two other workarounds: (2) just enable the timeout per-task when the legacy configuration is used. Leave it undocumented. We don't change behavior then, but configuration is kind of a mess, and it'll be a headache to continue to maintain this (3) Add a timeout just to *taskset* level blacklisting. So its a behavior change from the existing blacklisting, which has a timeout per *task*. This removes the interaction w/ spark.task.maxFailures that we've always got to tiptoe around. I also think it might satisfy your use case even better. I still don't think its a great solution to the problem, and we need something else for handling this sort of backoff better, so I don't feel great about it getting shoved into this feature. I'm thinking (3) is the best but will give it a bit more thought. Also @kayousterhout @tgravescs @markhamstra for opinions as well since this is a bigger design point to consider. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15307 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15307 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66352/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15307 **[Test build #66352 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66352/consoleFull)** for PR 15307 at commit [`02603c7`](https://github.com/apache/spark/commit/02603c7f56c8722d9003d09e40889084122ba40d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15356: [BUILD] Closing some stale PRs
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15356 FYI, for 15294, the JIRA is set as `Won't fix`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15356: [BUILD] Closing some stale PRs
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15356 Just to defense myself, it seems the PR such 15339 against a branch to a branch leaving a failure mark for each commit in the branches ([branch-2.0](https://github.com/apache/spark/commits/branch-2.0)). Could you please take a look please @srowen ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15351: [SPARK-17612][SQL][branch-2.0] Support `DESCRIBE table P...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15351 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66354/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15356: [BUILD] Closing some stale PRs
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15356 **[Test build #66366 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66366/consoleFull)** for PR 15356 at commit [`a307b5e`](https://github.com/apache/spark/commit/a307b5e40d59e5ce40a0c3986a6db1553acea50a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15351: [SPARK-17612][SQL][branch-2.0] Support `DESCRIBE table P...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15351 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15351: [SPARK-17612][SQL][branch-2.0] Support `DESCRIBE table P...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15351 **[Test build #66354 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66354/consoleFull)** for PR 15351 at commit [`bb6d6c1`](https://github.com/apache/spark/commit/bb6d6c1d689d096e9c7ec123b74ae364978d8d1c). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15356: [BUILD] Closing some stale PRs
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/15356 [BUILD] Closing some stale PRs ## What changes were proposed in this pull request? This PR proposes to close some stale PRs and ones suggested to be closed by committer(s) or obviously inappropriate PRs. Closes #13458 Closes #14565 Closes #15078 Closes #15278 Closes #15294 Closes #15339 ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark closing-prs Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15356.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15356 commit a307b5e40d59e5ce40a0c3986a6db1553acea50a Author: hyukjinkwonDate: 2016-10-05T04:00:39Z Closing some stale PRs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15307 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15307 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66353/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15307 **[Test build #66353 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66353/consoleFull)** for PR 15307 at commit [`9fd6815`](https://github.com/apache/spark/commit/9fd681536bf8200af4b448f87e8cdbf17df2c0ba). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15292: [SPARK-17719][SPARK-17776][SQL] Unify and tie up options...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15292 **[Test build #66365 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66365/consoleFull)** for PR 15292 at commit [`3fa9f43`](https://github.com/apache/spark/commit/3fa9f43686f1195a9f86ab1bcda054119c332a20). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15348: [SPARK-17758][SQL] Last returns wrong result in case of ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15348 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15348: [SPARK-17758][SQL] Last returns wrong result in case of ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15348 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66349/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15348: [SPARK-17758][SQL] Last returns wrong result in case of ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15348 **[Test build #66349 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66349/consoleFull)** for PR 15348 at commit [`5ae49ae`](https://github.com/apache/spark/commit/5ae49ae2d0f98a79712abd5ccad262ea9e0f9b5e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15348: [SPARK-17758][SQL] Last returns wrong result in case of ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15348 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15348: [SPARK-17758][SQL] Last returns wrong result in case of ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15348 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66351/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15348: [SPARK-17758][SQL] Last returns wrong result in case of ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15348 **[Test build #66351 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66351/consoleFull)** for PR 15348 at commit [`8b442de`](https://github.com/apache/spark/commit/8b442debd33f6e985aa4ca536e2e8607db3ba477). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSet...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/15249#discussion_r81896656 --- Diff: core/src/main/scala/org/apache/spark/scheduler/BlacklistTracker.scala --- @@ -0,0 +1,130 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler + +import org.apache.spark.SparkConf +import org.apache.spark.internal.config +import org.apache.spark.internal.Logging +import org.apache.spark.util.Utils + +private[scheduler] object BlacklistTracker extends Logging { + + private val DEFAULT_TIMEOUT = "1h" + + /** + * Returns true if the blacklist is enabled, based on checking the configuration in the following + * order: + * 1. Is it specifically enabled or disabled? + * 2. Is it enabled via the legacy timeout conf? + * 3. Use the default for the spark-master: + * - off for local mode + * - on for distributed modes (including local-cluster) + */ + def isBlacklistEnabled(conf: SparkConf): Boolean = { +conf.get(config.BLACKLIST_ENABLED) match { + case Some(isEnabled) => +isEnabled + case None => +// if they've got a non-zero setting for the legacy conf, always enable the blacklist, +// otherwise, use the default based on the cluster-mode (off for local-mode, on otherwise). +val legacyKey = config.BLACKLIST_LEGACY_TIMEOUT_CONF.key +conf.get(config.BLACKLIST_LEGACY_TIMEOUT_CONF) match { + case Some(legacyTimeout) => +if (legacyTimeout == 0) { + logWarning(s"Turning off blacklisting due to legacy configuaration:" + +s" $legacyKey == 0") + false +} else { + // mostly this is necessary just for tests, since real users that want the blacklist + // will get it anyway by default + logWarning(s"Turning on blacklisting due to legacy configuration:" + +s" $legacyKey > 0") + true +} + case None => +// local-cluster is *not* considered local for these purposes, we still want the +// blacklist enabled by default +!Utils.isLocalMaster(conf) +} +} + } + + def getBlacklistTimeout(conf: SparkConf): Long = { +conf.get(config.BLACKLIST_TIMEOUT_CONF).getOrElse { + conf.get(config.BLACKLIST_LEGACY_TIMEOUT_CONF).getOrElse { +Utils.timeStringAsMs(DEFAULT_TIMEOUT) + } +} + } + + /** + * Verify that blacklist configurations are consistent; if not, throw an exception. Should only + * be called if blacklisting is enabled. + * + * The configuration for the blacklist is expected to adhere to a few invariants. Default + * values follow these rules of course, but users may unwittingly change one configuration + * without making the corresponding adjustment elsewhere. This ensures we fail-fast when + * there are such misconfigurations. + */ + def validateBlacklistConfs(conf: SparkConf): Unit = { + +def mustBePos(k: String, v: String): Unit = { + throw new IllegalArgumentException(s"$k was $v, but must be > 0.") +} + +// undocumented escape hatch for validation -- just for tests that want to run in an "unsafe" +// configuration. +if (!conf.get("spark.blacklist.testing.skipValidation", "false").toBoolean) { + + Seq( +config.MAX_TASK_ATTEMPTS_PER_EXECUTOR, +config.MAX_TASK_ATTEMPTS_PER_NODE, +config.MAX_FAILURES_PER_EXEC_STAGE, +config.MAX_FAILED_EXEC_PER_NODE_STAGE + ).foreach { config => +val v = conf.get(config) +if (v <= 0) { + mustBePos(config.key, v.toString) +} + } + + val timeout = getBlacklistTimeout(conf) + if (timeout <= 0)
[GitHub] spark issue #15246: [MINOR][SQL] Use resource path for test_script.sh
Github user weiqingy commented on the issue: https://github.com/apache/spark/pull/15246 Retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15314: [SPARK-17747][ML] WeightCol support non-double datatypes
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15314 **[Test build #66364 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66364/consoleFull)** for PR 15314 at commit [`423fd51`](https://github.com/apache/spark/commit/423fd5117e32e971e47a02728d6a863a726fc539). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15246: [MINOR][SQL] Use resource path for test_script.sh
Github user weiqingy commented on the issue: https://github.com/apache/spark/pull/15246 The changes should be safe to `org.apache.spark.streaming.kafka010.DirectKafkaStreamSuite.pattern based subscription`, I'll re-trigger again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15307 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15307 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66350/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15307 **[Test build #66350 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66350/consoleFull)** for PR 15307 at commit [`bbd0d8b`](https://github.com/apache/spark/commit/bbd0d8bacae529cdb5e43b5165e3c687c5c9ec05). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14828: [SPARK-17258][SQL] Parse scientific decimal literals as ...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/14828 @gatorsmile does this LGTY? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15355: [SPARK-17782][STREAMING] Disable Kafka 010 pattern based...
Github user koeninger commented on the issue: https://github.com/apache/spark/pull/15355 I have generally been unable to reproduce these kinds of test failures on my local environment, and don't have access to the build server, so trying fix without repro is pretty much shooting randomly in the dark. It does seem unfortunate to me that we're effectively doing full integration tests on every PR, even if a patch has changed something (e.g. MLLib) that couldn't possibly affect the modules in /external --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15307 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66348/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15307 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15307 **[Test build #66348 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66348/consoleFull)** for PR 15307 at commit [`05f22d7`](https://github.com/apache/spark/commit/05f22d7974f410289028bfa4df1d2f6036f5023e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r81895371 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -525,8 +645,62 @@ class StreamExecution( case object TERMINATED extends State } -object StreamExecution { +object StreamExecution extends Logging { private val _nextId = new AtomicLong(0) + /** + * Get the number of input rows from the executed plan of the trigger + * @param triggerExecutionPlan Execution plan of the trigger + * @param triggerLogicalPlan Logical plan of the trigger, generated from the query logical plan + * @param sourceToDataframe Source to DataFrame returned by the source.getBatch in this trigger + */ + def getNumInputRowsFromTrigger( + triggerExecutionPlan: SparkPlan, + triggerLogicalPlan: LogicalPlan, + sourceToDataframe: Map[Source, DataFrame]): Map[Source, Long] = { + +// We want to associate execution plan leaves to sources that generate them, so that we match +// the their metrics (e.g. numOutputRows) to the sources. To do this we do the following. +// Consider the translation from the streaming logical plan to the final executed plan. +// +// streaming logical plan (with sources) <==> trigger's logical plan <==> executed plan +// +// 1. We keep track of streaming sources associated with each leaf in the trigger's logical plan +//- Each logical plan leaf will be associated with a single streaming source. +//- There can be multiple logical plan leaves associated a streaming source. +//- There can be leaves not associated with any streaming source, because they were +// generated from a batch source (e.g. stream-batch joins) +// +// 2. Assuming that the executed plan has same number of leaves in the same order as that of +//the trigger logical plan, we associate executed plan leaves with corresponding +//streaming sources. +// +// 3. For each source, we sum the metrics of the associated execution plan leaves. +// +val logicalPlanLeafToSource = sourceToDataframe.flatMap { case (source, df) => + df.logicalPlan.collectLeaves().map { leaf => leaf -> source } +} +val allLogicalPlanLeaves = triggerLogicalPlan.collectLeaves() // includes non-streaming sources +val allExecPlanLeaves = triggerExecutionPlan.collectLeaves() +if (allLogicalPlanLeaves.size == allExecPlanLeaves.size) { + val execLeafToSource = allLogicalPlanLeaves.zip(allExecPlanLeaves).flatMap { +case (lp, ep) => logicalPlanLeafToSource.get(lp).map { source => ep -> source } + } + val sourceToNumInputRows = execLeafToSource.map { case (execLeaf, source) => +val numRows = execLeaf.metrics.get("numOutputRows").map(_.value).getOrElse(0L) +source -> numRows + } + sourceToNumInputRows.groupBy(_._1).mapValues(_.map(_._2).sum) // sum up rows for each source +} else { + def toString[T](seq: Seq[T]): String = s"(size = ${seq.size}), ${seq.mkString(", ")}" + logWarning( +"Could not report metrics as number leaves in trigger logical plan did not match that" + --- End diff -- A warning printed once can gets lost in the logs. I think its worth printing it every minute or so that if we have to debug its easy to find it, rather than trying to look for the logs when the query started. Furthermore, I dont want to add a new field flags/timestamp in StreamExecution to keep track whether the log has been printed once/last minute. So I am thinking of adding a small utility trait that has method `logWarningEvery(period, ...)`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15314: [SPARK-17747][ML] WeightCol support non-double datatypes
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15314 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66355/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15314: [SPARK-17747][ML] WeightCol support non-double datatypes
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15314 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15314: [SPARK-17747][ML] WeightCol support non-double datatypes
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15314 **[Test build #66355 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66355/consoleFull)** for PR 15314 at commit [`07c156a`](https://github.com/apache/spark/commit/07c156a2b1c3ca60ff1fc4582c9024c333e3a064). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSet...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/15249#discussion_r81895140 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetBlacklist.scala --- @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark.scheduler + +import scala.collection.mutable.{HashMap, HashSet} + +import org.apache.spark.SparkConf +import org.apache.spark.internal.config +import org.apache.spark.internal.Logging +import org.apache.spark.util.Clock + +/** + * Handles blacklisting executors and nodes within a taskset. This includes blacklisting specific + * (task, executor) / (task, nodes) pairs, and also completely blacklisting executors and nodes + * for the entire taskset. + * + * THREADING: As a helper to [[TaskSetManager]], this class is designed to only be called from code + * with a lock on the TaskScheduler (e.g. its event handlers). It should not be called from other + * threads. + */ +private[scheduler] class TaskSetBlacklist(val conf: SparkConf, val stageId: Int, val clock: Clock) +extends Logging { + + private val MAX_TASK_ATTEMPTS_PER_EXECUTOR = conf.get(config.MAX_TASK_ATTEMPTS_PER_EXECUTOR) + private val MAX_TASK_ATTEMPTS_PER_NODE = conf.get(config.MAX_TASK_ATTEMPTS_PER_NODE) + private val MAX_FAILURES_PER_EXEC_STAGE = conf.get(config.MAX_FAILURES_PER_EXEC_STAGE) + private val MAX_FAILED_EXEC_PER_NODE_STAGE = conf.get(config.MAX_FAILED_EXEC_PER_NODE_STAGE) + private val TIMEOUT_MILLIS = BlacklistTracker.getBlacklistTimeout(conf) + + /** + * A map from each executor to the task failures on that executor. + */ + val execToFailures: HashMap[String, ExecutorFailuresInTaskSet] = new HashMap() + + /** + * Map from node to all executors on it with failures. Needed because we want to know about + * executors on a node even after they have died. + */ + private val nodeToExecsWithFailures: HashMap[String, HashSet[String]] = new HashMap() + private val nodeToBlacklistedTasks: HashMap[String, HashSet[Int]] = new HashMap() + private val blacklistedExecs: HashSet[String] = new HashSet() + private val blacklistedNodes: HashSet[String] = new HashSet() + + /** + * Return true if this executor is blacklisted for the given task. This does *not* + * need to return true if the executor is blacklisted for the entire stage. + * That is to keep this method as fast as possible in the inner-loop of the + * scheduler, where those filters will have already been applied. + */ + def isExecutorBlacklistedForTask( + executorId: String, + index: Int): Boolean = { +execToFailures.get(executorId) + .map { execFailures => +val count = execFailures.taskToFailureCountAndExpiryTime.get(index).map(_._1).getOrElse(0) +count >= MAX_TASK_ATTEMPTS_PER_EXECUTOR + } + .getOrElse(false) + } + + def isNodeBlacklistedForTask( + node: String, + index: Int): Boolean = { +nodeToBlacklistedTasks.get(node) + .map(_.contains(index)) + .getOrElse(false) + } + + /** + * Return true if this executor is blacklisted for the given stage. Completely ignores whether + * anything to do with the node the executor is on. That + * is to keep this method as fast as possible in the inner-loop of the scheduler, where those + * filters will already have been applied. + */ + def isExecutorBlacklistedForTaskSet(executorId: String): Boolean = { +blacklistedExecs.contains(executorId) + } + + def isNodeBlacklistedForTaskSet(node: String): Boolean = { +blacklistedNodes.contains(node) + } --- End diff -- I know its verbose but I'd prefer to keep it. Especially once application-level blacklisting is added (https://github.com/apache/spark/pull/14079), there are lots of different
[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15307 **[Test build #66362 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66362/consoleFull)** for PR 15307 at commit [`f5732a5`](https://github.com/apache/spark/commit/f5732a50da7f0df326f52ad9b85da3876ecfafbc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12135: [SPARK-14352][SQL] approxQuantile should support multi c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12135 **[Test build #66363 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66363/consoleFull)** for PR 12135 at commit [`a475090`](https://github.com/apache/spark/commit/a475090f5424752a1cfe04983d964f6fb85181b0). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15355: [SPARK-17782][STREAMING] Disable Kafka 010 pattern based...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15355 **[Test build #66361 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66361/consoleFull)** for PR 15355 at commit [`b7074d4`](https://github.com/apache/spark/commit/b7074d48159804035eaf00e1abed35e408684b42). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15355: [SPARK-17782][STREAMING]Disable Kafka 010 pattern based ...
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15355 cc @koeninger any idea why this flaky? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15355: [SPARK-17782][STREAMING]Disable Kafka 010 pattern...
GitHub user hvanhovell opened a pull request: https://github.com/apache/spark/pull/15355 [SPARK-17782][STREAMING]Disable Kafka 010 pattern based subscription test. ## What changes were proposed in this pull request? This PR disables the `pattern based subscription` test in the Kafka's 010 DirectKafkaStreamSuite. It is behaving flaky. You can merge this pull request into a Git repository by running: $ git pull https://github.com/hvanhovell/spark SPARK-17782 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15355.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15355 commit b7074d48159804035eaf00e1abed35e408684b42 Author: Herman van HovellDate: 2016-10-05T03:12:08Z Disable Kafka 010 pattern based subscription test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #14452: [SPARK-16849][SQL] Improve subquery execution by dedupli...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14452 **[Test build #66360 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66360/consoleFull)** for PR 14452 at commit [`cebfbf5`](https://github.com/apache/spark/commit/cebfbf5e3dd7b2d2365e5152991ab7ff2c63dd90). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org