[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/17166 The task killed messages should be informative, and I don't think we should sacrifice informative messages just so they can be shown concisely in the stage summary view. I think it's much better to have an informative message that a user needs to click one link to see than it is to force the user to look in the logs to figure out what's going on (especially since behavior around tasks getting killed can be very confusing). If others feel strongly I can be convinced to add this per-task info to the summary view, but I'm not convinced of the need to sacrifice clarify for conciseness. @mridum @rxin what do you two think here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...
Github user ericl commented on the issue: https://github.com/apache/spark/pull/17166 Drilling down into the detail view is kind of cumbersome -- I think it's most useful to have a good summary at the progress bar, and then the user can refer to logs for detailed per-task debugging. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106088842 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class CostBasedJoinReorder(conf: CatalystConf) extends Rule[LogicalPlan] wi def reorder(plan: LogicalPlan, output: AttributeSet): LogicalPlan = { val (items, conditions) = extractInnerJoins(plan) +// Find the star schema joins. Currently, it returns the star join with the largest +// fact table. In the future, it can return more than one star join (e.g. F1-D1-D2 +// and F2-D3-D4). +val starJoinPlans = StarSchemaDetection(conf).findStarJoins(items, conditions.toSeq) --- End diff -- @ron8hu We already ran TPC-DS with star schema and the results are documented in the design doc. I don't think there is a question about its value. I am familiar with Pat Selinger's paper since I've been working in the IBM DB2 optimizer for several years. What Zhenhua and I are discussing here is how to integrate the star join plans with his new DP planning. There are no competing planning algorithm that needs to be tested. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17299: [SPARK-19817][SS] Make it clear that `timeZone` i...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17299 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17299: [SPARK-19817][SS] Make it clear that `timeZone` is a gen...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/17299 Thanks! Merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17186: [SPARK-19846][SQL] Add a flag to disable constraint prop...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17186 **[Test build #74581 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74581/testReport)** for PR 17186 at commit [`d3b0a72`](https://github.com/apache/spark/commit/d3b0a7237c5f70ea64a786ecf63edac13617d284). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/17166 Why not just just "X killed" in the stage summary? It seems like overkill to put the reasons for all of the killings there, now that I'm seeing the screenshot, since they're already in the detail view (and the whole reason we have the detail view is to show per-task info) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106086454 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class CostBasedJoinReorder(conf: CatalystConf) extends Rule[LogicalPlan] wi def reorder(plan: LogicalPlan, output: AttributeSet): LogicalPlan = { val (items, conditions) = extractInnerJoins(plan) +// Find the star schema joins. Currently, it returns the star join with the largest +// fact table. In the future, it can return more than one star join (e.g. F1-D1-D2 +// and F2-D3-D4). +val starJoinPlans = StarSchemaDetection(conf).findStarJoins(items, conditions.toSeq) --- End diff -- The dynamic programming solution proposed by Pat was also used in DB2. She is the mother of DB2. We knew the limit of this solution. It is unable to solve all the issues. [The above suggestion](https://github.com/apache/spark/pull/15363#issuecomment-285187051) by Ioana (who is the senior DB2 compiler expert) is in the right direction, IMO, --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15945: [SPARK-12978][SQL] Merge unnecessary partial aggregates
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15945 **[Test build #74580 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74580/testReport)** for PR 15945 at commit [`11d2757`](https://github.com/apache/spark/commit/11d2757fd1299e2499b2739013cc454392f1a524). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...
Github user ron8hu commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106084556 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class CostBasedJoinReorder(conf: CatalystConf) extends Rule[LogicalPlan] wi def reorder(plan: LogicalPlan, output: AttributeSet): LogicalPlan = { val (items, conditions) = extractInnerJoins(plan) +// Find the star schema joins. Currently, it returns the star join with the largest +// fact table. In the future, it can return more than one star join (e.g. F1-D1-D2 +// and F2-D3-D4). +val starJoinPlans = StarSchemaDetection(conf).findStarJoins(items, conditions.toSeq) --- End diff -- As discussed earlier, we only need to perform join reorder algorithm once. CostBasedJoinReorder implemented Dynamic Programming algorithm published in the classic paper "Access Path Selection in a relational database system" by Patricia Selinger. The same algorithm was used in PostgreSQL. To my understanding, it is a generic algorithm that can work on both star schema and non-star schema. For example, it is capable to generate a bushy tree if it is optimal. That is it is not limited to left-deep tree only. I suggest that we identify the strength of the star join reorder algorithm and it can help solve the deficiency of the dynamic programming algorithm. Then we add the necessary code to address the deficiency. There is no need to add code that does the same job twice without added value. Perhaps running TPC-ds benchmark queries and inspecting the generated query plan can help us identify the strength and weakness of both algorithms. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17274: [SPARK-19925][SPARKR] Fix SparkR spark.getSparkFi...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/17274#discussion_r106083228 --- Diff: R/pkg/inst/tests/testthat/test_context.R --- @@ -177,6 +177,13 @@ test_that("add and get file to be downloaded with Spark job on every node", { spark.addFile(path) download_path <- spark.getSparkFiles(filename) expect_equal(readLines(download_path), words) + + # Test spark.getSparkFiles works well on executors. + seq <- seq(from = 1, to = 10, length.out = 5) + f <- function(seq) { readLines(spark.getSparkFiles(filename)) } + results <- spark.lapply(seq, f) + for (i in 1:5) { expect_equal(results[[i]], words) } + --- End diff -- * It fails when I run ```run-tests.sh``` on my machine. * It succeed when I paste these code to SparkR console. * It succeed when I paste these code to text file, and submit by ```bin/spark-submit test.R```(local mode) or ```bin/spark-submit --master yarn test.R```(yarn mode). So I think it may caused by the test suites infrastructure, but I'm not familiar with that part. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17299: [SPARK-19817][SS] Make it clear that `timeZone` is a gen...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17299 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74578/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17299: [SPARK-19817][SS] Make it clear that `timeZone` is a gen...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17299 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17299: [SPARK-19817][SS] Make it clear that `timeZone` is a gen...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17299 **[Test build #74578 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74578/testReport)** for PR 17299 at commit [`0b13c08`](https://github.com/apache/spark/commit/0b13c0850472b5172a9f428f923b99a168a79e00). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17166 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74576/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17166 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17166 **[Test build #74576 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74576/testReport)** for PR 17166 at commit [`72b28cb`](https://github.com/apache/spark/commit/72b28cb0dc7aacf7cde1b5e49f05da49cb5de276). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` case class KillTask(taskId: Long, executor: String, interruptThread: Boolean, reason: String)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15009: [SPARK-17443][SPARK-11035] Stop Spark Application if lau...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15009 **[Test build #74579 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74579/testReport)** for PR 15009 at commit [`c17f15f`](https://github.com/apache/spark/commit/c17f15f3994ba0cba4be63519f33ce4429adf489). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15945: [SPARK-12978][SQL] Merge unnecessary partial aggregates
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15945 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15945: [SPARK-12978][SQL] Merge unnecessary partial aggregates
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15945 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74577/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15945: [SPARK-12978][SQL] Merge unnecessary partial aggregates
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15945 **[Test build #74577 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74577/testReport)** for PR 15945 at commit [`870222e`](https://github.com/apache/spark/commit/870222e8ec1b6e7aa32a0260a045192323ba8d30). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15604: [SPARK-18066] [CORE] [TESTS] Add Pool usage policies tes...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15604 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74575/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15604: [SPARK-18066] [CORE] [TESTS] Add Pool usage policies tes...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15604 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15604: [SPARK-18066] [CORE] [TESTS] Add Pool usage policies tes...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15604 **[Test build #74575 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74575/testReport)** for PR 15604 at commit [`17e11f0`](https://github.com/apache/spark/commit/17e11f0c56e2a581766c06bd52695c2b05bcfcb2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17166 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74572/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17166 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17166 **[Test build #74572 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74572/testReport)** for PR 17166 at commit [`31967d1`](https://github.com/apache/spark/commit/31967d185852870d8edecb855ea1aafb7bd04dd1). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` case class KillTask(taskId: Long, executor: String, interruptThread: Boolean, reason: String)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15009: [SPARK-17443][SPARK-11035] Stop Spark Application...
Github user kishorvpatil commented on a diff in the pull request: https://github.com/apache/spark/pull/15009#discussion_r106080005 --- Diff: resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala --- @@ -252,20 +307,55 @@ class YarnClusterSuite extends BaseYarnClusterSuite { handle.getAppId() should startWith ("application_") handle.disconnect() + var applicationId = ConverterUtils.toApplicationId(handle.getAppId) + var yarnClient: YarnClient = getYarnClient eventually(timeout(30 seconds), interval(100 millis)) { handle.getState() should be (SparkAppHandle.State.LOST) +var status = yarnClient.getApplicationReport(applicationId).getFinalApplicationStatus +status should be (FinalApplicationStatus.KILLED) } + } finally { handle.kill() } } - test("timeout to get SparkContext in cluster mode triggers failure") { -val timeout = 2000 -val finalState = runSpark(false, mainClassName(SparkContextTimeoutApp.getClass), - appArgs = Seq((timeout * 4).toString), - extraConf = Map(AM_MAX_WAIT_TIME.key -> timeout.toString)) -finalState should be (SparkAppHandle.State.FAILED) + test("monitor app using launcher library for thread without auto shutdown") { +val env = new JHashMap[String, String]() +env.put("YARN_CONF_DIR", hadoopConfDir.getAbsolutePath()) + +val propsFile = createConfFile() +val handle = new SparkLauncher(env) + .setSparkHome(sys.props("spark.test.home")) + .setConf("spark.ui.enabled", "false") + .setPropertiesFile(propsFile) + .setMaster("yarn") + .setDeployMode("cluster") + .launchAsThread(true) + .setAppResource(SparkLauncher.NO_RESOURCE) + .setMainClass(mainClassName(YarnLauncherTestApp.getClass)) + .startApplication() + +try { + eventually(timeout(30 seconds), interval(100 millis)) { +handle.getState() should be (SparkAppHandle.State.RUNNING) + } + + handle.getAppId() should not be (null) + handle.getAppId() should startWith ("application_") + handle.disconnect() + + var applicationId = ConverterUtils.toApplicationId(handle.getAppId) + var yarnClient: YarnClient = getYarnClient + eventually(timeout(30 seconds), interval(100 millis)) { +handle.getState() should be (SparkAppHandle.State.LOST) +var status = yarnClient.getApplicationReport(applicationId).getYarnApplicationState +status should not be (YarnApplicationState.KILLED) --- End diff -- Checking condition status should `not` be. It could be running of successful but not killed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17267 LGTM cc @davies @holdenk --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17299: [SPARK-19817][SS] Make it clear that `timeZone` is a gen...
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/17299 LGTM, thank you for your follow-up! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17297: [SPARK-14649][CORE] DagScheduler should not run duplicat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17297 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17297: [SPARK-14649][CORE] DagScheduler should not run duplicat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17297 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74566/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17297: [SPARK-14649][CORE] DagScheduler should not run duplicat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17297 **[Test build #74566 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74566/testReport)** for PR 17297 at commit [`0bcc69a`](https://github.com/apache/spark/commit/0bcc69a7a3094ddaa8c915be1e4a198a354f8b6b). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class TasksAborted(stageId: Int, tasks: Seq[Task[_]]) extends DAGSchedulerEvent` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17300: [SPARK-19956][Core]Optimize a location order of blocks w...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17300 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17300: [SPARK-19956][Core]Optimize a location order of b...
GitHub user ConeyLiu opened a pull request: https://github.com/apache/spark/pull/17300 [SPARK-19956][Core]Optimize a location order of blocks with topology information ## What changes were proposed in this pull request? When call the method getLocations of BlockManager, we only compare the data block host. Random selection for non-local data blocks, this may cause the selected data block to be in a different rack. So in this patch to increase the sort of the rack. ## How was this patch tested? New test case. Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ConeyLiu/spark blockmanager Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17300.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17300 commit 926ea2551487f170b24f06a0d6c11879103b05b5 Author: Xianyang LiuDate: 2017-03-15T02:42:49Z optimize a location order of blocks with topology information --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17267: [SPARK-19926][PYSPARK] Make pyspark exception more reada...
Github user uncleGen commented on the issue: https://github.com/apache/spark/pull/17267 ping @viirya --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17166 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17166 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74570/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17166 **[Test build #74570 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74570/testReport)** for PR 17166 at commit [`6c90289`](https://github.com/apache/spark/commit/6c902898b7b29f5f32a1618cfbf06e39c8fc3f0f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17232: [SPARK-18112] [SQL] Support reading data from Hiv...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17232 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17232 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17178: [SPARK-19828][R] Support array type in from_json ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17178 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17178: [SPARK-19828][R] Support array type in from_json in R
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/17178 thanks! merged to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17268: [SPARK-19932][SS] Also save event time into StateStore f...
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/17268 Thank you @marmbrus for the detailed explanation! > For that reason, I think its safest to require the user to explicitly include the timestamp. Yea, let me update this in this direction. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15125: [SPARK-5484][GraphX] Periodically do checkpoint in Prege...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/15125 @jkbradley @ankurdave would you like to review? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17297: [SPARK-14649][CORE] DagScheduler should not run duplicat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17297 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17297: [SPARK-14649][CORE] DagScheduler should not run duplicat...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17297 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74562/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17297: [SPARK-14649][CORE] DagScheduler should not run duplicat...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17297 **[Test build #74562 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74562/testReport)** for PR 17297 at commit [`f127150`](https://github.com/apache/spark/commit/f1271506d0f5d5d037cee91cc91d42ddb14a8038). * This patch **fails from timeout after a configured wait of \`250m\`**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class TasksAborted(stageId: Int, tasks: Seq[Task[_]]) extends DAGSchedulerEvent` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17251: [SPARK-19910][SQL] `stack` should not reject NULL values...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/17251 @cloud-fan . Sorry, but could you review this `stack` PR once again? :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15363: [SPARK-17791][SQL] Join reordering using star sch...
Github user ioana-delaney commented on a diff in the pull request: https://github.com/apache/spark/pull/15363#discussion_r106074890 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala --- @@ -51,6 +51,11 @@ case class CostBasedJoinReorder(conf: CatalystConf) extends Rule[LogicalPlan] wi def reorder(plan: LogicalPlan, output: AttributeSet): LogicalPlan = { val (items, conditions) = extractInnerJoins(plan) +// Find the star schema joins. Currently, it returns the star join with the largest +// fact table. In the future, it can return more than one star join (e.g. F1-D1-D2 +// and F2-D3-D4). +val starJoinPlans = StarSchemaDetection(conf).findStarJoins(items, conditions.toSeq) --- End diff -- @wzhfy I’ve looked into moving the star reordering at the end of the optimization phase. Star reordering uses the existing ```ReorderJoin.createOrderedJoin``` method to construct the final plan once a star join is discovered. This method only handles specific types of plans, and doesn’t recognize the plan layout in the last phase of the Optimizer. Writing a new join reordering method for this purpose would not make too much sense since star joins are to be used by the existing planning strategies. I suggest to keep the current logic and, next, I can look into integrating the star plans with your new DP planning. Once that’s tested, we can probably remove the star schema call from ```ReorderJoin``` planning rule. Please let me know what you think. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17266: [SPARK-19912][SQL] String literals should be escaped for...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/17266 For non-error message cases, incorrect result is also a problem in this issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17241: [SPARK-19877][SQL] Restrict the nested level of a view
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/17241 ping @hvanhovell @gatorsmile --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17232 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74574/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17266: [SPARK-19912][SQL] String literals should be escaped for...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/17266 The following is the error message. Since we are not escaping in the spark master, the behavior (incorrect filtering or the error message) is the same from the master branch Spark. ``` java.lang.RuntimeException: Caught Hive MetaException attempting to get partition metadata by filter from Hive. You can set the Spark configuration setting spark.sql.hive.manageFilesourcePartitions to false to work around this problem, however this will result in degraded performance. Please report a bug: https://issues.apache.org/jira/browse/SPARK at org.apache.spark.sql.hive.client.Shim_v0_13.getPartitionsByFilter(HiveShim.scala:612) ... Caused by: java.lang.reflect.InvocationTargetException: org.apache.hadoop.hive.metastore.api.MetaException: Error parsing partition filter : line 1:8 mismatched character '' expecting '"' at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ... at org.apache.spark.sql.hive.client.Shim_v0_13.getPartitionsByFilter(HiveShim.scala:599) ... 103 more Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Error parsing partition filter : line 1:8 mismatched character '' expecting '"' at org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2569) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2512) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2335) ... org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4442) ... org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1103) ... at org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2254) ... 108 more ``` HIVE-11723 seems to resove that in SemanticAnalyzer. So, I need to try that soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17232 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17232 **[Test build #74574 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74574/testReport)** for PR 17232 at commit [`80f33da`](https://github.com/apache/spark/commit/80f33da13dd6e3bd9820ab6fdd641404f0ad2a0b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17255: [SPARK-19918][SQL] Use TextFileFormat in implemen...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17255 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17277: [SPARK-19887][SQL] dynamic partition keys can be ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/17277 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17255: [SPARK-19918][SQL] Use TextFileFormat in implementation ...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17255 thanks, merging to master! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17255: [SPARK-19918][SQL] Use TextFileFormat in implemen...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17255#discussion_r106072933 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonInferSchema.scala --- @@ -40,18 +40,11 @@ private[sql] object JsonInferSchema { json: RDD[T], configOptions: JSONOptions, createParser: (JsonFactory, T) => JsonParser): StructType = { -require(configOptions.samplingRatio > 0, - s"samplingRatio (${configOptions.samplingRatio}) should be greater than 0") val shouldHandleCorruptRecord = configOptions.permissive val columnNameOfCorruptRecord = configOptions.columnNameOfCorruptRecord -val schemaData = if (configOptions.samplingRatio > 0.99) { - json -} else { - json.sample(withReplacement = false, configOptions.samplingRatio, 1) -} --- End diff -- I think it's fine --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17232 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17299: [SPARK-19817][SS] Make it clear that `timeZone` is a gen...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17299 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17299: [SPARK-19817][SS] Make it clear that `timeZone` is a gen...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17299 **[Test build #74578 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74578/testReport)** for PR 17299 at commit [`0b13c08`](https://github.com/apache/spark/commit/0b13c0850472b5172a9f428f923b99a168a79e00). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17299: [SPARK-19817][SS] Make it clear that `timeZone` is a gen...
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/17299 This is the fix for the streaming counter-part (i.e. Structured Streaming), @ueshin @gatorsmile would you take a look? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17299: [SPARK-19817][SS] Make it clear that `timeZone` i...
GitHub user lw-lin opened a pull request: https://github.com/apache/spark/pull/17299 [SPARK-19817][SS] Make it clear that `timeZone` is a general option in DataStreamReader/Writer ## What changes were proposed in this pull request? As timezone setting can also affect partition values, it works for all formats, we should make it clear. ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/lw-lin/spark timezone Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/17299.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #17299 commit 0b13c0850472b5172a9f428f923b99a168a79e00 Author: Liwei LinDate: 2017-03-15T02:06:06Z Initial commit --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17298: [SPARK-19094][WIP][PySpark] Plumb through logging for IJ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17298 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74568/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17298: [SPARK-19094][WIP][PySpark] Plumb through logging for IJ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17298 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17164: [SPARK-16844][SQL] Support codegen for sort-based aggrea...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17164 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17298: [SPARK-19094][WIP][PySpark] Plumb through logging for IJ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17298 **[Test build #74568 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74568/testReport)** for PR 17298 at commit [`15d999b`](https://github.com/apache/spark/commit/15d999bb901aa0a0eef73ff50f2ba3d24c4d3f72). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17164: [SPARK-16844][SQL] Support codegen for sort-based aggrea...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17164 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74569/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17164: [SPARK-16844][SQL] Support codegen for sort-based aggrea...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17164 **[Test build #74569 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74569/testReport)** for PR 17164 at commit [`5baa928`](https://github.com/apache/spark/commit/5baa928d758eaf4c6711c4a8d67611995ca3af25). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `trait AggregateCodegenHelper ` * `abstract class AggregateExec extends UnaryExecNode ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17237: [SPARK-19852][PYSPARK][ML] Update Python API setHandleIn...
Github user VinceShieh commented on the issue: https://github.com/apache/spark/pull/17237 Sure. No problem! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17223: [SPARK-19881][SQL] Support Dynamic Partition Inse...
Github user dongjoon-hyun closed the pull request at: https://github.com/apache/spark/pull/17223 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17223: [SPARK-19881][SQL] Support Dynamic Partition Inserts par...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/17223 I'll close this PR and JIRA issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17223: [SPARK-19881][SQL] Support Dynamic Partition Inserts par...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/17223 I see. That's the reason why not to support that. Thank you, @cloud-fan. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17266: [SPARK-19912][SQL] String literals should be escaped for...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17266 can we say something more in the error message? We should explain that it's a hive bug. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15945: [SPARK-12978][SQL] Merge unnecessary partial aggregates
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15945 **[Test build #74577 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74577/testReport)** for PR 15945 at commit [`870222e`](https://github.com/apache/spark/commit/870222e8ec1b6e7aa32a0260a045192323ba8d30). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12436: [SPARK-14649][CORE] DagScheduler should not run duplicat...
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/12436 @jisookim0513 - created a new PR - https://github.com/apache/spark/pull/17297 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17166 **[Test build #74576 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74576/testReport)** for PR 17166 at commit [`72b28cb`](https://github.com/apache/spark/commit/72b28cb0dc7aacf7cde1b5e49f05da49cb5de276). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17177: [SPARK-19834][SQL] csv escape of quote escape
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/17177 @ep1804 @jbax Thank you. I will cc and inform you both when I happen to see a PR bumping up the version to 2.4.0 (or probably I guess I will). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17177: [SPARK-19834][SQL] csv escape of quote escape
Github user ep1804 closed the pull request at: https://github.com/apache/spark/pull/17177 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17177: [SPARK-19834][SQL] csv escape of quote escape
Github user ep1804 commented on the issue: https://github.com/apache/spark/pull/17177 I agree with you @HyukjinKwon , this PR will be closed for now and re-open. And, thank you for the notice @jbox ! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106066177 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -710,7 +710,11 @@ private[spark] class TaskSetManager( logInfo(s"Killing attempt ${attemptInfo.attemptNumber} for task ${attemptInfo.id} " + s"in stage ${taskSet.id} (TID ${attemptInfo.taskId}) on ${attemptInfo.host} " + s"as the attempt ${info.attemptNumber} succeeded on ${info.host}") - sched.backend.killTask(attemptInfo.taskId, attemptInfo.executorId, true) + sched.backend.killTask( +attemptInfo.taskId, +attemptInfo.executorId, +interruptThread = true, +reason = "another attempt succeeded") --- End diff -- I added two screenshots to the PR description. In the second scenario having a verbose reason is fine, but in the stage summary view long or many distinct reasons would overflow the progress bar. We could probably fix the css to allow slightly longer / more reasons, but even that wouldn't be great if each task had a different reason. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17287: [SPARK-19945][SQL]add test suite for SessionCatal...
Github user windpiger commented on a diff in the pull request: https://github.com/apache/spark/pull/17287#discussion_r106065119 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala --- @@ -76,468 +118,499 @@ class SessionCatalogSuite extends PlanTest { } test("create databases using invalid names") { -val catalog = new SessionCatalog(newEmptyCatalog()) -testInvalidName(name => catalog.createDatabase(newDb(name), ignoreIfExists = true)) +withSessionCatalog(EMPTY) { catalog => + testInvalidName( +name => catalog.createDatabase(newDb(name), ignoreIfExists = true)) +} } test("get database when a database exists") { -val catalog = new SessionCatalog(newBasicCatalog()) -val db1 = catalog.getDatabaseMetadata("db1") -assert(db1.name == "db1") -assert(db1.description.contains("db1")) +withSessionCatalog() { catalog => + val db1 = catalog.getDatabaseMetadata("db1") + assert(db1.name == "db1") + assert(db1.description.contains("db1")) +} } test("get database should throw exception when the database does not exist") { -val catalog = new SessionCatalog(newBasicCatalog()) -intercept[NoSuchDatabaseException] { - catalog.getDatabaseMetadata("db_that_does_not_exist") +withSessionCatalog() { catalog => + intercept[NoSuchDatabaseException] { +catalog.getDatabaseMetadata("db_that_does_not_exist") + } } } test("list databases without pattern") { -val catalog = new SessionCatalog(newBasicCatalog()) -assert(catalog.listDatabases().toSet == Set("default", "db1", "db2", "db3")) +withSessionCatalog() { catalog => + assert(catalog.listDatabases().toSet == Set("default", "db1", "db2", "db3")) +} } test("list databases with pattern") { -val catalog = new SessionCatalog(newBasicCatalog()) -assert(catalog.listDatabases("db").toSet == Set.empty) -assert(catalog.listDatabases("db*").toSet == Set("db1", "db2", "db3")) -assert(catalog.listDatabases("*1").toSet == Set("db1")) -assert(catalog.listDatabases("db2").toSet == Set("db2")) +withSessionCatalog() { catalog => + assert(catalog.listDatabases("db").toSet == Set.empty) + assert(catalog.listDatabases("db*").toSet == Set("db1", "db2", "db3")) + assert(catalog.listDatabases("*1").toSet == Set("db1")) + assert(catalog.listDatabases("db2").toSet == Set("db2")) +} } test("drop database") { -val catalog = new SessionCatalog(newBasicCatalog()) -catalog.dropDatabase("db1", ignoreIfNotExists = false, cascade = false) -assert(catalog.listDatabases().toSet == Set("default", "db2", "db3")) +withSessionCatalog() { catalog => + catalog.dropDatabase("db1", ignoreIfNotExists = false, cascade = false) + assert(catalog.listDatabases().toSet == Set("default", "db2", "db3")) +} } test("drop database when the database is not empty") { // Throw exception if there are functions left -val externalCatalog1 = newBasicCatalog() -val sessionCatalog1 = new SessionCatalog(externalCatalog1) -externalCatalog1.dropTable("db2", "tbl1", ignoreIfNotExists = false, purge = false) -externalCatalog1.dropTable("db2", "tbl2", ignoreIfNotExists = false, purge = false) -intercept[AnalysisException] { - sessionCatalog1.dropDatabase("db2", ignoreIfNotExists = false, cascade = false) +withSessionCatalogAndExternal() { (catalog, externalCatalog) => + externalCatalog.dropTable("db2", "tbl1", ignoreIfNotExists = false, purge = false) + externalCatalog.dropTable("db2", "tbl2", ignoreIfNotExists = false, purge = false) + intercept[AnalysisException] { +catalog.dropDatabase("db2", ignoreIfNotExists = false, cascade = false) + } } - -// Throw exception if there are tables left -val externalCatalog2 = newBasicCatalog() -val sessionCatalog2 = new SessionCatalog(externalCatalog2) -externalCatalog2.dropFunction("db2", "func1") -intercept[AnalysisException] { - sessionCatalog2.dropDatabase("db2", ignoreIfNotExists = false, cascade = false) +withSessionCatalogAndExternal() { (catalog, externalCatalog) => + // Throw exception if there are tables left + externalCatalog.dropFunction("db2", "func1") + intercept[AnalysisException] { +catalog.dropDatabase("db2", ignoreIfNotExists = false, cascade = false) + } } -// When cascade is true, it should drop them -val externalCatalog3 =
[GitHub] spark issue #15604: [SPARK-18066] [CORE] [TESTS] Add Pool usage policies tes...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15604 **[Test build #74575 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74575/testReport)** for PR 15604 at commit [`17e11f0`](https://github.com/apache/spark/commit/17e11f0c56e2a581766c06bd52695c2b05bcfcb2). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15604: [SPARK-18066] [CORE] [TESTS] Add Pool usage policies tes...
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/15604 @erenavsarogullari please file a JIRA when you see test failures instead of ignoring them. I updated https://issues.apache.org/jira/browse/SPARK-19803 for the first failure, but please file a JIRA for the second one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15604: [SPARK-18066] [CORE] [TESTS] Add Pool usage policies tes...
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/15604 Jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15945: [SPARK-12978][SQL] Merge unnecessary partial aggregates
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15945 **[Test build #74573 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74573/testReport)** for PR 15945 at commit [`ea586cf`](https://github.com/apache/spark/commit/ea586cf6fb4464101c22ac98c4a5f5e08dfc5dbf). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15945: [SPARK-12978][SQL] Merge unnecessary partial aggregates
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15945 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74573/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15945: [SPARK-12978][SQL] Merge unnecessary partial aggregates
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15945 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17232: [SPARK-18112] [SQL] Support reading data from Hive 2.1 m...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17232 **[Test build #74574 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74574/testReport)** for PR 17232 at commit [`80f33da`](https://github.com/apache/spark/commit/80f33da13dd6e3bd9820ab6fdd641404f0ad2a0b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17223: [SPARK-19881][SQL] Support Dynamic Partition Inserts par...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17223 Since hive client is shared among all sessions, we can't set hive conf dynamically, to keep session isolation. I think we should treat hive conf as static sql conf, and throw exception when users try to change them. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15945: [SPARK-12978][SQL] Merge unnecessary partial aggregates
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15945 **[Test build #74573 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74573/testReport)** for PR 15945 at commit [`ea586cf`](https://github.com/apache/spark/commit/ea586cf6fb4464101c22ac98c4a5f5e08dfc5dbf). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15945: [SPARK-12978][SQL] Merge unnecessary partial aggregates
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15945 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15945: [SPARK-12978][SQL] Merge unnecessary partial aggregates
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15945 **[Test build #74571 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74571/testReport)** for PR 15945 at commit [`8e5d522`](https://github.com/apache/spark/commit/8e5d5226f7bb9bdf32cf93742f80fd44052f085e). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15945: [SPARK-12978][SQL] Merge unnecessary partial aggregates
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15945 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74571/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17166: [SPARK-19820] [core] Allow reason to be specified for ta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17166 **[Test build #74572 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74572/testReport)** for PR 17166 at commit [`31967d1`](https://github.com/apache/spark/commit/31967d185852870d8edecb855ea1aafb7bd04dd1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16788: [SPARK-16742] Kerberos impersonation support
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/16788 >Trying to put it differently: if Spark had its own, secure method for distributing the initial set of delegation tokens needed by the executors (+ AM in case of YARN), then the YARN backend wouldn't need to use amContainer.setTokens at all. What I'm suggesting here is that this method be the base of the Mesos / Kerberos integration; and later we could change YARN to also use it. >This particular code is pretty self-contained and is the base of what you need here to bootstrap things. Moving it to "core" wouldn't be that hard, I think. The main thing would be to work on how the initial set of tokens is sent to executors, since that is the only thing YARN does for Spark right now. Agreed, I'm also thinking about it, the main thing currently only Spark on YARN can support DT (delegation token) is that yarn could help propagate DTs in bootstrapping. If Spark has a common solution for this, then Spark could support accessing kerberized services under different cluster manages. One simple way as I prototyped before is to pass serialized credentials as executor launch command argument, then when executor launched, deserialize the credential and set to UGI. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106063119 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -710,7 +710,11 @@ private[spark] class TaskSetManager( logInfo(s"Killing attempt ${attemptInfo.attemptNumber} for task ${attemptInfo.id} " + s"in stage ${taskSet.id} (TID ${attemptInfo.taskId}) on ${attemptInfo.host} " + s"as the attempt ${info.attemptNumber} succeeded on ${info.host}") - sched.backend.killTask(attemptInfo.taskId, attemptInfo.executorId, true) + sched.backend.killTask( +attemptInfo.taskId, +attemptInfo.executorId, +interruptThread = true, +reason = "another attempt succeeded") --- End diff -- Can you post a screenshot of the relevant part of the UI? Is the problem just that the HTML properties don't allow columns to wrap? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #17166: [SPARK-19820] [core] Allow reason to be specified...
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/17166#discussion_r106063061 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -239,8 +244,9 @@ private[spark] class Executor( */ @volatile var task: Task[Any] = _ -def kill(interruptThread: Boolean): Unit = { - logInfo(s"Executor is trying to kill $taskName (TID $taskId)") +def kill(interruptThread: Boolean, reason: String): Unit = { + logInfo(s"Executor is trying to kill $taskName (TID $taskId), reason: $reason") --- End diff -- Hm good question looks fine --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org