[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...
Github user OopsOutOfMemory commented on the pull request: https://github.com/apache/spark/pull/3935#issuecomment-70220078 @rxin The rebase may have some problems, cause changed 278 files ? how do I revert it ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5186] [MLLIB] Vector.equals and Vector....
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3997#issuecomment-70220103 [Test build #25645 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25645/consoleFull) for PR 3997 at commit [`bdf8789`](https://github.com/apache/spark/commit/bdf8789b5ee33eb0bda465e68cc31892b049c381). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70220093 [Test build #25644 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25644/consoleFull) for PR 4043 at commit [`c7b3332`](https://github.com/apache/spark/commit/c7b33329eda2093adc415dacc7941f9ca6d4fbe8). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70220172 @ScrapCodes Rather than change this to +LinkedHashMap+ can you just check if it contains it before removing it? It might not be obvious to developers that +remove+ has that specific behavior. I think it's better to just be explicit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5214][Core] Add EventLoop and change DA...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4016#issuecomment-70220476 I think if we have a stated goal of moving away from akka (which would eventually move all of our event loops into this structure) then it makes some sense to do it incrementally. For me this is the main long term goal here, but not something we've really discussed much as a project. So maybe we keep this on ice and try to get some consensus on that goal, and if we do all want to go that direction, we can consider this as part of an incremental roll out. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70220772 LinkedHashMap was introduced to maintain the insertion order of stagesIds(Hasmap will lead to arbitrary order), all map(s) in scala has return an `Option` on `remove` behavior. Even in java calling remove on inexistent key returns null. Do you still want me to change it? Also may be changing the name pendingStages to pendingStagesMap will avoid that confusion ? and should the same apply to - already existing field activeStages ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-70220867 ![screen shot 2015-01-16 at 12 04 04 am](https://cloud.githubusercontent.com/assets/6590087/5773199/bcd70ffc-9d13-11e4-86c1-7140a73f9e92.png) Shows a stage that has Input Metrics (reading from a file) and writes data for next stage. ![screen shot 2015-01-16 at 12 04 15 am](https://cloud.githubusercontent.com/assets/6590087/5773207/e252d90a-9d13-11e4-86d6-d1e77500785c.png) Shows a stage that has both shuffle reading and writing - no input or output metrics. ![screen shot 2015-01-16 at 12 04 25 am](https://cloud.githubusercontent.com/assets/6590087/5773215/048fdef0-9d14-11e4-877f-229922bfe967.png) Shows a stage that has outputting to a file. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/3935#issuecomment-70220923 Not sure - do you have a backup? Maybe just take a diff and apply the diff on master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70220952 ah I see - if the existing remove call is safe, then I think it's fine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Refactor LiveLis...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4006#issuecomment-70221214 [Test build #25646 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25646/consoleFull) for PR 4006 at commit [`4e85ffc`](https://github.com/apache/spark/commit/4e85ffcec4a8ceb1c379eb92e4932010f405e5c9). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2450 Adds exeuctor log links to Web UI
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/3486#issuecomment-70221523 @JoshRosen Now that #3696, then #3711 are merged, I rebased, fixed the merge issues and so I think this pr is ready once again to be reviewed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2450 Adds exeuctor log links to Web UI
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3486#issuecomment-70221659 [Test build #25647 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25647/consoleFull) for PR 3486 at commit [`5d5c95a`](https://github.com/apache/spark/commit/5d5c95a9a028878646ff958b065080bb9c93f893). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4879] [WIP] Use driver to coordinate Ha...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4066#issuecomment-7016 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25640/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4879] [WIP] Use driver to coordinate Ha...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4066#issuecomment-7012 [Test build #25640 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25640/consoleFull) for PR 4066 at commit [`8c64d12`](https://github.com/apache/spark/commit/8c64d12d2e4f5b7b377cce0f49c941870958cdef). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class AskPermissionToCommitOutput(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70223642 [Test build #25648 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25648/consoleFull) for PR 4059 at commit [`c1d4c71`](https://github.com/apache/spark/commit/c1d4c71ec5049e27a0ecb9851efbeee14aca7c7e). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/4043#discussion_r23068344 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllStagesPage.scala --- @@ -37,12 +37,18 @@ private[ui] class AllStagesPage(parent: StagesTab) extends WebUIPage() { val numCompletedStages = listener.numCompletedStages val failedStages = listener.failedStages.reverse.toSeq val numFailedStages = listener.numFailedStages + val pendingStages = listener.pendingStages.values.toSeq + val numWaitingStages = pendingStages.size val now = System.currentTimeMillis val activeStagesTable = new StageTableBase(activeStages.sortBy(_.submissionTime).reverse, parent.basePath, parent.listener, isFairScheduler = parent.isFairScheduler, killEnabled = parent.killEnabled) + val pendingStagesTable = +new StageTableBase(pendingStages.sortBy(_.submissionTime), --- End diff -- @pwendell Can I use Sorting.stableSort here ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5186] [MLLIB] Vector.equals and Vector....
Github user hhbyyh commented on the pull request: https://github.com/apache/spark/pull/3997#issuecomment-70224754 Oh I didn't see your comment before the PR. @mengxr I surely found a lot of the code using the pattern (first assign locally, then access) in Spark, and we can follow it. Just to be honest, I'm not sure I completely understand the reason, is it for sparing a memory addressing? I ran some local perf test and got no obvious difference. Would you please share some insight? Thanks a lot! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-70224780 **[Test build #25638 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25638/consoleFull)** for PR 4067 at commit [`1572054`](https://github.com/apache/spark/commit/157205446fbf474877195d7a30ada86e31e836c2) after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-70224784 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25638/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-70225110 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25641/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-70225105 [Test build #25641 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25641/consoleFull) for PR 4067 at commit [`3c2d021`](https://github.com/apache/spark/commit/3c2d021ae4879f844ce3a5f1fc761b015ab4b5a9). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class AfterNextInterceptingIterator[A](sub: Iterator[A]) extends Iterator[A] ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4984][CORE][UI] Adding a pop-up contain...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3819#issuecomment-70226661 [Test build #25649 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25649/consoleFull) for PR 3819 at commit [`0bca96d`](https://github.com/apache/spark/commit/0bca96d02b475febfcbdd3aaeb0e082102e9802e). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4984][CORE][UI] Adding a pop-up contain...
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3819#issuecomment-70226767 Using a read-only text form, the screenshot is like this, also support to copy the description. ![image](https://cloud.githubusercontent.com/assets/7018048/5773882/a58c1d94-9da2-11e4-91d5-eff2e1efb3b6.png) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5186] [MLLIB] Vector.equals and Vector....
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3997#issuecomment-70226850 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25645/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5186] [MLLIB] Vector.equals and Vector....
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3997#issuecomment-70226841 [Test build #25645 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25645/consoleFull) for PR 3997 at commit [`bdf8789`](https://github.com/apache/spark/commit/bdf8789b5ee33eb0bda465e68cc31892b049c381). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70227033 [Test build #25644 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25644/consoleFull) for PR 4043 at commit [`c7b3332`](https://github.com/apache/spark/commit/c7b33329eda2093adc415dacc7941f9ca6d4fbe8). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70227040 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25644/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5193][SQL] Remove Spark SQL Java-specif...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4065#issuecomment-70227554 [Test build #25643 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25643/consoleFull) for PR 4065 at commit [`83735da`](https://github.com/apache/spark/commit/83735da66edd8c820653602ce8ca14d2c9c25b5b). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5193][SQL] Remove Spark SQL Java-specif...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4065#issuecomment-70227559 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25643/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4687. [WIP] Add an addDirectory API
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3670#issuecomment-70227656 [Test build #25650 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25650/consoleFull) for PR 3670 at commit [`12cac14`](https://github.com/apache/spark/commit/12cac148043a8cd6289566f807e4f24b8261cb38). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4687. [WIP] Add an addDirectory API
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/3670#issuecomment-70227794 Ok here's a version that's ready for review. It still needs a little more doc, polish, and test or two, but would like to get validation on the approach. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4008#issuecomment-70227990 [Test build #25642 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25642/consoleFull) for PR 4008 at commit [`d202e6e`](https://github.com/apache/spark/commit/d202e6e1a8970c3e1f6660a03bc5774c2e40db54). * This patch **passes all tests**. * This patch **does not merge cleanly**. * This patch adds the following public classes _(experimental)_: * `trait TaskKilledListener extends EventListener ` * `class TaskKilledListenerException(errorMessages: Seq[String]) extends Exception ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4008#issuecomment-70227997 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25642/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Refactor LiveLis...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4006#issuecomment-70228082 [Test build #25646 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25646/consoleFull) for PR 4006 at commit [`4e85ffc`](https://github.com/apache/spark/commit/4e85ffcec4a8ceb1c379eb92e4932010f405e5c9). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Refactor LiveLis...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4006#issuecomment-70228091 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25646/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3935#issuecomment-70228217 [Test build #25651 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25651/consoleFull) for PR 3935 at commit [`9efdf35`](https://github.com/apache/spark/commit/9efdf353af71b8b3a15f5a0e837e580237cfcd50). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4043#discussion_r23070055 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllStagesPage.scala --- @@ -37,12 +37,18 @@ private[ui] class AllStagesPage(parent: StagesTab) extends WebUIPage() { val numCompletedStages = listener.numCompletedStages val failedStages = listener.failedStages.reverse.toSeq val numFailedStages = listener.numFailedStages + val pendingStages = listener.pendingStages.values.toSeq + val numWaitingStages = pendingStages.size val now = System.currentTimeMillis val activeStagesTable = new StageTableBase(activeStages.sortBy(_.submissionTime).reverse, parent.basePath, parent.listener, isFairScheduler = parent.isFairScheduler, killEnabled = parent.killEnabled) + val pendingStagesTable = +new StageTableBase(pendingStages.sortBy(_.submissionTime), --- End diff -- Why not just keep the sorting the same as it was? I think the submission time is unlikely to be tied in most cases. It would be good to just make it consistent with the existing ones. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4043#discussion_r23070066 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllStagesPage.scala --- @@ -37,12 +37,18 @@ private[ui] class AllStagesPage(parent: StagesTab) extends WebUIPage() { val numCompletedStages = listener.numCompletedStages val failedStages = listener.failedStages.reverse.toSeq val numFailedStages = listener.numFailedStages + val pendingStages = listener.pendingStages.values.toSeq --- End diff -- can you move this up with `activeStages` to make the declarations grouped properly? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4043#discussion_r23070082 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala --- @@ -17,7 +17,7 @@ package org.apache.spark.ui.jobs -import scala.collection.mutable.{HashMap, HashSet, ListBuffer} +import scala.collection.mutable.{HashMap, HashSet, ListBuffer, LinkedHashMap} --- End diff -- if it's safe to do `remove` on a HashMap, I think it's fine to revert this back to using a HashMap. It's more consistent in that case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2450 Adds exeuctor log links to Web UI
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3486#issuecomment-70228666 [Test build #25647 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25647/consoleFull) for PR 3486 at commit [`5d5c95a`](https://github.com/apache/spark/commit/5d5c95a9a028878646ff958b065080bb9c93f893). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` case class RegisterExecutor(executorId: String, hostPort: String, cores: Int,` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-2450 Adds exeuctor log links to Web UI
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3486#issuecomment-70228673 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25647/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3935#issuecomment-70229207 [Test build #25651 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25651/consoleFull) for PR 3935 at commit [`9efdf35`](https://github.com/apache/spark/commit/9efdf353af71b8b3a15f5a0e837e580237cfcd50). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class DDLDescribeCommand(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3935#issuecomment-70229216 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25651/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-70229214 What about combining the input size and records in the same column. Overall this will help with the expansion in the number of columns. The title could be Input Size / Records --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4923][REPL] Add Developer API to REPL t...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/4034#issuecomment-70229681 Jenkins ok to test. Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4923][REPL] Add Developer API to REPL t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4034#issuecomment-70229882 [Test build #25652 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25652/consoleFull) for PR 4034 at commit [`053ca75`](https://github.com/apache/spark/commit/053ca75965f496dca7839e5a3602f2008fa13098). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3935#issuecomment-70230293 @OopsOutOfMemory do like this: git checkout describe git pull apachesparkrepo git fetch apachesparkrepo master:newdescribe git diff newdescribe describe.patch git checkout newdescribe git apply describe.patch git push --force your-github-repo newdescribe:describe --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70230901 [Test build #25648 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25648/consoleFull) for PR 4059 at commit [`c1d4c71`](https://github.com/apache/spark/commit/c1d4c71ec5049e27a0ecb9851efbeee14aca7c7e). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class GaussianMixtureModel(object):` * `class GaussianMixtureEM(object):` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5012][MLLib][PySpark]Python API for Gau...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4059#issuecomment-70230909 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25648/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...
Github user ksakellis commented on the pull request: https://github.com/apache/spark/pull/4067#issuecomment-70230969 If we do that we wouldn't be able to sort on num records and bytes independently. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-3996: Shade Jetty in Spark deliverables.
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3130#issuecomment-70231376 Hey All, I don't see where spark-submit fits into the issue of having a conflicting jetty library? If you have an application that requires a conflicting version of jetty, it doesn't matter one way or the other whether you use spark-submit. There are certainly applications that do not user spark-submit, and I don't think we have any issue with that. Databricks Cloud is like this, it embeds Spark. I think we'll always support that, it will just be more work for those applications to deal with setting up classpaths, etc in the proper way. The issue here is just a genuine dependency conflict between a user application and Spark. We don't have a general purpose way of solving those. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5124][Core][WIP] A standard internal RP...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3974#issuecomment-70232206 [Test build #25653 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25653/consoleFull) for PR 3974 at commit [`c3359f0`](https://github.com/apache/spark/commit/c3359f09e551ab79c829b7b24215b70921709045). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3935#issuecomment-70232782 [Test build #25654 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25654/consoleFull) for PR 3935 at commit [`9d22708`](https://github.com/apache/spark/commit/9d22708ee3360bc0f9f06a5428f01ff3c7acb48d). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3935#issuecomment-70233294 @OopsOutOfMemory, you need revert unnecessary changes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4984][CORE][UI] Adding a pop-up contain...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3819#issuecomment-70234271 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25649/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4984][CORE][UI] Adding a pop-up contain...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3819#issuecomment-70234263 [Test build #25649 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25649/consoleFull) for PR 3819 at commit [`0bca96d`](https://github.com/apache/spark/commit/0bca96d02b475febfcbdd3aaeb0e082102e9802e). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-733] Add documentation on use of accumu...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/4022#discussion_r23072648 --- Diff: docs/programming-guide.md --- @@ -1316,7 +1316,35 @@ For accumulator updates performed inside bactions only/b, Spark guarantees t will only be applied once, i.e. restarted tasks will not update the value. In transformations, users should be aware of that each task's update may be applied more than once if tasks or job stages are re-executed. +In addition, accumulators do not maintain lineage for the operations that use them. Consequently, accumulator updates are not guaranteed to be executed when made within a lazy transformation like `map()`. Unless something has triggered the evaluation of the lazy transformation that updates the value of the accumlator, subsequent operations will not themselves trigger that evaluation and the value of the accumulator will remain unchanged. The below code fragment demonstrates this issue: --- End diff -- I found this is worded a bit confusingly: what would it mean for an accumulator to maintain lineage? I think this is from @JoshRosen's PR description, but IMO it might be better to remove that particular phrasing. What about a slight re-wording: ``` Accumulators do not change the lazy evaluation model of Spark. Their value is only updated once the RDD in which they are being modified is computed as part of an action. The below code fragment demonstrates this property: ``` I also didn't call it an issue because it's just a property of how they work, I don't think it's necessarily a bug. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4687. [WIP] Add an addDirectory API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3670#issuecomment-70235440 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25650/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4687. [WIP] Add an addDirectory API
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3670#issuecomment-70235428 [Test build #25650 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25650/consoleFull) for PR 3670 at commit [`12cac14`](https://github.com/apache/spark/commit/12cac148043a8cd6289566f807e4f24b8261cb38). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1507][YARN]specify # cores for Applicat...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/4018#issuecomment-70236672 I tested with this patch, with no specified configs in yarn client and cluster mode AM used 1 vCore. With `spark.driver.cores=3` and `spark.yarn.am.cores=4`, in client mode AM used 4 vCores, while in cluster mode it is 3. With `--driver-cores 5`, in cluster mode AM used 5 vCores. If we have no problems with the name, I think it is to go. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4923][REPL] Add Developer API to REPL t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4034#issuecomment-70237633 [Test build #25652 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25652/consoleFull) for PR 4034 at commit [`053ca75`](https://github.com/apache/spark/commit/053ca75965f496dca7839e5a3602f2008fa13098). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class SparkILoop(` * ` * @param id The id (variable name, method name, class name, etc) whose` * ` * Retrieves the class representing the id (variable name, method name,` * ` * @param id The id (variable name, method name, class name, etc) whose` * ` * @return Some containing term name (id) class if exists, else None` * ` * @param id The id (variable name, method name, class name, etc) whose` * ` * @param id The id (variable name, method name, class name, etc) whose` * ` * Retrieves the runtime class and type representing the id (variable name,` * ` * @param id The id (variable name, method name, class name, etc) whose` * ` * @param id The id (variable name, method name, class name, etc) whose` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4923][REPL] Add Developer API to REPL t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4034#issuecomment-70237640 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25652/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3935#issuecomment-70237708 [Test build #25655 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25655/consoleFull) for PR 3935 at commit [`5b7ae19`](https://github.com/apache/spark/commit/5b7ae19dfdc4410f1018193e0b1701abf799c439). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...
Github user OopsOutOfMemory commented on the pull request: https://github.com/apache/spark/pull/3935#issuecomment-70237945 Thanks @scwf @rxin, conflicts resolved cleanly, this now up-to-date. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5124][Core][WIP] A standard internal RP...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3974#issuecomment-70239690 [Test build #25653 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25653/consoleFull) for PR 3974 at commit [`c3359f0`](https://github.com/apache/spark/commit/c3359f09e551ab79c829b7b24215b70921709045). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` class ClientActor(override val rpcEnv: RpcEnv) extends NetworkRpcEndpoint with Logging ` * `class ExecutorActor(override val rpcEnv: RpcEnv, executorId: String) extends RpcEndpoint ` * `trait RpcEnv ` * `trait RpcEndpoint ` * `trait NetworkRpcEndpoint extends RpcEndpoint ` * `trait RpcEndpointRef ` * `case class RpcAddress(host: String, port: Int) ` * ` case class RegisterExecutor(executorId: String, hostPort: String, cores: Int,` * `class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: RpcEnv)` * ` class DriverActor(override val rpcEnv: RpcEnv, sparkProperties: Seq[(String, String)])` * `class BlockManagerMasterActor(override val rpcEnv: RpcEnv, val isLocal: Boolean, conf: SparkConf,` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5124][Core][WIP] A standard internal RP...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3974#issuecomment-70239691 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25653/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/4043#discussion_r23074849 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllStagesPage.scala --- @@ -37,12 +37,18 @@ private[ui] class AllStagesPage(parent: StagesTab) extends WebUIPage() { val numCompletedStages = listener.numCompletedStages val failedStages = listener.failedStages.reverse.toSeq val numFailedStages = listener.numFailedStages + val pendingStages = listener.pendingStages.values.toSeq + val numWaitingStages = pendingStages.size val now = System.currentTimeMillis val activeStagesTable = new StageTableBase(activeStages.sortBy(_.submissionTime).reverse, parent.basePath, parent.listener, isFairScheduler = parent.isFairScheduler, killEnabled = parent.killEnabled) + val pendingStagesTable = +new StageTableBase(pendingStages.sortBy(_.submissionTime), --- End diff -- I was thinking about usability, like the most interesting stage which is next to be executed will appear at last. Anyway will change it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70240729 [Test build #25656 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25656/consoleFull) for PR 4043 at commit [`15cdda4`](https://github.com/apache/spark/commit/15cdda44960f94e8218e6d922eb8c61fdfbb). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3935#issuecomment-70243143 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25654/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3935#issuecomment-70243136 [Test build #25654 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25654/consoleFull) for PR 3935 at commit [`9d22708`](https://github.com/apache/spark/commit/9d22708ee3360bc0f9f06a5428f01ff3c7acb48d). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class DDLDescribeCommand(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3935#issuecomment-70244543 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25655/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5135][SQL] Add support for describe [ex...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3935#issuecomment-70244538 [Test build #25655 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25655/consoleFull) for PR 3935 at commit [`5b7ae19`](https://github.com/apache/spark/commit/5b7ae19dfdc4410f1018193e0b1701abf799c439). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class DDLDescribeCommand(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70247730 [Test build #25656 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25656/consoleFull) for PR 4043 at commit [`15cdda4`](https://github.com/apache/spark/commit/15cdda44960f94e8218e6d922eb8c61fdfbb). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5217 Spark UI should report pending stag...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4043#issuecomment-70247735 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25656/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5282][mllib]: fix int overflow in the w...
GitHub user hhbyyh opened a pull request: https://github.com/apache/spark/pull/4069 [SPARK-5282][mllib]: fix int overflow in the warning JIRA: https://issues.apache.org/jira/browse/SPARK-5282 fix the possible int overflow in the memory computation warning You can merge this pull request into a Git repository by running: $ git pull https://github.com/hhbyyh/spark addscStop Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4069.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4069 commit 7afac23e476ec2d28fe38fcf4b9377593cd7bfc3 Author: Yuhao Yang hhb...@gmail.com Date: 2015-01-17T12:31:22Z 5282: fix int overflow in the warning --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5282][mllib]: RowMatrix easily gets int...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4069#issuecomment-70248153 [Test build #25657 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25657/consoleFull) for PR 4069 at commit [`7afac23`](https://github.com/apache/spark/commit/7afac23e476ec2d28fe38fcf4b9377593cd7bfc3). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4630][Core]Dynamically determine optima...
GitHub user lianhuiwang opened a pull request: https://github.com/apache/spark/pull/4070 [SPARK-4630][Core]Dynamically determine optimal number of partitions stages in application have different size of data. if user doesnot set numPartitions for any stages, spark will use same defaultParallelism as partitons. then in DAGScheduler number of stage's running tasks is equal to partition size of stage.so usually this number is a same value. so if number of stage's partitions is too small, then task need to process large mount of data and slows down due to spilling or gc. if number of stage's partitions is too large, there is big cost in schedule. To improve performance of application, we need to determine optimal number of partitions according to stage's input data size. there are two steps: 1. estimate number of Stage's Partitions according to stage's parent stages's input data size and spark.reduce.per.partition.bytes configuration we can determine number of stage's partitions. how to get parent stages's input data Size? if it has no parent, we get its inputSize through accumulating its input path's length. else if it has parents, but its parents is no available, so we get its parents' input data size. else if its parents is avilable, we just compute its parents' shuffle data size as its input size. 2. update Stage's Partitioner firstly, update partition of parents' shuffleDep. that make shuffleMapTask write wanted number of partition files. and then, update stage's information, particularly stage's rdd. it make stage's shuffleRDD can correctly pull data from map task. TODO: 1. consider spark.shuffle.memoryFraction to determine spark.reduce.per.partition.bytes configuration. 2. when stage is final stage, resultHandler's value cannot be returned to SparkContext because partitions has been changed. 3. when number of stage's tasks has been changed, report stage's new information to UI. @ksakellis @sryza @JoshRosen @rxin You can merge this pull request into a Git repository by running: $ git pull https://github.com/lianhuiwang/spark SPARK-4630 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4070.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4070 commit fc652a5aefe4d694d0c3374dce6b3c4415c7be56 Author: lianhuiwang lianhuiwan...@gmail.com Date: 2015-01-16T12:38:03Z Dynamically determine optimal number of partitions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4630][Core]Dynamically determine optima...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4070#issuecomment-70249049 [Test build #25658 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25658/consoleFull) for PR 4070 at commit [`fc652a5`](https://github.com/apache/spark/commit/fc652a5aefe4d694d0c3374dce6b3c4415c7be56). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4630][Core]Dynamically determine optima...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4070#issuecomment-70249150 [Test build #25658 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25658/consoleFull) for PR 4070 at commit [`fc652a5`](https://github.com/apache/spark/commit/fc652a5aefe4d694d0c3374dce6b3c4415c7be56). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class HashPartitioner(var partitions: Int) extends Partitioner ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4630][Core]Dynamically determine optima...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4070#issuecomment-70249152 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25658/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4630][Core]Dynamically determine optima...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4070#issuecomment-70249983 [Test build #25659 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25659/consoleFull) for PR 4070 at commit [`668926c`](https://github.com/apache/spark/commit/668926cc681320b1d2fec191c9e7917384dc4803). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4630][Core]Dynamically determine optima...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4070#issuecomment-70250150 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25659/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4630][Core]Dynamically determine optima...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4070#issuecomment-70250149 [Test build #25659 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25659/consoleFull) for PR 4070 at commit [`668926c`](https://github.com/apache/spark/commit/668926cc681320b1d2fec191c9e7917384dc4803). * This patch **fails to build**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class HashPartitioner(var partitions: Int) extends Partitioner ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4630][Core]Dynamically determine optima...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4070#issuecomment-70250923 [Test build #25660 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25660/consoleFull) for PR 4070 at commit [`8b7216f`](https://github.com/apache/spark/commit/8b7216fc539ec656cdcf79b04d34da3b8b708423). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5282][mllib]: RowMatrix easily gets int...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4069#issuecomment-70255261 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25657/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5282][mllib]: RowMatrix easily gets int...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4069#issuecomment-70255251 [Test build #25657 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25657/consoleFull) for PR 4069 at commit [`7afac23`](https://github.com/apache/spark/commit/7afac23e476ec2d28fe38fcf4b9377593cd7bfc3). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Spark 3883: SSL support for HttpServer and Akk...
Github user jacek-lewandowski commented on the pull request: https://github.com/apache/spark/pull/3571#issuecomment-70258127 @vanzin I've applied your comments. I've tested on a 3 node Spark cluster, tried to submit application from OSX to Linux cluster in both client and cluster deploy mode (in standalone mode). Everything seems working fine. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Spark 3883: SSL support for HttpServer and Akk...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3571#issuecomment-70258426 [Test build #25661 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25661/consoleFull) for PR 3571 at commit [`271d166`](https://github.com/apache/spark/commit/271d16663020386fdc02a6314ee293c9a51bd452). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4630][Core]Dynamically determine optima...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4070#issuecomment-70258988 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25660/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4630][Core]Dynamically determine optima...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4070#issuecomment-70258982 [Test build #25660 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25660/consoleFull) for PR 4070 at commit [`8b7216f`](https://github.com/apache/spark/commit/8b7216fc539ec656cdcf79b04d34da3b8b708423). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class HashPartitioner(var partitions: Int) extends Partitioner ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4630][Core]Dynamically determine optima...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4070#issuecomment-70263224 [Test build #25662 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25662/consoleFull) for PR 4070 at commit [`622e45c`](https://github.com/apache/spark/commit/622e45c2bebea7a6206fbf29569e766f57a7f17b). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Merge pull request #1 from apache/master
GitHub user yuyongyang800 opened a pull request: https://github.com/apache/spark/pull/4071 Merge pull request #1 from apache/master Update You can merge this pull request into a Git repository by running: $ git pull https://github.com/yuyongyang800/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4071.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4071 commit 5fbac656768b808329f8c5ec01620dfa82fd0806 Author: yuyongyang800 yuyongyang...@gmail.com Date: 2015-01-14T03:05:21Z Merge pull request #1 from apache/master Update --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5282][mllib]: RowMatrix easily gets int...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4069#discussion_r23084566 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala --- @@ -131,7 +131,7 @@ class RowMatrix( throw new IllegalArgumentException(sArgument with more than 65535 cols: $cols) } if (cols 1) { - val mem = cols * cols * 8 + val mem = (math.pow(cols, 2) * 8).formatted(%,.0f) --- End diff -- I know it's minor, but I think it might be simpler to report megabytes in the warning message, since the reported size is at least 800MB. Then this can be simply `val memMB = (cols.toLong * cols) / 125000`; that multiplication can't overflow. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Merge pull request #1 from apache/master
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4071#issuecomment-70265345 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Merge pull request #1 from apache/master
Github user yuyongyang800 closed the pull request at: https://github.com/apache/spark/pull/4071 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Spark 3883: SSL support for HttpServer and Akk...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3571#issuecomment-70268752 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25661/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Spark 3883: SSL support for HttpServer and Akk...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3571#issuecomment-70268740 [Test build #25661 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25661/consoleFull) for PR 3571 at commit [`271d166`](https://github.com/apache/spark/commit/271d16663020386fdc02a6314ee293c9a51bd452). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5249] Added type specific set functions...
Github user AdamGS commented on the pull request: https://github.com/apache/spark/pull/4042#issuecomment-70269203 It's just a small thing that bothers me, so I decided to try getting it into the code base. Should I push the fixed version (two set/setIfMissing methods)? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5186] [MLLIB] Vector.equals and Vector....
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3997#issuecomment-70269269 [Test build #25663 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25663/consoleFull) for PR 3997 at commit [`985e160`](https://github.com/apache/spark/commit/985e16088196c2fb808c63c67ac7ee420edb4efa). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5186] [MLLIB] Vector.equals and Vector....
Github user hhbyyh commented on the pull request: https://github.com/apache/spark/pull/3997#issuecomment-70270369 @mengxr I did some further test and find the perf difference caused by the heap access. Thanks for pointing it out. Sent an update containing the locality improvement. @srowen you may notice I changed the implementation back a little and uses the allEqual inspired by @mengxr :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org