[GitHub] spark pull request: [SPARK-4952][Core]Handle ConcurrentModificatio...
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/3788 [SPARK-4952][Core]Handle ConcurrentModificationExceptions in SparkEnv.environmentDetails You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-4952 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3788.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3788 commit d903529819f090288f6acfb666873f9ac01990be Author: GuoQiang Li wi...@qq.com Date: 2014-12-24T07:56:59Z Handle ConcurrentModificationExceptions in SparkEnv.environmentDetails --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4952][Core]Handle ConcurrentModificatio...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3788#issuecomment-68034841 [Test build #24778 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24778/consoleFull) for PR 3788 at commit [`d903529`](https://github.com/apache/spark/commit/d903529819f090288f6acfb666873f9ac01990be). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68035013 [Test build #24773 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24773/consoleFull) for PR 3778 at commit [`8c0316f`](https://github.com/apache/spark/commit/8c0316f8454f0ac8268f98d9a4c9cc29baedbf5b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `abstract class CombinePredicate extends BinaryPredicate ` * `case class And(left: Expression, right: Expression) extends CombinePredicate ` * `case class Or(left: Expression, right: Expression) extends CombinePredicate ` * ` implicit class CombinePredicateExtension(source: CombinePredicate) ` * ` implicit class ExpressionCookies(expression: Expression) ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP][SPARK-4937][SQL] Adding optimization to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3778#issuecomment-68035016 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24773/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4937][SQL] Normalizes conjunctions and ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3784#issuecomment-68035062 [Test build #24779 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24779/consoleFull) for PR 3784 at commit [`4ab3a58`](https://github.com/apache/spark/commit/4ab3a58fe8a86bc8f08fa0007d88022b3021e0e6). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor] Fix a typo of type parameter in JavaUt...
GitHub user sarutak opened a pull request: https://github.com/apache/spark/pull/3789 [Minor] Fix a typo of type parameter in JavaUtils.scala In JavaUtils.scala, thare is a typo of type parameter. In addition, the type information is removed at the time of compile by erasure. This issue is really minor so I don't file in JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sarutak/spark fix-typo-in-javautils Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3789.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3789 commit 99f6f6342b98156a5f7771b0dd0d50c4e0f21a8c Author: Kousuke Saruta saru...@oss.nttdata.co.jp Date: 2014-12-24T08:05:51Z Fixed a typo of type parameter in JavaUtils.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor] Fix a typo of type parameter in JavaUt...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3789#issuecomment-68035307 [Test build #24780 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24780/consoleFull) for PR 3789 at commit [`99f6f63`](https://github.com/apache/spark/commit/99f6f6342b98156a5f7771b0dd0d50c4e0f21a8c). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4723] [CORE] To abort the stages which ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3786#issuecomment-68035714 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24767/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4723] [CORE] To abort the stages which ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3786#issuecomment-68035711 [Test build #24767 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24767/consoleFull) for PR 3786 at commit [`003774a`](https://github.com/apache/spark/commit/003774ab2dea5c0f6fd70e68c385178cc235d1c2). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4949] [SPARK-4949] shutdownCallback in ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3781#discussion_r22248980 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala --- @@ -31,16 +31,17 @@ private[spark] class SparkDeploySchedulerBackend( with AppClientListener with Logging { - var client: AppClient = null - var stopping = false - var shutdownCallback : (SparkDeploySchedulerBackend) = Unit = _ - @volatile var appId: String = _ + private var client: AppClient = null + private var stopping = false + private val shutdownCallbackLock = new Object() + private var shutdownCallback : (SparkDeploySchedulerBackend) = Unit = _ + @volatile private var appId: String = _ - val registrationLock = new Object() - var registrationDone = false + private val registrationLock = new Object() --- End diff -- On the one hand, this sounds like it could be an `AtomicBoolean`. On another hand -- this whole mechanism could be replaced by something more robust in `java.util.concurrent` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4949] [SPARK-4949] shutdownCallback in ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3781#discussion_r22248985 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala --- @@ -82,8 +83,11 @@ private[spark] class SparkDeploySchedulerBackend( stopping = true super.stop() client.stop() -if (shutdownCallback != null) { - shutdownCallback(this) + +shutdownCallbackLock.synchronized { --- End diff -- This doesn't work since `shutdownCallbackLock` may be `null`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4949] [SPARK-4949] shutdownCallback in ...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3781#discussion_r22249000 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala --- @@ -31,16 +31,17 @@ private[spark] class SparkDeploySchedulerBackend( with AppClientListener with Logging { - var client: AppClient = null - var stopping = false - var shutdownCallback : (SparkDeploySchedulerBackend) = Unit = _ - @volatile var appId: String = _ + private var client: AppClient = null + private var stopping = false + private val shutdownCallbackLock = new Object() --- End diff -- Same for the new lock --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4953][Doc] Fix the description of build...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3787#issuecomment-68036584 [Test build #24769 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24769/consoleFull) for PR 3787 at commit [`264e4e0`](https://github.com/apache/spark/commit/264e4e0ce01e5f41eb60413249219ff98864dc0c). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4953][Doc] Fix the description of build...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3787#issuecomment-68036586 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24769/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4409][MLlib] Additional Linear Algebra ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3319#issuecomment-68036616 [Test build #24774 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24774/consoleFull) for PR 3319 at commit [`04c4829`](https://github.com/apache/spark/commit/04c4829d8364a36314485d6bdceed5ab93c67398). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4409][MLlib] Additional Linear Algebra ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3319#issuecomment-68036621 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24774/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2765#issuecomment-68036640 [Test build #24770 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24770/consoleFull) for PR 2765 at commit [`ce86bcc`](https://github.com/apache/spark/commit/ce86bcc5be8a790245787f75dfd2cba51ab50f55). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2765#issuecomment-68036643 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24770/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4953][Doc] Fix the description of build...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3787#discussion_r22249140 --- Diff: docs/building-spark.md --- @@ -60,20 +60,29 @@ mvn -Dhadoop.version=2.0.0-mr1-cdh4.2.0 -DskipTests clean package mvn -Phadoop-0.23 -Dhadoop.version=0.23.7 -DskipTests clean package {% endhighlight %} -For Apache Hadoop 2.x, 0.23.x, Cloudera CDH, and other Hadoop versions with YARN, you can enable the yarn profile and optionally set the yarn.version property if it is different from hadoop.version. As of Spark 1.3, Spark only supports YARN versions 2.2.0 and later. +For Apache Hadoop 2.2.0 and later and Cloudera CDH 5 with YARN, you can enable the yarn profile and optionally set the yarn.version property if it is different from hadoop.version. As of Spark 1.3, Spark only supports YARN versions 2.2.0 and later. Examples: {% highlight bash %} # Apache Hadoop 2.2.X -mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests clean package +mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.X -DskipTests clean package --- End diff -- This is wrong since 2.2.X is not a version. This is intended to be an executable example. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4953][Doc] Fix the description of build...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3787#discussion_r22249130 --- Diff: docs/building-spark.md --- @@ -60,20 +60,29 @@ mvn -Dhadoop.version=2.0.0-mr1-cdh4.2.0 -DskipTests clean package mvn -Phadoop-0.23 -Dhadoop.version=0.23.7 -DskipTests clean package {% endhighlight %} -For Apache Hadoop 2.x, 0.23.x, Cloudera CDH, and other Hadoop versions with YARN, you can enable the yarn profile and optionally set the yarn.version property if it is different from hadoop.version. As of Spark 1.3, Spark only supports YARN versions 2.2.0 and later. +For Apache Hadoop 2.2.0 and later and Cloudera CDH 5 with YARN, you can enable the yarn profile and optionally set the yarn.version property if it is different from hadoop.version. As of Spark 1.3, Spark only supports YARN versions 2.2.0 and later. --- End diff -- This is not only applicable to CDH *5*+, so I'd revert that addition. What was removed with `yarn-alpha` was not really Hadoop 0.23 support, although it kind of lines up with that. Why not remove this whole qualifying For Apache Hadoop ... phrase altogether? Also, do you mean Spark 1.2? what are you referring to in 1.3 otherwise? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4953][Doc] Fix the description of build...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3787#issuecomment-68036986 [Test build #24771 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24771/consoleFull) for PR 3787 at commit [`9ab0c24`](https://github.com/apache/spark/commit/9ab0c24e440972ba861ceae75767847fbce96f91). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * ` case class ApplicationFinished(id: String)` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4953][Doc] Fix the description of build...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3787#issuecomment-68036994 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24771/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4953][Doc] Fix the description of build...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3787#discussion_r22249225 --- Diff: docs/building-spark.md --- @@ -60,20 +60,29 @@ mvn -Dhadoop.version=2.0.0-mr1-cdh4.2.0 -DskipTests clean package mvn -Phadoop-0.23 -Dhadoop.version=0.23.7 -DskipTests clean package {% endhighlight %} -For Apache Hadoop 2.x, 0.23.x, Cloudera CDH, and other Hadoop versions with YARN, you can enable the yarn profile and optionally set the yarn.version property if it is different from hadoop.version. As of Spark 1.3, Spark only supports YARN versions 2.2.0 and later. +For Apache Hadoop 2.2.0 and later and Cloudera CDH 5 with YARN, you can enable the yarn profile and optionally set the yarn.version property if it is different from hadoop.version. As of Spark 1.3, Spark only supports YARN versions 2.2.0 and later. Examples: {% highlight bash %} # Apache Hadoop 2.2.X -mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests clean package +mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.X -DskipTests clean package # Apache Hadoop 2.3.X -mvn -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -DskipTests clean package +mvn -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.X -DskipTests clean package # Apache Hadoop 2.4.X or 2.5.X mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=VERSION -DskipTests clean package +# Cloudera CDH 5.0.X +mvn -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0-cdh5.0.X -DskipTests clean package + +# Cloudera CDH 5.1.X +mvn -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0-cdh5.1.X -DskipTests clean package + +# Cloudera CDEH 5.2.X or 5.3.X --- End diff -- This has a typo in CDEH and are also not runnable. I don't see much value in elaborating this example 3 more times. (As a related aside, I would like to see less, not more, vendor stuff in Spark anyway. Adding just this text unduly favors Cloudera a tiny bit; the alternative is to write a bunch of other vendor combos here, which is going to turn into at least a maintenance headache. I already disagree with maintaining vendor versioning info in the project POM.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4951][Core] Fix the issue that a busy e...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3783#issuecomment-68037376 [Test build #24772 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24772/consoleFull) for PR 3783 at commit [`105ba3a`](https://github.com/apache/spark/commit/105ba3acea521a77122a016faa6674793d1ff696). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4951][Core] Fix the issue that a busy e...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3783#issuecomment-68037381 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24772/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor] Fix a typo of type parameter in JavaUt...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3789#discussion_r22249340 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala --- @@ -80,7 +80,7 @@ private[spark] object JavaUtils { prev match { case Some(k) = underlying match { -case mm: mutable.Map[a, _] = +case mm: mutable.Map[_, _] = --- End diff -- Should this really be `A` to express the relation to the generic bound? although `underlying` must already have keys of type `A` already. It just looks like that was the intent. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4937][SQL] Normalizes conjunctions and ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3784#issuecomment-68037960 [Test build #24776 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24776/consoleFull) for PR 3784 at commit [`3cf7937`](https://github.com/apache/spark/commit/3cf7937bf2c2631b3a313e5873d7f7d0b853203f). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4937][SQL] Normalizes conjunctions and ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3784#issuecomment-68037961 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24776/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4952][Core]Handle ConcurrentModificatio...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3788#discussion_r22249654 --- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala --- @@ -395,7 +395,7 @@ object SparkEnv extends Logging { val sparkProperties = (conf.getAll ++ schedulerMode).sorted // System properties that are not java classpaths -val systemProperties = System.getProperties.iterator.toSeq +val systemProperties = Utils.getSystemProperties.toSeq --- End diff -- It wasn't clear to me at first whether this is the culprit, but it looks so since the underlying object being modified is a `java.util.Properties`. The defensive copy made in `Utils` should be thread-safe in the sense that `Hashtable.clone()` is `synchronized`. LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4409][MLlib] Additional Linear Algebra ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3319#issuecomment-68038204 [Test build #24775 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24775/consoleFull) for PR 3319 at commit [`b0354f6`](https://github.com/apache/spark/commit/b0354f616f7f49ee9b19f6b8e5d0dc775b05dba2). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4409][MLlib] Additional Linear Algebra ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3319#issuecomment-68038211 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24775/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4937][SQL] Normalizes conjunctions and ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3784#issuecomment-68038291 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24777/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4937][SQL] Normalizes conjunctions and ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3784#issuecomment-68038290 [Test build #24777 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24777/consoleFull) for PR 3784 at commit [`0e51101`](https://github.com/apache/spark/commit/0e511019c6ee3faadd81f860adde5ee7bc6e4778). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4954][Core] add spark version infomatio...
GitHub user liyezhang556520 opened a pull request: https://github.com/apache/spark/pull/3790 [SPARK-4954][Core] add spark version infomation in log for standalone mode The master and worker spark version may be not the same with Driver spark version. That is because spark Jar file might be replaced for new application without restarting the spark cluster. So there shall log out the spark-version in both Mater and Worker log. You can merge this pull request into a Git repository by running: $ git pull https://github.com/liyezhang556520/spark version4Standalone Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3790.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3790 commit e05e1e3e0bf747adc1f0c3e4c6461e92b5368c23 Author: Zhang, Liye liye.zh...@intel.com Date: 2014-12-24T08:46:06Z add spark version infomation in log for standalone mode --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4954][Core] add spark version infomatio...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3790#issuecomment-68038435 [Test build #24781 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24781/consoleFull) for PR 3790 at commit [`e05e1e3`](https://github.com/apache/spark/commit/e05e1e3e0bf747adc1f0c3e4c6461e92b5368c23). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4937][SQL] Normalizes conjunctions and ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3784#issuecomment-68039132 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24779/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4937][SQL] Normalizes conjunctions and ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3784#issuecomment-68039127 [Test build #24779 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24779/consoleFull) for PR 3784 at commit [`4ab3a58`](https://github.com/apache/spark/commit/4ab3a58fe8a86bc8f08fa0007d88022b3021e0e6). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4952][Core]Handle ConcurrentModificatio...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3788#issuecomment-68039339 [Test build #24778 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24778/consoleFull) for PR 3788 at commit [`d903529`](https://github.com/apache/spark/commit/d903529819f090288f6acfb666873f9ac01990be). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4952][Core]Handle ConcurrentModificatio...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3788#issuecomment-68039346 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24778/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4159 [CORE] Maven build doesn't run JUni...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3651#issuecomment-68039924 @JoshRosen I found that removing the `SPARK_HOME` config doesn't seem to matter; the REPL and YARN tests still pass. OK to remove that config in this PR, do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor] Fix a typo of type parameter in JavaUt...
Github user sarutak commented on a diff in the pull request: https://github.com/apache/spark/pull/3789#discussion_r22250287 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala --- @@ -80,7 +80,7 @@ private[spark] object JavaUtils { prev match { case Some(k) = underlying match { -case mm: mutable.Map[a, _] = +case mm: mutable.Map[_, _] = --- End diff -- I thought it should be `A` but the type parameter at the position is no mean because the erasure removes type information at the time of compile. Even though, should we place `A` for readability? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor] Fix a typo of type parameter in JavaUt...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3789#issuecomment-68040214 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24780/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor] Fix a typo of type parameter in JavaUt...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3789#issuecomment-68040208 [Test build #24780 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24780/consoleFull) for PR 3789 at commit [`99f6f63`](https://github.com/apache/spark/commit/99f6f6342b98156a5f7771b0dd0d50c4e0f21a8c). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4949]shutdownCallback in SparkDeploySch...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3781#issuecomment-68040463 [Test build #24782 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24782/consoleFull) for PR 3781 at commit [`1b60fd1`](https://github.com/apache/spark/commit/1b60fd19bd0ef79e72fe568cf03d6976c7c32f97). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4913] Fix incorrect event log path
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/3755#issuecomment-68040630 Thanks for your suggestion too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4953][Doc] Fix the description of build...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3787#issuecomment-68041088 [Test build #24783 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24783/consoleFull) for PR 3787 at commit [`ee9c355`](https://github.com/apache/spark/commit/ee9c355dfc516dc612906be33c6baa9090cade0b). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2458] Make failed application log visib...
Github user tsudukim commented on the pull request: https://github.com/apache/spark/pull/3467#issuecomment-68043299 Thank you for your comments! I'm going to do it in a few days! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4954][Core] add spark version infomatio...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3790#issuecomment-68043368 [Test build #24781 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24781/consoleFull) for PR 3790 at commit [`e05e1e3`](https://github.com/apache/spark/commit/e05e1e3e0bf747adc1f0c3e4c6461e92b5368c23). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4954][Core] add spark version infomatio...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3790#issuecomment-68043372 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24781/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4949]shutdownCallback in SparkDeploySch...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3781#issuecomment-68045308 [Test build #24782 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24782/consoleFull) for PR 3781 at commit [`1b60fd1`](https://github.com/apache/spark/commit/1b60fd19bd0ef79e72fe568cf03d6976c7c32f97). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4949]shutdownCallback in SparkDeploySch...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3781#issuecomment-68045313 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24782/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4951][Core] Fix the issue that a busy e...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3783#issuecomment-68045928 [Test build #24784 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24784/consoleFull) for PR 3783 at commit [`05f6238`](https://github.com/apache/spark/commit/05f6238e988a54aada24ce85272212717fdc8c4e). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4953][Doc] Fix the description of build...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3787#issuecomment-68045940 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24783/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4953][Doc] Fix the description of build...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3787#issuecomment-68045936 [Test build #24783 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24783/consoleFull) for PR 3787 at commit [`ee9c355`](https://github.com/apache/spark/commit/ee9c355dfc516dc612906be33c6baa9090cade0b). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3629][Doc] improve spark on yarn doc
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/2813#issuecomment-68045956 @ssjssh could you reorganize your PR to 2 commits: one for the addition or modification, the other for moving the text. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4951][Core] Fix the issue that a busy e...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3783#issuecomment-68050039 [Test build #24784 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24784/consoleFull) for PR 3783 at commit [`05f6238`](https://github.com/apache/spark/commit/05f6238e988a54aada24ce85272212717fdc8c4e). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4951][Core] Fix the issue that a busy e...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3783#issuecomment-68050042 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24784/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4858] Add an option to turn off a progr...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3709#issuecomment-68050938 [Test build #24785 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24785/consoleFull) for PR 3709 at commit [`0681403`](https://github.com/apache/spark/commit/06814035d454a9b7e444b0dc657a572a6ae2f899). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4858] Add an option to turn off a progr...
Github user maropu commented on the pull request: https://github.com/apache/spark/pull/3709#issuecomment-68050965 Fixed, please test it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4723] [CORE] To abort the stages which ...
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/3786#issuecomment-68051186 I don't like the approach of saying for some reason, something happens and then putting in a patch to address what happens instead of identifying and correcting the reason that it happens. If anything, patching the effect in that way can make identifying the underlying cause more difficult. Maybe we'll end up using something like `maxStageRetryAttempts`, but I don't want to do so until after we clearly understand why that is needed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4950] Delete obsolete mapReduceTripelet...
Github user ankurdave commented on the pull request: https://github.com/apache/spark/pull/3782#issuecomment-68052655 We wanted to retain binary compatibility for the Pregel API, which prevented adding the TripletFields parameter. Instead it might be better to add a second version of the Pregel API with several changes: manually-specified TripletFields, aggregateMessages-style API, and custom vertex activeness (#1217). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4386] Improve performance when writing ...
Github user MickDavies commented on the pull request: https://github.com/apache/spark/pull/3254#issuecomment-68053039 @jimfcarroll - that's exactly the change I made. Performance improvements are very substantial for wide tables, as I said in the case I was looking at 6x as fast, but more significant still if you just consider just processing in Spark. Thanks for checking in the improvement. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4858] Add an option to turn off a progr...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3709#issuecomment-68055149 [Test build #24785 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24785/consoleFull) for PR 3709 at commit [`0681403`](https://github.com/apache/spark/commit/06814035d454a9b7e444b0dc657a572a6ae2f899). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4858] Add an option to turn off a progr...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3709#issuecomment-68055152 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24785/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Vectors.sparse() add support to unsorted indic...
GitHub user hzlyx opened a pull request: https://github.com/apache/spark/pull/3791 Vectors.sparse() add support to unsorted indices For original method, when the indices is not strictly increasing, the sparse vector can be created without any warning or error. But when use apply() method, only zero will be returned. You can merge this pull request into a Git repository by running: $ git pull https://github.com/hzlyx/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3791.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3791 commit fa3df2406cea3b73e3058510d0a6c5e4098dc22a Author: Yuxi Liao liaoy...@huawei.com Date: 2014-12-24T14:54:15Z Vectors.sparse() add support to unsorted indices For original method, when the indices is not strictly increasing, the sparse vector can be created without any warning or error. But when use apply() method, only zero will be returned. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MLlib]Vectors.sparse() add support to unsorte...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3791#discussion_r22257875 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala --- @@ -173,11 +173,13 @@ object Vectors { * Creates a sparse vector providing its index array and value array. * * @param size vector size. - * @param indices index array, must be strictly increasing. - * @param values value array, must have the same length as indices. + * @param indices index array. + * @param values value array. */ - def sparse(size: Int, indices: Array[Int], values: Array[Double]): Vector = -new SparseVector(size, indices, values) + def sparse(size: Int, indices: Array[Int], values: Array[Double]): Vector = { +val (newIndices, newValues) = indices.zip(values).sortBy(_._1).unzip --- End diff -- This is non-trivial overhead to introduce every time a vector is made, when the common case is that the indices are sorted, and the other cases are really caller error. I'd still suggest merely checking the sorting. There are lots of one-liners but this may be among the most efficient `require((1 until indices.length).forall(i = indices(i-1) = indices(i)))` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MLlib]Vectors.sparse() add support to unsorte...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3791#issuecomment-68057633 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2687] [yarn]amClient should remove Cont...
Github user lianhuiwang closed the pull request at: https://github.com/apache/spark/pull/3245 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2687] [yarn]amClient should remove Cont...
Github user lianhuiwang commented on the pull request: https://github.com/apache/spark/pull/3245#issuecomment-68059290 ok, i will close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4195][Core]retry to fetch blocks's resu...
Github user lianhuiwang commented on the pull request: https://github.com/apache/spark/pull/3061#issuecomment-68059363 OK. I will close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4195][Core]retry to fetch blocks's resu...
Github user lianhuiwang closed the pull request at: https://github.com/apache/spark/pull/3061 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Added setMinCount to Word2Vec.scala
Github user ganonp commented on the pull request: https://github.com/apache/spark/pull/3693#issuecomment-68062042 Sorry I didn't mean to commit that norm method for this pull request. That said, I think it makes sense for norm to be public or at least a d=2 version of norm. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4454 Fix race condition in DAGScheduler
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/3345#issuecomment-68068199 Ah yes, I see now. Thanks for coming back to this one, Josh. `DAGScheduler#getPreferredLocs` is definitely broken. You're correct that the access to and potential update of the `cacheLocs` needs to be routed through the actor. But because of the need to return the preferred locations, this will be a little different than the fire-and-forget messages that are currently sent to the `eventProcessActor`, and will need to be an [`ask`](http://doc.akka.io/docs/akka/2.3.4/scala/actors.html#Ask__Send-And-Receive-Future) pattern instead. Something that also concerns me in looking at the usages of `SparkContext#getPreferredLocs` in `CoalescedRDD` and `PartitionerAwareUnionRDD` is that they both have a `currPrefLocs` method with a comment that this is supposed to Get the *current* preferred locations from the DAGScheduler. I'm not sure just what the expectation or requirement there for current is -- current when the RDD is defined, when actions are run on it, something else? This feels like a potential race condition to me, and I am wondering whether it might make sense to make this getting of current preferred locations as lazy as possible and resolved during the execution of a job. That's just speculation as to the need for or desirability of that laziness, but I think it deserves a look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-3655 GroupByKeyAndSortValues
Github user markhamstra commented on the pull request: https://github.com/apache/spark/pull/3632#issuecomment-68069421 The reason for separate classes is to cleanly segregate the available/supportable functionality. Not every `PairRDD` has keys that can be ordered, so `sortByKey` shouldn't be part of `PairRDD`. When keys can be ordered, there is often a natural ordering that is already implicitly in scope. When that is true, then we don't want to force the user to explicitly provide an `Ordering` -- e.g. if you have an `RDD[Int, Foo]`, then rdd.sortByKey() should just work. If you want a different Ordering, then you just need to bring a new implicit Ordering for that key type into scope. Things aren't as cleanly separated in the Java API because of the lack of support for implicits there, but that doesn't mean that we should abandon the separation between `PairRDD` and `OrderedRDD` on the Scala side or start dirtying-up `PairRDD.scala` when we want to provide new methods for RDDs whose keys and values can both be ordered. I really think that we want to repeat the pattern of `OrderedRDD` for these `DoublyOrderedRDD` -- or whatever better name you can come up with. The biggest quirk I can see right now is if the types of both keys and values are the same but you want to order them one way when sorting by key and a different way when doing the secondary sort on values. That won't work with implicits since there can only be one implicit `Ordering` for the type in scope at a time. The problem could either be avoided by using distinct types for the key and value roles, or a method signature with explicit orderings could be added to address this corner case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Added Java serialization util functions back i...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/3792 Added Java serialization util functions back in network/common/util/JavaUtils You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark java-ser Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3792.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3792 commit 2a2ad9d6fcb98bbb7ffca1c0a5273f4ff8cb53a6 Author: Reynold Xin r...@databricks.com Date: 2014-12-24T19:24:31Z Added Java serialization util functions back in network/common/util/JavaUtils. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Minor] Fix a typo of type parameter in JavaUt...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/3789#discussion_r22262419 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaUtils.scala --- @@ -80,7 +80,7 @@ private[spark] object JavaUtils { prev match { case Some(k) = underlying match { -case mm: mutable.Map[a, _] = +case mm: mutable.Map[_, _] = --- End diff -- yea for readability let's use A --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Added Java serialization util functions back i...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/3792#issuecomment-68070712 cc @aarondav --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Added Java serialization util functions back i...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3792#issuecomment-68070795 [Test build #24786 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24786/consoleFull) for PR 3792 at commit [`2a2ad9d`](https://github.com/apache/spark/commit/2a2ad9d6fcb98bbb7ffca1c0a5273f4ff8cb53a6). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Added Java serialization util functions back i...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3792#discussion_r22262724 --- Diff: network/common/src/main/java/org/apache/spark/network/util/JavaUtils.java --- @@ -41,6 +41,34 @@ public class JavaUtils { private static final Logger logger = LoggerFactory.getLogger(JavaUtils.class); + /** Deserialize a byte array using Java serialization. */ + public static T T deserialize(byte[] bytes) { +try { + ObjectInputStream is = new ObjectInputStream(new ByteArrayInputStream(bytes)); + Object out = is.readObject(); + is.close(); + return (T) out; +} catch (ClassNotFoundException e) { + throw new RuntimeException(Could not deserialize object, e); --- End diff -- Yeah pretty standard formulation. Nit suggestion: don't throw general `RuntimeException` but something marginally more specific like `IllegalStateException`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4877] Allow user first classes to exten...
Github user stephenh commented on the pull request: https://github.com/apache/spark/pull/3725#issuecomment-68071870 Cool, sounds good. FWIW there are few things to do after this gets in: a) document that if userClassPathFirst=true, then user's uberjar should not include any Spark or Scala code (or else they'll get class cast exceptions b/c the parent scala.Function will be different from the child scala.Function), b) either accept Marcelo's PR as-is (which, among other things, applies the user-first classloader to driver code) or pull out just the driver part of his PR until the rest gets in (I've done this for our local Spark build), c) as a few others have said, adapt the filtering logic from Jetty/Hadoop that will prefer scala.* and org.apache.spark.* (and a few others) from the parent classloader all the time, even if the user's uberjar does accidentally include them (at this point, the documentation added in a) could be removed). I included these in order of small - large, with the idea that, unless someone beats me to it (which would be great :-)), I'll progressively work through each one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [EC2] Update default Spark version to 1.2.0
GitHub user nchammas opened a pull request: https://github.com/apache/spark/pull/3793 [EC2] Update default Spark version to 1.2.0 Now that 1.2.0 is out, let's update the default Spark version. You can merge this pull request into a Git repository by running: $ git pull https://github.com/nchammas/spark patch-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3793.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3793 commit ec0e904608eaa65bbbf35b2558a0116387abaecf Author: Nicholas Chammas nicholas.cham...@gmail.com Date: 2014-12-24T20:10:02Z [EC2] Update default Spark version to 1.2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [EC2] Update default Spark version to 1.2.0
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/3793#issuecomment-68072359 cc @JoshRosen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [EC2] Update default Spark version to 1.2.0
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3793#issuecomment-68072413 [Test build #24787 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24787/consoleFull) for PR 3793 at commit [`ec0e904`](https://github.com/apache/spark/commit/ec0e904608eaa65bbbf35b2558a0116387abaecf). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [EC2] Update default Spark version to 1.2.0
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3793#issuecomment-68073160 It looks like this was already done in `branch-1.2`, but it doesn't hurt to do it in master: https://github.com/apache/spark/commit/dfb8c65b730fdf60540e91cd74fbaa2764a2a2bc If it's not already there, we should add this to the preparing a release checklist on the wiki. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4890] Upgrade Boto to 2.34.0; automatic...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3737#issuecomment-68073531 @nchammas Thanks for raising those concerns. The `--help` issue might not be too hard to fix (we may be able to do some lazy-loading of `boto`). For read-only mounts, I don't see a great solution: I don't want to continue bundling a zip file in the Spark source, since the boto download is huge (even after compression). Maybe we could package it when making binary distributions, though. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [EC2] Update default Spark version to 1.2.0
Github user nchammas commented on the pull request: https://github.com/apache/spark/pull/3793#issuecomment-68073619 Hmm, master is [already on 1.3.0](https://github.com/apache/spark/blob/199e59aacd540e17b31f38e0e32a3618870e9055/docs/_config.yml#L16) in that config file in dfb8c65. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4501][Core] - Create build/mvn to autom...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3707#discussion_r22263467 --- Diff: build/mvn --- @@ -0,0 +1,130 @@ +#!/usr/bin/env bash + +# Determine the current working directory +_DIR=$( cd $( dirname ${BASH_SOURCE[0]} ) pwd ) + +# Installs any application tarball given a URL, the expected tarball name, +# and, optionally, a checkable binary path to determine if the binary has +# already been installed +## Arg1 - URL +## Arg2 - Tarball Name +## Arg3 - Checkable Binary +install_app() { + local remote_tarball=$1/$2 + local local_tarball=${_DIR}/$2 + local binary=${_DIR}/$3 + + # setup `curl` and `wget` silent options if we're running on Jenkins + local curl_opts= + local wget_opts= + if [ -n $AMPLAB_JENKINS ]; then +curl_opts=-s +wget_opts=--quiet + else +curl_opts=--progress-bar +wget_opts=--progress=bar:force + fi + + if [ -z $3 -o ! -f $binary ]; then +# check if we already have the tarball +# check if we have curl installed +# download application +[ ! -f ${local_tarball} ] [ -n `which curl 2/dev/null` ] \ + echo exec: curl ${curl_opts} ${remote_tarball} \ + curl ${curl_opts} ${remote_tarball} ${local_tarball} +# if the file still doesn't exist, lets try `wget` and cross our fingers +[ ! -f ${local_tarball} ] [ -n `which wget 2/dev/null` ] \ + echo exec: wget ${wget_opts} ${remote_tarball} \ + wget ${wget_opts} -O ${local_tarball} ${remote_tarball} +# if both were unsuccessful, exit +[ ! -f ${local_tarball} ] \ + echo -n ERROR: Cannot download $2 with cURL or wget; \ + echo please install manually and try again. \ + exit 2 +cd ${_DIR} tar -xzf $2 +rm -rf $local_tarball + fi +} + +# Install maven under the build/ folder +install_mvn() { + install_app \ +http://apache.claz.org/maven/maven-3/3.2.3/binaries; \ +apache-maven-3.2.3-bin.tar.gz \ +apache-maven-3.2.3/bin/mvn + MVN_BIN=${_DIR}/apache-maven-3.2.3/bin/mvn +} + +# Install zinc under the build/ folder +install_zinc() { + local zinc_path=zinc-0.3.5.3/bin/zinc + [ ! -f ${zinc_path} ] ZINC_INSTALL_FLAG=1 + install_app \ +http://downloads.typesafe.com/zinc/0.3.5.3; \ +zinc-0.3.5.3.tgz \ +${zinc_path} + ZINC_BIN=${_DIR}/${zinc_path} +} + +# Determine the Scala version from the root pom.xml file, set the Scala URL, +# and, with that, download the specific version of Scala necessary under +# the build/ folder +install_scala() { + # determine the Scala version used in Spark + local scala_version=`grep scala.version ${_DIR}/../pom.xml | \ --- End diff -- That would probably be less brittle, but I guess it would introduce another dependency which we'd have to install in this script (since we want it to be a one-click installer). Since `xmlstarlet` binaries aren't portable, the installation logic might be complex since we'd need to have some platform-specific logic to download the right binaries. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Added Java serialization util functions back i...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3792#issuecomment-68073817 [Test build #24786 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24786/consoleFull) for PR 3792 at commit [`2a2ad9d`](https://github.com/apache/spark/commit/2a2ad9d6fcb98bbb7ffca1c0a5273f4ff8cb53a6). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Added Java serialization util functions back i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3792#issuecomment-68073819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24786/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4159 [CORE] Maven build doesn't run JUni...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3651#issuecomment-68074101 I did a quick `git grep` through the codebase to find uses of `SPARK_HOME` and it looks like there's only a few places where it's read: SparkContext, which is a fallback if `spark.home` is not set: ``` core/src/main/scala/org/apache/spark/SparkContext.scala- * Get Spark's home location from either a value set through the constructor, core/src/main/scala/org/apache/spark/SparkContext.scala- * or the spark.home Java property, or the SPARK_HOME environment variable core/src/main/scala/org/apache/spark/SparkContext.scala- * (in that order of preference). If neither of these is set, return None. core/src/main/scala/org/apache/spark/SparkContext.scala- */ core/src/main/scala/org/apache/spark/SparkContext.scala- private[spark] def getSparkHome(): Option[String] = { core/src/main/scala/org/apache/spark/SparkContext.scala: conf.getOption(spark.home).orElse(Option(System.getenv(SPARK_HOME))) core/src/main/scala/org/apache/spark/SparkContext.scala- } core/src/main/scala/org/apache/spark/SparkContext.scala- core/src/main/scala/org/apache/spark/SparkContext.scala- /** core/src/main/scala/org/apache/spark/SparkContext.scala- * Set the thread-local property for overriding the call sites core/src/main/scala/org/apache/spark/SparkContext.scala- * of actions and RDDs. ``` PythonUtils, with no fallback: ``` core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala- core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala-private[spark] object PythonUtils { core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala- /** Get the PYTHONPATH for PySpark, either from SPARK_HOME, if it is set, or from our JAR */ core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala- def sparkPythonPath: String = { core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala-val pythonPath = new ArrayBuffer[String] core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala:for (sparkHome - sys.env.get(SPARK_HOME)) { core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala- pythonPath += Seq(sparkHome, python).mkString(File.separator) core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala- pythonPath += Seq(sparkHome, python, lib, py4j-0.8.2.1-src.zip).mkString(File.separator) core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala-} core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala- pythonPath ++= SparkContext.jarOfObject(this) core/src/main/scala/org/apache/spark/api/python/PythonUtils.scala- pythonPath.mkString(File.pathSeparator) ``` FaultToleranceTest, which isn't actually run in our tests (since it needs a bunch of manual Docker setup to work): ``` core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala- val zk = SparkCuratorUtil.newClient(conf) core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala- core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala- var numPassed = 0 core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala- var numFailed = 0 core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala- core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala: val sparkHome = System.getenv(SPARK_HOME) core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala- assertTrue(sparkHome != null, Run with a valid SPARK_HOME) core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala- core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala- val containerSparkHome = /opt/spark core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala- val dockerMountDir = %s:%s.format(sparkHome, containerSparkHome) core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala- ``` SparkSubmitArguments, which uses this without a fallback: ``` core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala- */ core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala- private def mergeSparkProperties(): Unit = { core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala- // Use common defaults file, if not specified by user core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala- if (propertiesFile == null) { core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala- val sep = File.separator core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala: val sparkHomeConfig = env.get(SPARK_HOME).map(sparkHome = s${sparkHome}${sep}conf) core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala-
[GitHub] spark pull request: [SPARK-3398] [SPARK-4325] [EC2] Use EC2 status...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3195#issuecomment-68074192 Doh! Unfortunately, there's no way for us to go back and edit commit messages without messing up the git history. In the future, I'll be more careful when verifying commit message before merging (I should make a review checklist for these sorts of steps...) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4006] Block Manager - Double Register C...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2854#issuecomment-68074240 Since it's not obvious what's failing, I guess I'll have to log into Jenkins and look at the logs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Improve LiveList...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3710#discussion_r22263727 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/StreamingListenerBus.scala --- @@ -64,36 +71,40 @@ private[spark] class StreamingListenerBus() extends Logging { } def addListener(listener: StreamingListener) { -listeners += listener +listeners.add(listener) } def post(event: StreamingListenerEvent) { +if (stopped) { + // Drop further events to make `StreamingListenerShutdown` be delivered ASAP + logError(StreamingListenerBus has been stopped! Drop + event) + return +} val eventAdded = eventQueue.offer(event) -if (!eventAdded !queueFullErrorMessageLogged) { +if (!eventAdded queueFullErrorMessageLogged.compareAndSet(false, true)) { logError(Dropping StreamingListenerEvent because no remaining room in event queue. + This likely means one of the StreamingListeners is too slow and cannot keep up with the + rate at which events are being started by the scheduler.) - queueFullErrorMessageLogged = true } } - /** - * Waits until there are no more events in the queue, or until the specified time has elapsed. - * Used for testing only. Returns true if the queue has emptied and false is the specified time - * elapsed before the queue emptied. - */ - def waitUntilEmpty(timeoutMillis: Int): Boolean = { -val finishTime = System.currentTimeMillis + timeoutMillis -while (!eventQueue.isEmpty) { - if (System.currentTimeMillis finishTime) { -return false + def stop(): Unit = { +stopped = true +// Should not call `post`, or `StreamingListenerShutdown` may be dropped. +eventQueue.put(StreamingListenerShutdown) +listenerThread.join() + } + + private def foreachListener(f: StreamingListener = Unit): Unit = { +val iter = listeners.iterator --- End diff -- Nit: Can you add a comment mentioning why you used an iterator so that this does not regress in the future. This is quite subtle. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4159 [CORE] Maven build doesn't run JUni...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3651#issuecomment-68074359 @JoshRosen Yes, sounds about right to me. I rebased and pushed one more commit to remove special `SPARK_HOME` setting in these modules too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Improve LiveList...
Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/3710#discussion_r22263766 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/StreamingListenerBus.scala --- @@ -64,36 +71,40 @@ private[spark] class StreamingListenerBus() extends Logging { } def addListener(listener: StreamingListener) { -listeners += listener +listeners.add(listener) } def post(event: StreamingListenerEvent) { +if (stopped) { + // Drop further events to make `StreamingListenerShutdown` be delivered ASAP + logError(StreamingListenerBus has been stopped! Drop + event) + return +} val eventAdded = eventQueue.offer(event) -if (!eventAdded !queueFullErrorMessageLogged) { +if (!eventAdded queueFullErrorMessageLogged.compareAndSet(false, true)) { logError(Dropping StreamingListenerEvent because no remaining room in event queue. + This likely means one of the StreamingListeners is too slow and cannot keep up with the + rate at which events are being started by the scheduler.) - queueFullErrorMessageLogged = true } } - /** - * Waits until there are no more events in the queue, or until the specified time has elapsed. - * Used for testing only. Returns true if the queue has emptied and false is the specified time - * elapsed before the queue emptied. - */ - def waitUntilEmpty(timeoutMillis: Int): Boolean = { -val finishTime = System.currentTimeMillis + timeoutMillis -while (!eventQueue.isEmpty) { - if (System.currentTimeMillis finishTime) { -return false + def stop(): Unit = { +stopped = true +// Should not call `post`, or `StreamingListenerShutdown` may be dropped. +eventQueue.put(StreamingListenerShutdown) +listenerThread.join() + } + + private def foreachListener(f: StreamingListener = Unit): Unit = { +val iter = listeners.iterator --- End diff -- If this change is to avoid an implicit Java - Scala collections conversion, why not replace the `JavaConversions` implicits with the more explicit `JavaConverters` instead, so that you have to manually write `.asJava` or `.asScala`? That, in addition to a comment, would make it more obvious if we're re-introducing those conversions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Improve LiveList...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3710#discussion_r22263767 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/StreamingListenerBus.scala --- @@ -64,36 +71,40 @@ private[spark] class StreamingListenerBus() extends Logging { } def addListener(listener: StreamingListener) { -listeners += listener +listeners.add(listener) } def post(event: StreamingListenerEvent) { +if (stopped) { + // Drop further events to make `StreamingListenerShutdown` be delivered ASAP + logError(StreamingListenerBus has been stopped! Drop + event) + return +} val eventAdded = eventQueue.offer(event) -if (!eventAdded !queueFullErrorMessageLogged) { +if (!eventAdded queueFullErrorMessageLogged.compareAndSet(false, true)) { logError(Dropping StreamingListenerEvent because no remaining room in event queue. + This likely means one of the StreamingListeners is too slow and cannot keep up with the + rate at which events are being started by the scheduler.) - queueFullErrorMessageLogged = true } } - /** - * Waits until there are no more events in the queue, or until the specified time has elapsed. - * Used for testing only. Returns true if the queue has emptied and false is the specified time - * elapsed before the queue emptied. - */ - def waitUntilEmpty(timeoutMillis: Int): Boolean = { -val finishTime = System.currentTimeMillis + timeoutMillis -while (!eventQueue.isEmpty) { - if (System.currentTimeMillis finishTime) { -return false + def stop(): Unit = { +stopped = true +// Should not call `post`, or `StreamingListenerShutdown` may be dropped. +eventQueue.put(StreamingListenerShutdown) --- End diff -- Why is this `put` still there? Wouldnt this block / throw error if the queue is full? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4159 [CORE] Maven build doesn't run JUni...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3651#issuecomment-68074482 [Test build #24788 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24788/consoleFull) for PR 3651 at commit [`2e8a0af`](https://github.com/apache/spark/commit/2e8a0afeef77df8fd9c7df406812878b22c67aa7). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Improve LiveList...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3710#discussion_r22263810 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/StreamingListenerBus.scala --- @@ -39,18 +45,19 @@ private[spark] class StreamingListenerBus() extends Logging { val event = eventQueue.take --- End diff -- This does not use `stopped` like the `LiveListenerBus`. I know that introducing `eventLock` and using `eventQueue.poll` instead of `eventQueue.take` like the LiveListenerBus is too much for this PR. But at least we can eliminate the bug related to `StreamingListenerShutdown` by using `stopped` instead of `StreamingListenerShutdown`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Improve LiveList...
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/3710#discussion_r22263834 --- Diff: core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala --- @@ -35,8 +36,9 @@ private[spark] class LiveListenerBus extends SparkListenerBus with Logging { * an OOM exception) if it's perpetually being added to more quickly than it's being drained. */ --- End diff -- Please update the comment about `SparkListenerShutdown` in the documentation of this class. Also we should probably removed the declaration of `SparkListenerShutdown` and any references to it. git-grep to check. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Improve LiveList...
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/3710#issuecomment-68074729 I left a few more comments. There are clear inconsistencies between `LiveListenerBus` and `StreamingListenerBus`, which can only be solved by actually having `StreamingListenerBus` inherit `LiveListenerBus`. Since that is to be a different PR, I would suggest that let us at least try to maintain feature parity between them even with duplicate code. For eg., the bug with posting Shutdown event should be solved for both classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org