[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64322393 [Test build #23827 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23827/consoleFull) for PR 3381 at commit [`0c6cab6`](https://github.com/apache/spark/commit/0c6cab68aa7b4fad87968afa70888ed5b0fe3bbf). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4573] [SQL] Add SettableStructObjectIns...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3429#issuecomment-64322381 [Test build #23826 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23826/consoleFull) for PR 3429 at commit [`932940d`](https://github.com/apache/spark/commit/932940d17efee84b83103b1da918c107226aa643). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3435#issuecomment-64322398 [Test build #23825 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23825/consoleFull) for PR 3435 at commit [`85885a9`](https://github.com/apache/spark/commit/85885a98125e41a323212499eb7a8c6895f8c252). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...
GitHub user andrewor14 opened a pull request: https://github.com/apache/spark/pull/3447 [SPARK-4592] Avoid duplicate worker registrations in standalone mode **Symptom.** On failover, the Master may receive duplicate registrations from the same worker, causing the worker to exit. **Cause.** This commit https://github.com/apache/spark/commit/4afe9a4852ebeb4cc77322a14225cd3dec165f3f adds logic for the worker to re-register with the master in case of failures. However, the following race condition may occur: (1) Master A fails and Worker attempts to reconnect to all masters (2) Master B takes over and notifies Worker (3) Worker responds by registering with Master B (4) Meanwhile, Worker's previous reconnection attempt reaches Master B, causing the same Worker to register with Master B twice **Fix.** Instead of attempting to register with all known masters, the worker should re-register with only the one that it has been communicating with. Then, when it is finally notified of the change in master, the worker gives up on the old master and communicates with the new one. **Caveat.** Even this fix is subject to more obscure race conditions. For instance, if Master B fails and Master A recovers immediately, then Master A may still observed duplicate worker registrations. However, this, and other potential race conditions summarized in [SPARK-4592](https://issues.apache.org/jira/browse/SPARK-4592), are much, much less likely than the one described above, which is deterministically reproducible. You can merge this pull request into a Git repository by running: $ git pull https://github.com/andrewor14/spark standalone-failover Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3447.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3447 commit b6f269e6460ecc441c319b5e92437e47d141c361 Author: Andrew Or and...@databricks.com Date: 2014-11-25T07:40:00Z Avoid duplicate worker registrations The gist is that we only reconnect to the master we've been communicating with instead of making a registration request to all known masters. More details in the code comments. commit 1fce6a9343d6f563dac0c793480420c6511091ac Author: Andrew Or and...@databricks.com Date: 2014-11-25T08:06:04Z Active master actor could be null in the beginning If a worker cannot initially reach a master, then it will attempt a retry. In this case, the active master actor must be null. This commit removes an assert that falsely assumes the contrary. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4593] [SQL] return null when divider is...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3443#issuecomment-64323518 [Test build #23821 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23821/consoleFull) for PR 3443 at commit [`2dfe50f`](https://github.com/apache/spark/commit/2dfe50f607a6dced1554d07daa3a8d2e9664ffa9). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4593] [SQL] return null when divider is...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3443#issuecomment-64323521 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23821/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4573] [SQL] Add SettableStructObjectIns...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3429#issuecomment-64323707 [Test build #23829 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23829/consoleFull) for PR 3429 at commit [`f1b6749`](https://github.com/apache/spark/commit/f1b67499656b0170866405b248876c1ba4652822). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3447#issuecomment-64323736 [Test build #23828 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23828/consoleFull) for PR 3447 at commit [`1fce6a9`](https://github.com/apache/spark/commit/1fce6a9343d6f563dac0c793480420c6511091ac). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/2828#issuecomment-64323860 Andrew's got a patch for this: #3447 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4594][SQL] Improvement the broadcast fo...
GitHub user Leolh opened a pull request: https://github.com/apache/spark/pull/3448 [SPARK-4594][SQL] Improvement the broadcast for HiveConf https://issues.apache.org/jira/browse/SPARK-4594 Every time we need to get a table from hive , HadoopTableReader will broadcast HiveConf to clustor . Acturally In one application the hiveconf is single, so I think we can keep it in HiveContext for every query . Although it just 50kb , it's useful for JDBC user and streaming+sql app . You can merge this pull request into a Git repository by running: $ git pull https://github.com/Leolh/spark spark-4594 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3448.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3448 commit 4a21f4efb58108d00fc9eddd1192937a83c77c1e Author: Leolh leosand...@gmail.com Date: 2014-11-17T07:38:19Z Update MetadataCleaner.scala Fix a little mistake about delaySeconds . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4594][SQL] Improvement the broadcast fo...
Github user Leolh closed the pull request at: https://github.com/apache/spark/pull/3448 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/3449 [SPARK-4597] Use proper exception and reset variable `File.exists()` and `File.mkdirs()` only throw `SecurityException` instead of `IOException`. Then, when an exception is thrown, `dir` should be reset too. You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 fix_createtempdir Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3449.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3449 commit 36cacbd1f2f5cfa4f2cb0814650ba439e2cff3f3 Author: Liang-Chi Hsieh vii...@gmail.com Date: 2014-11-25T08:12:54Z Use proper exception and reset variable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4526][MLLIB]GradientDescent get a wrong...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3399#issuecomment-64324351 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23820/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4526][MLLIB]GradientDescent get a wrong...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3399#issuecomment-64324344 [Test build #23820 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23820/consoleFull) for PR 3399 at commit [`13cb228`](https://github.com/apache/spark/commit/13cb228a4e059f39997a6cd235ff87279b0cf854). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3449#issuecomment-64324372 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/3449#issuecomment-64324781 Jenkins, this is ok to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3447#issuecomment-64324812 [Test build #23830 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23830/consoleFull) for PR 3447 at commit [`83b321c`](https://github.com/apache/spark/commit/83b321cc02e4dfb47541c7dd13f65f98012316ef). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3449#issuecomment-64325487 [Test build #23831 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23831/consoleFull) for PR 3449 at commit [`36cacbd`](https://github.com/apache/spark/commit/36cacbd1f2f5cfa4f2cb0814650ba439e2cff3f3). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4573] [SQL] Add SettableStructObjectIns...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3429#issuecomment-64325735 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23826/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4573] [SQL] Add SettableStructObjectIns...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3429#issuecomment-64325733 [Test build #23826 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23826/consoleFull) for PR 3429 at commit [`932940d`](https://github.com/apache/spark/commit/932940d17efee84b83103b1da918c107226aa643). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-4512] [SQL] Unresolved Attribute Except...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3386#issuecomment-64325757 [Test build #23832 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23832/consoleFull) for PR 3386 at commit [`e0047a0`](https://github.com/apache/spark/commit/e0047a02183c52fa637e2808d94fc4b98fbe18c8). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3247#issuecomment-64325755 [Test build #23833 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23833/consoleFull) for PR 3247 at commit [`bb1eb2d`](https://github.com/apache/spark/commit/bb1eb2dfd14142defb43efb39cda8cb1cce460e4). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3247#issuecomment-64325852 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23833/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3247#issuecomment-64325848 [Test build #23833 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23833/consoleFull) for PR 3247 at commit [`bb1eb2d`](https://github.com/apache/spark/commit/bb1eb2dfd14142defb43efb39cda8cb1cce460e4). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class RandomForestModel(JavaModelWrapper):` * `class RandomForest(object):` * `case class UnresolvedFunction(` * `abstract class AggregateFunction ` * `trait AggregateExpression extends Expression ` * `case class MinFunction(aggr: BoundReference, base: Min) extends AggregateFunction ` * `case class Min(child: Expression, distinct: Boolean = false, override val distinctLike: Boolean = true) extends UnaryExpression with AggregateExpression ` * `case class AverageFunction(count: BoundReference, sum: BoundReference, base: Average) extends AggregateFunction ` * `case class Average(child: Expression, distinct: Boolean = false) extends UnaryExpression with AggregateExpression ` * `case class Max(child: Expression) extends UnaryExpression with AggregateExpression ` * `case class MaxFunction(expr: Expression, base: AggregateExpression) extends AggregateFunction ` * `case class Count(child: Expression) extends UnaryExpression with AggregateExpression ` * `case class CountDistinct(expressions: Seq[Expression]) extends UnaryExpression with AggregateExpression ` * `case class CollectHashSet(expressions: Seq[Expression]) extends UnaryExpression with AggregateExpression ` * `case class CombineSetsAndCount(inputSet: Expression) extends UnaryExpression with AggregateExpression ` * `case class ApproxCountDistinctPartition(child: Expression, relativeSD: Double) extends UnaryExpression with AggregateExpression` * `case class ApproxCountDistinctMerge(child: Expression, relativeSD: Double) extends UnaryExpression with AggregateExpression ` * `case class ApproxCountDistinct(child: Expression, relativeSD: Double = 0.05) extends UnaryExpression with AggregateExpression ` * `case class Sum(child: Expression) extends UnaryExpression with AggregateExpression` * `case class SumDistinct(child: Expression) extends UnaryExpression with AggregateExpression` * `case class First(child: Expression) extends UnaryExpression with AggregateExpression` * `case class Last(child: Expression) extends UnaryExpression with AggregateExpression` * `sealed case class AggregateFunctionBind(` * `sealed trait Aggregate ` * `sealed trait PreShuffle extends Aggregate ` * `sealed trait PostShuffle extends Aggregate ` * `case class AggregatePreShuffle(` * `case class AggregatePostShuffle(` * `case class DistinctAggregate(` * `class DefaultSource extends RelationProvider ` * `case class ParquetRelation2(path: String)(@transient val sqlContext: SQLContext)` * `abstract class CatalystScan extends BaseRelation ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3449#discussion_r20849000 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -262,7 +262,7 @@ private[spark] object Utils extends Logging { if (dir.exists() || !dir.mkdirs()) { dir = null } - } catch { case e: IOException = ; } + } catch { case e: SecurityException = dir = null; } --- End diff -- It looks like these two methods can't throw `IOException` after all, is that the gist of it? `mkdirs` just returns `false` if it fails, hm. https://docs.oracle.com/javase/7/docs/api/java/io/File.html#mkdirs() `dir = null` is a good bug fix. I might have changed this to not even assign `dir` and hold the new `File` in a temp variable until the checks succeeded. This looks equivalent though. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/3449#discussion_r20849117 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -262,7 +262,7 @@ private[spark] object Utils extends Logging { if (dir.exists() || !dir.mkdirs()) { dir = null } - } catch { case e: IOException = ; } + } catch { case e: SecurityException = dir = null; } --- End diff -- Yes. The only exception they would throw is `SecurityException`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] enable empty aggr test case
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3445#issuecomment-64326886 [Test build #23823 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23823/consoleFull) for PR 3445 at commit [`982575e`](https://github.com/apache/spark/commit/982575e58835c84d1bb57a0f471141edce4532db). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] enable empty aggr test case
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3445#issuecomment-64326891 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23823/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4573] [SQL] Add SettableStructObjectIns...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3429#issuecomment-64327084 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23829/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4573] [SQL] Add SettableStructObjectIns...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3429#issuecomment-64327080 [Test build #23829 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23829/consoleFull) for PR 3429 at commit [`f1b6749`](https://github.com/apache/spark/commit/f1b67499656b0170866405b248876c1ba4652822). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4599] [Build] [SQL] add hive profile in...
GitHub user adrian-wang opened a pull request: https://github.com/apache/spark/pull/3450 [SPARK-4599] [Build] [SQL] add hive profile in root pom This is what it was after #2685 , but seems reset by #3159 You can merge this pull request into a Git repository by running: $ git pull https://github.com/adrian-wang/spark profile Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3450.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3450 commit cbb2d330c9b47c993acb5dce5f65cf6b493374bd Author: Daoyuan Wang daoyuan.w...@intel.com Date: 2014-11-25T08:52:07Z add hive profile in root pom --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4599] [Build] [SQL] add hive profile in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3450#issuecomment-64327986 [Test build #23834 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23834/consoleFull) for PR 3450 at commit [`cbb2d33`](https://github.com/apache/spark/commit/cbb2d330c9b47c993acb5dce5f65cf6b493374bd). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4595][Core] Fix MetricsServlet not work...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3444#issuecomment-64328090 [Test build #23822 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23822/consoleFull) for PR 3444 at commit [`f779fe0`](https://github.com/apache/spark/commit/f779fe010f79ae70dc9e76cb9abd9edda6d2e16a). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4599] [Build] [SQL] add hive profile in...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3450#discussion_r20849854 --- Diff: pom.xml --- @@ -1394,7 +1394,7 @@ /dependencies /profile profile - idhive-thriftserver/id + idhive/id --- End diff -- What's the purpose of this? there are already profiles covering activation of Hive itself below; this profile is supposed to be about thriftserver and is so named. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4595][Core] Fix MetricsServlet not work...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3444#issuecomment-64328095 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23822/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4599] [Build] [SQL] add hive profile in...
Github user adrian-wang commented on a diff in the pull request: https://github.com/apache/spark/pull/3450#discussion_r20850064 --- Diff: pom.xml --- @@ -1394,7 +1394,7 @@ /dependencies /profile profile - idhive-thriftserver/id + idhive/id --- End diff -- We can have hive-thriftserver always included when we have -Phive --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64328732 [Test build #23827 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23827/consoleFull) for PR 3381 at commit [`0c6cab6`](https://github.com/apache/spark/commit/0c6cab68aa7b4fad87968afa70888ed5b0fe3bbf). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `final class Date extends Ordered[Date] with Serializable ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64328742 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23827/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-4512] [SQL] Unresolved Attribute Except...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3386#issuecomment-64329427 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23832/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-4512] [SQL] Unresolved Attribute Except...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3386#issuecomment-64329419 [Test build #23832 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23832/consoleFull) for PR 3386 at commit [`e0047a0`](https://github.com/apache/spark/commit/e0047a02183c52fa637e2808d94fc4b98fbe18c8). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3447#issuecomment-64330152 [Test build #23835 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23835/consoleFull) for PR 3447 at commit [`79286dc`](https://github.com/apache/spark/commit/79286dc3e027d138bf13ef55f190b95844afae0e). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4596][MLLib] Refactorize Normalizer to ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3446#issuecomment-64330374 [Test build #23824 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23824/consoleFull) for PR 3446 at commit [`e20a2b9`](https://github.com/apache/spark/commit/e20a2b97fcec7fcda11d5845674a25c1aace414f). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4596][MLLib] Refactorize Normalizer to ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3446#issuecomment-64330381 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23824/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3435#issuecomment-64331011 [Test build #23825 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23825/consoleFull) for PR 3435 at commit [`85885a9`](https://github.com/apache/spark/commit/85885a98125e41a323212499eb7a8c6895f8c252). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3435#issuecomment-64331017 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23825/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Branch 1.0
GitHub user lowryact opened a pull request: https://github.com/apache/spark/pull/3451 Branch 1.0 You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/spark branch-1.0 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3451.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3451 commit 16e3910a0512cd53ad0c9c71ef20a3ee0f10c34f Author: Matei Zaharia ma...@databricks.com Date: 2014-06-06T06:01:48Z SPARK-2043: ExternalAppendOnlyMap doesn't always find matching keys The current implementation reads one key with the next hash code as it finishes reading the keys with the current hash code, which may cause it to miss some matches of the next key. This can cause operations like join to give the wrong result when reduce tasks spill to disk and there are hash collisions, as values won't be matched together. This PR fixes it by not reading in that next key, using a peeking iterator instead. Author: Matei Zaharia ma...@databricks.com Closes #986 from mateiz/spark-2043 and squashes the following commits: 0959514 [Matei Zaharia] Added unit test for having many hash collisions 892debb [Matei Zaharia] SPARK-2043: don't read a key with the next hash code in ExternalAppendOnlyMap, instead use a buffered iterator to only read values with the current hash code. (cherry picked from commit b45c13e7d798f97b92f1a6329528191b8d779c4f) Signed-off-by: Matei Zaharia ma...@databricks.com commit d3717bea951888fe64cc2a0119d23b641b030735 Author: Michael Armbrust mich...@databricks.com Date: 2014-06-06T06:20:59Z [SPARK-2050][SQL] LIKE, RLIKE and IN in HQL should not be case sensitive. Author: Michael Armbrust mich...@databricks.com Closes #989 from marmbrus/caseSensitiveFuncitons and squashes the following commits: 681de54 [Michael Armbrust] LIKE, RLIKE and IN in HQL should not be case sensitive. (cherry picked from commit 41db44c428a10f4453462d002d226798bb8fbdda) Signed-off-by: Reynold Xin r...@apache.org commit d7467484ff08a5f9a566d3a7b21bab426ff89127 Author: Michael Armbrust mich...@databricks.com Date: 2014-06-06T18:31:37Z [SPARK-2050 - 2][SQL] DIV and BETWEEN should not be case sensitive. Followup: #989 Author: Michael Armbrust mich...@databricks.com Closes #994 from marmbrus/caseSensitiveFunctions2 and squashes the following commits: 9d9c8ed [Michael Armbrust] Fix DIV and BETWEEN. (cherry picked from commit 8d210560be8b143e48abfbaca347f383b5aa4798) Signed-off-by: Michael Armbrust mich...@databricks.com commit 39cfa9c0be34d4baf9de4eb9f9191c7b406c4d59 Author: Michael Armbrust mich...@databricks.com Date: 2014-06-07T21:20:33Z [SPARK-1994][SQL] Weird data corruption bug when running Spark SQL on data in HDFS Basically there is a race condition (possibly a scala bug?) when these values are recomputed on all of the slaves that results in an incorrect projection being generated (possibly because the GUID uniqueness contract is broken?). In general we should probably enforce that all expression planing occurs on the driver, as is now occurring here. Author: Michael Armbrust mich...@databricks.com Closes #1004 from marmbrus/fixAggBug and squashes the following commits: e0c116c [Michael Armbrust] Compute aggregate expression during planning instead of lazily on workers. (cherry picked from commit a6c72ab16e7a3027739ab419819f5222e270838e) Signed-off-by: Reynold Xin r...@apache.org commit 3f8450ec67fe84c290d725d4ebfcf9f5a7b0b109 Author: maji2014 ma...@asiainfo-linkage.com Date: 2014-06-08T22:14:27Z Update run-example Old code can only be ran under spark_home and use bin/run-example. Error ./run-example: line 55: ./bin/spark-submit: No such file or directory appears when running in other place. So change this Author: maji2014 ma...@asiainfo-linkage.com Closes #1011 from maji2014/master and squashes the following commits: 2cc1af6 [maji2014] Update run-example Closes #988. (cherry picked from commit e9261d0866a610eab29fa332726186b534d1018f) Signed-off-by: Patrick Wendell pwend...@gmail.com commit 502a8f795551007db8a390c4eb7cfde7ca7742fb Author: Neville Li nevi...@spotify.com Date: 2014-06-09T06:18:27Z [SPARK-2067] use relative path for Spark logo in UI Author: Neville Li nevi...@spotify.com Closes #1006 from nevillelyh/gh/SPARK-2067 and squashes the following commits: 9ee64cf [Neville Li] [SPARK-2067] use relative path for Spark logo in UI (cherry picked from commit 15ddbef414d5fd6d4672936ba3c747b5fb7ab52b) Signed-off-by: Patrick Wendell
[GitHub] spark pull request: [SPARK-4594][SQL] Improvement the broadcast fo...
GitHub user Leolh opened a pull request: https://github.com/apache/spark/pull/3452 [SPARK-4594][SQL] Improvement the broadcast for HiveConf Every time we need to get a table from hive , HadoopTableReader will broadcast HiveConf to clustor . Acturally In one application the hiveconf is single, so I think we can keep it in HiveContext for every query . Although it just 50kb , it's useful for JDBC user and streaming+sql app . You can merge this pull request into a Git repository by running: $ git pull https://github.com/Leolh/spark spark-4594 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3452.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3452 commit 0a4997e00f0a4eccb810a70bee5646669617dfc5 Author: leo leo@leo.localdomain Date: 2014-11-25T09:02:55Z make hiveconf broadcast as singal --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4594][SQL] Improvement the broadcast fo...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3452#issuecomment-64331829 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Branch 1.0
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3451#issuecomment-64331834 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3447#issuecomment-64332172 [Test build #23828 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23828/consoleFull) for PR 3447 at commit [`1fce6a9`](https://github.com/apache/spark/commit/1fce6a9343d6f563dac0c793480420c6511091ac). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Branch 1.0
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3451#issuecomment-64332167 This PR looks messed up or opened in error, can it be closed? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3447#issuecomment-64332179 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23828/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...
Github user liancheng commented on the pull request: https://github.com/apache/spark/pull/3381#issuecomment-64332371 @adrian-wang Yea, Jenkins compiles Spark SQL with both Hive 0.12.0 and 0.13.1, and then runs SQL tests against 0.13.1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3447#issuecomment-64336089 [Test build #23830 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23830/consoleFull) for PR 3447 at commit [`83b321c`](https://github.com/apache/spark/commit/83b321cc02e4dfb47541c7dd13f65f98012316ef). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3447#issuecomment-64336176 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23830/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-4509] Revert EC2 tag-based cluster memb...
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/3453 [Spark-4509] Revert EC2 tag-based cluster membership patch This PR reverts changes related to tag-based cluster membership. As discussed in SPARK-3332, we didn't figure out a safe strategy to use tags to determine cluster membership, because tagging is not atomic. The following changes are reverted: SPARK-2333: 94053a7b766788bb62e2dbbf352ccbcc75f71fc0 SPARK-3213: 7faf755ae4f0cf510048e432340260a6e609066d SPARK-3608: 78d4220fa0bf2f9ee663e34bbf3544a5313b02f0. I tested launch, login, and destroy. It is easy to check the diff by comparing it to Josh's patch for branch-1.1: https://github.com/apache/spark/pull/2225/files @JoshRosen I sent the PR to master. It might be easier for us to keep master and branch-1.2 the same at this time. We can always re-apply the patch once we figure out a stable solution. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mengxr/spark SPARK-4509 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3453.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3453 commit 35963a1ba2a94e18adf07d6879aeb53379cd8f14 Author: Xiangrui Meng m...@databricks.com Date: 2014-11-25T09:11:28Z Revert SPARK-3608 Break if the instance tag naming succeeds This reverts commit 78d4220fa0bf2f9ee663e34bbf3544a5313b02f0. commit 4298ea52d4ffddd3a209684a991918f3114f44d4 Author: Xiangrui Meng m...@databricks.com Date: 2014-11-25T09:16:59Z revert 7faf755ae4f0cf510048e432340260a6e609066d commit f0b708bb125ebde0f65a1d6130a5168793ca8a66 Author: Xiangrui Meng m...@databricks.com Date: 2014-11-25T09:21:38Z revert 94053a7b766788bb62e2dbbf352ccbcc75f71fc0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/3435#discussion_r20852499 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/feature/StandardScaler.scala --- @@ -97,30 +97,57 @@ class StandardScalerModel private[mllib] ( override def transform(vector: Vector): Vector = { require(mean.size == vector.size) if (withMean) { - vector.toBreeze match { -case dv: BDV[Double] = - val output = vector.toBreeze.copy - var i = 0 - while (i output.length) { -output(i) = (output(i) - mean(i)) * (if (withStd) factor(i) else 1.0) -i += 1 + // By default, Scala generates Java methods for member variables. So every time when + // the member variables are accessed, `invokespecial` will be called which is expensive. + // This can be avoid by having a local reference of `shift`. + val localShift = shift --- End diff -- `shift` only holds a reference to `mean.values`. We don't really need to define it as a member and make it lazy. It should give the same performance if we only define it inside the if branch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3449#issuecomment-64344035 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23831/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3449#issuecomment-64344002 [Test build #23831 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23831/consoleFull) for PR 3449 at commit [`36cacbd`](https://github.com/apache/spark/commit/36cacbd1f2f5cfa4f2cb0814650ba439e2cff3f3). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4596][MLLib] Refactorize Normalizer to ...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3446#issuecomment-64345544 LGTM. Merged into master and branch-1.2. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-4509] Revert EC2 tag-based cluster memb...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3453#issuecomment-64347503 [Test build #23836 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23836/consoleFull) for PR 3453 at commit [`f0b708b`](https://github.com/apache/spark/commit/f0b708bb125ebde0f65a1d6130a5168793ca8a66). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4543] Javadoc failure for network-commo...
Github user ueshin commented on the pull request: https://github.com/apache/spark/pull/3405#issuecomment-64351835 Hi, is this related to #3058? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4526][MLLIB]GradientDescent get a wrong...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/3399#issuecomment-64354201 LGTM. Merged into master and branch-1.2. The JIRA number should be SPARK-4530 instead of SPARK-4526. Could you update the title and then close this PR? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4583] [mllib] LogLoss for GradientBoost...
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/3439#discussion_r20853259 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/loss/SquaredError.scala --- @@ -49,18 +48,17 @@ object SquaredError extends Loss { } /** - * Method to calculate error of the base learner for the gradient boosting calculation. + * Method to calculate loss of the base learner for the gradient boosting calculation. * Note: This method is not used by the gradient boosting algorithm but is useful for debugging * purposes. - * @param model Model of the weak learner. + * @param model Ensemble model * @param data Training dataset: RDD of [[org.apache.spark.mllib.regression.LabeledPoint]]. - * @return + * @return Mean squared error of model on data --- End diff -- `MSE` is not usually defined with multiplier `1/2`. Shall we use a different name here, or example, `mean squared loss` or `average loss`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4599] [Build] [SQL] add hive profile in...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3450#issuecomment-64376530 [Test build #23834 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23834/consoleFull) for PR 3450 at commit [`cbb2d33`](https://github.com/apache/spark/commit/cbb2d330c9b47c993acb5dce5f65cf6b493374bd). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4599] [Build] [SQL] add hive profile in...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3450#issuecomment-64376615 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23834/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3447#issuecomment-64381651 [Test build #23835 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23835/consoleFull) for PR 3447 at commit [`79286dc`](https://github.com/apache/spark/commit/79286dc3e027d138bf13ef55f190b95844afae0e). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3447#issuecomment-64381659 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23835/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4530][MLLIB]GradientDescent get a wrong...
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/3399#issuecomment-64381732 @mengxr The title has been updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1450 [EC2] Specify the default zone in t...
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/3454 SPARK-1450 [EC2] Specify the default zone in the EC2 script help This looks like a one-liner, so I took a shot at it. There can be no fixed default availability zone since the names are different per region. But the default behavior can be documented: ``` if opts.zone == : opts.zone = random.choice(conn.get_all_zones()).name ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/srowen/spark SPARK-1450 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3454.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3454 commit 9193cf3f8e9ae0e2d30a7c50a7be06440a006f91 Author: Sean Owen so...@cloudera.com Date: 2014-11-25T10:48:23Z Document that --zone defaults to a single random zone --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1450 [EC2] Specify the default zone in t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3454#issuecomment-64382910 [Test build #23837 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23837/consoleFull) for PR 3454 at commit [`9193cf3`](https://github.com/apache/spark/commit/9193cf3f8e9ae0e2d30a7c50a7be06440a006f91). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Delete unnecessary function
GitHub user KaiXinXiaoLei reopened a pull request: https://github.com/apache/spark/pull/3224 Delete unnecessary function when building spark by sbt, the function ârunAlternateBoot in sbt/sbt-launch-lib.bash is not used. And this function is not used by spark code. So I think this function is not necessary. And the option of sbt.boot.properties can be configured in the command line when building spark, eg: sbt/sbt assembly -Dsbt.boot.properties=$bootpropsfile. You can merge this pull request into a Git repository by running: $ git pull https://github.com/KaiXinXiaoLei/spark deleteFunction Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3224.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3224 commit efe36d4dda1c56b027afb6604f0996f019995c89 Author: KaiXinXiaoLei huleil...@huawei.com Date: 2014-11-12T09:45:21Z Delete unnecessary function --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Delete unnecessary function
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3224#issuecomment-64384870 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Delete unnecessary function
Github user KaiXinXiaoLei commented on the pull request: https://github.com/apache/spark/pull/3224#issuecomment-64385026 The file from https://github.com/sbt/sbt-launcher-package is changed. And the function ârunAlternateBoot is deleted in upstream project. I think spark project should delete this function in file sbt/sbt-launch-lib.bash. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Delete unnecessary function
Github user KaiXinXiaoLei commented on the pull request: https://github.com/apache/spark/pull/3224#issuecomment-64385336 The file from https://github.com/sbt/sbt-launcher-package is changed. And the function ârunAlternateBoot is deleted in upstream project. I think spark project should delete this function in file sbt/sbt-launch-lib.bash. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-4509] Revert EC2 tag-based cluster memb...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3453#issuecomment-64386611 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23836/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-4509] Revert EC2 tag-based cluster memb...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3453#issuecomment-64386599 [Test build #23836 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23836/consoleFull) for PR 3453 at commit [`f0b708b`](https://github.com/apache/spark/commit/f0b708bb125ebde0f65a1d6130a5168793ca8a66). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4535][Streaming] Fix the error in comme...
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/3400#issuecomment-64389631 @watermen Thanks these are great fixes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4535][Streaming] Fix the error in comme...
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/3400#issuecomment-64389589 Jenkins, this is ok to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4535][Streaming] Fix the error in comme...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3400#issuecomment-64390206 [Test build #23838 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23838/consoleFull) for PR 3400 at commit [`75d795c`](https://github.com/apache/spark/commit/75d795ce9f64ca1cde1e9c360987db7e2ca41337). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1450 [EC2] Specify the default zone in t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3454#issuecomment-64391330 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23837/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-1450 [EC2] Specify the default zone in t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3454#issuecomment-64391321 [Test build #23837 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23837/consoleFull) for PR 3454 at commit [`9193cf3`](https://github.com/apache/spark/commit/9193cf3f8e9ae0e2d30a7c50a7be06440a006f91). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4601][Streaming] Set correct call site ...
GitHub user tdas opened a pull request: https://github.com/apache/spark/pull/3455 [SPARK-4601][Streaming] Set correct call site for streaming jobs so that it is displayed correctly on the Spark UI When running the NetworkWordCount, the description of the word count jobs are set as getCallsite at DStream:xxx . This should be set to the line number of the streaming application that has the output operation that led to the job being created. This is because the callsite is incorrectly set in the thread launching the jobs. This PR fixes that. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tdas/spark streaming-callsite-fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3455.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3455 commit 69fc26fd870361c345f6482737d6192949ab46b4 Author: Tathagata Das tathagata.das1...@gmail.com Date: 2014-11-25T13:08:32Z Set correct call site for streaming jobs so that it is displayed correctly on the Spark UI --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4601][Streaming] Set correct call site ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3455#issuecomment-64398664 [Test build #23839 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23839/consoleFull) for PR 3455 at commit [`69fc26f`](https://github.com/apache/spark/commit/69fc26fd870361c345f6482737d6192949ab46b4). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4598] use pagination to show tasktable
GitHub user XuTingjun opened a pull request: https://github.com/apache/spark/pull/3456 [SPARK-4598] use pagination to show tasktable When the application has too many tasks, tasktable with all tasks costs a lot of memory. If using pagination, every time tasktable shows some tasks. So this can reduce the memory usage You can merge this pull request into a Git repository by running: $ git pull https://github.com/XuTingjun/spark patch-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3456.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3456 commit 5d75a606bd51e9cb07eddef4e4fd555cabab1b5a Author: meiyoula 1039320...@qq.com Date: 2014-11-25T13:20:12Z Update StagePage.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4598] use pagination to show tasktable
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3456#issuecomment-64398877 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4535][Streaming] Fix the error in comme...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3400#issuecomment-64398984 [Test build #23838 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23838/consoleFull) for PR 3400 at commit [`75d795c`](https://github.com/apache/spark/commit/75d795ce9f64ca1cde1e9c360987db7e2ca41337). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4535][Streaming] Fix the error in comme...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3400#issuecomment-64398987 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23838/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4381][Streaming]Add warning log when us...
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/3244#issuecomment-64399467 Since no unit tests cover this change, I tested it manually. It works as expected merging this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/3409#discussion_r20864416 --- Diff: conf/spark-defaults.conf.template --- @@ -8,3 +8,4 @@ # spark.serializer org.apache.spark.serializer.KryoSerializer # spark.driver.memory 5g # spark.executor.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers=one two three +# spark.yarn.am.extraJavaOptions -XX:+PrintGCDetails -Dkey=value -Dnumbers=one two three --- End diff -- I would remove this from here since its yarn specific. Can you also please document it in the docs/running-on-yarn.md. We should be clear on what mode this applies to since this will only be different from driver extraJavaOptions in Client mode, correct? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4534][Core]JavaSparkContext create new ...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3403#issuecomment-64405670 I agree with Sandy on this, can you close this until we get SPARK-2089 working and then we can make sure it works with Java api also. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4344][DOCS] adding documentation on spa...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3209#issuecomment-64405811 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4344][DOCS] adding documentation on spa...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3209#issuecomment-64405912 @vanzin I think I'll pull this in and you will have to remove it again in https://github.com/apache/spark/pull/3233 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4344][DOCS] adding documentation on spa...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3209#issuecomment-64406285 I pulled this into both master and branch-1.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4451]force to kill process after 5 seco...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3316#issuecomment-64406710 [Test build #23840 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23840/consoleFull) for PR 3316 at commit [`88bd312`](https://github.com/apache/spark/commit/88bd312a52efc539719afb4221e469a495305ce0). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-4512] [SQL] Unresolved Attribute Except...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3386#issuecomment-64407794 [Test build #23841 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23841/consoleFull) for PR 3386 at commit [`6e720af`](https://github.com/apache/spark/commit/6e720af49428817c6f48d5e161b34e182a31b872). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4601][Streaming] Set correct call site ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3455#issuecomment-64409916 [Test build #23839 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23839/consoleFull) for PR 3455 at commit [`69fc26f`](https://github.com/apache/spark/commit/69fc26fd870361c345f6482737d6192949ab46b4). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4601][Streaming] Set correct call site ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3455#issuecomment-64409929 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23839/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4196][SPARK-4602] Fix serialization iss...
GitHub user tdas opened a pull request: https://github.com/apache/spark/pull/3457 [SPARK-4196][SPARK-4602] Fix serialization issue in PairDStreamFunctions.saveAsNewAPIHadoopFiles Solves two JIRAs in one shot - Makes the ForechDStream created by saveAsNewAPIHadoopFiles serializable for checkpoints - Makes the default configuration object used saveAsNewAPIHadoopFiles be the Spark's hadoop configuration You can merge this pull request into a Git repository by running: $ git pull https://github.com/tdas/spark savefiles-fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/3457.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3457 commit b382ea9facd1cd70b254319811fe9600503e0286 Author: Tathagata Das tathagata.das1...@gmail.com Date: 2014-11-25T15:05:29Z Fix serialization issue in PairDStreamFunctions.saveAsNewAPIHadoopFiles. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4196][SPARK-4602][Streaming] Fix serial...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3457#discussion_r20868146 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/PairDStreamFunctions.scala --- @@ -702,11 +699,14 @@ class PairDStreamFunctions[K, V](self: DStream[(K,V)]) keyClass: Class[_], valueClass: Class[_], outputFormatClass: Class[_ : NewOutputFormat[_, _]], - conf: Configuration = new Configuration + conf: Configuration = ssc.sparkContext.hadoopConfiguration ) { +// Wrap this in SerializableWritable so that ForeachDStream can be serialized for checkpoints +val serializableConf = new SerializableWritable(conf) --- End diff -- Ah you're already on it, yep, that looks good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org