[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3381#issuecomment-64322393
  
  [Test build #23827 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23827/consoleFull)
 for   PR 3381 at commit 
[`0c6cab6`](https://github.com/apache/spark/commit/0c6cab68aa7b4fad87968afa70888ed5b0fe3bbf).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4573] [SQL] Add SettableStructObjectIns...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3429#issuecomment-64322381
  
  [Test build #23826 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23826/consoleFull)
 for   PR 3429 at commit 
[`932940d`](https://github.com/apache/spark/commit/932940d17efee84b83103b1da918c107226aa643).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3435#issuecomment-64322398
  
  [Test build #23825 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23825/consoleFull)
 for   PR 3435 at commit 
[`85885a9`](https://github.com/apache/spark/commit/85885a98125e41a323212499eb7a8c6895f8c252).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...

2014-11-25 Thread andrewor14
GitHub user andrewor14 opened a pull request:

https://github.com/apache/spark/pull/3447

[SPARK-4592] Avoid duplicate worker registrations in standalone mode

**Symptom.** On failover, the Master may receive duplicate registrations 
from the same worker, causing the worker to exit.

**Cause.** This commit 
https://github.com/apache/spark/commit/4afe9a4852ebeb4cc77322a14225cd3dec165f3f 
adds logic for the worker to re-register with the master in case of failures. 
However, the following race condition may occur:

(1) Master A fails and Worker attempts to reconnect to all masters
(2) Master B takes over and notifies Worker
(3) Worker responds by registering with Master B
(4) Meanwhile, Worker's previous reconnection attempt reaches Master B, 
causing the same Worker to register with Master B twice

**Fix.** Instead of attempting to register with all known masters, the 
worker should re-register with only the one that it has been communicating 
with. Then, when it is finally notified of the change in master, the worker 
gives up on the old master and communicates with the new one.

**Caveat.** Even this fix is subject to more obscure race conditions. For 
instance, if Master B fails and Master A recovers immediately, then Master A 
may still observed duplicate worker registrations. However, this, and other 
potential race conditions summarized in 
[SPARK-4592](https://issues.apache.org/jira/browse/SPARK-4592), are much, much 
less likely than the one described above, which is deterministically 
reproducible.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andrewor14/spark standalone-failover

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3447.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3447


commit b6f269e6460ecc441c319b5e92437e47d141c361
Author: Andrew Or and...@databricks.com
Date:   2014-11-25T07:40:00Z

Avoid duplicate worker registrations

The gist is that we only reconnect to the master we've been
communicating with instead of making a registration request
to all known masters. More details in the code comments.

commit 1fce6a9343d6f563dac0c793480420c6511091ac
Author: Andrew Or and...@databricks.com
Date:   2014-11-25T08:06:04Z

Active master actor could be null in the beginning

If a worker cannot initially reach a master, then it will attempt
a retry. In this case, the active master actor must be null. This
commit removes an assert that falsely assumes the contrary.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4593] [SQL] return null when divider is...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3443#issuecomment-64323518
  
  [Test build #23821 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23821/consoleFull)
 for   PR 3443 at commit 
[`2dfe50f`](https://github.com/apache/spark/commit/2dfe50f607a6dced1554d07daa3a8d2e9664ffa9).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4593] [SQL] return null when divider is...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3443#issuecomment-64323521
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23821/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4573] [SQL] Add SettableStructObjectIns...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3429#issuecomment-64323707
  
  [Test build #23829 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23829/consoleFull)
 for   PR 3429 at commit 
[`f1b6749`](https://github.com/apache/spark/commit/f1b67499656b0170866405b248876c1ba4652822).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3447#issuecomment-64323736
  
  [Test build #23828 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23828/consoleFull)
 for   PR 3447 at commit 
[`1fce6a9`](https://github.com/apache/spark/commit/1fce6a9343d6f563dac0c793480420c6511091ac).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-3736] Workers reconnect when disassocia...

2014-11-25 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/2828#issuecomment-64323860
  
Andrew's got a patch for this: #3447


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4594][SQL] Improvement the broadcast fo...

2014-11-25 Thread Leolh
GitHub user Leolh opened a pull request:

https://github.com/apache/spark/pull/3448

[SPARK-4594][SQL] Improvement the broadcast for HiveConf

https://issues.apache.org/jira/browse/SPARK-4594
Every time we need to get a table from hive , HadoopTableReader will 
broadcast HiveConf to clustor . Acturally In one application the hiveconf is 
single, so I think we can keep it in HiveContext for every query . Although it 
just 50kb , it's useful for JDBC user and streaming+sql app .

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Leolh/spark spark-4594

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3448.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3448


commit 4a21f4efb58108d00fc9eddd1192937a83c77c1e
Author: Leolh leosand...@gmail.com
Date:   2014-11-17T07:38:19Z

Update MetadataCleaner.scala

Fix a little mistake about delaySeconds .




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4594][SQL] Improvement the broadcast fo...

2014-11-25 Thread Leolh
Github user Leolh closed the pull request at:

https://github.com/apache/spark/pull/3448


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...

2014-11-25 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/3449

[SPARK-4597] Use proper exception and reset variable

`File.exists()` and `File.mkdirs()` only throw `SecurityException` instead 
of `IOException`. Then, when an exception is thrown, `dir` should be reset too.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 fix_createtempdir

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3449.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3449


commit 36cacbd1f2f5cfa4f2cb0814650ba439e2cff3f3
Author: Liang-Chi Hsieh vii...@gmail.com
Date:   2014-11-25T08:12:54Z

Use proper exception and reset variable.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4526][MLLIB]GradientDescent get a wrong...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3399#issuecomment-64324351
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23820/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4526][MLLIB]GradientDescent get a wrong...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3399#issuecomment-64324344
  
  [Test build #23820 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23820/consoleFull)
 for   PR 3399 at commit 
[`13cb228`](https://github.com/apache/spark/commit/13cb228a4e059f39997a6cd235ff87279b0cf854).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3449#issuecomment-64324372
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...

2014-11-25 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3449#issuecomment-64324781
  
Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3447#issuecomment-64324812
  
  [Test build #23830 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23830/consoleFull)
 for   PR 3447 at commit 
[`83b321c`](https://github.com/apache/spark/commit/83b321cc02e4dfb47541c7dd13f65f98012316ef).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3449#issuecomment-64325487
  
  [Test build #23831 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23831/consoleFull)
 for   PR 3449 at commit 
[`36cacbd`](https://github.com/apache/spark/commit/36cacbd1f2f5cfa4f2cb0814650ba439e2cff3f3).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4573] [SQL] Add SettableStructObjectIns...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3429#issuecomment-64325735
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23826/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4573] [SQL] Add SettableStructObjectIns...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3429#issuecomment-64325733
  
  [Test build #23826 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23826/consoleFull)
 for   PR 3429 at commit 
[`932940d`](https://github.com/apache/spark/commit/932940d17efee84b83103b1da918c107226aa643).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-4512] [SQL] Unresolved Attribute Except...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3386#issuecomment-64325757
  
  [Test build #23832 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23832/consoleFull)
 for   PR 3386 at commit 
[`e0047a0`](https://github.com/apache/spark/commit/e0047a02183c52fa637e2808d94fc4b98fbe18c8).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3247#issuecomment-64325755
  
  [Test build #23833 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23833/consoleFull)
 for   PR 3247 at commit 
[`bb1eb2d`](https://github.com/apache/spark/commit/bb1eb2dfd14142defb43efb39cda8cb1cce460e4).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3247#issuecomment-64325852
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23833/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4233] [SQL] WIP:Simplify the UDAF API (...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3247#issuecomment-64325848
  
  [Test build #23833 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23833/consoleFull)
 for   PR 3247 at commit 
[`bb1eb2d`](https://github.com/apache/spark/commit/bb1eb2dfd14142defb43efb39cda8cb1cce460e4).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class RandomForestModel(JavaModelWrapper):`
  * `class RandomForest(object):`
  * `case class UnresolvedFunction(`
  * `abstract class AggregateFunction `
  * `trait AggregateExpression extends Expression `
  * `case class MinFunction(aggr: BoundReference, base: Min) extends 
AggregateFunction `
  * `case class Min(child: Expression, distinct: Boolean = false, override 
val distinctLike: Boolean = true) extends UnaryExpression with 
AggregateExpression `
  * `case class AverageFunction(count: BoundReference, sum: BoundReference, 
base: Average) extends AggregateFunction `
  * `case class Average(child: Expression, distinct: Boolean = false) 
extends UnaryExpression with AggregateExpression `
  * `case class Max(child: Expression) extends UnaryExpression with 
AggregateExpression `
  * `case class MaxFunction(expr: Expression, base: AggregateExpression) 
extends AggregateFunction `
  * `case class Count(child: Expression) extends UnaryExpression with 
AggregateExpression `
  * `case class CountDistinct(expressions: Seq[Expression]) extends 
UnaryExpression with AggregateExpression `
  * `case class CollectHashSet(expressions: Seq[Expression]) extends 
UnaryExpression with AggregateExpression `
  * `case class CombineSetsAndCount(inputSet: Expression) extends 
UnaryExpression with AggregateExpression `
  * `case class ApproxCountDistinctPartition(child: Expression, relativeSD: 
Double) extends UnaryExpression with AggregateExpression`
  * `case class ApproxCountDistinctMerge(child: Expression, relativeSD: 
Double) extends UnaryExpression with AggregateExpression `
  * `case class ApproxCountDistinct(child: Expression, relativeSD: Double = 
0.05) extends UnaryExpression with AggregateExpression `
  * `case class Sum(child: Expression)  extends UnaryExpression with 
AggregateExpression`
  * `case class SumDistinct(child: Expression) extends UnaryExpression with 
AggregateExpression`
  * `case class First(child: Expression)  extends UnaryExpression with 
AggregateExpression`
  * `case class Last(child: Expression)  extends UnaryExpression with 
AggregateExpression`
  * `sealed case class AggregateFunctionBind(`
  * `sealed trait Aggregate `
  * `sealed trait PreShuffle extends Aggregate `
  * `sealed trait PostShuffle extends Aggregate `
  * `case class AggregatePreShuffle(`
  * `case class AggregatePostShuffle(`
  * `case class DistinctAggregate(`
  * `class DefaultSource extends RelationProvider `
  * `case class ParquetRelation2(path: String)(@transient val sqlContext: 
SQLContext)`
  * `abstract class CatalystScan extends BaseRelation `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...

2014-11-25 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3449#discussion_r20849000
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -262,7 +262,7 @@ private[spark] object Utils extends Logging {
 if (dir.exists() || !dir.mkdirs()) {
   dir = null
 }
-  } catch { case e: IOException = ; }
+  } catch { case e: SecurityException = dir = null; }
--- End diff --

It looks like these two methods can't throw `IOException` after all, is 
that the gist of it? `mkdirs` just returns `false` if it fails, hm. 
https://docs.oracle.com/javase/7/docs/api/java/io/File.html#mkdirs()

`dir = null` is a good bug fix. I might have changed this to not even 
assign `dir` and hold the new `File` in a temp variable until the checks 
succeeded. This looks equivalent though.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...

2014-11-25 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/3449#discussion_r20849117
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -262,7 +262,7 @@ private[spark] object Utils extends Logging {
 if (dir.exists() || !dir.mkdirs()) {
   dir = null
 }
-  } catch { case e: IOException = ; }
+  } catch { case e: SecurityException = dir = null; }
--- End diff --

Yes. The only exception they would throw is `SecurityException`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] enable empty aggr test case

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3445#issuecomment-64326886
  
  [Test build #23823 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23823/consoleFull)
 for   PR 3445 at commit 
[`982575e`](https://github.com/apache/spark/commit/982575e58835c84d1bb57a0f471141edce4532db).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SQL] enable empty aggr test case

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3445#issuecomment-64326891
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23823/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4573] [SQL] Add SettableStructObjectIns...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3429#issuecomment-64327084
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23829/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4573] [SQL] Add SettableStructObjectIns...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3429#issuecomment-64327080
  
  [Test build #23829 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23829/consoleFull)
 for   PR 3429 at commit 
[`f1b6749`](https://github.com/apache/spark/commit/f1b67499656b0170866405b248876c1ba4652822).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4599] [Build] [SQL] add hive profile in...

2014-11-25 Thread adrian-wang
GitHub user adrian-wang opened a pull request:

https://github.com/apache/spark/pull/3450

[SPARK-4599] [Build] [SQL] add hive profile in root pom

This is what it was after #2685 , but seems reset by #3159 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/adrian-wang/spark profile

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3450.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3450


commit cbb2d330c9b47c993acb5dce5f65cf6b493374bd
Author: Daoyuan Wang daoyuan.w...@intel.com
Date:   2014-11-25T08:52:07Z

add hive profile in root pom




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4599] [Build] [SQL] add hive profile in...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3450#issuecomment-64327986
  
  [Test build #23834 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23834/consoleFull)
 for   PR 3450 at commit 
[`cbb2d33`](https://github.com/apache/spark/commit/cbb2d330c9b47c993acb5dce5f65cf6b493374bd).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4595][Core] Fix MetricsServlet not work...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3444#issuecomment-64328090
  
  [Test build #23822 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23822/consoleFull)
 for   PR 3444 at commit 
[`f779fe0`](https://github.com/apache/spark/commit/f779fe010f79ae70dc9e76cb9abd9edda6d2e16a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4599] [Build] [SQL] add hive profile in...

2014-11-25 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3450#discussion_r20849854
  
--- Diff: pom.xml ---
@@ -1394,7 +1394,7 @@
   /dependencies
 /profile
 profile
-  idhive-thriftserver/id
+  idhive/id
--- End diff --

What's the purpose of this? there are already profiles covering activation 
of Hive itself below; this profile is supposed to be about thriftserver and is 
so named.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4595][Core] Fix MetricsServlet not work...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3444#issuecomment-64328095
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23822/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4599] [Build] [SQL] add hive profile in...

2014-11-25 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/3450#discussion_r20850064
  
--- Diff: pom.xml ---
@@ -1394,7 +1394,7 @@
   /dependencies
 /profile
 profile
-  idhive-thriftserver/id
+  idhive/id
--- End diff --

We can have hive-thriftserver always included when we have -Phive


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3381#issuecomment-64328732
  
  [Test build #23827 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23827/consoleFull)
 for   PR 3381 at commit 
[`0c6cab6`](https://github.com/apache/spark/commit/0c6cab68aa7b4fad87968afa70888ed5b0fe3bbf).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `final class Date extends Ordered[Date] with Serializable `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3381#issuecomment-64328742
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23827/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-4512] [SQL] Unresolved Attribute Except...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3386#issuecomment-64329427
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23832/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-4512] [SQL] Unresolved Attribute Except...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3386#issuecomment-64329419
  
  [Test build #23832 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23832/consoleFull)
 for   PR 3386 at commit 
[`e0047a0`](https://github.com/apache/spark/commit/e0047a02183c52fa637e2808d94fc4b98fbe18c8).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3447#issuecomment-64330152
  
  [Test build #23835 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23835/consoleFull)
 for   PR 3447 at commit 
[`79286dc`](https://github.com/apache/spark/commit/79286dc3e027d138bf13ef55f190b95844afae0e).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4596][MLLib] Refactorize Normalizer to ...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3446#issuecomment-64330374
  
  [Test build #23824 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23824/consoleFull)
 for   PR 3446 at commit 
[`e20a2b9`](https://github.com/apache/spark/commit/e20a2b97fcec7fcda11d5845674a25c1aace414f).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4596][MLLib] Refactorize Normalizer to ...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3446#issuecomment-64330381
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23824/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3435#issuecomment-64331011
  
  [Test build #23825 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23825/consoleFull)
 for   PR 3435 at commit 
[`85885a9`](https://github.com/apache/spark/commit/85885a98125e41a323212499eb7a8c6895f8c252).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3435#issuecomment-64331017
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23825/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Branch 1.0

2014-11-25 Thread lowryact
GitHub user lowryact opened a pull request:

https://github.com/apache/spark/pull/3451

Branch 1.0



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/spark branch-1.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3451.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3451


commit 16e3910a0512cd53ad0c9c71ef20a3ee0f10c34f
Author: Matei Zaharia ma...@databricks.com
Date:   2014-06-06T06:01:48Z

SPARK-2043: ExternalAppendOnlyMap doesn't always find matching keys

The current implementation reads one key with the next hash code as it 
finishes reading the keys with the current hash code, which may cause it to 
miss some matches of the next key. This can cause operations like join to give 
the wrong result when reduce tasks spill to disk and there are hash collisions, 
as values won't be matched together. This PR fixes it by not reading in that 
next key, using a peeking iterator instead.

Author: Matei Zaharia ma...@databricks.com

Closes #986 from mateiz/spark-2043 and squashes the following commits:

0959514 [Matei Zaharia] Added unit test for having many hash collisions
892debb [Matei Zaharia] SPARK-2043: don't read a key with the next hash 
code in ExternalAppendOnlyMap, instead use a buffered iterator to only read 
values with the current hash code.

(cherry picked from commit b45c13e7d798f97b92f1a6329528191b8d779c4f)
Signed-off-by: Matei Zaharia ma...@databricks.com

commit d3717bea951888fe64cc2a0119d23b641b030735
Author: Michael Armbrust mich...@databricks.com
Date:   2014-06-06T06:20:59Z

[SPARK-2050][SQL] LIKE, RLIKE and IN in HQL should not be case sensitive.

Author: Michael Armbrust mich...@databricks.com

Closes #989 from marmbrus/caseSensitiveFuncitons and squashes the following 
commits:

681de54 [Michael Armbrust] LIKE, RLIKE and IN in HQL should not be case 
sensitive.

(cherry picked from commit 41db44c428a10f4453462d002d226798bb8fbdda)
Signed-off-by: Reynold Xin r...@apache.org

commit d7467484ff08a5f9a566d3a7b21bab426ff89127
Author: Michael Armbrust mich...@databricks.com
Date:   2014-06-06T18:31:37Z

[SPARK-2050 - 2][SQL] DIV and BETWEEN should not be case sensitive.

Followup: #989

Author: Michael Armbrust mich...@databricks.com

Closes #994 from marmbrus/caseSensitiveFunctions2 and squashes the 
following commits:

9d9c8ed [Michael Armbrust] Fix DIV and BETWEEN.

(cherry picked from commit 8d210560be8b143e48abfbaca347f383b5aa4798)
Signed-off-by: Michael Armbrust mich...@databricks.com

commit 39cfa9c0be34d4baf9de4eb9f9191c7b406c4d59
Author: Michael Armbrust mich...@databricks.com
Date:   2014-06-07T21:20:33Z

[SPARK-1994][SQL] Weird data corruption bug when running Spark SQL on data 
in HDFS

Basically there is a race condition (possibly a scala bug?) when these 
values are recomputed on all of the slaves that results in an incorrect 
projection being generated (possibly because the GUID uniqueness contract is 
broken?).

In general we should probably enforce that all expression planing occurs on 
the driver, as is now occurring here.

Author: Michael Armbrust mich...@databricks.com

Closes #1004 from marmbrus/fixAggBug and squashes the following commits:

e0c116c [Michael Armbrust] Compute aggregate expression during planning 
instead of lazily on workers.

(cherry picked from commit a6c72ab16e7a3027739ab419819f5222e270838e)
Signed-off-by: Reynold Xin r...@apache.org

commit 3f8450ec67fe84c290d725d4ebfcf9f5a7b0b109
Author: maji2014 ma...@asiainfo-linkage.com
Date:   2014-06-08T22:14:27Z

Update run-example

Old code can only be ran under spark_home and use bin/run-example.
 Error ./run-example: line 55: ./bin/spark-submit: No such file or 
directory appears when running in other place. So change this

Author: maji2014 ma...@asiainfo-linkage.com

Closes #1011 from maji2014/master and squashes the following commits:

2cc1af6 [maji2014] Update run-example

Closes #988.
(cherry picked from commit e9261d0866a610eab29fa332726186b534d1018f)

Signed-off-by: Patrick Wendell pwend...@gmail.com

commit 502a8f795551007db8a390c4eb7cfde7ca7742fb
Author: Neville Li nevi...@spotify.com
Date:   2014-06-09T06:18:27Z

[SPARK-2067] use relative path for Spark logo in UI

Author: Neville Li nevi...@spotify.com

Closes #1006 from nevillelyh/gh/SPARK-2067 and squashes the following 
commits:

9ee64cf [Neville Li] [SPARK-2067] use relative path for Spark logo in UI
(cherry picked from commit 15ddbef414d5fd6d4672936ba3c747b5fb7ab52b)

Signed-off-by: Patrick Wendell 

[GitHub] spark pull request: [SPARK-4594][SQL] Improvement the broadcast fo...

2014-11-25 Thread Leolh
GitHub user Leolh opened a pull request:

https://github.com/apache/spark/pull/3452

[SPARK-4594][SQL] Improvement the broadcast for HiveConf

Every time we need to get a table from hive , HadoopTableReader will 
broadcast HiveConf to clustor . Acturally In one application the hiveconf is 
single, so I think we can keep it in HiveContext for every query . Although it 
just 50kb , it's useful for JDBC user and streaming+sql app .

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Leolh/spark spark-4594

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3452.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3452


commit 0a4997e00f0a4eccb810a70bee5646669617dfc5
Author: leo leo@leo.localdomain
Date:   2014-11-25T09:02:55Z

make hiveconf broadcast as singal




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4594][SQL] Improvement the broadcast fo...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3452#issuecomment-64331829
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Branch 1.0

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3451#issuecomment-64331834
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3447#issuecomment-64332172
  
  [Test build #23828 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23828/consoleFull)
 for   PR 3447 at commit 
[`1fce6a9`](https://github.com/apache/spark/commit/1fce6a9343d6f563dac0c793480420c6511091ac).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Branch 1.0

2014-11-25 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/3451#issuecomment-64332167
  
This PR looks messed up or opened in error, can it be closed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3447#issuecomment-64332179
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23828/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4508] [SQL] build native date type to c...

2014-11-25 Thread liancheng
Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/3381#issuecomment-64332371
  
@adrian-wang Yea, Jenkins compiles Spark SQL with both Hive 0.12.0 and 
0.13.1, and then runs SQL tests against 0.13.1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3447#issuecomment-64336089
  
  [Test build #23830 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23830/consoleFull)
 for   PR 3447 at commit 
[`83b321c`](https://github.com/apache/spark/commit/83b321cc02e4dfb47541c7dd13f65f98012316ef).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3447#issuecomment-64336176
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23830/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-4509] Revert EC2 tag-based cluster memb...

2014-11-25 Thread mengxr
GitHub user mengxr opened a pull request:

https://github.com/apache/spark/pull/3453

[Spark-4509] Revert EC2 tag-based cluster membership patch

This PR reverts changes related to tag-based cluster membership. As 
discussed in SPARK-3332, we didn't figure out a safe strategy to use tags to 
determine cluster membership, because tagging is not atomic. The following 
changes are reverted:

SPARK-2333: 94053a7b766788bb62e2dbbf352ccbcc75f71fc0
SPARK-3213: 7faf755ae4f0cf510048e432340260a6e609066d
SPARK-3608: 78d4220fa0bf2f9ee663e34bbf3544a5313b02f0.

I tested launch, login, and destroy. It is easy to check the diff by 
comparing it to Josh's patch for branch-1.1:

https://github.com/apache/spark/pull/2225/files

@JoshRosen I sent the PR to master. It might be easier for us to keep 
master and branch-1.2 the same at this time. We can always re-apply the patch 
once we figure out a stable solution.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mengxr/spark SPARK-4509

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3453.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3453


commit 35963a1ba2a94e18adf07d6879aeb53379cd8f14
Author: Xiangrui Meng m...@databricks.com
Date:   2014-11-25T09:11:28Z

Revert SPARK-3608 Break if the instance tag naming succeeds

This reverts commit 78d4220fa0bf2f9ee663e34bbf3544a5313b02f0.

commit 4298ea52d4ffddd3a209684a991918f3114f44d4
Author: Xiangrui Meng m...@databricks.com
Date:   2014-11-25T09:16:59Z

revert 7faf755ae4f0cf510048e432340260a6e609066d

commit f0b708bb125ebde0f65a1d6130a5168793ca8a66
Author: Xiangrui Meng m...@databricks.com
Date:   2014-11-25T09:21:38Z

revert 94053a7b766788bb62e2dbbf352ccbcc75f71fc0




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4581][MLlib] Refactorize StandardScaler...

2014-11-25 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/3435#discussion_r20852499
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/feature/StandardScaler.scala ---
@@ -97,30 +97,57 @@ class StandardScalerModel private[mllib] (
   override def transform(vector: Vector): Vector = {
 require(mean.size == vector.size)
 if (withMean) {
-  vector.toBreeze match {
-case dv: BDV[Double] =
-  val output = vector.toBreeze.copy
-  var i = 0
-  while (i  output.length) {
-output(i) = (output(i) - mean(i)) * (if (withStd) factor(i) 
else 1.0)
-i += 1
+  // By default, Scala generates Java methods for member variables. So 
every time when
+  // the member variables are accessed, `invokespecial` will be called 
which is expensive.
+  // This can be avoid by having a local reference of `shift`.
+  val localShift = shift
--- End diff --

`shift` only holds a reference to `mean.values`. We don't really need to 
define it as a member and make it lazy. It should give the same performance if 
we only define it inside the if branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3449#issuecomment-64344035
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23831/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4597] Use proper exception and reset va...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3449#issuecomment-64344002
  
  [Test build #23831 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23831/consoleFull)
 for   PR 3449 at commit 
[`36cacbd`](https://github.com/apache/spark/commit/36cacbd1f2f5cfa4f2cb0814650ba439e2cff3f3).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4596][MLLib] Refactorize Normalizer to ...

2014-11-25 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/3446#issuecomment-64345544
  
LGTM. Merged into master and branch-1.2. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-4509] Revert EC2 tag-based cluster memb...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3453#issuecomment-64347503
  
  [Test build #23836 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23836/consoleFull)
 for   PR 3453 at commit 
[`f0b708b`](https://github.com/apache/spark/commit/f0b708bb125ebde0f65a1d6130a5168793ca8a66).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4543] Javadoc failure for network-commo...

2014-11-25 Thread ueshin
Github user ueshin commented on the pull request:

https://github.com/apache/spark/pull/3405#issuecomment-64351835
  
Hi, is this related to #3058?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4526][MLLIB]GradientDescent get a wrong...

2014-11-25 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/3399#issuecomment-64354201
  
LGTM. Merged into master and branch-1.2. The JIRA number should be 
SPARK-4530 instead of SPARK-4526. Could you update the title and then close 
this PR? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4583] [mllib] LogLoss for GradientBoost...

2014-11-25 Thread mengxr
Github user mengxr commented on a diff in the pull request:

https://github.com/apache/spark/pull/3439#discussion_r20853259
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/tree/loss/SquaredError.scala ---
@@ -49,18 +48,17 @@ object SquaredError extends Loss {
   }
 
   /**
-   * Method to calculate error of the base learner for the gradient 
boosting calculation.
+   * Method to calculate loss of the base learner for the gradient 
boosting calculation.
* Note: This method is not used by the gradient boosting algorithm but 
is useful for debugging
* purposes.
-   * @param model Model of the weak learner.
+   * @param model Ensemble model
* @param data Training dataset: RDD of 
[[org.apache.spark.mllib.regression.LabeledPoint]].
-   * @return
+   * @return  Mean squared error of model on data
--- End diff --

`MSE` is not usually defined with multiplier `1/2`. Shall we use a 
different name here, or example, `mean squared loss` or `average loss`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4599] [Build] [SQL] add hive profile in...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3450#issuecomment-64376530
  
  [Test build #23834 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23834/consoleFull)
 for   PR 3450 at commit 
[`cbb2d33`](https://github.com/apache/spark/commit/cbb2d330c9b47c993acb5dce5f65cf6b493374bd).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4599] [Build] [SQL] add hive profile in...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3450#issuecomment-64376615
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23834/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3447#issuecomment-64381651
  
  [Test build #23835 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23835/consoleFull)
 for   PR 3447 at commit 
[`79286dc`](https://github.com/apache/spark/commit/79286dc3e027d138bf13ef55f190b95844afae0e).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4592] Avoid duplicate worker registrati...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3447#issuecomment-64381659
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23835/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4530][MLLIB]GradientDescent get a wrong...

2014-11-25 Thread witgo
Github user witgo commented on the pull request:

https://github.com/apache/spark/pull/3399#issuecomment-64381732
  
@mengxr  The title has been updated.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-1450 [EC2] Specify the default zone in t...

2014-11-25 Thread srowen
GitHub user srowen opened a pull request:

https://github.com/apache/spark/pull/3454

SPARK-1450 [EC2] Specify the default zone in the EC2 script help

This looks like a one-liner, so I took a shot at it. There can be no fixed 
default availability zone since the names are different per region. But the 
default behavior can be documented:

```
if opts.zone == :
opts.zone = random.choice(conn.get_all_zones()).name
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/srowen/spark SPARK-1450

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3454.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3454


commit 9193cf3f8e9ae0e2d30a7c50a7be06440a006f91
Author: Sean Owen so...@cloudera.com
Date:   2014-11-25T10:48:23Z

Document that --zone defaults to a single random zone




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-1450 [EC2] Specify the default zone in t...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3454#issuecomment-64382910
  
  [Test build #23837 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23837/consoleFull)
 for   PR 3454 at commit 
[`9193cf3`](https://github.com/apache/spark/commit/9193cf3f8e9ae0e2d30a7c50a7be06440a006f91).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Delete unnecessary function

2014-11-25 Thread KaiXinXiaoLei
GitHub user KaiXinXiaoLei reopened a pull request:

https://github.com/apache/spark/pull/3224

Delete unnecessary function

when building spark by sbt, the function “runAlternateBoot in 
sbt/sbt-launch-lib.bash is not used. And this function is not used by spark 
code. So I think this function is not necessary. And the option of 
sbt.boot.properties can be configured in the command line when building 
spark, eg: 
sbt/sbt assembly -Dsbt.boot.properties=$bootpropsfile.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/KaiXinXiaoLei/spark deleteFunction

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3224.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3224


commit efe36d4dda1c56b027afb6604f0996f019995c89
Author: KaiXinXiaoLei huleil...@huawei.com
Date:   2014-11-12T09:45:21Z

Delete unnecessary function




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Delete unnecessary function

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3224#issuecomment-64384870
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Delete unnecessary function

2014-11-25 Thread KaiXinXiaoLei
Github user KaiXinXiaoLei commented on the pull request:

https://github.com/apache/spark/pull/3224#issuecomment-64385026
  
The  file from https://github.com/sbt/sbt-launcher-package is changed. And 
the  function “runAlternateBoot  is deleted in upstream project. I think 
spark project should delete this function in file sbt/sbt-launch-lib.bash. 
Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Delete unnecessary function

2014-11-25 Thread KaiXinXiaoLei
Github user KaiXinXiaoLei commented on the pull request:

https://github.com/apache/spark/pull/3224#issuecomment-64385336
  
The file from https://github.com/sbt/sbt-launcher-package is changed. And 
the function “runAlternateBoot is deleted in upstream project. I think spark 
project should delete this function in file sbt/sbt-launch-lib.bash. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-4509] Revert EC2 tag-based cluster memb...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3453#issuecomment-64386611
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23836/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-4509] Revert EC2 tag-based cluster memb...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3453#issuecomment-64386599
  
  [Test build #23836 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23836/consoleFull)
 for   PR 3453 at commit 
[`f0b708b`](https://github.com/apache/spark/commit/f0b708bb125ebde0f65a1d6130a5168793ca8a66).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4535][Streaming] Fix the error in comme...

2014-11-25 Thread tdas
Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/3400#issuecomment-64389631
  
@watermen Thanks these are great fixes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4535][Streaming] Fix the error in comme...

2014-11-25 Thread tdas
Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/3400#issuecomment-64389589
  
Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4535][Streaming] Fix the error in comme...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3400#issuecomment-64390206
  
  [Test build #23838 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23838/consoleFull)
 for   PR 3400 at commit 
[`75d795c`](https://github.com/apache/spark/commit/75d795ce9f64ca1cde1e9c360987db7e2ca41337).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-1450 [EC2] Specify the default zone in t...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3454#issuecomment-64391330
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23837/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-1450 [EC2] Specify the default zone in t...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3454#issuecomment-64391321
  
  [Test build #23837 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23837/consoleFull)
 for   PR 3454 at commit 
[`9193cf3`](https://github.com/apache/spark/commit/9193cf3f8e9ae0e2d30a7c50a7be06440a006f91).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4601][Streaming] Set correct call site ...

2014-11-25 Thread tdas
GitHub user tdas opened a pull request:

https://github.com/apache/spark/pull/3455

[SPARK-4601][Streaming] Set correct call site for streaming jobs so that it 
is displayed correctly on the Spark UI

When running the NetworkWordCount, the description of the word count jobs 
are set as getCallsite at DStream:xxx . This should be set to the line number 
of the streaming application that has the output operation that led to the job 
being created. This is because the callsite is incorrectly set in the thread 
launching the jobs. This PR fixes that.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tdas/spark streaming-callsite-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3455.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3455


commit 69fc26fd870361c345f6482737d6192949ab46b4
Author: Tathagata Das tathagata.das1...@gmail.com
Date:   2014-11-25T13:08:32Z

Set correct call site for streaming jobs so that it is displayed correctly 
on the Spark UI




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4601][Streaming] Set correct call site ...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3455#issuecomment-64398664
  
  [Test build #23839 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23839/consoleFull)
 for   PR 3455 at commit 
[`69fc26f`](https://github.com/apache/spark/commit/69fc26fd870361c345f6482737d6192949ab46b4).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4598] use pagination to show tasktable

2014-11-25 Thread XuTingjun
GitHub user XuTingjun opened a pull request:

https://github.com/apache/spark/pull/3456

[SPARK-4598] use pagination to show tasktable

When the application has too many tasks, tasktable with all tasks costs a 
lot of memory. If using pagination, every time tasktable shows some tasks. So 
this can reduce the memory usage

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuTingjun/spark patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3456.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3456


commit 5d75a606bd51e9cb07eddef4e4fd555cabab1b5a
Author: meiyoula 1039320...@qq.com
Date:   2014-11-25T13:20:12Z

Update StagePage.scala




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4598] use pagination to show tasktable

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3456#issuecomment-64398877
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4535][Streaming] Fix the error in comme...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3400#issuecomment-64398984
  
  [Test build #23838 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23838/consoleFull)
 for   PR 3400 at commit 
[`75d795c`](https://github.com/apache/spark/commit/75d795ce9f64ca1cde1e9c360987db7e2ca41337).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4535][Streaming] Fix the error in comme...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3400#issuecomment-64398987
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23838/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4381][Streaming]Add warning log when us...

2014-11-25 Thread tdas
Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/3244#issuecomment-64399467
  
Since no unit tests cover this change, I tested it manually. It works as 
expected merging this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...

2014-11-25 Thread tgravescs
Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/3409#discussion_r20864416
  
--- Diff: conf/spark-defaults.conf.template ---
@@ -8,3 +8,4 @@
 # spark.serializer 
org.apache.spark.serializer.KryoSerializer
 # spark.driver.memory  5g
 # spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value 
-Dnumbers=one two three
+# spark.yarn.am.extraJavaOptions   -XX:+PrintGCDetails -Dkey=value 
-Dnumbers=one two three
--- End diff --

I would remove this from here since its yarn specific.
Can you also please document it in the docs/running-on-yarn.md.  We should 
be clear on what mode this applies to since this will only be different from 
driver extraJavaOptions in Client mode, correct?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4534][Core]JavaSparkContext create new ...

2014-11-25 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/3403#issuecomment-64405670
  
I agree with Sandy on this, can you close this until we get SPARK-2089 
working and then we can make sure it works with Java api also.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4344][DOCS] adding documentation on spa...

2014-11-25 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/3209#issuecomment-64405811
  
test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4344][DOCS] adding documentation on spa...

2014-11-25 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/3209#issuecomment-64405912
  
@vanzin I think I'll pull this in and you will have to remove it again in 
https://github.com/apache/spark/pull/3233


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4344][DOCS] adding documentation on spa...

2014-11-25 Thread tgravescs
Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/3209#issuecomment-64406285
  
I pulled this into both master and branch-1.2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4451]force to kill process after 5 seco...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3316#issuecomment-64406710
  
  [Test build #23840 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23840/consoleFull)
 for   PR 3316 at commit 
[`88bd312`](https://github.com/apache/spark/commit/88bd312a52efc539719afb4221e469a495305ce0).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [Spark-4512] [SQL] Unresolved Attribute Except...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3386#issuecomment-64407794
  
  [Test build #23841 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23841/consoleFull)
 for   PR 3386 at commit 
[`6e720af`](https://github.com/apache/spark/commit/6e720af49428817c6f48d5e161b34e182a31b872).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4601][Streaming] Set correct call site ...

2014-11-25 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3455#issuecomment-64409916
  
  [Test build #23839 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/23839/consoleFull)
 for   PR 3455 at commit 
[`69fc26f`](https://github.com/apache/spark/commit/69fc26fd870361c345f6482737d6192949ab46b4).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4601][Streaming] Set correct call site ...

2014-11-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3455#issuecomment-64409929
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23839/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4196][SPARK-4602] Fix serialization iss...

2014-11-25 Thread tdas
GitHub user tdas opened a pull request:

https://github.com/apache/spark/pull/3457

[SPARK-4196][SPARK-4602] Fix serialization issue in 
PairDStreamFunctions.saveAsNewAPIHadoopFiles

Solves two JIRAs in one shot
- Makes the ForechDStream created by saveAsNewAPIHadoopFiles serializable 
for checkpoints
- Makes the default configuration object used saveAsNewAPIHadoopFiles be 
the Spark's hadoop configuration

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tdas/spark savefiles-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3457.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3457


commit b382ea9facd1cd70b254319811fe9600503e0286
Author: Tathagata Das tathagata.das1...@gmail.com
Date:   2014-11-25T15:05:29Z

Fix serialization issue in PairDStreamFunctions.saveAsNewAPIHadoopFiles.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4196][SPARK-4602][Streaming] Fix serial...

2014-11-25 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3457#discussion_r20868146
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/dstream/PairDStreamFunctions.scala
 ---
@@ -702,11 +699,14 @@ class PairDStreamFunctions[K, V](self: DStream[(K,V)])
   keyClass: Class[_],
   valueClass: Class[_],
   outputFormatClass: Class[_ : NewOutputFormat[_, _]],
-  conf: Configuration = new Configuration
+  conf: Configuration = ssc.sparkContext.hadoopConfiguration
 ) {
+// Wrap this in SerializableWritable so that ForeachDStream can be 
serialized for checkpoints
+val serializableConf = new SerializableWritable(conf)
--- End diff --

Ah you're already on it, yep, that looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   >