date:20141204

[GitHub] spark pull request: [FIX][DOC] Fix broken links in ml-guide.md

2014-12-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/3601


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [FIX][DOC] Fix broken links in ml-guide.md

2014-12-04 Thread mengxr

Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/3601#issuecomment-65623870
  
Merged into master and branch-1.2.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4743 - Use SparkEnv.serializer instead o...

2014-12-04 Thread IvanVergiliev

GitHub user IvanVergiliev opened a pull request:

https://github.com/apache/spark/pull/3605

SPARK-4743 - Use SparkEnv.serializer instead of closureSerializer in 
aggregateByKey and foldByKey



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/IvanVergiliev/spark change-serializer

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3605.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3605


commit a49b7cf7b507d7dbd3aba587eeb99125ce3e8203
Author: Ivan Vergiliev i...@leanplum.com
Date:   2014-12-04T12:08:12Z

Use serializer instead of closureSerializer in aggregate/foldByKey.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4743 - Use SparkEnv.serializer instead o...

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3605#issuecomment-65624217
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [FIX][DOC] Fix broken links in ml-guide.md

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3601#issuecomment-65625554
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24138/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [FIX][DOC] Fix broken links in ml-guide.md

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3601#issuecomment-65625544
  
  [Test build #24138 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24138/consoleFull)
 for   PR 3601 at commit 
[`c559768`](https://github.com/apache/spark/commit/c559768a78cbfab84038b6d2489b923f24ed79a7).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2765#issuecomment-65619648
  
  [Test build #24137 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24137/consoleFull)
 for   PR 2765 at commit 
[`cba8a2e`](https://github.com/apache/spark/commit/cba8a2e6cf11741867561ce1c0d7d2eda66033c6).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2765#issuecomment-65619652
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24137/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4494] IDFModel.transform() add support ...

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3603#issuecomment-65619901
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24136/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4735]Spark SQL UDF doesn't support 0 ar...

2014-12-04 Thread potix2

GitHub user potix2 opened a pull request:

https://github.com/apache/spark/pull/3604

[SPARK-4735]Spark SQL UDF doesn't support 0 arguments

I fixed the udf bug.
https://issues.apache.org/jira/browse/SPARK-4735

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/potix2/spark bugfix-4735

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3604.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3604


commit 025537a1ec966fa34330fbbc1ab29c2d3d9943cf
Author: Katsunori Kanda ka...@amoad.com
Date:   2014-12-04T11:52:06Z

Add UdfRegistration.registerFunction() for Function0




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4494] IDFModel.transform() add support ...

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3603#issuecomment-65619897
  
  [Test build #24136 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24136/consoleFull)
 for   PR 3603 at commit 
[`d25e49b`](https://github.com/apache/spark/commit/d25e49b01ad5e160366b5e4512ff0826f3cf2740).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4735]Spark SQL UDF doesn't support 0 ar...

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3604#issuecomment-65621779
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4744] [SQL] Short circuit evaluation fo...

2014-12-04 Thread chenghao-intel

GitHub user chenghao-intel opened a pull request:

https://github.com/apache/spark/pull/3606

[SPARK-4744] [SQL] Short circuit evaluation for AND  OR in CodeGen



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chenghao-intel/spark codegen_short_circuit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3606.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3606


commit f466303f872bcfd5e056f626f41108b748011680
Author: Cheng Hao hao.ch...@intel.com
Date:   2014-12-04T12:47:11Z

short circuit for AND  OR




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4744] [SQL] Short circuit evaluation fo...

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3606#issuecomment-65628448
  
  [Test build #24139 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24139/consoleFull)
 for   PR 3606 at commit 
[`f466303`](https://github.com/apache/spark/commit/f466303f872bcfd5e056f626f41108b748011680).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3928][SQL] Support wildcard matches on ...

2014-12-04 Thread tkyaw

Github user tkyaw commented on the pull request:

https://github.com/apache/spark/pull/3407#issuecomment-65630704
  
Make following changes as suggested.

(1) Change PR request and commit message.
(2) Update parquetFile doc message.
(3) Added test case.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3928][SQL] Support wildcard matches on ...

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3407#issuecomment-65630826
  
  [Test build #24140 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24140/consoleFull)
 for   PR 3407 at commit 
[`ceded32`](https://github.com/apache/spark/commit/ceded32aa2a487af41678807e56f32448af38096).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4697][YARN]System properties should ove...

2014-12-04 Thread tgravescs

Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/3557#issuecomment-65632939
  
Did you test it to see if they still work?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Spark 3883: SSL support for HttpServer and Akk...

2014-12-04 Thread jacek-lewandowski

Github user jacek-lewandowski commented on the pull request:

https://github.com/apache/spark/pull/3571#issuecomment-65633106
  
@pwendell is it possible to access log file somehow? I don't know how to 
replicate the problems - what is the operating system Jenkins is running on?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1953]yarn client mode Application Maste...

2014-12-04 Thread WangTaoTheTonic

GitHub user WangTaoTheTonic opened a pull request:

https://github.com/apache/spark/pull/3607

[SPARK-1953]yarn client mode Application Master memory size is same as 
driver memory...

... size

Ways to set Application Master's memory on yarn-client mode:
Item 1 `--am-memory MEM` in SparkSubmit args
Item 2 `spark.yarn.appMaster.memory` in SparkConf or System Properties
Item 3 `SPARK_YARN_AM_MEMORY` in System env
Item 4 default value 512m

Note: this arguments is only available in yarn-client mode and will be 
assigned with driver memory in yarn-cluster mode.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WangTaoTheTonic/spark SPARK4181

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3607.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3607


commit 0566bb89318a848ee6d2f551430d9fd135a22c7d
Author: WangTaoTheTonic barneystin...@aliyun.com
Date:   2014-12-04T13:42:47Z

yarn client mode Application Master memory size is same as driver memory 
size




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1953]yarn client mode Application Maste...

2014-12-04 Thread WangTaoTheTonic

Github user WangTaoTheTonic commented on the pull request:

https://github.com/apache/spark/pull/3607#issuecomment-65635264
  
@tgravescs


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4744] [SQL] Short circuit evaluation fo...

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3606#issuecomment-65635600
  
  [Test build #24139 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24139/consoleFull)
 for   PR 3606 at commit 
[`f466303`](https://github.com/apache/spark/commit/f466303f872bcfd5e056f626f41108b748011680).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4744] [SQL] Short circuit evaluation fo...

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3606#issuecomment-65635609
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24139/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1953]yarn client mode Application Maste...

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3607#issuecomment-65635774
  
  [Test build #24141 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24141/consoleFull)
 for   PR 3607 at commit 
[`0566bb8`](https://github.com/apache/spark/commit/0566bb89318a848ee6d2f551430d9fd135a22c7d).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2014-12-04 Thread wangxiaojing

Github user wangxiaojing commented on the pull request:

https://github.com/apache/spark/pull/2765#issuecomment-65637021
  
@liancheng 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...

2014-12-04 Thread wangxiaojing

Github user wangxiaojing commented on the pull request:

https://github.com/apache/spark/pull/2765#issuecomment-65637093
  
@liancheng 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2188] Support sbt/sbt for Windows

2014-12-04 Thread tsudukim

Github user tsudukim commented on the pull request:

https://github.com/apache/spark/pull/3591#issuecomment-65638788
  
@pwendell Thank you for your comment. I quite agree that Windows script 
like .cmd or .bat is very high cost for maintainance, but this time I used 
PowerShell which is a scripting language differently from .cmd or .bat. You 
can see the script inside. Linux version and PowerShell version have same 
structure (functions, variables, ...) so I think it's easier to read or modify.

And Yes, I use Windows for daily Spark development.
For sbt and maven, sbt is much better for trial and erro development as you 
know. I think the reason why I want sbt is the same as why we use sbt rather 
than maven for development on linux. I also use maven as a final check, but sbt 
is more useful for continuous development.
About cygwin, I'm not using cygwin. Cygwin environment is so polluted by 
cygwin functions or cygwin variables that the behavior of Windows becomes 
strange. That's critical for enterprise systems.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3928][SQL] Support wildcard matches on ...

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3407#issuecomment-65639449
  
  [Test build #24140 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24140/consoleFull)
 for   PR 3407 at commit 
[`ceded32`](https://github.com/apache/spark/commit/ceded32aa2a487af41678807e56f32448af38096).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3928][SQL] Support wildcard matches on ...

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3407#issuecomment-65639456
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24140/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4697][YARN]System properties should ove...

2014-12-04 Thread WangTaoTheTonic

Github user WangTaoTheTonic commented on the pull request:

https://github.com/apache/spark/pull/3557#issuecomment-65644328
  
@tgravescs Ok if you mean whether the app name could still be shown 
correctly, I will test it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...

2014-12-04 Thread tgravescs

Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/3409#discussion_r21309157
  
--- Diff: 
yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala ---
@@ -360,6 +360,10 @@ private[spark] trait ClientBase extends Logging {
   }
 }
 
+// include yarn am specific java options
+sparkConf.getOption(spark.yarn.am.extraJavaOptions)
+  .foreach(opts = javaOpts += opts)
--- End diff --

so currently this affects both cluster and client mode since 
driver.extraJavaOptions applies in cluster mode I think we should make this 
only apply in client mode.  Otherwise we should define precendence between it 
and the driver.extraJavaOptions in driver mode or potentially error if they 
aren't set.  It seems most straight forward to only have it apply in client 
mode but I'm open to thoughts  - @vanzin @andrewor14 .

Also we should run this through the check like in SparkConf for 
spark.executor.extraJavaOptions to make sure no spark configs or -Xmx is set. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3779. yarn spark.yarn.applicationMaster....

2014-12-04 Thread tgravescs

Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/3471#discussion_r21309273
  
--- Diff: docs/running-on-yarn.md ---
@@ -22,10 +22,12 @@ Most of the configs are the same for Spark on YARN as 
for other deployment modes
 table class=table
 trthProperty Name/ththDefault/ththMeaning/th/tr
 tr
-  tdcodespark.yarn.applicationMaster.waitTries/code/td
-  td10/td
+  tdcodespark.yarn.applicationMaster.waitTime/code/td
--- End diff --

can we rename it to spark.yarn.am.waitTime? (to be consistent with pr 3409 
and possibly others)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3779. yarn spark.yarn.applicationMaster....

2014-12-04 Thread tgravescs

Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/3471#discussion_r21309453
  
--- Diff: 
yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala 
---
@@ -329,8 +329,10 @@ private[spark] class ApplicationMaster(args: 
ApplicationMasterArguments,
   sparkContextRef.synchronized {
 var count = 0
--- End diff --

count isn't needed anymore


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3779. yarn spark.yarn.applicationMaster....

2014-12-04 Thread tgravescs

Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/3471#discussion_r21309571
  
--- Diff: 
yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala 
---
@@ -353,13 +355,13 @@ private[spark] class ApplicationMaster(args: 
ApplicationMasterArguments,
 val hostport = args.userArgs(0)
 val (driverHost, driverPort) = Utils.parseHostPort(hostport)
 
-// spark driver should already be up since it launched us, but we 
don't want to
+// Spark driver should already be up since it launched us, but we 
don't want to
 // wait forever, so wait 100 seconds max to match the cluster mode 
setting.
-// Leave this config unpublished for now. SPARK-3779 to investigating 
changing
-// this config to be time based.
-val numTries = 
sparkConf.getInt(spark.yarn.applicationMaster.waitTries, 1000)
+val waitTime = 100
+val totalWaitTime = 
sparkConf.getInt(spark.yarn.applicationMaster.waitTime, 10)
+val deadline = System.currentTimeMillis + totalWaitTime
 
-while (!driverUp  !finished  count  numTries) {
+while (!driverUp  !finished  System.currentTimeMillis  deadline + 
waitTime) {
   try {
 count = count + 1
--- End diff --

count isn't needed anymore


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3779. yarn spark.yarn.applicationMaster....

2014-12-04 Thread tgravescs

Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/3471#discussion_r21309688
  
--- Diff: 
yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala 
---
@@ -353,13 +355,13 @@ private[spark] class ApplicationMaster(args: 
ApplicationMasterArguments,
 val hostport = args.userArgs(0)
 val (driverHost, driverPort) = Utils.parseHostPort(hostport)
 
-// spark driver should already be up since it launched us, but we 
don't want to
+// Spark driver should already be up since it launched us, but we 
don't want to
 // wait forever, so wait 100 seconds max to match the cluster mode 
setting.
-// Leave this config unpublished for now. SPARK-3779 to investigating 
changing
-// this config to be time based.
-val numTries = 
sparkConf.getInt(spark.yarn.applicationMaster.waitTries, 1000)
+val waitTime = 100
+val totalWaitTime = 
sparkConf.getInt(spark.yarn.applicationMaster.waitTime, 10)
+val deadline = System.currentTimeMillis + totalWaitTime
 
-while (!driverUp  !finished  count  numTries) {
+while (!driverUp  !finished  System.currentTimeMillis  deadline + 
waitTime) {
--- End diff --

why are we adding waitTime to deadline here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1953]yarn client mode Application Maste...

2014-12-04 Thread tgravescs

Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/3607#discussion_r21309773
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -107,6 +108,10 @@ private[spark] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, St
   .orElse(sparkProperties.get(spark.driver.memory))
   .orElse(env.get(SPARK_DRIVER_MEMORY))
   .orNull
+amMemory = Option(amMemory)
+.orElse(sparkProperties.get(spark.yarn.appMaster.memory))
--- End diff --

can we call this spark.yarn.am.memory


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1953]yarn client mode Application Maste...

2014-12-04 Thread tgravescs

Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/3607#discussion_r21309762
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -107,6 +108,10 @@ private[spark] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, St
   .orElse(sparkProperties.get(spark.driver.memory))
   .orElse(env.get(SPARK_DRIVER_MEMORY))
   .orNull
+amMemory = Option(amMemory)
+.orElse(sparkProperties.get(spark.yarn.appMaster.memory))
+.orElse(env.get(SPARK_YARN_AM_MEMORY))
--- End diff --

env variables are only for backwards compatibility we shouldn't add them 
for new configs so can you please remove it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1953]yarn client mode Application Maste...

2014-12-04 Thread WangTaoTheTonic

Github user WangTaoTheTonic commented on a diff in the pull request:

https://github.com/apache/spark/pull/3607#discussion_r21310482
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -107,6 +108,10 @@ private[spark] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, St
   .orElse(sparkProperties.get(spark.driver.memory))
   .orElse(env.get(SPARK_DRIVER_MEMORY))
   .orNull
+amMemory = Option(amMemory)
+.orElse(sparkProperties.get(spark.yarn.appMaster.memory))
--- End diff --

Ok, I use appMaster because we already have an item called 
spark.yarn.appMasterEnv.*, but spark.yarn.am.memory looks more simple.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1953]yarn client mode Application Maste...

2014-12-04 Thread WangTaoTheTonic

Github user WangTaoTheTonic commented on a diff in the pull request:

https://github.com/apache/spark/pull/3607#discussion_r21310507
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -107,6 +108,10 @@ private[spark] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, St
   .orElse(sparkProperties.get(spark.driver.memory))
   .orElse(env.get(SPARK_DRIVER_MEMORY))
   .orNull
+amMemory = Option(amMemory)
+.orElse(sparkProperties.get(spark.yarn.appMaster.memory))
+.orElse(env.get(SPARK_YARN_AM_MEMORY))
--- End diff --

Got it. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1953]yarn client mode Application Maste...

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3607#issuecomment-65648737
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24141/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1953]yarn client mode Application Maste...

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3607#issuecomment-65648726
  
  [Test build #24141 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24141/consoleFull)
 for   PR 3607 at commit 
[`0566bb8`](https://github.com/apache/spark/commit/0566bb89318a848ee6d2f551430d9fd135a22c7d).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1953]yarn client mode Application Maste...

2014-12-04 Thread tgravescs

Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/3607#discussion_r21310970
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -107,6 +108,10 @@ private[spark] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, St
   .orElse(sparkProperties.get(spark.driver.memory))
   .orElse(env.get(SPARK_DRIVER_MEMORY))
   .orNull
+amMemory = Option(amMemory)
+.orElse(sparkProperties.get(spark.yarn.appMaster.memory))
--- End diff --

Yeah there are various prs up right now with am related configs, trying to 
be consistent and use .am.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1953]yarn client mode Application Maste...

2014-12-04 Thread tgravescs

Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/3607#discussion_r21311166
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -279,6 +285,10 @@ private[spark] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, St
 driverExtraLibraryPath = value
 parse(tail)
 
+  case (--am-memory) :: value :: tail =
--- End diff --

I am a little bit on the fence here about having this config in 
spark-submit.  I'm not sure if it will cause more confusion since it only 
applies to client mode.  I'm wondering if perhaps we just add the config for 
now.

@vanzin @andrewor14 thoughts on that since you both commented on the 
am.extraJavaOptions pr


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4730][YARN] Warn against deprecated YAR...

2014-12-04 Thread tgravescs

Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/3590#discussion_r21311665
  
--- Diff: 
yarn/common/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
 ---
@@ -78,11 +79,25 @@ private[spark] class YarnClientSchedulerBackend(
 (--queue, SPARK_YARN_QUEUE, spark.yarn.queue),
 (--name, SPARK_YARN_APP_NAME, spark.app.name)
   )
+// Warn against the following deprecated environment variables: env 
var - suggestion
+val deprecatedEnvVars = Map(
+  SPARK_MASTER_MEMORY - SPARK_DRIVER_MEMORY or --driver-memory 
through spark-submit,
+  SPARK_WORKER_INSTANCES - SPARK_WORKER_INSTANCES or 
--num-executors through spark-submit,
+  SPARK_WORKER_MEMORY - SPARK_EXECUTOR_MEMORY or --executor-memory 
through spark-submit,
+  SPARK_WORKER_CORES - SPARK_EXECUTOR_CORES or --executor-cores 
through spark-submit)
--- End diff --

aren't essentially all of the env variables deprecated?  I know we have 
warnings on some throughout the code but haven't checked for all of them.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4461][YARN] pass extra java options to ...

2014-12-04 Thread WangTaoTheTonic

Github user WangTaoTheTonic commented on the pull request:

https://github.com/apache/spark/pull/3409#issuecomment-65651760
  
Agree with @tgravescs, adding spark.yarn.am.extraClassPath and 
spark.yarn.am.extraLibraryPath together would be better. @zhzhan You can also 
check https://issues.apache.org/jira/browse/SPARK-4181.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4685] Include all spark.ml and spark.ml...

2014-12-04 Thread jkbradley

Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/3598#issuecomment-65664523
  
LGTM in retrospect


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3779. yarn spark.yarn.applicationMaster....

2014-12-04 Thread sryza

Github user sryza commented on the pull request:

https://github.com/apache/spark/pull/3471#issuecomment-65675297
  
Thanks for the feedback, Tom.  Updated the patch to reflect your and Wang 
Tao's comments.

I left out adding MS to the config name because it's inconsistent with all 
Spark's existing configs.  I agree that it would have been better to start out 
including the units in config names, but I think it'll be confusing to 
different conventions for different configs here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3779. yarn spark.yarn.applicationMaster....

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3471#issuecomment-65675709
  
  [Test build #24142 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24142/consoleFull)
 for   PR 3471 at commit 
[`ce6dff2`](https://github.com/apache/spark/commit/ce6dff2be37e1ab40925f2b60182565386245438).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4683][SQL] Add a beeline.cmd to run on ...

2014-12-04 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/3599#issuecomment-65677001
  
Thanks Cheng, I'll pull this in.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4683][SQL] Add a beeline.cmd to run on ...

2014-12-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/3599


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [STREAMING] Add redis pub/sub streaming suppor...

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/2348#issuecomment-65678290
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1953]yarn client mode Application Maste...

2014-12-04 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/3607#discussion_r21324538
  
--- Diff: 
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -279,6 +285,10 @@ private[spark] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, St
 driverExtraLibraryPath = value
 parse(tail)
 
+  case (--am-memory) :: value :: tail =
--- End diff --

I'd prefer not to add this to SparkSubmit. I've never seen someone have to 
fiddle with that value, so my guess is that this is such an uncommon need that 
those who want to use it wouldn't be bothered by the more verbose --conf 
approach.

Also, should probably add a memory overhead config too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2199] [mllib] topic modeling

2014-12-04 Thread jkbradley

Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/1269#issuecomment-65682412
  
@akopich  Thanks for the responses!  Follow-ups:

(1) Users implementing their own regularizers

You're right that this would be nice to have.  If we include it, then 
perhaps we can spend more time to have a clean API for regularizers which can 
be re-used in other algorithms which try to optimize an objective.  (E.g., 
ideally, Dirichlet regularization would be implemented such that any algorithm 
which needed a Dirichlet regularizer (or prior) could reuse your code.)  Here 
are some thoughts on that:
* All of the regularizers here operate on each Matrix element individually. 
 If that will be the case for all useful regularizers, then the regularizer API 
could operate per-element (to be simpler), and the code using the regularizers 
could iterate over all elements as needed.
* The regularizer API should use general terminology, such as:
 * penalty(param: Double): Double
 * gradient(param: Double): Double

Alternatively, we could use this development path to avoid having to decide 
on the API right now:
* In this PR, keep the regularizer types public, but make all of their 
internals private[mllib].  This way, the API choices are kept private for now.
* In a later PR, the regularizer API can be refined and made public so that 
users can implement their own pluggable types.

(2) Regular and Robust in the same class

You should be able to extract the functionality you need.  E.g., if the 
newIteration method knows that it has a DocumentParameters instance, it can 
call getNewTheta().  If the actual type of the instance is 
RobustDocumentParameters, then RobustDocumentParameters.getNewTheta() will be 
called.  But I would recommend having both classes implement getNewTheta() with 
the same visibility (private/public), and also to use the âoverrideâ 
keyword in RobustDocumentParameters.

You should be able to abstract any other needed functionality similarly.

(3) PLSA and RobustPLSA code duplication

Looking more closely, I think youâre right about it being hard to 
abstract further.  (Iâll let you know if I have ideas.)

(4) Float vs. Double

True, it may be worth the trouble to use Float to save on memory and 
communication.  I donât know enough about PLSA to know how important 
numerical precision is in general.  Your approach sounds reasonable then.  One 
alternative would be to user Breeze matrices (but not in public APIs!), but 
Iâd only suggest that if it will simplify or shorten code.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Spark Core - [SPARK-3620] - Refactor of SparkS...

2014-12-04 Thread vanzin

Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/2516#issuecomment-65682500
  
@pwendell I was less interested in the refactoring part than in formalizing 
the precedence for the options in a more obvious manner in the code. Right now 
that's a little confusing.

But yeah, this patch is rather large, and a lot has changed since it was 
last updated...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Spark 3883: SSL support for HttpServer and Akk...

2014-12-04 Thread vanzin

Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/3571#issuecomment-65682860
  
@jacek-lewandowski from a quick look at the diff, it seems you didn't 
change anything w.r.t. the configuration. In master, there's no need to add a 
new config file nor all the different ways of loading it - all daemons should 
be loading spark-defaults.conf and so you could just use SparkConf for 
everything like I suggested in the old PR.

Did you have a chance to look at that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4397] Move object RDD to the front of R...

2014-12-04 Thread mateiz

Github user mateiz commented on the pull request:

https://github.com/apache/spark/pull/3580#issuecomment-65682925
  
Good catch on the return types. Would be great if we can make ScalaStyle 
complain about those.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4697][YARN]System properties should ove...

2014-12-04 Thread vanzin

Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/3557#discussion_r21328217
  
--- Diff: 
yarn/common/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
 ---
@@ -79,10 +79,10 @@ private[spark] class YarnClientSchedulerBackend(
 (--name, SPARK_YARN_APP_NAME, spark.app.name)
   )
 optionTuples.foreach { case (optionName, envVar, sparkProp) =
-  if (System.getenv(envVar) != null) {
-extraArgs += (optionName, System.getenv(envVar))
-  } else if (sc.getConf.contains(sparkProp)) {
+  if (sc.getConf.contains(sparkProp)) {
 extraArgs += (optionName, sc.getConf.get(sparkProp))
+  } else if (System.getenv(envVar) != null) {
+extraArgs += (optionName, System.getenv(envVar))
--- End diff --

The method you're modifying (`getExtraClientArguments`) is the one that 
defines the `--name` argument for `ClientArguments`. And you're inverting the 
priority here, so that `spark.app.name`  `SPARK_YARN_APP_NAME`. So basically, 
since `spark.app.name` is mandatory, `SPARK_YARN_APP_NAME` becomes useless.

But please test it; make sure both work, both in client and cluster mode. 
Something might have changed since those fixes went in, although I kinda doubt 
it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2188] Support sbt/sbt for Windows

2014-12-04 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3591#issuecomment-65687621
  
Our `sbt` shell scripts are from the 
[sbt-launcher-package](https://github.com/sbt/sbt-launcher-package) project.  
Do you think we should try to submit this change upstream first?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3779. yarn spark.yarn.applicationMaster....

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3471#issuecomment-65687913
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24142/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3779. yarn spark.yarn.applicationMaster....

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3471#issuecomment-65687902
  
  [Test build #24142 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24142/consoleFull)
 for   PR 3471 at commit 
[`ce6dff2`](https://github.com/apache/spark/commit/ce6dff2be37e1ab40925f2b60182565386245438).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4253]Ignore spark.driver.host in yarn-c...

2014-12-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/3112


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4253]Ignore spark.driver.host in yarn-c...

2014-12-04 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3112#issuecomment-65693674
  
Thanks for testing this out!  I also tested it with my own integration test 
and it now passes, so this looks good to me.  I'm going to merge this into 
`master` and `branch-1.2`.  I'll edit the commit message to reflect the bug 
description from JIRA.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4253]Ignore spark.driver.host in yarn-c...

2014-12-04 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3112#issuecomment-65694771
  
I've also backported this to `branch-1.1`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [HOTFIX] Fixing two issues with the release sc...

2014-12-04 Thread pwendell

GitHub user pwendell opened a pull request:

https://github.com/apache/spark/pull/3608

[HOTFIX] Fixing two issues with the release script.

1. The version replacement was still producing some false changes.
2. Uploads to the staging repo specifically.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pwendell/spark release-script

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3608.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3608


commit 3c63294a3109b13ad570a7a6056eedae0558f029
Author: Patrick Wendell pwend...@gmail.com
Date:   2014-11-28T22:10:13Z

Fixing two issues with the release script:

1. The version replacement was still producing some false changes.
2. Uploads to the staging repo specifically.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [HOTFIX] Fixing two issues with the release sc...

2014-12-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/3608


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [HOTFIX] Fixing two issues with the release sc...

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3608#issuecomment-65698083
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24143/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3431] [WIP] Parallelize test execution

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3564#issuecomment-65700600
  
  [Test build #24145 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24145/consoleFull)
 for   PR 3564 at commit 
[`ef705a4`](https://github.com/apache/spark/commit/ef705a4536dbdc3c46e0a05a18098624b5d6be5c).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3431] [WIP] Parallelize test execution

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3564#issuecomment-65701539
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24144/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3431] [WIP] Parallelize test execution

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3564#issuecomment-65703187
  
  [Test build #24145 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24145/consoleFull)
 for   PR 3564 at commit 
[`ef705a4`](https://github.com/apache/spark/commit/ef705a4536dbdc3c46e0a05a18098624b5d6be5c).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3431] [WIP] Parallelize test execution

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3564#issuecomment-65703195
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24145/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3431] [WIP] Parallelize test execution

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3564#issuecomment-65704277
  
  [Test build #24146 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24146/consoleFull)
 for   PR 3564 at commit 
[`bf1d46f`](https://github.com/apache/spark/commit/bf1d46f1bb5ac343a2e47b3e9880d2f35b013603).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Update DecisionTree.scala

2014-12-04 Thread emtl97

GitHub user emtl97 opened a pull request:

https://github.com/apache/spark/pull/3609

Update DecisionTree.scala

Hello,

Hope you are well.  We've been using DecisionTree at Samsung and hope to 
help in some small way.  I was interested in setting the seed for the sampling 
i.e. in line 988.We're in the process of creating tests for our code being 
able to set the seed is helpful.  

To that end, I think also the sample method here depends on a 
PartitionwiseSampledRDD.  In there the 'compute' method think uses a different 
seed from the one that can be passed to the  constructor of 
PartitionSampledRDD, it uses split.seed (below).  Well hope we can discuss 
more! Thank you..   Best Wishes -Ed 

override def compute(splitIn: Partition, context: TaskContext): Iterator[U] 
= {
val split = splitIn.asInstanceOf[PartitionwiseSampledRDDPartition]
val thisSampler = sampler.clone
thisSampler.setSeed(split.seed)
thisSampler.sample(firstParent[T].iterator(split.prev, context))
  }

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/emtl97/spark patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3609.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3609


commit b2f6926cc5d4e5020b6811f6952101d6882877e1
Author: Ed sigm...@yahoo.com
Date:   2014-12-04T21:11:41Z

Update DecisionTree.scala

Hello,

Hope you are well.  We've been using DecisionTree at Samsung and hope to 
help in some small way.  I was interested in setting the seed for the sampling 
i.e. in line 988.We're in the process of creating tests for our code being 
able to set the seed is helpful.  

To that end, I think also the sample method here depends on a 
PartitionwiseSampledRDD.  In there the 'compute' method think uses a different 
seed from the one that can be passed to the  constructor of 
PartitionSampledRDD, it uses split.seed (below).  Well hope we can discuss 
more! Thank you..   Best Wishes -Ed 

override def compute(splitIn: Partition, context: TaskContext): Iterator[U] 
= {
val split = splitIn.asInstanceOf[PartitionwiseSampledRDDPartition]
val thisSampler = sampler.clone
thisSampler.setSeed(split.seed)
thisSampler.sample(firstParent[T].iterator(split.prev, context))
  }




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Update DecisionTree.scala

2014-12-04 Thread emtl97

Github user emtl97 closed the pull request at:

https://github.com/apache/spark/pull/3609


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3431] [WIP] Parallelize test execution

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3564#issuecomment-65706465
  
  [Test build #24146 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24146/consoleFull)
 for   PR 3564 at commit 
[`bf1d46f`](https://github.com/apache/spark/commit/bf1d46f1bb5ac343a2e47b3e9880d2f35b013603).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3431] [WIP] Parallelize test execution

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3564#issuecomment-65706473
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24146/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3431] [WIP] Parallelize test execution

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3564#issuecomment-65707887
  
  [Test build #24147 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24147/consoleFull)
 for   PR 3564 at commit 
[`ab127b7`](https://github.com/apache/spark/commit/ab127b798dbfa9399833d546e627f9651b060918).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4349] Checking if parallel collection p...

2014-12-04 Thread mccheah

Github user mccheah closed the pull request at:

https://github.com/apache/spark/pull/3275


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4349] Checking if parallel collection p...

2014-12-04 Thread mccheah

Github user mccheah commented on the pull request:

https://github.com/apache/spark/pull/3275#issuecomment-65709183
  
We want a more generic fix than this. I'll push something new which will be 
completely different, addressing the issue further down in the stack.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3431] [WIP] Parallelize test execution

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3564#issuecomment-65710986
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/24147/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3431] [WIP] Parallelize test execution

2014-12-04 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3564#issuecomment-65710977
  
  [Test build #24147 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/24147/consoleFull)
 for   PR 3564 at commit 
[`ab127b7`](https://github.com/apache/spark/commit/ab127b798dbfa9399833d546e627f9651b060918).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4749: Allow initializing KMeans clusters...

2014-12-04 Thread nxwhite-str

GitHub user nxwhite-str opened a pull request:

https://github.com/apache/spark/pull/3610

SPARK-4749: Allow initializing KMeans clusters using a seed

This implements the functionality for SPARK-4749 and provides units tests 
in Scala and PySpark 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nxwhite-str/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3610.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3610


commit 35c188463798729b65ca74549984cb765ac1e9c9
Author: nate.crosswhite nate.crosswh...@stresearch.com
Date:   2014-12-04T19:12:29Z

Add kmeans initial seed to pyspark API

commit 616d11187128ca5bb1ecce1bfe3ca2df16529f61
Author: nate.crosswhite nate.crosswh...@stresearch.com
Date:   2014-12-04T19:13:12Z

Merge remote-tracking branch 'upstream/master'

commit 5d087b40e14db51b1eeb44e462e04d5e718338be
Author: nate.crosswhite nate.crosswh...@stresearch.com
Date:   2014-12-04T21:25:49Z

Adding KMeans train with seed and Scala unit test

commit 9156a5782c254bbf765954fffcee1ca34d5d0b7f
Author: nate.crosswhite nate.crosswh...@stresearch.com
Date:   2014-12-04T21:28:32Z

Merge remote-tracking branch 'upstream/master'




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4749: Allow initializing KMeans clusters...

2014-12-04 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3610#issuecomment-65712520
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4745] Fix get_existing_cluster() functi...

2014-12-04 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3596#discussion_r21341088
  
--- Diff: ec2/spark_ec2.py ---
@@ -504,9 +504,9 @@ def get_existing_cluster(conn, opts, cluster_name, 
die_on_error=True):
 active = [i for i in res.instances if is_active(i)]
 for inst in active:
 group_names = [g.name for g in inst.groups]
-if group_names == [cluster_name + -master]:
+if cluster_name + -master in group_names:
--- End diff --

Minor nit, but I think it's clearer to add parentheses here to make the 
operator precedence more clear.  I'm going to do this myself while merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4745] Fix get_existing_cluster() functi...

2014-12-04 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3596#issuecomment-65713681
  
LGTM.  I tested this out myself and it works, so I'm going to merge this 
into `master`, `branch-1.2` and `branch-1.1`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4745] Fix get_existing_cluster() functi...

2014-12-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/3596


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Add a Note on jsonFile having separate JSON ob...

2014-12-04 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/3517#discussion_r21341182
  
--- Diff: docs/sql-programming-guide.md ---
@@ -621,7 +621,7 @@ val sqlContext = new org.apache.spark.sql.SQLContext(sc)
 
 // A JSON dataset is pointed to by path.
 // The path can be either a single text file or a directory storing text 
files.
-val path = examples/src/main/resources/people.json
+val path = examples/src/main/resources/people.txt
--- End diff --

We need to move the file too and update the other places that reference it:

```
examples/src/main/java/org/apache/spark/examples/sql/JavaSparkSQL.java:
String path = examples/src/main/resources/people.json;
examples/src/main/python/sql.py:path = 
os.path.join(os.environ['SPARK_HOME'], 
examples/src/main/resources/people.json)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Add a Note on jsonFile having separate JSON ob...

2014-12-04 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/3517#issuecomment-65714075
  
LGTM once my comment is addressed.  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: Add a Note on jsonFile having separate JSON ob...

2014-12-04 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3517#issuecomment-65714207
  
One thought: will the changed example file name / location be confusing for 
people reading documentation versions that don't match their Spark version?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4618][SQL] Make foreign DDL commands op...

2014-12-04 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/3470#discussion_r21341852
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala ---
@@ -37,7 +37,7 @@ import 
org.apache.spark.sql.catalyst.expressions.{Expression, Attribute}
 @DeveloperApi
 trait RelationProvider {
   /** Returns a new base relation with the given parameters. */
-  def createRelation(sqlContext: SQLContext, parameters: Map[String, 
String]): BaseRelation
+  def createRelation(sqlContext: SQLContext, parameters: 
CaseInsensitiveMap): BaseRelation
--- End diff --

`spark.sql.caseSensitive` is about identifiers (i.e.,  attributes and table 
names).  I'd say this is more analogous to keyword case insensitivity.  I don't 
know any database that doesn't treat `SELECT` and `select` the same so I'm not 
sure if that should be configurable.

You can still pass your `CaseInsensitiveMap ` in and it will have the 
desired effect.  Just don't change the function signature.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4459] Change groupBy type parameter fro...

2014-12-04 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3327#discussion_r21341919
  
--- Diff: core/src/test/java/org/apache/spark/JavaAPISuite.java ---
@@ -322,6 +322,42 @@ public Boolean call(Integer x) {
 Assert.assertEquals(2, 
Iterables.size(oddsAndEvens.lookup(true).get(0)));  // Evens
 Assert.assertEquals(5, 
Iterables.size(oddsAndEvens.lookup(false).get(0))); // Odds
   }
+   
+   @Test
+  public void groupByOnPairRDD() {
+JavaRDDInteger rdd = sc.parallelize(Arrays.asList(1, 1, 2, 3, 5, 8, 
13));
+Functionscala.Tuple2Integer, Integer, Boolean areOdd = new 
Functionscala.Tuple2Integer, Integer, Boolean() {
+  @Override
+  public Boolean call(scala.Tuple2Integer, Integer x) {
+return x._1 % 2 == 0  x._2 % 2 == 0;
+  }
+};
+   JavaPairRDDInteger, Integer pairrdd = rdd.zip(rdd);
+JavaPairRDDBoolean, Iterablescala.Tuple2Integer, Integer 
oddsAndEvens = pairrdd.groupBy(areOdd);
+Assert.assertEquals(2, oddsAndEvens.count());
+Assert.assertEquals(2, 
Iterables.size(oddsAndEvens.lookup(true).get(0)));  // Evens
+Assert.assertEquals(5, 
Iterables.size(oddsAndEvens.lookup(false).get(0))); // Odds
+
+oddsAndEvens = pairrdd.groupBy(areOdd, 1);
+Assert.assertEquals(2, oddsAndEvens.count());
+Assert.assertEquals(2, 
Iterables.size(oddsAndEvens.lookup(true).get(0)));  // Evens
+Assert.assertEquals(5, 
Iterables.size(oddsAndEvens.lookup(false).get(0))); // Odds
+  }
+   
+   @Test
+  public void keyByOnPairRDD() {
+JavaRDDInteger rdd = sc.parallelize(Arrays.asList(1, 1, 2, 3, 5, 8, 
13));
+Functionscala.Tuple2Integer, Integer, String areOdd = new 
Functionscala.Tuple2Integer, Integer, String() {
+  @Override
+  public String call(scala.Tuple2Integer, Integer x) {
+return +(x._1 +x._2);
--- End diff --

The spacing here is messy.  Also, ` + x` is messy; just do `x.toString()` 
instead if you want to convert an object in a string.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4459] Change groupBy type parameter fro...

2014-12-04 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3327#discussion_r21342049
  
--- Diff: core/src/test/java/org/apache/spark/JavaAPISuite.java ---
@@ -322,6 +322,42 @@ public Boolean call(Integer x) {
 Assert.assertEquals(2, 
Iterables.size(oddsAndEvens.lookup(true).get(0)));  // Evens
 Assert.assertEquals(5, 
Iterables.size(oddsAndEvens.lookup(false).get(0))); // Odds
   }
+   
+   @Test
+  public void groupByOnPairRDD() {
+JavaRDDInteger rdd = sc.parallelize(Arrays.asList(1, 1, 2, 3, 5, 8, 
13));
+Functionscala.Tuple2Integer, Integer, Boolean areOdd = new 
Functionscala.Tuple2Integer, Integer, Boolean() {
+  @Override
+  public Boolean call(scala.Tuple2Integer, Integer x) {
+return x._1 % 2 == 0  x._2 % 2 == 0;
+  }
+};
+   JavaPairRDDInteger, Integer pairrdd = rdd.zip(rdd);
+JavaPairRDDBoolean, Iterablescala.Tuple2Integer, Integer 
oddsAndEvens = pairrdd.groupBy(areOdd);
+Assert.assertEquals(2, oddsAndEvens.count());
+Assert.assertEquals(2, 
Iterables.size(oddsAndEvens.lookup(true).get(0)));  // Evens
+Assert.assertEquals(5, 
Iterables.size(oddsAndEvens.lookup(false).get(0))); // Odds
+
+oddsAndEvens = pairrdd.groupBy(areOdd, 1);
+Assert.assertEquals(2, oddsAndEvens.count());
+Assert.assertEquals(2, 
Iterables.size(oddsAndEvens.lookup(true).get(0)));  // Evens
+Assert.assertEquals(5, 
Iterables.size(oddsAndEvens.lookup(false).get(0))); // Odds
+  }
+   
+   @Test
+  public void keyByOnPairRDD() {
+JavaRDDInteger rdd = sc.parallelize(Arrays.asList(1, 1, 2, 3, 5, 8, 
13));
+Functionscala.Tuple2Integer, Integer, String areOdd = new 
Functionscala.Tuple2Integer, Integer, String() {
--- End diff --

Also, why is this named `areOdd`?  That's not what this function is doing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4668] Fix some documentation typos.

2014-12-04 Thread ryan-williams

Github user ryan-williams commented on the pull request:

https://github.com/apache/spark/pull/3523#issuecomment-65716929
  
added some more documentation typo fixes and added parameter names to a few 
unmarked booleans, for clarity


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4459] Change groupBy type parameter fro...

2014-12-04 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3327#discussion_r21342647
  
--- Diff: core/src/test/java/org/apache/spark/JavaAPISuite.java ---
@@ -322,6 +322,42 @@ public Boolean call(Integer x) {
 Assert.assertEquals(2, 
Iterables.size(oddsAndEvens.lookup(true).get(0)));  // Evens
 Assert.assertEquals(5, 
Iterables.size(oddsAndEvens.lookup(false).get(0))); // Odds
   }
+   
+   @Test
+  public void groupByOnPairRDD() {
+JavaRDDInteger rdd = sc.parallelize(Arrays.asList(1, 1, 2, 3, 5, 8, 
13));
+Functionscala.Tuple2Integer, Integer, Boolean areOdd = new 
Functionscala.Tuple2Integer, Integer, Boolean() {
--- End diff --

Also, you could just write `Tuple2` instead of `scala.Tuple2`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4459] Change groupBy type parameter fro...

2014-12-04 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3327#issuecomment-65719085
  
I downloaded this and tested it out; everything looks fine, modulo these 
formatting issues.  I've fixed the style issues myself and am going to merge 
this into `master`, `branch-1.2`, and `branch-1.1`.  Thanks for fixing this!

(I ran the MiMa tests locally and this passes)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4459] Change groupBy type parameter fro...

2014-12-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/3327


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4459] Change groupBy type parameter fro...

2014-12-04 Thread alokito

Github user alokito commented on the pull request:

https://github.com/apache/spark/pull/3327#issuecomment-65719809
  
Thanks for fixing the style issues, I meant to but it's been a tough week.

On Dec 4, 2014, at 17:57, Josh Rosen notificati...@github.com wrote:

I downloaded this and tested it out; everything looks fine, modulo these
formatting issues. I've fixed the style issues myself and am going to merge
this into master, branch-1.2, and branch-1.1. Thanks for fixing this!

(I ran the MiMa tests locally and this passes)

â
Reply to this email directly or view it on GitHub
https://github.com/apache/spark/pull/3327#issuecomment-65719085.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4652][DOCS] Add docs about spark-git-re...

2014-12-04 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/3513#discussion_r21344431
  
--- Diff: docs/ec2-scripts.md ---
@@ -85,6 +85,11 @@ another.
  specified version of Spark. The `version` can be a version number
  (e.g. 0.7.3) or a specific git hash. By default, a recent
  version will be used.
+-`--spark-git-repo=repository url` enables you to run your 
+ development version on EC2 cluster. You need to set 
+ `--spark-version` as git commit hash such as 317e114 not 
+ original release version number. By default, this repository is 
+ set [apache mirror](https://github.com/apache/spark).
--- End diff --

Apache needs to be capitalized here.  I'd also swap the order of these 
sentences so that the default for `--spark-git-repo` appears first, followed by 
the sentence describing how `--spark-version` needs to be set when using this 
option.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4652][DOCS] Add docs about spark-git-re...

2014-12-04 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/3513#issuecomment-65721200
  
This looks good to me.  There's one minor sentence-ordering and 
capitalization issue that I'd like to fix, but I'll do it myself on merge.  I'm 
going to merge this into `master`, `branch-1.2`, and `branch-1.1`.  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4652][DOCS] Add docs about spark-git-re...

2014-12-04 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/3513


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4005][CORE] handle message replies in r...

2014-12-04 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2853#discussion_r21345364
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala ---
@@ -351,23 +350,23 @@ class BlockManagerMasterActor(val isLocal: Boolean, 
conf: SparkConf, listenerBus
   storageLevel: StorageLevel,
   memSize: Long,
   diskSize: Long,
-  tachyonSize: Long) {
+  tachyonSize: Long): Boolean = {
 
+var updated = true
 if (!blockManagerInfo.contains(blockManagerId)) {
   if (blockManagerId.isDriver  !isLocal) {
 // We intentionally do not register the master (except in local 
mode),
 // so we should not indicate failure.
-sender ! true
+// do nothing here, updated == true.
--- End diff --

Why not just `return true` here, and `return false` in the other branch so 
that we can eliminate the mutable `updated` variable?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4005][CORE] handle message replies in r...

2014-12-04 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/2853#discussion_r21345387
  
--- Diff: 
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala ---
@@ -391,7 +390,7 @@ class BlockManagerMasterActor(val isLocal: Boolean, 
conf: SparkConf, listenerBus
 if (locations.size == 0) {
   blockLocations.remove(blockId)
 }
-sender ! true
+updated
--- End diff --

Similarly, why not just `return true` here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 >

1 - 100 of 279 matches

Mail list logo