date:20160917

[GitHub] spark issue #15122: [SPARK-17569] Make StructuredStreaming FileStreamSource ...

2016-09-17 Thread petermaxlee

Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/15122
  
I looked into this. I think there are two ways that you can intercept any 
calls to HDFS.

The first way is slightly hacky but pretty simple. 
FileSystem.addFileSystemForTesting is a package private method that can be used 
to inject a mock file system. You can create an implementation of 
FilterFileSystem and pass it in as "file" schema. Then all accesses to local 
file system will go through your implementation. Of course, you can also use a 
mocking library to do that, but that is not as clean since FilterFileSystem is 
a public interface.

The second way is more robust and does not depend on any private APIs. 
Create an implementation of FilterFileSystem by pointing to LocalFileSystem, 
e.g. call it MockFileSystem. MockFileSystem.getScheme should return 
"mockfs://". You can then use this as the path when passing to structured 
streaming. This is probably a more robust, generic solution.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15133: [SPARK-17578][Docs] Add spark.app.name default value for...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15133
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15133: [SPARK-17578][Docs] Add spark.app.name default value for...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15133
  
**[Test build #65552 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65552/consoleFull)**
 for PR 15133 at commit 
[`339d5d4`](https://github.com/apache/spark/commit/339d5d4f7afb110e17b01e3355fb68ef6d12200d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15133: [SPARK-17578][Docs] Add spark.app.name default value for...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15133
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65552/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14959: [SPARK-17387][PYSPARK] Creating SparkContext() from pyth...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14959
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14959: [SPARK-17387][PYSPARK] Creating SparkContext() from pyth...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14959
  
**[Test build #65551 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65551/consoleFull)**
 for PR 14959 at commit 
[`d85bf36`](https://github.com/apache/spark/commit/d85bf36850b7e97056889fbd273749e1d8144cc6).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14959: [SPARK-17387][PYSPARK] Creating SparkContext() from pyth...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14959
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65551/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15133: [SPARK-17578][Docs] Add spark.app.name default value for...

2016-09-17 Thread phalodi

Github user phalodi commented on the issue:

https://github.com/apache/spark/pull/15133
  
Also change this one according to that now default value of app name is 
(random) for session and context


![random](https://cloud.githubusercontent.com/assets/8075390/18613106/c5e3b420-7d8f-11e6-8763-9d7d16d2eafa.png)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15133: [SPARK-17578][Docs] Add spark.app.name default value for...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15133
  
**[Test build #65552 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65552/consoleFull)**
 for PR 15133 at commit 
[`339d5d4`](https://github.com/apache/spark/commit/339d5d4f7afb110e17b01e3355fb68ef6d12200d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-17 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/15090#discussion_r79297872
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -98,8 +98,12 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder {
   ctx.identifier != null &&
   ctx.identifier.getText.toLowerCase == "noscan") {
   
AnalyzeTableCommand(visitTableIdentifier(ctx.tableIdentifier).toString)
-} else {
+} else if (ctx.identifierSeq() == null) {
--- End diff --

As mentioned in  [the 
comment](https://github.com/apache/spark/pull/15090#r78687294), we are going to 
change the "ANALYZE" syntax in SqlBase.g4, i.e. make the identifierSeq 
non-optional, which it's different from Hive. Is this ok? @rxin @hvanhovell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14959: [SPARK-17387][PYSPARK] Creating SparkContext() from pyth...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14959
  
**[Test build #65551 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65551/consoleFull)**
 for PR 14959 at commit 
[`d85bf36`](https://github.com/apache/spark/commit/d85bf36850b7e97056889fbd273749e1d8144cc6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14959: [SPARK-17387][PYSPARK] Creating SparkContext() from pyth...

2016-09-17 Thread zjffdu

Github user zjffdu commented on the issue:

https://github.com/apache/spark/pull/14959
  
@vanzin Thanks for your reviews. I just update the PR, but don't get your 
following statement mean. Can you explain it ? Thanks
```
Especially since the Scala SparkContext clones the original user config - 
and if I read your code correctly, you're not doing that here.
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15134: [SPARK-17580][CORE]Add random UUID as app name while app...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15134
  
**[Test build #65550 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65550/consoleFull)**
 for PR 15134 at commit 
[`7a5946d`](https://github.com/apache/spark/commit/7a5946d90e1d1816964baf724b4e3422ade99b3d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15133: [SPARK-17578][Docs] Add spark.app.name default value for...

2016-09-17 Thread phalodi

Github user phalodi commented on the issue:

https://github.com/apache/spark/pull/15133
  
@andrewor14 what is think about this 
https://github.com/apache/spark/pull/15134 we add random UUID of app name while 
creating spark context if its not define.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15134: [SPARK-17580][CORE]Add random UUID as app name wh...

2016-09-17 Thread phalodi

GitHub user phalodi opened a pull request:

https://github.com/apache/spark/pull/15134

[SPARK-17580][CORE]Add random UUID as app name while app name not define 
while creating â¦

## What changes were proposed in this pull request?

Assign Random UUID as a app name while app name not define while creating 
spark context its also same in SparkSession so we should make this behaviour 
same.


## How was this patch tested?

Run all test cases




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/phalodi/spark SPARK-17580

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15134.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15134


commit 7a5946d90e1d1816964baf724b4e3422ade99b3d
Author: sandy 
Date:   2016-09-18T05:17:07Z

add random UUID as app name while app name not define while creating spark 
context




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14971
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15122: [SPARK-17569] Make StructuredStreaming FileStreamSource ...

2016-09-17 Thread petermaxlee

Github user petermaxlee commented on the issue:

https://github.com/apache/spark/pull/15122
  
Can you test this by deleting the file on purpose, and see what kind of 
exceptions are thrown?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14971
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65548/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14971
  
**[Test build #65548 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65548/consoleFull)**
 for PR 14971 at commit 
[`3376bd6`](https://github.com/apache/spark/commit/3376bd6a57a65fa004abd43237f8f3c87f07064a).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15127: [SPARK-17571][SQL] AssertOnQuery.condition should always...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15127
  
**[Test build #65549 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65549/consoleFull)**
 for PR 15127 at commit 
[`d013acf`](https://github.com/apache/spark/commit/d013acf3b8a258d12dbe61a2d348ccfc4f099fb6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15129: [SPARK-17546] [DEPLOY] start-* scripts should use hostna...

2016-09-17 Thread rxin

Github user rxin commented on the issue:

https://github.com/apache/spark/pull/15129
  
LGTM.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15051: [SPARK-17499][SparkR][ML][MLLib] make the default...

2016-09-17 Thread WeichenXu123

Github user WeichenXu123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/15051#discussion_r79297392
  
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,14 @@ setMethod("predict", signature(object = "KMeansModel"),
 #' }
 #' @note spark.mlp since 2.1.0
 setMethod("spark.mlp", signature(data = "SparkDataFrame"),
-  function(data, blockSize = 128, layers = c(3, 5, 2), solver = 
"l-bfgs", maxIter = 100,
-   tol = 0.5, stepSize = 1, seed = 1) {
+  function(data, layers, blockSize = 128, solver = "l-bfgs", 
maxIter = 100,
+   tol = 1E-6, stepSize = 0.03, seed = 0x7FFF) {
+if (length(layers) <= 1) {
+  stop("layers vector require length > 0.")
+}
+if (any(sapply(layers, function(e) !is.numeric(e {
--- End diff --

oh, its a clever way using `as.intege(x) != x` to check whether it is an 
integer.
here the mlp require layers to be integer vector,
is it better to force user pass integer vector, if not call `stop`, or just 
print a warning ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15051: [SPARK-17499][SparkR][ML][MLLib] make the default...

2016-09-17 Thread shivaram

Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/15051#discussion_r79297229
  
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,14 @@ setMethod("predict", signature(object = "KMeansModel"),
 #' }
 #' @note spark.mlp since 2.1.0
 setMethod("spark.mlp", signature(data = "SparkDataFrame"),
-  function(data, blockSize = 128, layers = c(3, 5, 2), solver = 
"l-bfgs", maxIter = 100,
-   tol = 0.5, stepSize = 1, seed = 1) {
+  function(data, layers, blockSize = 128, solver = "l-bfgs", 
maxIter = 100,
+   tol = 1E-6, stepSize = 0.03, seed = 0x7FFF) {
+if (length(layers) <= 1) {
+  stop("layers vector require length > 0.")
+}
+if (any(sapply(layers, function(e) !is.numeric(e {
--- End diff --

You can use `numToInt` from 
https://github.com/apache/spark/blob/master/R/pkg/R/utils.R#L368 -- It'll print 
a warning if its not an integer


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15131: [SPARK-17577][SparkR] SparkR support add files to Spark ...

2016-09-17 Thread HyukjinKwon

Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/15131
  
@shivaram Thanks for cc'ing me. I will try to look closely within today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13513: [SPARK-15698][SQL][Streaming] Add the ability to remove ...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13513
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13513: [SPARK-15698][SQL][Streaming] Add the ability to remove ...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13513
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65547/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13513: [SPARK-15698][SQL][Streaming] Add the ability to remove ...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13513
  
**[Test build #65547 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65547/consoleFull)**
 for PR 13513 at commit 
[`be1abfa`](https://github.com/apache/spark/commit/be1abfa0e902fca3ed945bfbb6e0573909d55e2b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14600: [SPARK-15899] [SQL] Fix the construction of the file pat...

2016-09-17 Thread Praveenmail2him

Github user Praveenmail2him commented on the issue:

https://github.com/apache/spark/pull/14600
  
Can anyone post the sample usage for this exception in Spark 2.0, I'm still 
facing this exception?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-17 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/15090#discussion_r79296707
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeColumnCommand.scala
 ---
@@ -0,0 +1,159 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command
+
+import scala.collection.mutable
+
+import org.apache.spark.sql._
+import org.apache.spark.sql.catalyst.{InternalRow, TableIdentifier}
+import org.apache.spark.sql.catalyst.analysis.EliminateSubqueryAliases
+import org.apache.spark.sql.catalyst.catalog.{CatalogRelation, 
CatalogTable}
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.aggregate._
+import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, 
BasicColStats, Statistics}
+import org.apache.spark.sql.execution.datasources.LogicalRelation
+import org.apache.spark.sql.types._
+
+
+/**
+ * Analyzes the given columns of the given table in the current database 
to generate statistics,
+ * which will be used in query optimizations.
+ */
+case class AnalyzeColumnCommand(
+tableIdent: TableIdentifier,
+columnNames: Seq[String]) extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val sessionState = sparkSession.sessionState
+val relation = 
EliminateSubqueryAliases(sessionState.catalog.lookupRelation(tableIdent))
+
+// check correctness of column names
+val validColumns = mutable.MutableList[NamedExpression]()
+val resolver = sessionState.conf.resolver
+columnNames.foreach { col =>
+  val exprOption = relation.resolve(col.split("\\."), resolver)
+  if (exprOption.isEmpty) {
+throw new AnalysisException(s"Invalid column name: $col")
+  }
+  if (validColumns.map(_.exprId).contains(exprOption.get.exprId)) {
+throw new AnalysisException(s"Duplicate column name: $col")
+  }
+  validColumns += exprOption.get
+}
+
+relation match {
+  case catalogRel: CatalogRelation =>
+updateStats(catalogRel.catalogTable,
+  AnalyzeTableCommand.calculateTotalSize(sparkSession, 
catalogRel.catalogTable))
+
+  case logicalRel: LogicalRelation if 
logicalRel.catalogTable.isDefined =>
+updateStats(logicalRel.catalogTable.get, 
logicalRel.relation.sizeInBytes)
+
+  case otherRelation =>
+throw new AnalysisException("ANALYZE TABLE is not supported for " +
+  s"${otherRelation.nodeName}.")
+}
+
+def updateStats(catalogTable: CatalogTable, newTotalSize: Long): Unit 
= {
+  // Collect statistics per column.
+  // The first element in the result will be the overall row count, 
the following elements
+  // will be structs containing all column stats.
+  // The layout of each struct follows the layout of the BasicColStats.
+  val ndvMaxErr = sessionState.conf.ndvMaxError
+  val expressions = Count(Literal(1)).toAggregateExpression() +:
+validColumns.map(ColumnStatsStruct(_, ndvMaxErr))
+  val namedExpressions = expressions.map(e => Alias(e, e.toString)())
+  val statsRow = Dataset.ofRows(sparkSession, Aggregate(Nil, 
namedExpressions, relation))
+.queryExecution.toRdd.collect().head
+
+  // unwrap the result
+  val rowCount = statsRow.getLong(0)
+  val colStats = validColumns.zipWithIndex.map { case (expr, i) =>
+val colInfo = statsRow.getStruct(i + 1, 
ColumnStatsStruct.statsNumber)
+val colStats = ColumnStatsStruct.unwrapRow(expr, colInfo)
+(expr.name, colStats)
+  }.toMap
+
+  val statistics =
+Statistics(sizeInBytes = newTotalSize, rowCount = Some(rowCount), 
basicColStats = colStats)
+  sessionState.catalog.alterTable(catalogTable.copy(stats = 
Some(statistics)))
+

[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14971
  
**[Test build #65548 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65548/consoleFull)**
 for PR 14971 at commit 
[`3376bd6`](https://github.com/apache/spark/commit/3376bd6a57a65fa004abd43237f8f3c87f07064a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-17 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/15090#discussion_r7929
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/StatisticsSuite.scala ---
@@ -101,4 +101,47 @@ class StatisticsSuite extends QueryTest with 
SharedSQLContext {
   checkTableStats(tableName, expectedRowCount = Some(2))
 }
   }
+
+  test("test column-level statistics for data source table created in 
InMemoryCatalog") {
+def checkColStats(colStats: BasicColStats, expectedColStats: 
BasicColStats): Unit = {
+  assert(colStats.dataType == expectedColStats.dataType)
+  assert(colStats.numNulls == expectedColStats.numNulls)
+  assert(colStats.max == expectedColStats.max)
+  assert(colStats.min == expectedColStats.min)
+  if (expectedColStats.ndv.isDefined) {
+// ndv is an approximate value, so we just make sure we have the 
value
+assert(colStats.ndv.get >= 0)
--- End diff --

How to get the standard deviations?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-17 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/15090#discussion_r79296668
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -563,6 +563,13 @@ object SQLConf {
   .timeConf(TimeUnit.MILLISECONDS)
   .createWithDefault(10L)
 
+  val NDV_MAX_ERROR =
+SQLConfigBuilder("spark.sql.ndv.maxError")
--- End diff --

OK


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-17 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/15090#discussion_r79296634
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
@@ -87,19 +87,23 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder {
   }
 
   /**
-   * Create an [[AnalyzeTableCommand]] command. This currently only 
implements the NOSCAN
-   * option (other options are passed on to Hive) e.g.:
-   * {{{
-   *   ANALYZE TABLE table COMPUTE STATISTICS NOSCAN;
-   * }}}
+   * Create an [[AnalyzeTableCommand]] command or an 
[[AnalyzeColumnCommand]] command.
*/
   override def visitAnalyze(ctx: AnalyzeContext): LogicalPlan = 
withOrigin(ctx) {
 if (ctx.partitionSpec == null &&
   ctx.identifier != null &&
   ctx.identifier.getText.toLowerCase == "noscan") {
-  
AnalyzeTableCommand(visitTableIdentifier(ctx.tableIdentifier).toString)
+  AnalyzeTableCommand(visitTableIdentifier(ctx.tableIdentifier))
+} else if (ctx.identifierSeq() == null) {
--- End diff --

Yeah, good idea


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15090: [SPARK-17073] [SQL] generate column-level statist...

2016-09-17 Thread wzhfy

Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/15090#discussion_r79296629
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsColumnSuite.scala 
---
@@ -0,0 +1,228 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive
+
+import java.sql.{Date, Timestamp}
+
+import org.apache.spark.sql.{AnalysisException, Row}
+import org.apache.spark.sql.catalyst.plans.logical.BasicColStats
+import org.apache.spark.sql.execution.command.AnalyzeColumnCommand
+import org.apache.spark.sql.types._
+
+class StatisticsColumnSuite extends StatisticsTest {
+
+  test("parse analyze column commands") {
+val table = "table"
+assertAnalyzeCommand(
+  s"ANALYZE TABLE $table COMPUTE STATISTICS FOR COLUMNS key, value",
+  classOf[AnalyzeColumnCommand])
+
+val noColumnError = intercept[AnalysisException] {
+  sql(s"ANALYZE TABLE $table COMPUTE STATISTICS FOR COLUMNS")
+}
+assert(noColumnError.message == "Need to specify the columns to 
analyze. Usage: " +
+  "ANALYZE TABLE tbl COMPUTE STATISTICS FOR COLUMNS key, value")
+
+withTable(table) {
+  sql(s"CREATE TABLE $table (key INT, value STRING)")
+  val invalidColError = intercept[AnalysisException] {
+sql(s"ANALYZE TABLE $table COMPUTE STATISTICS FOR COLUMNS k")
+  }
+  assert(invalidColError.message == s"Invalid column name: k")
+
+  val duplicateColError = intercept[AnalysisException] {
+sql(s"ANALYZE TABLE $table COMPUTE STATISTICS FOR COLUMNS key, 
value, key")
+  }
+  assert(duplicateColError.message == s"Duplicate column name: key")
+
+  withSQLConf("spark.sql.caseSensitive" -> "true") {
+val invalidErr = intercept[AnalysisException] {
+  sql(s"ANALYZE TABLE $table COMPUTE STATISTICS FOR COLUMNS keY")
+}
+assert(invalidErr.message == s"Invalid column name: keY")
+  }
+
+  withSQLConf("spark.sql.caseSensitive" -> "false") {
+val duplicateErr = intercept[AnalysisException] {
+  sql(s"ANALYZE TABLE $table COMPUTE STATISTICS FOR COLUMNS key, 
value, vaLue")
+}
+assert(duplicateErr.message == s"Duplicate column name: vaLue")
+  }
+}
+  }
+
+  test("basic statistics for integral type columns") {
+val rdd = sparkContext.parallelize(Seq("1", null, "2", "3", null)).map 
{ i =>
+  if (i != null) Row(i.toByte, i.toShort, i.toInt, i.toLong) else 
Row(i, i, i, i)
+}
+val schema = StructType(
+  StructField(name = "c1", dataType = ByteType, nullable = true) ::
+StructField(name = "c2", dataType = ShortType, nullable = true) ::
+StructField(name = "c3", dataType = IntegerType, nullable = true) 
::
+StructField(name = "c4", dataType = LongType, nullable = true) :: 
Nil)
+val expectedBasicStats = BasicColStats(
+  dataType = ByteType, numNulls = 2, max = Some(3), min = Some(1), ndv 
= Some(3))
--- End diff --

Can you explain more about this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15051: [SPARK-17499][SparkR][ML][MLLib] make the default params...

2016-09-17 Thread WeichenXu123

Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/15051
  
@felixcheung 
yeah, in fact 0x7FFF is not ideal because itself also a valid seed.
and there is another problem, in scala, seed is `long` type,
but in R side, it seems there is no `long` type, so the seed value range in 
R-side is already smaller than scala-side.
but I think it is a trivial problem, because `int` range seed is large 
enough to be used.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15051: [SPARK-17499][SparkR][ML][MLLib] make the default...

2016-09-17 Thread WeichenXu123

Github user WeichenXu123 commented on a diff in the pull request:

https://github.com/apache/spark/pull/15051#discussion_r79295910
  
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,14 @@ setMethod("predict", signature(object = "KMeansModel"),
 #' }
 #' @note spark.mlp since 2.1.0
 setMethod("spark.mlp", signature(data = "SparkDataFrame"),
-  function(data, blockSize = 128, layers = c(3, 5, 2), solver = 
"l-bfgs", maxIter = 100,
-   tol = 0.5, stepSize = 1, seed = 1) {
+  function(data, layers, blockSize = 128, solver = "l-bfgs", 
maxIter = 100,
+   tol = 1E-6, stepSize = 0.03, seed = 0x7FFF) {
+if (length(layers) <= 1) {
+  stop("layers vector require length > 0.")
+}
+if (any(sapply(layers, function(e) !is.numeric(e {
--- End diff --

layers should be integer, but in R it seems we can't distinguish numeric or 
integer vector ?
to `layers<-c(1,2)` or `layers<-c(1.0, 2.0)`, `is.integer(layers[i])` both 
return `false` and `as.integer(layers)` both return `true`,
so is there some good way to check it is an integer vector but not a 
numeric vector ?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13513: [SPARK-15698][SQL][Streaming] Add the ability to remove ...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/13513
  
**[Test build #65547 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65547/consoleFull)**
 for PR 13513 at commit 
[`be1abfa`](https://github.com/apache/spark/commit/be1abfa0e902fca3ed945bfbb6e0573909d55e2b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14971
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/14971
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65546/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14971
  
**[Test build #65546 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65546/consoleFull)**
 for PR 14971 at commit 
[`2f40c7f`](https://github.com/apache/spark/commit/2f40c7f5532c8b6e66c786f3b1506bd4efdcf711).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14643: [SPARK-17057][ML] ProbabilisticClassifierModels' predict...

2016-09-17 Thread zhengruifeng

Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/14643
  
@srowen You can take it over.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14971: [SPARK-17410] [SPARK-17284] Move Hive-generated Stats In...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/14971
  
**[Test build #65546 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65546/consoleFull)**
 for PR 14971 at commit 
[`2f40c7f`](https://github.com/apache/spark/commit/2f40c7f5532c8b6e66c786f3b1506bd4efdcf711).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15051: [SPARK-17499][SparkR][ML][MLLib] make the default params...

2016-09-17 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/15051
  
LGTM - just a question above and this: would 0x7FFF be a good 
placeholder value - is it possible to set seed to this in Scala?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15051: [SPARK-17499][SparkR][ML][MLLib] make the default...

2016-09-17 Thread felixcheung

Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/15051#discussion_r79294205
  
--- Diff: R/pkg/R/mllib.R ---
@@ -694,8 +694,14 @@ setMethod("predict", signature(object = "KMeansModel"),
 #' }
 #' @note spark.mlp since 2.1.0
 setMethod("spark.mlp", signature(data = "SparkDataFrame"),
-  function(data, blockSize = 128, layers = c(3, 5, 2), solver = 
"l-bfgs", maxIter = 100,
-   tol = 0.5, stepSize = 1, seed = 1) {
+  function(data, layers, blockSize = 128, solver = "l-bfgs", 
maxIter = 100,
+   tol = 1E-6, stepSize = 0.03, seed = 0x7FFF) {
+if (length(layers) <= 1) {
+  stop("layers vector require length > 0.")
+}
+if (any(sapply(layers, function(e) !is.numeric(e {
--- End diff --

juts double checking - should layers be integer or numeric?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14338: [SPARK-16701] Make parameters configurable in Blo...

2016-09-17 Thread lovexi

Github user lovexi closed the pull request at:

https://github.com/apache/spark/pull/14338


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15093: [SPARK-17480][SQL][FOLLOWUP] Fix more instances which ca...

2016-09-17 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/15093
  
Yep, saw that. I re-merged this, and yes during conflict resolution 
QuantileSummaries.scala comes up as a file added only in the master branch, but 
when I choose to not take the change in the IDE, I see it actually resulted in 
adding an empty file. I made sure that was not part of the commit and pushed 
again. Looks as intended now: 
https://github.com/apache/spark/commit/5fd354b2d628130a74c9d01adc7ab6bef65fbd9a


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15093: [SPARK-17480][SQL][FOLLOWUP] Fix more instances which ca...

2016-09-17 Thread tdas

Github user tdas commented on the issue:

https://github.com/apache/spark/pull/15093
  
I reverted already! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15093: [SPARK-17480][SQL][FOLLOWUP] Fix more instances which ca...

2016-09-17 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/15093
  
Oh weird! no idea why that happened. Yeah I'll take care of it from here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15093: [SPARK-17480][SQL][FOLLOWUP] Fix more instances which ca...

2016-09-17 Thread tdas

Github user tdas commented on the issue:

https://github.com/apache/spark/pull/15093
  
@HyukjinKwon @srowen 
This PR when merged into branch-2.0 somehow created an empty file 
QuantileSummaries.scala that is failing the lint test as the Apache license 
header does not exist - 
Commit - 
https://github.com/apache/spark/commit/a3bba372abce926351335d0a2936b70988f19b23
Empty file - 
https://github.com/apache/spark/blob/a3bba372abce926351335d0a2936b70988f19b23/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/QuantileSummaries.scala

I am not sure how exactly backporting a patch led to an empty file, but 
this does not seem right. I am reverting this commit in branch 2.0. Please make 
a new PR to fix this in branch 2.0 correctly.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15131: [SPARK-17577][SparkR] SparkR support add files to Spark ...

2016-09-17 Thread shivaram

Github user shivaram commented on the issue:

https://github.com/apache/spark/pull/15131
  
It looks like `addFile` isn't working on Windows because we try to convert 
the windows file path into a URI and that fails.  Not sure what the fix is in 
this case. 

cc @HyukjinKwon who worked on this for `hadoopFile` 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15133: [SPARK-17578][Docs] Add spark.app.name default value for...

2016-09-17 Thread andrewor14

Github user andrewor14 commented on the issue:

https://github.com/apache/spark/pull/15133
  
Yeah `SparkSession` will be the new thing moving forward. `SparkContext` is 
kind of just a legacy thing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15133: [SPARK-17578][Docs] Add spark.app.name default value for...

2016-09-17 Thread phalodi

Github user phalodi commented on the issue:

https://github.com/apache/spark/pull/15133
  
@andrewor14 So as you suggest we also change it in spark context code 
because right now we must set app name while creating spark context. So when we 
add random UUID generate for default value of spark name while creating spark 
context then it will be consistent for all cases.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15133: [SPARK-17578][Docs] Add spark.app.name default value for...

2016-09-17 Thread andrewor14

Github user andrewor14 commented on the issue:

https://github.com/apache/spark/pull/15133
  
We should probably just make it a random UUID in all cases to be 
consistent. I don't know if people check whether `spark.app.name` is set, so 
that might be a backward compatibility concern (though one that we kind of 
already broke with `SparkSession`).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15132: [SPARK-17510][STREAMING][KAFKA] config max rate on a per...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15132
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15132: [SPARK-17510][STREAMING][KAFKA] config max rate on a per...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15132
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65544/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15132: [SPARK-17510][STREAMING][KAFKA] config max rate on a per...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15132
  
**[Test build #65544 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65544/consoleFull)**
 for PR 15132 at commit 
[`9ff922b`](https://github.com/apache/spark/commit/9ff922bead9805b5d0b7dcb8f9d910e7202ed67b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15043: [SPARK-17491] Close serialization stream to fix w...

2016-09-17 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/15043


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15043: [SPARK-17491] Close serialization stream to fix wrong an...

2016-09-17 Thread JoshRosen

Github user JoshRosen commented on the issue:

https://github.com/apache/spark/pull/15043
  
I believe that this latest test failure is caused by a known flaky PySpark 
test, so I'm going to merge this now and will monitor tests afterwards.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15131: [SPARK-17577][SparkR] SparkR support add files to Spark ...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15131
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15131: [SPARK-17577][SparkR] SparkR support add files to Spark ...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15131
  
**[Test build #65539 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65539/consoleFull)**
 for PR 15131 at commit 
[`d3dd380`](https://github.com/apache/spark/commit/d3dd3808e88b3f4ba5af683eb7d7709fcc2710f7).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15131: [SPARK-17577][SparkR] SparkR support add files to Spark ...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15131
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65539/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15133: [SPARK-17578][Docs] Add spark.app.name default value for...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15133
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65545/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15133: [SPARK-17578][Docs] Add spark.app.name default value for...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15133
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15133: [SPARK-17578][Docs] Add spark.app.name default value for...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15133
  
**[Test build #65545 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65545/consoleFull)**
 for PR 15133 at commit 
[`eade2e2`](https://github.com/apache/spark/commit/eade2e2d5fbb757616a1265d1f2e196fe8799dd9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15043: [SPARK-17491] Close serialization stream to fix wrong an...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15043
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15043: [SPARK-17491] Close serialization stream to fix wrong an...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15043
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65541/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15043: [SPARK-17491] Close serialization stream to fix wrong an...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15043
  
**[Test build #65541 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65541/consoleFull)**
 for PR 15043 at commit 
[`0d70774`](https://github.com/apache/spark/commit/0d70774e1db04edb46b312efc4b1646d7201fb03).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15133: [SPARK-17578][Docs] Add spark.app.name default value for...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15133
  
**[Test build #65545 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65545/consoleFull)**
 for PR 15133 at commit 
[`eade2e2`](https://github.com/apache/spark/commit/eade2e2d5fbb757616a1265d1f2e196fe8799dd9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15133: [SPARK-17578][Docs] Add spark.app.name default va...

2016-09-17 Thread phalodi

GitHub user phalodi opened a pull request:

https://github.com/apache/spark/pull/15133

[SPARK-17578][Docs] Add spark.app.name default value for spark session

## What changes were proposed in this pull request?
Modify spark.app.name configuration for spark session


## How was this patch tested?

run all test cases and generate documentation


![appname](https://cloud.githubusercontent.com/assets/8075390/18609970/9eba2f2c-7d2c-11e6-8d3b-e45691db59b9.png)





You can merge this pull request into a Git repository by running:

$ git pull https://github.com/phalodi/spark SPARK-17578

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15133.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15133


commit eade2e2d5fbb757616a1265d1f2e196fe8799dd9
Author: sandy 
Date:   2016-09-17T17:43:41Z

add spark.app.name default value for spark session




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15073: [SPARK-17518] [SQL] Block Users to Specify the Internal ...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15073
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65538/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15073: [SPARK-17518] [SQL] Block Users to Specify the Internal ...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15073
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15073: [SPARK-17518] [SQL] Block Users to Specify the Internal ...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15073
  
**[Test build #65538 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65538/consoleFull)**
 for PR 15073 at commit 
[`ef174c1`](https://github.com/apache/spark/commit/ef174c1fde3b872a2374d8b47b5a28eeb8a13321).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15051: [SPARK-17499][SparkR][ML][MLLib] make the default params...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15051
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65542/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15051: [SPARK-17499][SparkR][ML][MLLib] make the default params...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15051
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15051: [SPARK-17499][SparkR][ML][MLLib] make the default params...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15051
  
**[Test build #65542 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65542/consoleFull)**
 for PR 15051 at commit 
[`ce2c2f7`](https://github.com/apache/spark/commit/ce2c2f743e912225416a1f28b0e90d5d88ddaf49).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15093: [SPARK-17480][SQL][FOLLOWUP] Fix more instances which ca...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15093
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65536/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15093: [SPARK-17480][SQL][FOLLOWUP] Fix more instances which ca...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15093
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15093: [SPARK-17480][SQL][FOLLOWUP] Fix more instances which ca...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15093
  
**[Test build #65536 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65536/consoleFull)**
 for PR 15093 at commit 
[`8a3d293`](https://github.com/apache/spark/commit/8a3d293302ba87629a7a7247a7c3912e294e3752).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-17 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14981
  
I am referring to 
http://mail-archives.apache.org/mod_mbox/www-legal-discuss/201609.mbox/

I don't think it is up to us being 'flexible' or not. I also don't actually 
see that a source vs binary distinction is drawn here either. Indeed there is a 
question whether even that is permitted. 

But I do not see any conclusive argument that this isn't permitted. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-17 Thread lresende

Github user lresende commented on the issue:

https://github.com/apache/spark/pull/14981
  
The pointer is exactly your quote on the e-mail to legal-discuss:

http://www.apache.org/legal/resolved.html#prohibited says:
-
CAN APACHE PROJECTS RELY ON COMPONENTS UNDER PROHIBITED LICENSES?
**Apache projects cannot distribute any such components**. As with the 
previous question on platforms, the component can be relied on if the 
component's licence terms do not affect the Apache product's licensing. For 
example, using a GPL'ed tool during the build is OK.

CAN APACHE PROJECTS RELY ON COMPONENTS WHOSE LICENSING AFFECTS THE APACHE 
PRODUCT?
Apache projects cannot distribute any such components. **However, if the 
component is only needed for optional features, a project can provide the user 
with instructions on how to obtain and install the non-included work**. 
Optional means that the component is not required for standard use of the 
product or for the product to achieve a desirable level of quality. The 
question to ask yourself in this situation is:

===

And I am being flexible here, and agreeing that that is ok to have the 
source distribution with the kinesis and ganglia modules, as long as we don't 
publish them into maven and require the users to build with the respective 
profiles in order to gain access to these modules in their application.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15132: [SPARK-17510][STREAMING][KAFKA] config max rate on a per...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15132
  
**[Test build #65544 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65544/consoleFull)**
 for PR 15132 at commit 
[`9ff922b`](https://github.com/apache/spark/commit/9ff922bead9805b5d0b7dcb8f9d910e7202ed67b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15097: [SPARK-17540][SparkR][Spark Core] fix SparkR array serde...

2016-09-17 Thread felixcheung

Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/15097
  
Please add tests for this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15132: [SPARK-17510][STREAMING][KAFKA] config max rate on a per...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15132
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15132: [SPARK-17510][STREAMING][KAFKA] config max rate on a per...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15132
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65543/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15132: [SPARK-17510][STREAMING][KAFKA] config max rate on a per...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15132
  
**[Test build #65543 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65543/consoleFull)**
 for PR 15132 at commit 
[`9fa1a4f`](https://github.com/apache/spark/commit/9fa1a4f8c8d1027b9c39d087299eeac1ffa11348).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15132: [SPARK-17510][STREAMING][KAFKA] config max rate on a per...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15132
  
**[Test build #65543 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65543/consoleFull)**
 for PR 15132 at commit 
[`9fa1a4f`](https://github.com/apache/spark/commit/9fa1a4f8c8d1027b9c39d087299eeac1ffa11348).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-17 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14981
  
That isn't the conclusion I took from the discussion on legal-discuss - do 
you have a pointer? I took that it was at best ambiguous but not obviously 
prohibited to distribute these because they are optional wrt Spark.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #15132: [SPARK-17510][STREAMING][KAFKA] config max rate o...

2016-09-17 Thread koeninger

GitHub user koeninger opened a pull request:

https://github.com/apache/spark/pull/15132

[SPARK-17510][STREAMING][KAFKA] config max rate on a per-partition basis

## What changes were proposed in this pull request?

Allow configuration of max rate on a per-topicpartition basis.

## How was this patch tested?

Unit tests.

The reporter (Jeff Nadler) said he could test on his workload, so let's 
wait on that report.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/koeninger/spark-1 SPARK-17510

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15132.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15132


commit b282fe1ba1245170e426b62fe7c543b2a26a6488
Author: cody koeninger 
Date:   2016-09-17T16:32:41Z

[SPARK-17510][STREAMING][KAFKA] allow max rate on a per-partition basis

commit 9fa1a4f8c8d1027b9c39d087299eeac1ffa11348
Author: cody koeninger 
Date:   2016-09-17T16:45:58Z

[SPARK-17510][STREAMING][KAFKA] test max rate on a per-partition basis




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-17 Thread lresende

Github user lresende commented on the issue:

https://github.com/apache/spark/pull/14981
  
Yes, and this is the intent. It's ok to have these in the source release 
(similar to ganglia) but we don't publish them in maven repository and it 
becomes available only if people goes and directly build them locally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15051: [SPARK-17499][SparkR][ML][MLLib] make the default params...

2016-09-17 Thread WeichenXu123

Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/15051
  
@felixcheung Now I add some test using default parameter and compare the 
output prediction with the result generated using scala-side code.
thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15051: [SPARK-17499][SparkR][ML][MLLib] make the default params...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15051
  
**[Test build #65542 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65542/consoleFull)**
 for PR 15051 at commit 
[`ce2c2f7`](https://github.com/apache/spark/commit/ce2c2f743e912225416a1f28b0e90d5d88ddaf49).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15131: [SPARK-17577][SparkR] SparkR support add files to Spark ...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15131
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15131: [SPARK-17577][SparkR] SparkR support add files to Spark ...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15131
  
**[Test build #65540 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65540/consoleFull)**
 for PR 15131 at commit 
[`5c49428`](https://github.com/apache/spark/commit/5c49428738d8817f43f23c60f85850864845e7b9).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15131: [SPARK-17577][SparkR] SparkR support add files to Spark ...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15131
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65540/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15043: [SPARK-17491] Close serialization stream to fix wrong an...

2016-09-17 Thread JoshRosen

Github user JoshRosen commented on the issue:

https://github.com/apache/spark/pull/15043
  
I agree that we should add more off-heap tests, but I'd like to do it in 
another patch so that we can get this one merged faster to unblock the 2.0.1 RC.

In terms of testing off-heap, I think that one of the best high-level tests 
/ asserts would be to strengthen the `releaseUnrollMemory()` checks so that 
inappropriately releasing unroll memory _during_ a task throws an exception 
during tests. Today there are some circumstances where unroll memory can only 
be released at the end of a task (such as an iterator backed by an unrolled 
block that is only partially consumed before the task ends), so the calls to 
release unroll memory have been tolerant of too much memory being released (it 
just releases `min(actualMemory, requestedToRelease)`). However, this is only 
appropriate to do at the end of the task so we should strengthen the asserts to 
only allow it there; this would have caught the memory mode mixup that I fixed 
here.

I'm going to retest this and if it passes tests then I'll merge to master 
and branch-2.0. I'll add the new tests described above in a followup.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-17 Thread srowen

Github user srowen commented on the issue:

https://github.com/apache/spark/pull/14981
  
The issue is that this also removes the non assembly artifact from the 
release. That does not seem to be strictly needed license wise. It is easy and 
tidy though. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15043: [SPARK-17491] Close serialization stream to fix wrong an...

2016-09-17 Thread SparkQA

Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15043
  
**[Test build #65541 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65541/consoleFull)**
 for PR 15043 at commit 
[`0d70774`](https://github.com/apache/spark/commit/0d70774e1db04edb46b312efc4b1646d7201fb03).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #14981: [SPARK-17418] Remove Kinesis artifacts from Spark releas...

2016-09-17 Thread lresende

Github user lresende commented on the issue:

https://github.com/apache/spark/pull/14981
  
@srowen @rxin My understanding is that the mvn deploy is what takes care of 
actually publishing the files to maven staging repository :
`
  $MVN -DzincPort=$ZINC_PORT --settings $tmp_settings -DskipTests 
$PUBLISH_PROFILES deploy
  ./dev/change-scala-version.sh 2.10
  $MVN -DzincPort=$ZINC_PORT -Dscala-2.10 --settings $tmp_settings \
-DskipTests $PUBLISH_PROFILES clean deploy
`

So, the suggested fix to remove Kinesis from the $PUBLISH_PROFILES should 
take care or making sure Kinesis won't show up in the maven staging repository 
for the release. 

@srowen Do you have other concerns ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #15043: [SPARK-17491] Close serialization stream to fix wrong an...

2016-09-17 Thread JoshRosen

Github user JoshRosen commented on the issue:

https://github.com/apache/spark/pull/15043
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #13324: [SPARK-15559][PYTHON][STREAMING] Add hash method for Top...

2016-09-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/13324
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65537/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 >

1 - 100 of 217 matches

Mail list logo