[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16787
  
**[Test build #72622 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72622/testReport)**
 for PR 16787 at commit 
[`df3597e`](https://github.com/apache/spark/commit/df3597ec28e71dc82c56f87464e6c12f3862ca95).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16854: [WIP][SPARK-15463][SQL] Add an API to load DataFrame fro...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16854
  
**[Test build #72623 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72623/testReport)**
 for PR 16854 at commit 
[`a7e8c2b`](https://github.com/apache/spark/commit/a7e8c2bfaf98c27885907caa21cce7e93d4afd1b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16858: [SPARK-19464][BUILD][HOTFIX] run-tests should use hadoop...

2017-02-08 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/16858
  
Thanks but the R test removal is unnecessary but probably ideally should be 
added back at some point. it was just parsing a sample URL - we should have one 
to make sure we don't break with the release share URL in which the sub 
directory structure is different from nightly snapshot builds.

It's not urgent.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16837: [SPARK-19359][SQL] renaming partition should not leave u...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16837
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16837: [SPARK-19359][SQL] renaming partition should not leave u...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16837
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72617/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16837: [SPARK-19359][SQL] renaming partition should not leave u...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16837
  
**[Test build #72617 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72617/testReport)**
 for PR 16837 at commit 
[`2c4d3c7`](https://github.com/apache/spark/commit/2c4d3c71e971bd039936c43143718ba7091d6113).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16750: [SPARK-18937][SQL] Timezone support in CSV/JSON p...

2017-02-08 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/16750#discussion_r100225911
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
 ---
@@ -357,30 +361,70 @@ class JsonExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 val jsonData = """{"a" 1}"""
 val schema = StructType(StructField("a", IntegerType) :: Nil)
 checkEvaluation(
-  JsonToStruct(schema, Map.empty, Literal(jsonData)),
+  JsonToStruct(schema, Map.empty, Literal(jsonData), gmtId),
   null
 )
 
 // Other modes should still return `null`.
 checkEvaluation(
-  JsonToStruct(schema, Map("mode" -> ParseModes.PERMISSIVE_MODE), 
Literal(jsonData)),
+  JsonToStruct(schema, Map("mode" -> ParseModes.PERMISSIVE_MODE), 
Literal(jsonData), gmtId),
   null
 )
   }
 
   test("from_json null input column") {
 val schema = StructType(StructField("a", IntegerType) :: Nil)
 checkEvaluation(
-  JsonToStruct(schema, Map.empty, Literal.create(null, StringType)),
+  JsonToStruct(schema, Map.empty, Literal.create(null, StringType), 
gmtId),
   null
 )
   }
 
+  test("from_json with timestamp") {
+val schema = StructType(StructField("t", TimestampType) :: Nil)
+
+val jsonData1 = """{"t": "2016-01-01T00:00:00.123Z"}"""
+var c = Calendar.getInstance(DateTimeUtils.TimeZoneGMT)
+c.set(2016, 0, 1, 0, 0, 0)
+c.set(Calendar.MILLISECOND, 123)
+checkEvaluation(
+  JsonToStruct(schema, Map.empty, Literal(jsonData1), gmtId),
+  InternalRow.fromSeq(c.getTimeInMillis * 1000L :: Nil)
+)
+checkEvaluation(
+  JsonToStruct(schema, Map.empty, Literal(jsonData1), Option("PST")),
+  InternalRow.fromSeq(c.getTimeInMillis * 1000L :: Nil)
--- End diff --

FYI, it's because the input json string includes timezone string `"Z"`, 
which means GMT.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16415: [SPARK-19063][ML]Speedup and optimize the GradientBooste...

2017-02-08 Thread zdh2292390
Github user zdh2292390 commented on the issue:

https://github.com/apache/spark/pull/16415
  
@jkbradley @srowen   Hi, sorry to bother, it's been a lot time since last 
discussion. Can you please tell me if there is a problem in my latest commit?  
Thank you very much!  Expect reply.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16750: [SPARK-18937][SQL] Timezone support in CSV/JSON p...

2017-02-08 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/16750#discussion_r100225669
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
 ---
@@ -357,30 +361,70 @@ class JsonExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 val jsonData = """{"a" 1}"""
 val schema = StructType(StructField("a", IntegerType) :: Nil)
 checkEvaluation(
-  JsonToStruct(schema, Map.empty, Literal(jsonData)),
+  JsonToStruct(schema, Map.empty, Literal(jsonData), gmtId),
   null
 )
 
 // Other modes should still return `null`.
 checkEvaluation(
-  JsonToStruct(schema, Map("mode" -> ParseModes.PERMISSIVE_MODE), 
Literal(jsonData)),
+  JsonToStruct(schema, Map("mode" -> ParseModes.PERMISSIVE_MODE), 
Literal(jsonData), gmtId),
   null
 )
   }
 
   test("from_json null input column") {
 val schema = StructType(StructField("a", IntegerType) :: Nil)
 checkEvaluation(
-  JsonToStruct(schema, Map.empty, Literal.create(null, StringType)),
+  JsonToStruct(schema, Map.empty, Literal.create(null, StringType), 
gmtId),
   null
 )
   }
 
+  test("from_json with timestamp") {
+val schema = StructType(StructField("t", TimestampType) :: Nil)
+
+val jsonData1 = """{"t": "2016-01-01T00:00:00.123Z"}"""
+var c = Calendar.getInstance(DateTimeUtils.TimeZoneGMT)
+c.set(2016, 0, 1, 0, 0, 0)
+c.set(Calendar.MILLISECOND, 123)
+checkEvaluation(
+  JsonToStruct(schema, Map.empty, Literal(jsonData1), gmtId),
+  InternalRow.fromSeq(c.getTimeInMillis * 1000L :: Nil)
+)
+checkEvaluation(
+  JsonToStruct(schema, Map.empty, Literal(jsonData1), Option("PST")),
+  InternalRow.fromSeq(c.getTimeInMillis * 1000L :: Nil)
--- End diff --

I'm sorry, I should have added a comment.
I'll add soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16750: [SPARK-18937][SQL] Timezone support in CSV/JSON p...

2017-02-08 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/16750#discussion_r100225659
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala
 ---
@@ -357,30 +361,70 @@ class JsonExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 val jsonData = """{"a" 1}"""
 val schema = StructType(StructField("a", IntegerType) :: Nil)
 checkEvaluation(
-  JsonToStruct(schema, Map.empty, Literal(jsonData)),
+  JsonToStruct(schema, Map.empty, Literal(jsonData), gmtId),
   null
 )
 
 // Other modes should still return `null`.
 checkEvaluation(
-  JsonToStruct(schema, Map("mode" -> ParseModes.PERMISSIVE_MODE), 
Literal(jsonData)),
+  JsonToStruct(schema, Map("mode" -> ParseModes.PERMISSIVE_MODE), 
Literal(jsonData), gmtId),
   null
 )
   }
 
   test("from_json null input column") {
 val schema = StructType(StructField("a", IntegerType) :: Nil)
 checkEvaluation(
-  JsonToStruct(schema, Map.empty, Literal.create(null, StringType)),
+  JsonToStruct(schema, Map.empty, Literal.create(null, StringType), 
gmtId),
   null
 )
   }
 
+  test("from_json with timestamp") {
+val schema = StructType(StructField("t", TimestampType) :: Nil)
+
+val jsonData1 = """{"t": "2016-01-01T00:00:00.123Z"}"""
+var c = Calendar.getInstance(DateTimeUtils.TimeZoneGMT)
+c.set(2016, 0, 1, 0, 0, 0)
+c.set(Calendar.MILLISECOND, 123)
+checkEvaluation(
+  JsonToStruct(schema, Map.empty, Literal(jsonData1), gmtId),
+  InternalRow.fromSeq(c.getTimeInMillis * 1000L :: Nil)
--- End diff --

Thanks, I'll use it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16859: [SPARK-17714][Core][test-maven][test-hadoop2.6]Avoid usi...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16859
  
**[Test build #3570 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3570/testReport)**
 for PR 16859 at commit 
[`1c88474`](https://github.com/apache/spark/commit/1c8847494c29d4b51182ecfeebb5cc85e000e7a1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16664: [SPARK-18120 ][SQL] Call QueryExecutionListener c...

2017-02-08 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/16664#discussion_r100225242
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -218,7 +247,14 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
   bucketSpec = getBucketSpec,
   options = extraOptions.toMap)
 
-dataSource.write(mode, df)
+val destination = source match {
+  case "jdbc" => extraOptions.get(JDBCOptions.JDBC_TABLE_NAME)
+  case _ => extraOptions.get("path")
--- End diff --

But not all of the keys are exposed as public APIs as far as I can see.

 e.g. calling the `save` method adds a "path" key to the option map, but is 
that key name a public API? I don't consider the key name a public API in this 
case (you can change it and existing applications will keep working).

Similarly for the JDBC URL and table names. The public method is the API, 
not necessarily the keys used internally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16664: [SPARK-18120 ][SQL] Call QueryExecutionListener c...

2017-02-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16664#discussion_r100224399
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -218,7 +247,14 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
   bucketSpec = getBucketSpec,
   options = extraOptions.toMap)
 
-dataSource.write(mode, df)
+val destination = source match {
+  case "jdbc" => extraOptions.get(JDBCOptions.JDBC_TABLE_NAME)
+  case _ => extraOptions.get("path")
--- End diff --

As we already use `Map` as the user-facing API for `DataFrameWriter`, I 
think we can't change it. To make it consistent, I think it's reasonable to 
still pass `Map` to listeners. I agree it's bad that listeners to know these 
magic keys, but it's already the case of `DataFrameWriter`/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16787
  
**[Test build #72622 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72622/testReport)**
 for PR 16787 at commit 
[`df3597e`](https://github.com/apache/spark/commit/df3597ec28e71dc82c56f87464e6c12f3862ca95).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16638: [SPARK-19115] [SQL] Supporting Create External Table Lik...

2017-02-08 Thread ouyangxiaochen
Github user ouyangxiaochen commented on the issue:

https://github.com/apache/spark/pull/16638
  
Should I delete my local master repository firstly,and fork a new one 
again? @cloud-fan


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16736: [SPARK-19265][SQL][Follow-up] Configurable `tableRelatio...

2017-02-08 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16736
  
LGTM, pending test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16862: [SPARK-19520][streaming] Do not encrypt data written to ...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16862
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72614/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16862: [SPARK-19520][streaming] Do not encrypt data written to ...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16862
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16862: [SPARK-19520][streaming] Do not encrypt data written to ...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16862
  
**[Test build #72614 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72614/testReport)**
 for PR 16862 at commit 
[`bdbe267`](https://github.com/apache/spark/commit/bdbe267617f14392455881862571561724f7d5de).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16797: [SPARK-19455][SQL] Add option for case-insensitive Parqu...

2017-02-08 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16797
  
@budde Spark does support mixed-case-schema tables, and it has always been. 
It's because we write table schema to metastore case-preserving, via table 
properties. When we read a table, we get schema from metastore and assume it's 
the schema of the table data files. So the data file schema must match the 
table schema, or Spark will fail, it has always been.

However, there is one exception. There are 2 kinds of tables in Spark: data 
source tables and hive serde tables(we have different SQL syntax to create 
them). Data source tables are totally managed by Spark, we read/write data 
files directly and only use hive metastore as a persistent layer, which means 
data source tables are not compatible with hive, hive can't read/write it.

For hive serde tables, it should be compatible with hive and we use hive 
api to read/write it. For any table, as long as hive can read it, Spark can 
read it. However, the exception is, for parquet and orc formats, we will read 
data files directly, as an optimization(reading using hive api is slow). Before 
Spark 2.1, we save schema to hive metastore directly, which means schema will 
be lowercased. w.r.t. this, ideally we should not support mixed-case-schema 
parquet/orc data files for this kind of table, or the data schema will mismatch 
the table schema. But we supported it, with the cost of runtime schema 
inference.

This problem was solved in Spark 2.1, by writing table schema to metastore 
case-preserving for hive serde tables. Now we can say that, the data schema 
must match the table schema, or Spark should fail.

Then comes to this problem: for parquet/orc format hive serde tables 
created by Spark prior to 2.1, the data file schema may not match the table 
schema, but we need to still support it for compatibility.

That's why I prefer the migration command approach, it keeps the concept 
clean: data schema must match table schema.

Like you said, users can still create a hive table with mixed-case-schema 
parquet/orc files, by hive or other systems like presto. This table is readable 
for hive, and for Spark prior to 2.1, because of the runtime schema inference. 
But this is not intentional, and Spark should not support it as the data file 
schema and table schema mismatch. We can make the migration command cover this 
case too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16760: [SPARK-18872][SQL][TESTS] New test cases for EXISTS subq...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16760
  
**[Test build #72621 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72621/testReport)**
 for PR 16760 at commit 
[`2473e0c`](https://github.com/apache/spark/commit/2473e0c440a9d1cd761ae6d704d0aa02c63afd83).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16760: [SPARK-18872][SQL][TESTS] New test cases for EXISTS subq...

2017-02-08 Thread dilipbiswal
Github user dilipbiswal commented on the issue:

https://github.com/apache/spark/pull/16760
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16859: [SPARK-17714][Core][maven][test-hadoop2.6]Avoid using Ex...

2017-02-08 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/16859
  
btw I think you want `[test-maven]` to run the builder on maven. (last one 
ran on sbt.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72620 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72620/testReport)**
 for PR 16386 at commit 
[`f71a465`](https://github.com/apache/spark/commit/f71a465cf07fb9c043b2ccd86fa57e8e8ea9dc00).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16386: [SPARK-18352][SQL] Support parsing multiline json...

2017-02-08 Thread NathanHowell
Github user NathanHowell commented on a diff in the pull request:

https://github.com/apache/spark/pull/16386#discussion_r100219153
  
--- Diff: 
core/src/main/scala/org/apache/spark/input/PortableDataStream.scala ---
@@ -194,5 +195,8 @@ class PortableDataStream(
   }
 
   def getPath(): String = path
+
+  @Since("2.2.0")
--- End diff --

Done, pushed in f71a465cf07fb9c043b2ccd86fa57e8e8ea9dc00


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16736: [SPARK-19265][SQL][Follow-up] Configurable `tableRelatio...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16736
  
**[Test build #72619 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72619/testReport)**
 for PR 16736 at commit 
[`f29c9d7`](https://github.com/apache/spark/commit/f29c9d77a683c1a63abac92f19210eadcb68682e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16864: [SPARK-19527][Core] Approximate Size of Intersection of ...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16864
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16859: [SPARK-17714][Core][maven][test-hadoop2.6]Avoid u...

2017-02-08 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/16859#discussion_r100218630
  
--- Diff: 
common/network-common/src/main/java/org/apache/spark/network/protocol/MessageDecoder.java
 ---
@@ -35,6 +35,12 @@
 
   private static final Logger logger = 
LoggerFactory.getLogger(MessageDecoder.class);
 
+  public static final MessageDecoder INSTANCE = new MessageDecoder();
+
+  private MessageDecoder() {
+super();
--- End diff --

nit: not really necessary (also in the other class)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16864: [SPARK-19527][Core] Approximate Size of Intersect...

2017-02-08 Thread Bcpoole
GitHub user Bcpoole opened a pull request:

https://github.com/apache/spark/pull/16864

[SPARK-19527][Core] Approximate Size of Intersection of Bloom Filters

**What changes were proposed in this pull request?**

Added functions to get the Swamidass & Baldi (2007) approximation for 
number of items in a Bloom filter and the intersections of two filters. Added 
an exception type IncompatibleUnionException mimicing 
IncompatibleMergeException. As needed for the intersection approximation, there 
is a function that create the union of two Bloom filters (no mutations).

**How was this patch tested?**

Manual Tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Bcpoole/spark 
approxItemsInBloomFilterIntersection

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16864.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16864


commit 7a3ad46ff86bd3d2d47f6a56bace1a0c4fd171c8
Author: Bcpoole 
Date:   2017-02-09T01:11:07Z

Swamidass & Baldi approx. items in intersection of two Bloom filters. Also 
function to create union (non-mutation) of two Bloom filters.

commit b9680c57b2f8b1d93c28884de9a7ebbe52505f6c
Author: Bcpoole 
Date:   2017-02-09T01:42:36Z

Changed createUnionBloomFilter & approxItemsInIntersection to be instance 
instead of static functions

commit 501ad7e22101b00862c0c77ef8c38e1b166d33a4
Author: Bcpoole 
Date:   2017-02-09T01:53:50Z

Updated abstract class to reflect changes in previous commit




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16664: [SPARK-18120 ][SQL] Call QueryExecutionListener c...

2017-02-08 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/16664#discussion_r100218129
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -218,7 +247,14 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
   bucketSpec = getBucketSpec,
   options = extraOptions.toMap)
 
-dataSource.write(mode, df)
+val destination = source match {
+  case "jdbc" => extraOptions.get(JDBCOptions.JDBC_TABLE_NAME)
+  case _ => extraOptions.get("path")
--- End diff --

So that comment is about different types of queries that might have 
different extra information. I still think a generic map is the wrong idea for 
fixing that.

You could, for example, have this parameter be `Any` (or some tagging 
interface, e.g. `trait QueryExecutionParams` or some such), and the listener 
can then match to find the right type. That is extensible (new types can be 
added without breaking existing ones) and wouldn't require this API to change 
in the future.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16856: [SPARK-19516][DOC] update public doc to use SparkSession...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16856
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72618/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16856: [SPARK-19516][DOC] update public doc to use SparkSession...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16856
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16856: [SPARK-19516][DOC] update public doc to use SparkSession...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16856
  
**[Test build #72618 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72618/testReport)**
 for PR 16856 at commit 
[`a5b91fd`](https://github.com/apache/spark/commit/a5b91fd5e261f27df9e088f5ee16929a7489be6d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16386: [SPARK-18352][SQL] Support parsing multiline json...

2017-02-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16386#discussion_r100217653
  
--- Diff: 
core/src/main/scala/org/apache/spark/input/PortableDataStream.scala ---
@@ -194,5 +195,8 @@ class PortableDataStream(
   }
 
   def getPath(): String = path
+
+  @Since("2.2.0")
--- End diff --

SGTM, can you take a look at other public methods in this class and add 
since tag for them? or it looks weird that only one method has since tag...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15009: [SPARK-17443][SPARK-11035] Stop Spark Application if lau...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15009
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15009: [SPARK-17443][SPARK-11035] Stop Spark Application if lau...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15009
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72610/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15009: [SPARK-17443][SPARK-11035] Stop Spark Application if lau...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15009
  
**[Test build #72610 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72610/testReport)**
 for PR 15009 at commit 
[`2707d21`](https://github.com/apache/spark/commit/2707d219f3f2c3fa1e89553809cf3a8d118fc084).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16715: [Spark-18080][ML] Python API & Examples for Locality Sen...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16715
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72615/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16863: Swamidass & Baldi Approximations

2017-02-08 Thread Bcpoole
Github user Bcpoole closed the pull request at:

https://github.com/apache/spark/pull/16863


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16715: [Spark-18080][ML] Python API & Examples for Locality Sen...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16715
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16715: [Spark-18080][ML] Python API & Examples for Locality Sen...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16715
  
**[Test build #72615 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72615/testReport)**
 for PR 16715 at commit 
[`4bc670c`](https://github.com/apache/spark/commit/4bc670cf4e512953019d97cfd57413158f31377a).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16863: Swamidass & Baldi Approximations

2017-02-08 Thread uncleGen
Github user uncleGen commented on the issue:

https://github.com/apache/spark/pull/16863
  
Please review http://spark.apache.org/contributing.html before opening a 
pull request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16856: [SPARK-19516][DOC] update public doc to use SparkSession...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16856
  
**[Test build #72618 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72618/testReport)**
 for PR 16856 at commit 
[`a5b91fd`](https://github.com/apache/spark/commit/a5b91fd5e261f27df9e088f5ee16929a7489be6d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16863: Swamidass & Baldi Approximations

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16863
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16674: [SPARK-19331][SQL][TESTS] Improve the test coverage of S...

2017-02-08 Thread jiangxb1987
Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/16674
  
@gatorsmile @cloud-fan Could you please look at this when you have time? 
Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16863: Swamidass & Baldi Approximations

2017-02-08 Thread Bcpoole
GitHub user Bcpoole reopened a pull request:

https://github.com/apache/spark/pull/16863

Swamidass & Baldi Approximations

## What changes were proposed in this pull request?

Added functions to get the Swamidass & Baldi (2007) approximation for 
number of items in a Bloom filter and the intersections of two filters. Added 
an exception type IncompatibleUnionException mimicing 
IncompatibleMergeException. As needed for the intersection approximation, there 
is a function that create the union of two Bloom filters (no mutations).

## How was this patch tested?

Manual Tests

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Bcpoole/spark 
approxItemsInBloomFilterIntersection

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16863.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16863


commit 7a3ad46ff86bd3d2d47f6a56bace1a0c4fd171c8
Author: Bcpoole 
Date:   2017-02-09T01:11:07Z

Swamidass & Baldi approx. items in intersection of two Bloom filters. Also 
function to create union (non-mutation) of two Bloom filters.

commit b9680c57b2f8b1d93c28884de9a7ebbe52505f6c
Author: Bcpoole 
Date:   2017-02-09T01:42:36Z

Changed createUnionBloomFilter & approxItemsInIntersection to be instance 
instead of static functions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16856: [SPARK-19516][DOC] update public doc to use SparkSession...

2017-02-08 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16856
  
several questions:

1. as `SparkSession` becomes the new entry point, now we require users to 
include spark-sql module in their sbt file or pom.xml, which seems weird if 
users only want to write a very simple spark application without SQL 
functionality. Shall we rename spark-sql to a more general name like 
`spark-structure`?
2. when it comes to java application with RDD, it's inconvenient to build 
`SparkSession` and then create `JavaSparkContext`, users can create 
`JavaSparkContext` directly.

cc @rxin @yhuai


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16856: [SPARK-19516][DOC] update public doc to use Spark...

2017-02-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16856#discussion_r100213922
  
--- Diff: docs/programming-guide.md ---
@@ -244,13 +239,13 @@ use IPython, set the `PYSPARK_DRIVER_PYTHON` variable 
to `ipython` when running
 $ PYSPARK_DRIVER_PYTHON=ipython ./bin/pyspark
 {% endhighlight %}
 
-To use the Jupyter notebook (previously known as the IPython notebook), 
--- End diff --

sorry my editor removes unnecessary whitespace automatically...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16856: [SPARK-19516][DOC] update public doc to use Spark...

2017-02-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16856#discussion_r100213837
  
--- Diff: docs/programming-guide.md ---
@@ -77,9 +76,9 @@ In addition, if you wish to access an HDFS cluster, you 
need to add a dependency
 Finally, you need to import some Spark classes into your program. Add the 
following lines:
 
 {% highlight scala %}
-import org.apache.spark.api.java.JavaSparkContext
-import org.apache.spark.api.java.JavaRDD
-import org.apache.spark.SparkConf
+import org.apache.spark.api.java.JavaSparkContext;
--- End diff --

hmm, shouldn't it be java code?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16804: [SPARK-19459][SQL] Add Hive datatype (char/varcha...

2017-02-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16804#discussion_r100213581
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcSourceSuite.scala ---
@@ -152,14 +152,39 @@ abstract class OrcSuite extends QueryTest with 
TestHiveSingleton with BeforeAndA
 assert(new OrcOptions(Map("Orc.Compress" -> "NONE")).compressionCodec 
== "NONE")
   }
 
-  test("SPARK-18220: read Hive orc table with varchar column") {
+  test("SPARK-19459/SPARK-18220: read char/varchar column written by 
Hive") {
 val hiveClient = 
spark.sharedState.externalCatalog.asInstanceOf[HiveExternalCatalog].client
+val location = Utils.createTempDir().toURI
--- End diff --

shall we remove this temp dir in the finally block?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2017-02-08 Thread WeichenXu123
Github user WeichenXu123 commented on the issue:

https://github.com/apache/spark/pull/15435
  
I read  jkbradley's thoughts here, so I will modify this as following:

first we need 4 traits, using the following hierarchy:
LogisticRegressionSummary
LogisticRegressionTrainingSummary: LogisticRegressionSummary
BinaryLogisticRegressionSummary: LogisticRegressionSummary
BinaryLogisticRegressionTrainingSummary: LogisticRegressionTrainingSummary, 
BinaryLogisticRegressionSummary

and the public method such as `def summary` only return trait type listed 
above.

and then implement 4 concrete classes:
`LogisticRegressionSummaryImpl` (multiclass case) 
`LogisticRegressionTrainingSummaryImpl` (multiclass case)
`BinaryLogisticRegressionSummaryImpl` (binary case).
`BinaryLogisticRegressionTrainingSummaryImpl` (binary case).

Is that right ? @jkbradley @sethah 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16863: Swamidass & Baldi Approximations

2017-02-08 Thread Bcpoole
Github user Bcpoole closed the pull request at:

https://github.com/apache/spark/pull/16863


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16837: [SPARK-19359][SQL] renaming partition should not leave u...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16837
  
**[Test build #72617 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72617/testReport)**
 for PR 16837 at commit 
[`2c4d3c7`](https://github.com/apache/spark/commit/2c4d3c71e971bd039936c43143718ba7091d6113).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16826: Fork SparkSession with option to inherit a copy of the S...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16826
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72606/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16863: Swamidass & Baldi Approximations

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16863
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16826: Fork SparkSession with option to inherit a copy of the S...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16826
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16800: [SPARK-19456][SparkR]:Add LinearSVC R API

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16800
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72613/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16826: Fork SparkSession with option to inherit a copy of the S...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16826
  
**[Test build #72606 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72606/testReport)**
 for PR 16826 at commit 
[`a343d8a`](https://github.com/apache/spark/commit/a343d8af9c577158042e4af9f8832f46aeecd509).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16863: Swamidass & Baldi Approximations

2017-02-08 Thread Bcpoole
GitHub user Bcpoole opened a pull request:

https://github.com/apache/spark/pull/16863

Swamidass & Baldi Approximations

## What changes were proposed in this pull request?

Added functions to get the Swamidass & Baldi (2007) approximation for 
number of items in a Bloom filter and the intersections of two filters. Added 
an exception type IncompatibleUnionException mimicing 
IncompatibleMergeException. As needed for the intersection approximation, there 
is a function that create the union of two Bloom filters (no mutations).

## How was this patch tested?

Manual Tests

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Bcpoole/spark 
approxItemsInBloomFilterIntersection

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16863.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16863


commit 7a3ad46ff86bd3d2d47f6a56bace1a0c4fd171c8
Author: Bcpoole 
Date:   2017-02-09T01:11:07Z

Swamidass & Baldi approx. items in intersection of two Bloom filters. Also 
function to create union (non-mutation) of two Bloom filters.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16800: [SPARK-19456][SparkR]:Add LinearSVC R API

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16800
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16736: [SPARK-19265][SQL][Follow-up] Configurable `table...

2017-02-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16736#discussion_r100211955
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/internal/SQLConfEntrySuite.scala 
---
@@ -171,4 +171,14 @@ class SQLConfEntrySuite extends SparkFunSuite {
   buildConf(key).stringConf.createOptional
 }
   }
+
+  test("StaticSQLConf.FILESOURCE_TABLE_RELATION_CACHE_SIZE") {
+val confEntry = StaticSQLConf.FILESOURCE_TABLE_RELATION_CACHE_SIZE
+assert(conf.getConf(confEntry) === 1000)
+conf.setConf(confEntry, -1)
--- End diff --

please also test `setConfString`, which should check the value


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16800: [SPARK-19456][SparkR]:Add LinearSVC R API

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16800
  
**[Test build #72613 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72613/testReport)**
 for PR 16800 at commit 
[`bbc72e1`](https://github.com/apache/spark/commit/bbc72e1b02d6b1125cd125986beb5236a30b9d7d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16787
  
**[Test build #72616 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72616/testReport)**
 for PR 16787 at commit 
[`bf09f15`](https://github.com/apache/spark/commit/bf09f15ca7c90138312eb73b819131adf16ac040).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16787: [SPARK-19448][SQL]optimize some duplication functions in...

2017-02-08 Thread windpiger
Github user windpiger commented on the issue:

https://github.com/apache/spark/pull/16787
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16715: [Spark-18080][ML] Python API & Examples for Locality Sen...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16715
  
**[Test build #72615 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72615/testReport)**
 for PR 16715 at commit 
[`4bc670c`](https://github.com/apache/spark/commit/4bc670cf4e512953019d97cfd57413158f31377a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16861: Added function to get union of 2 Bloom filters (n...

2017-02-08 Thread Bcpoole
Github user Bcpoole closed the pull request at:

https://github.com/apache/spark/pull/16861


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16862: [SPARK-19520][streaming] Do not encrypt data written to ...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16862
  
**[Test build #72614 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72614/testReport)**
 for PR 16862 at commit 
[`bdbe267`](https://github.com/apache/spark/commit/bdbe267617f14392455881862571561724f7d5de).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16736: [SPARK-19265][SQL][Follow-up] Configurable `tableRelatio...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16736
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72605/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16736: [SPARK-19265][SQL][Follow-up] Configurable `tableRelatio...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16736
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16736: [SPARK-19265][SQL][Follow-up] Configurable `tableRelatio...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16736
  
**[Test build #72605 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72605/testReport)**
 for PR 16736 at commit 
[`314f6f8`](https://github.com/apache/spark/commit/314f6f8de6990b1c3bfddea503490a1797e25117).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16862: [SPARK-19520][streaming] Do not encrypt data writ...

2017-02-08 Thread vanzin
GitHub user vanzin opened a pull request:

https://github.com/apache/spark/pull/16862

[SPARK-19520][streaming] Do not encrypt data written to the WAL.

Spark's I/O encryption uses an ephemeral key for each driver instance.
So driver B cannot decrypt data written by driver A since it doesn't
have the correct key.

The write ahead log is used for recovery, thus needs to be readable by
a different driver. So it cannot be encrypted by Spark's I/O encryption
code.

The BlockManager APIs used by the WAL code to write the data automatically
encrypt data, so changes are needed so that callers can to opt out of
encryption.

Aside from that, the "putBytes" API in the BlockManager does not do
encryption, so a separate situation arised where the WAL would write
unencrypted data to the BM and, when those blocks were read, decryption
would fail. So the WAL code needs to ask the BM to encrypt that data
when encryption is enabled; this code is not optimal since it results
in a (temporary) second copy of the data block in memory, but should be
OK for now until a more performant solution is added. The non-encryption
case should not be affected.

Tested with new unit tests, and by running streaming apps that do
recovery using the WAL data with I/O encryption turned on.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vanzin/spark SPARK-19520

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16862.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16862


commit bdbe267617f14392455881862571561724f7d5de
Author: Marcelo Vanzin 
Date:   2017-02-08T22:18:14Z

[SPARK-19520][streaming] Do not encrypt data written to the WAL.

Spark's I/O encryption uses an ephemeral key for each driver instance.
So driver B cannot decrypt data written by driver A since it doesn't
have the correct key.

The write ahead log is used for recovery, thus needs to be readable by
a different driver. So it cannot be encrypted by Spark's I/O encryption
code.

The BlockManager APIs used by the WAL code to write the data automatically
encrypt data, so changes are needed so that callers can to opt out of
encryption.

Aside from that, the "putBytes" API in the BlockManager does not do
encryption, so a separate situation arised where the WAL would write
unencrypted data to the BM and, when those blocks were read, decryption
would fail. So the WAL code needs to ask the BM to encrypt that data
when encryption is enabled; this code is not optimal since it results
in a (temporary) second copy of the data block in memory, but should be
OK for now until a more performant solution is added. The non-encryption
case should not be affected.

Tested with new unit tests, and by running streaming apps that do
recovery using the WAL data with I/O encryption turned on.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16862: [SPARK-19520][streaming] Do not encrypt data written to ...

2017-02-08 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/16862
  
@tdas @zsxwing 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16804: [SPARK-19459][SQL] Add Hive datatype (char/varchar) to S...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16804
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72604/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16804: [SPARK-19459][SQL] Add Hive datatype (char/varchar) to S...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16804
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16804: [SPARK-19459][SQL] Add Hive datatype (char/varchar) to S...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16804
  
**[Test build #72604 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72604/testReport)**
 for PR 16804 at commit 
[`e7ca0ea`](https://github.com/apache/spark/commit/e7ca0ead843f2c9650e690fe649be18fa6389e48).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72611/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16386
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72603/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #72611 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72611/testReport)**
 for PR 15415 at commit 
[`049e1a3`](https://github.com/apache/spark/commit/049e1a326daee4c55edb6d65090fafd229b93b6a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16386: [SPARK-18352][SQL] Support parsing multiline json files

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16386
  
**[Test build #72603 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72603/testReport)**
 for PR 16386 at commit 
[`30fb509`](https://github.com/apache/spark/commit/30fb509c71ed1f919e0198f2366aa817d96fc0ca).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16825: [SPARK-19481][REPL][maven]Avoid to leak SparkCont...

2017-02-08 Thread zsxwing
Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/16825#discussion_r100206338
  
--- Diff: repl/src/main/scala/org/apache/spark/repl/Signaling.scala ---
@@ -28,15 +28,17 @@ private[repl] object Signaling extends Logging {
* when no jobs are currently running.
* This makes it possible to interrupt a running shell job by pressing 
Ctrl+C.
*/
-  def cancelOnInterrupt(ctx: SparkContext): Unit = 
SignalUtils.register("INT") {
-if (!ctx.statusTracker.getActiveJobIds().isEmpty) {
-  logWarning("Cancelling all active jobs, this can take a while. " +
-"Press Ctrl+C again to exit now.")
-  ctx.cancelAllJobs()
-  true
-} else {
-  false
-}
+  def cancelOnInterrupt(): Unit = SignalUtils.register("INT") {
--- End diff --

It's used by REPL to cancel the running job if any.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16825: [SPARK-19481][REPL][maven]Avoid to leak SparkCont...

2017-02-08 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/16825#discussion_r100206125
  
--- Diff: repl/src/main/scala/org/apache/spark/repl/Signaling.scala ---
@@ -28,15 +28,17 @@ private[repl] object Signaling extends Logging {
* when no jobs are currently running.
* This makes it possible to interrupt a running shell job by pressing 
Ctrl+C.
*/
-  def cancelOnInterrupt(ctx: SparkContext): Unit = 
SignalUtils.register("INT") {
-if (!ctx.statusTracker.getActiveJobIds().isEmpty) {
-  logWarning("Cancelling all active jobs, this can take a while. " +
-"Press Ctrl+C again to exit now.")
-  ctx.cancelAllJobs()
-  true
-} else {
-  false
-}
+  def cancelOnInterrupt(): Unit = SignalUtils.register("INT") {
--- End diff --

Who is using this one? Is this a breaking change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16830: [MINOR][CORE] Fix incorrect documentation of WritableCon...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16830
  
**[Test build #3569 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3569/testReport)**
 for PR 16830 at commit 
[`c300ff6`](https://github.com/apache/spark/commit/c300ff6e0a2d802c474b2af5e1bea9afb8101a2c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16852: [SPARK-19512][SQL] codegen for compare structs fails

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16852
  
**[Test build #3568 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3568/testReport)**
 for PR 16852 at commit 
[`9a8d853`](https://github.com/apache/spark/commit/9a8d8537748f38a4276188b3f60f6852010e3387).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16861: Added function to get union of 2 Bloom filters (no mutat...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16861
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16800: [SPARK-19456][SparkR]:Add LinearSVC R API

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16800
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72608/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16800: [SPARK-19456][SparkR]:Add LinearSVC R API

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16800
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16800: [SPARK-19456][SparkR]:Add LinearSVC R API

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16800
  
**[Test build #72608 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72608/testReport)**
 for PR 16800 at commit 
[`a180126`](https://github.com/apache/spark/commit/a1801261335cf08fc87f17d76979fda6fdcc1969).
 * This patch **fails SparkR unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16861: Added function to get union of 2 Bloom filters (n...

2017-02-08 Thread Bcpoole
GitHub user Bcpoole opened a pull request:

https://github.com/apache/spark/pull/16861

Added function to get union of 2 Bloom filters (no mutation). Added S…

…wamidass & Baldi approximations for number of items in a Bloom filter 
and for the intersection of 2 Bloom filters given those Bloom filters.

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Bcpoole/spark intersectionOfBloomFilters

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16861.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16861


commit 934766b6527f512d8d81d2533899583b98d0219c
Author: Bcpoole 
Date:   2017-02-08T23:59:41Z

Added function to get union of 2 Bloom filters (no mutation). Added 
Swamidass & Baldi approximations for number of items in a Bloom filter and for 
the intersection of 2 Bloom filters given those Bloom filters.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16800: [SPARK-19456][SparkR]:Add LinearSVC R API

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16800
  
**[Test build #72613 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72613/testReport)**
 for PR 16800 at commit 
[`bbc72e1`](https://github.com/apache/spark/commit/bbc72e1b02d6b1125cd125986beb5236a30b9d7d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16860: [SPARK-18613][ML] make spark.mllib LDA dependencies in s...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16860
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16860: [SPARK-18613][ML] make spark.mllib LDA dependencies in s...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16860
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72607/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16860: [SPARK-18613][ML] make spark.mllib LDA dependencies in s...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16860
  
**[Test build #72607 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72607/testReport)**
 for PR 16860 at commit 
[`ce8abb3`](https://github.com/apache/spark/commit/ce8abb308c0b555754d332712d922a7aa9a8b220).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16715: [Spark-18080][ML] Python API & Examples for Locality Sen...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16715
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16715: [Spark-18080][ML] Python API & Examples for Locality Sen...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16715
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72609/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16715: [Spark-18080][ML] Python API & Examples for Locality Sen...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16715
  
**[Test build #72609 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72609/testReport)**
 for PR 16715 at commit 
[`8e5468f`](https://github.com/apache/spark/commit/8e5468f6946a8f2c051746ddb0c0e65586bd1eed).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16715: [Spark-18080][ML] Python API & Examples for Locality Sen...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16715
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72612/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16715: [Spark-18080][ML] Python API & Examples for Locality Sen...

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16715
  
**[Test build #72612 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72612/testReport)**
 for PR 16715 at commit 
[`6e85e1a`](https://github.com/apache/spark/commit/6e85e1a04b02dea26e82c8bc77151b8e389f4fe5).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16715: [Spark-18080][ML] Python API & Examples for Locality Sen...

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16715
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   6   7   >