[GitHub] spark issue #16352: [SPARK-18947][SQL] SQLContext.tableNames should not call...

2016-12-20 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16352
  
LGTM again pending test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16352: [SPARK-18947][SQL] SQLContext.tableNames should n...

2016-12-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16352#discussion_r93390578
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala 
---
@@ -276,11 +276,12 @@ private[sql] object SQLUtils extends Logging {
   }
 
   def getTableNames(sparkSession: SparkSession, databaseName: String): 
Array[String] = {
-databaseName match {
-  case n: String if n != null && n.trim.nonEmpty =>
-sparkSession.catalog.listTables(n).collect().map(_.name)
+val db = databaseName match {
+  case _ if databaseName != null && databaseName.trim.nonEmpty =>
+databaseName.trim
--- End diff --

: ) Yeah


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16369: [SPARK-18956][SQL][PySpark] Reuse existing SparkSession ...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16369
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70460/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16369: [SPARK-18956][SQL][PySpark] Reuse existing SparkSession ...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16369
  
**[Test build #70460 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70460/testReport)**
 for PR 16369 at commit 
[`0e96618`](https://github.com/apache/spark/commit/0e96618a9e6530cf6e43204dc7f80965bc759cae).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16356: [SPARK-18949] [SQL] Add recoverPartitions API to Catalog

2016-12-20 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16356
  
Sure, let me do it now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16352: [SPARK-18947][SQL] SQLContext.tableNames should not call...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16352
  
**[Test build #70466 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70466/testReport)**
 for PR 16352 at commit 
[`95d6f89`](https://github.com/apache/spark/commit/95d6f89623fd29458b6363a40d2bfcdf7af6902d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15996
  
**[Test build #70465 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70465/testReport)**
 for PR 15996 at commit 
[`59f06ce`](https://github.com/apache/spark/commit/59f06ce86338b11b74164de632cee518bf513697).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16352: [SPARK-18947][SQL] SQLContext.tableNames should n...

2016-12-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16352#discussion_r93388755
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala 
---
@@ -276,11 +276,12 @@ private[sql] object SQLUtils extends Logging {
   }
 
   def getTableNames(sparkSession: SparkSession, databaseName: String): 
Array[String] = {
-databaseName match {
-  case n: String if n != null && n.trim.nonEmpty =>
-sparkSession.catalog.listTables(n).collect().map(_.name)
+val db = databaseName match {
+  case _ if databaseName != null && databaseName.trim.nonEmpty =>
+databaseName.trim
--- End diff --

ok let me keep the previous behavior, although it's weird(check 
`...trim.nonEmpty` but not use the trimmed database name)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16371: [SPARK-18932][SQL] Support partial aggregation fo...

2016-12-20 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/16371#discussion_r93388333
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Percentile.scala
 ---
@@ -33,10 +33,9 @@ import org.apache.spark.util.collection.OpenHashMap
  * The Percentile aggregate function returns the exact percentile(s) of 
numeric column `expr` at
  * the given percentage(s) with value range in [0.0, 1.0].
  *
- * The operator is bound to the slower sort based aggregation path because 
the number of elements
--- End diff --

`TypedImperativeAggregate` is't bound to sort based aggregation. 
`ObjectHashAggregateExec` supports `TypedImperativeAggregate` in hash based 
aggregation now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16371
  
**[Test build #70464 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70464/testReport)**
 for PR 16371 at commit 
[`68d1e98`](https://github.com/apache/spark/commit/68d1e98a049da996feb202660e6c6b15f94183b7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16356: [SPARK-18949] [SQL] Add recoverPartitions API to ...

2016-12-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16356


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16356: [SPARK-18949] [SQL] Add recoverPartitions API to Catalog

2016-12-20 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16356
  
Can you send a pr for branch-2.1?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16356: [SPARK-18949] [SQL] Add recoverPartitions API to Catalog

2016-12-20 Thread rxin
Github user rxin commented on the issue:

https://github.com/apache/spark/pull/16356
  
Merging in master/branch-2.1.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13909: [SPARK-16213][SQL] Reduce runtime overhead of a p...

2016-12-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/13909#discussion_r93387942
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -56,33 +58,89 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
   }
 
   override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
-val arrayClass = classOf[GenericArrayData].getName
-val values = ctx.freshName("values")
-ctx.addMutableState("Object[]", values, s"this.$values = null;")
+val array = ctx.freshName("array")
 
-ev.copy(code = s"""
-  this.$values = new Object[${children.size}];""" +
+val et = dataType.elementType
+val evals = children.map(e => e.genCode(ctx))
+val isPrimitiveArray = ctx.isPrimitiveType(et)
+val primitiveTypeName = if (isPrimitiveArray) 
ctx.primitiveTypeName(et) else ""
+val (preprocess, arrayData, arrayWriter) =
+  GenArrayData.getCodeArrayData(ctx, et, children.size, 
isPrimitiveArray, array)
+
+ev.copy(code =
--- End diff --

can you refactor it a little bit? The logic gets more complicated and it's 
hard to read when you put it in `ev.copy(...)`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTable...

2016-12-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/15996#discussion_r93387828
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -364,48 +366,157 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
   throw new AnalysisException("Cannot create hive serde table with 
saveAsTable API")
 }
 
-val tableExists = 
df.sparkSession.sessionState.catalog.tableExists(tableIdent)
-
-(tableExists, mode) match {
-  case (true, SaveMode.Ignore) =>
-// Do nothing
-
-  case (true, SaveMode.ErrorIfExists) =>
-throw new AnalysisException(s"Table $tableIdent already exists.")
-
-  case _ =>
-val existingTable = if (tableExists) {
-  
Some(df.sparkSession.sessionState.catalog.getTableMetadata(tableIdent))
-} else {
-  None
-}
-val storage = if (tableExists) {
-  existingTable.get.storage
-} else {
-  DataSource.buildStorageFormatFromOptions(extraOptions.toMap)
-}
-val tableType = if (tableExists) {
-  existingTable.get.tableType
-} else if (storage.locationUri.isDefined) {
-  CatalogTableType.EXTERNAL
-} else {
-  CatalogTableType.MANAGED
+val catalog = df.sparkSession.sessionState.catalog
+val db = tableIdent.database.getOrElse(catalog.getCurrentDatabase)
+val tableIdentWithDB = tableIdent.copy(database = Some(db))
+val tableName = tableIdentWithDB.unquotedString
+
+catalog.getTableMetadataOption(tableIdentWithDB) match {
+  // If the table already exists...
+  case Some(existingTable) =>
+mode match {
+  case SaveMode.Ignore => // Do nothing
+
+  case SaveMode.ErrorIfExists =>
+throw new AnalysisException(s"Table $tableName already exists. 
You can set SaveMode " +
+  "to SaveMode.Append to insert data into the table or set 
SaveMode to " +
+  "SaveMode.Overwrite to overwrite the existing data.")
+
+  case SaveMode.Append =>
+if (existingTable.tableType == CatalogTableType.VIEW) {
+  throw new AnalysisException("Saving data into a view is not 
allowed.")
+}
+
+if (existingTable.provider.get == DDLUtils.HIVE_PROVIDER) {
+  throw new AnalysisException(s"Saving data in the Hive serde 
table $tableName is " +
+"not supported yet. Please use the insertInto() API as an 
alternative.")
+}
+
+// Check if the specified data source match the data source of 
the existing table.
+val existingProvider = 
DataSource.lookupDataSource(existingTable.provider.get)
+val specifiedProvider = DataSource.lookupDataSource(source)
+// TODO: Check that options from the resolved relation match 
the relation that we are
+// inserting into (i.e. using the same compression).
+if (existingProvider != specifiedProvider) {
+  throw new AnalysisException(s"The format of the existing 
table $tableName is " +
+s"`${existingProvider.getSimpleName}`. It doesn't match 
the specified format " +
+s"`${specifiedProvider.getSimpleName}`.")
+}
+
+if (df.schema.length != existingTable.schema.length) {
+  throw new AnalysisException(
+s"The column number of the existing table $tableName" +
+  s"(${existingTable.schema.catalogString}) doesn't match 
the data schema" +
+  s"(${df.schema.catalogString})")
+}
+
+val resolver = df.sparkSession.sessionState.conf.resolver
+val tableCols = existingTable.schema.map(_.name)
+
+// As we are inserting into an existing table, we should 
respect the existing schema and
+// adjust the column order of the given dataframe according to 
it, or throw exception
+// if the column names do not match.
+val adjustedColumns = tableCols.map { col =>
+  df.queryExecution.analyzed.resolve(Seq(col), 
resolver).getOrElse {
+val inputColumns = df.schema.map(_.name).mkString(", ")
+throw new AnalysisException(
+  s"cannot resolve '$col' given input columns: 
[$inputColumns]")
+  }
+}
+
+// Check if the specified partition columns match the existing 
table.
+val specifiedPartCols = CatalogUtils.normalizePartCols(
+  tableName, tableCols, 

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-12-20 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/13909
  
sorry for the delay. Yea looks like we can't reuse the byte array of unsafe 
data in expressions, which may get cached expectedly and leads to wrong result.

I'm a little concerned about the hacks in `BufferHolder` and the array 
writer. The code is so coupled with unsafe row writer and we have to hack it so 
that we can write unsafe array directly. What if we actually write an unsafe 
row with a single array field and return the array column? Then we don't need 
the hacks, but waste some bits for the row format overhead, which seems 
acceptable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16351: [SPARK-18943][SQL] Avoid per-record type dispatch in CSV...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16351
  
**[Test build #70463 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70463/testReport)**
 for PR 16351 at commit 
[`192bc6e`](https://github.com/apache/spark/commit/192bc6e59c3b9e7f5c782d3c9059e67d0e4550ec).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16351: [SPARK-18943][SQL] Avoid per-record type dispatch in CSV...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16351
  
**[Test build #70462 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70462/testReport)**
 for PR 16351 at commit 
[`c75bd05`](https://github.com/apache/spark/commit/c75bd050925dac6efc3276f4aafef00135778f88).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16337: [SPARK-18871][SQL] New test cases for IN/NOT IN subquery

2016-12-20 Thread kevinyu98
Github user kevinyu98 commented on the issue:

https://github.com/apache/spark/pull/16337
  
Nat will run against DB2 and provide result. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16351: [SPARK-18943][SQL] Avoid per-record type dispatch in CSV...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16351
  
**[Test build #70461 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70461/testReport)**
 for PR 16351 at commit 
[`3289726`](https://github.com/apache/spark/commit/3289726ffbf4ffbbda36d935f37a0dfcc946b20e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16369: [SPARK-18956][SQL][PySpark] Reuse existing SparkSession ...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16369
  
**[Test build #70460 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70460/testReport)**
 for PR 16369 at commit 
[`0e96618`](https://github.com/apache/spark/commit/0e96618a9e6530cf6e43204dc7f80965bc759cae).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16337: [SPARK-18871][SQL] New test cases for IN/NOT IN subquery

2016-12-20 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/16337
  
What are the refernce query results we can compare? For example, from DB2 
or Hive?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16356: [SPARK-18949] [SQL] Add recoverPartitions API to Catalog

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16356
  
**[Test build #3511 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3511/testReport)**
 for PR 16356 at commit 
[`451ab05`](https://github.com/apache/spark/commit/451ab0598d59bb5df9a222df931b1be127c3082a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16371: [SPARK-18932][SQL] Support partial aggregation for colle...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16371
  
**[Test build #70459 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70459/testReport)**
 for PR 16371 at commit 
[`6e8aa82`](https://github.com/apache/spark/commit/6e8aa82d95e52c6c469aaa4a8e1cfc0105576e69).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16371: [SPARK-18932][SQL] Support partial aggregation fo...

2016-12-20 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/16371

[SPARK-18932][SQL] Support partial aggregation for collect_set/collect_list

## What changes were proposed in this pull request?

Currently collect_set/collect_list aggregation expression don't support 
partial aggregation. This patch is to enable partial aggregation for them.

## How was this patch tested?

N/A

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 collect-partial-support

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16371.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16371


commit 6e8aa82d95e52c6c469aaa4a8e1cfc0105576e69
Author: Liang-Chi Hsieh 
Date:   2016-12-21T06:53:39Z

Support partial mode for collect.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/12775
  
**[Test build #70458 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70458/testReport)**
 for PR 12775 at commit 
[`9778cef`](https://github.com/apache/spark/commit/9778cefce3e152d559e53cd4e2f5a113e561f0ff).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16351: [SPARK-18943][SQL] Avoid per-record type dispatch in CSV...

2016-12-20 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/16351
  
Thank you @cloud-fan, will add some in the PR description soon after 
cleaning up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16352: [SPARK-18947][SQL] SQLContext.tableNames should n...

2016-12-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16352#discussion_r93385438
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala 
---
@@ -276,11 +276,12 @@ private[sql] object SQLUtils extends Logging {
   }
 
   def getTableNames(sparkSession: SparkSession, databaseName: String): 
Array[String] = {
-databaseName match {
-  case n: String if n != null && n.trim.nonEmpty =>
-sparkSession.catalog.listTables(n).collect().map(_.name)
+val db = databaseName match {
+  case _ if databaseName != null && databaseName.trim.nonEmpty =>
+databaseName.trim
--- End diff --

uh... not sure whether we should support triming. So far, when we do 
something like
```Scala
session.tableNames("default ")
```

It reports the error:
```
Database 'default ' not found;
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2016-12-20 Thread kayousterhout
Github user kayousterhout commented on the issue:

https://github.com/apache/spark/pull/12775
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16337: [SPARK-18871][SQL] New test cases for IN/NOT IN subquery

2016-12-20 Thread kevinyu98
Github user kevinyu98 commented on the issue:

https://github.com/apache/spark/pull/16337
  
Hello All:
I have divided the test case to small groups based the discussion, and this 
pr will be the first pr for the IN subquery, it covers the simple and group-by 
cases. 

This the run time from running on my local macbook.

$ build/sbt "~sql/test-only *SQLQueryTestSuite -- -z in-group-by.sql"
[info] Run completed in 23 seconds, 876 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.

$ build/sbt "~sql/test-only *SQLQueryTestSuite -- -z simple-in.sql"
[info] Run completed in 9 seconds, 986 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16367: [SPARK-18903][SPARKR] Add API to get SparkUI URL

2016-12-20 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/16367#discussion_r93384251
  
--- Diff: R/pkg/R/sparkR.R ---
@@ -410,6 +410,30 @@ sparkR.session <- function(
   sparkSession
 }
 
+#' Get the URL of the SparkUI instance for the current active SparkSession
+#'
+#' Get the URL of the SparkUI instance for the current active SparkSession.
--- End diff --

actually, no, the first is the title, then 2nd (after the empty line) is 
the description. we have that in some of our doc where there isn't really any 
more to say ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16337: [SPARK-18871][SQL] New test cases for IN/NOT IN subquery

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16337
  
**[Test build #70457 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70457/testReport)**
 for PR 16337 at commit 
[`9c584fb`](https://github.com/apache/spark/commit/9c584fb2c1bd99cdf4c0f5a222bc7aec4b003227).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-20 Thread wangmiao1981
Github user wangmiao1981 commented on the issue:

https://github.com/apache/spark/pull/16355
  
@imatiach-msft Can you add a test case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16355: [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm faili...

2016-12-20 Thread wangmiao1981
Github user wangmiao1981 commented on the issue:

https://github.com/apache/spark/pull/16355
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16370: [SPARK-18960][SQL][SS] Avoid double reading file which i...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16370
  
**[Test build #70456 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70456/testReport)**
 for PR 16370 at commit 
[`1d248c3`](https://github.com/apache/spark/commit/1d248c30bb6872494b82fe16a584b9b801058c58).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16367: [SPARK-18903][SPARKR] Add API to get SparkUI URL

2016-12-20 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request:

https://github.com/apache/spark/pull/16367#discussion_r93381426
  
--- Diff: R/pkg/R/sparkR.R ---
@@ -410,6 +410,30 @@ sparkR.session <- function(
   sparkSession
 }
 
+#' Get the URL of the SparkUI instance for the current active SparkSession
+#'
+#' Get the URL of the SparkUI instance for the current active SparkSession.
--- End diff --

Duplicate line?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16370: [SPARK-18960][SQL][SS] Avoid double reading file which i...

2016-12-20 Thread uncleGen
Github user uncleGen commented on the issue:

https://github.com/apache/spark/pull/16370
  
cc @zsxwing 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16370: [SPARK-18960][SQL][SS] Avoid double reading file ...

2016-12-20 Thread uncleGen
GitHub user uncleGen opened a pull request:

https://github.com/apache/spark/pull/16370

[SPARK-18960][SQL][SS] Avoid double reading file which is being copied.

## What changes were proposed in this pull request?

In HDFS, when we copy a file into target directory, there will a temporary 
`._COPY_` file for a period of time. The duration depends on file size. If we 
do not skip this file, we will may read the same data for two times.

## How was this patch tested?
update unit test


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uncleGen/spark SPARK-18960

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16370.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16370


commit 1d248c30bb6872494b82fe16a584b9b801058c58
Author: uncleGen 
Date:   2016-12-21T03:36:04Z

cp




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16351: [SPARK-18943][SQL] Avoid per-record type dispatch in CSV...

2016-12-20 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16351
  
mostly LGTM, do you have some performance numbers about this optimization?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16351: [SPARK-18943][SQL] Avoid per-record type dispatch...

2016-12-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16351#discussion_r93381128
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala
 ---
@@ -215,84 +215,133 @@ private[csv] object CSVInferSchema {
 }
 
 private[csv] object CSVTypeCast {
+  // A `ValueConverter` is responsible for converting the given value to a 
desired type.
+  private type ValueConverter = String => Any
 
   /**
-   * Casts given string datum to specified type.
-   * Currently we do not support complex types (ArrayType, MapType, 
StructType).
+   * Create converters which cast each given string datum to each 
specified type in given schema.
+   * Currently, we do not support complex types (`ArrayType`, `MapType`, 
`StructType`).
*
-   * For string types, this is simply the datum. For other types.
+   * For string types, this is simply the datum.
+   * For other types, this is converted into the value according to the 
type.
* For other nullable types, returns null if it is null or equals to the 
value specified
* in `nullValue` option.
*
-   * @param datum string value
-   * @param name field name in schema.
-   * @param castType data type to cast `datum` into.
-   * @param nullable nullability for the field.
+   * @param schema schema that contains data types to cast the given value 
into.
* @param options CSV options.
*/
-  def castTo(
+  def makeConverters(
+  schema: StructType,
+  options: CSVOptions = CSVOptions()): Array[ValueConverter] = {
+schema.map(f => makeConverter(f.name, f.dataType, f.nullable, 
options)).toArray
+  }
+
+  /**
+   * Create a converter which converts the string value to a value 
according to a desired type.
+   */
+  def makeConverter(
+   name: String,
+   dataType: DataType,
+   nullable: Boolean = true,
+   options: CSVOptions = CSVOptions()): ValueConverter = dataType 
match {
+case _: ByteType => (d: String) =>
+  nullSafeDatum(d, name, nullable, options) { case datum =>
--- End diff --

nit: nullSafeDatum(d, name, nullable, options)(_.toByte)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16351: [SPARK-18943][SQL] Avoid per-record type dispatch...

2016-12-20 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/16351#discussion_r93381049
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala
 ---
@@ -215,84 +215,133 @@ private[csv] object CSVInferSchema {
 }
 
 private[csv] object CSVTypeCast {
+  // A `ValueConverter` is responsible for converting the given value to a 
desired type.
+  private type ValueConverter = String => Any
 
   /**
-   * Casts given string datum to specified type.
-   * Currently we do not support complex types (ArrayType, MapType, 
StructType).
+   * Create converters which cast each given string datum to each 
specified type in given schema.
+   * Currently, we do not support complex types (`ArrayType`, `MapType`, 
`StructType`).
*
-   * For string types, this is simply the datum. For other types.
+   * For string types, this is simply the datum.
+   * For other types, this is converted into the value according to the 
type.
* For other nullable types, returns null if it is null or equals to the 
value specified
* in `nullValue` option.
*
-   * @param datum string value
-   * @param name field name in schema.
-   * @param castType data type to cast `datum` into.
-   * @param nullable nullability for the field.
+   * @param schema schema that contains data types to cast the given value 
into.
* @param options CSV options.
*/
-  def castTo(
+  def makeConverters(
+  schema: StructType,
+  options: CSVOptions = CSVOptions()): Array[ValueConverter] = {
+schema.map(f => makeConverter(f.name, f.dataType, f.nullable, 
options)).toArray
+  }
+
+  /**
+   * Create a converter which converts the string value to a value 
according to a desired type.
+   */
+  def makeConverter(
+   name: String,
+   dataType: DataType,
+   nullable: Boolean = true,
+   options: CSVOptions = CSVOptions()): ValueConverter = dataType 
match {
+case _: ByteType => (d: String) =>
+  nullSafeDatum(d, name, nullable, options) { case datum =>
+datum.toByte
+  }
+
+case _: ShortType => (d: String) =>
+  nullSafeDatum(d, name, nullable, options) { case datum =>
+datum.toShort
+  }
+
+case _: IntegerType => (d: String) =>
+  nullSafeDatum(d, name, nullable, options) { case datum =>
+datum.toInt
+  }
+
+case _: LongType => (d: String) =>
+  nullSafeDatum(d, name, nullable, options) { case datum =>
+datum.toLong
+  }
+
+case _: FloatType => (d: String) =>
+  nullSafeDatum(d, name, nullable, options) {
+case options.nanValue => Float.NaN
+case options.negativeInf => Float.NegativeInfinity
+case options.positiveInf => Float.PositiveInfinity
+case datum =>
+  Try(datum.toFloat)
+
.getOrElse(NumberFormat.getInstance(Locale.US).parse(datum).floatValue())
+  }
+
+case _: DoubleType => (d: String) =>
+  nullSafeDatum(d, name, nullable, options) {
+case options.nanValue => Double.NaN
+case options.negativeInf => Double.NegativeInfinity
+case options.positiveInf => Double.PositiveInfinity
+case datum =>
+  Try(datum.toDouble)
+
.getOrElse(NumberFormat.getInstance(Locale.US).parse(datum).doubleValue())
+  }
+
+case _: BooleanType => (d: String) =>
+  nullSafeDatum(d, name, nullable, options) { case datum =>
+datum.toBoolean
+  }
+
+case dt: DecimalType => (d: String) =>
+  nullSafeDatum(d, name, nullable, options) { case datum =>
+val value = new BigDecimal(datum.replaceAll(",", ""))
+Decimal(value, dt.precision, dt.scale)
+  }
+
+case _: TimestampType => (d: String) =>
+  nullSafeDatum(d, name, nullable, options) { case datum =>
+// This one will lose microseconds parts.
+// See https://issues.apache.org/jira/browse/SPARK-10681.
+Try(options.timestampFormat.parse(datum).getTime * 1000L)
+  .getOrElse {
+  // If it fails to parse, then tries the way used in 2.0 and 1.x 
for backwards
+  // compatibility.
+  DateTimeUtils.stringToTime(datum).getTime * 1000L
+}
+  }
+
+case _: DateType => (d: String) =>
+  nullSafeDatum(d, name, nullable, options) { case datum =>
+// This one will lose microseconds parts.
+// See https://issues.apache.org/jira/browse/SPARK-10681.x
+
Try(DateTimeUtils.millisToDays(options.dateFormat.parse(datum).getTime))
+

[GitHub] spark issue #16240: [SPARK-16792][SQL] Dataset containing a Case Class with ...

2016-12-20 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16240
  
How about we assign priority to implicit rules like 
http://stackoverflow.com/questions/1886953/is-there-a-way-to-control-which-implicit-conversion-will-be-the-default-used
 ?

I think we should prefer `Seq` encoder over `Product` encoder, for `Seq 
with Product`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #12775: [SPARK-14958][Core] Failed task not handled when there's...

2016-12-20 Thread lirui-intel
Github user lirui-intel commented on the issue:

https://github.com/apache/spark/pull/12775
  
I don't think the failure is related, and it can't be reproduced locally.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15212: [SPARK-17645][MLLIB][ML]add feature selector method base...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15212
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15212: [SPARK-17645][MLLIB][ML]add feature selector method base...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15212
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70454/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15212: [SPARK-17645][MLLIB][ML]add feature selector method base...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15212
  
**[Test build #70454 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70454/testReport)**
 for PR 15212 at commit 
[`83a429e`](https://github.com/apache/spark/commit/83a429e9907aac389d45aa1b6a23f432216e0382).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16323: [SPARK-18911] [SQL] Define CatalogStatistics to interact...

2016-12-20 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/16323
  
SGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15996: [SPARK-18567][SQL] Simplify CreateDataSourceTableAsSelec...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15996
  
**[Test build #70455 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70455/testReport)**
 for PR 15996 at commit 
[`7481150`](https://github.com/apache/spark/commit/748115047175420c842b6743ab33489882f18104).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16282: [SPARK-18588][SS][Kafka]Create a new KafkaConsumer when ...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16282
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16282: [SPARK-18588][SS][Kafka]Create a new KafkaConsumer when ...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16282
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70444/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16282: [SPARK-18588][SS][Kafka]Create a new KafkaConsumer when ...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16282
  
**[Test build #70444 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70444/testReport)**
 for PR 16282 at commit 
[`9080acd`](https://github.com/apache/spark/commit/9080acd43f7568ba1b084ee892144a92f4cfa376).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16350
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70452/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16350
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16350
  
**[Test build #70452 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70452/consoleFull)**
 for PR 16350 at commit 
[`8dd0169`](https://github.com/apache/spark/commit/8dd01693c5fca8a724fe0e9f1ada0f7bdaf1f5f6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16360: [SPARK-18234][SS] Made update mode public

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16360
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70450/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16360: [SPARK-18234][SS] Made update mode public

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16360
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16360: [SPARK-18234][SS] Made update mode public

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16360
  
**[Test build #70450 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70450/testReport)**
 for PR 16360 at commit 
[`628c6c2`](https://github.com/apache/spark/commit/628c6c2e801b8cce6a47608c41068a0a085698ed).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16369: [SPARK-18956][SQL][PySpark] Reuse existing SparkSession ...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16369
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70453/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16369: [SPARK-18956][SQL][PySpark] Reuse existing SparkSession ...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16369
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16369: [SPARK-18956][SQL][PySpark] Reuse existing SparkSession ...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16369
  
**[Test build #70453 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70453/testReport)**
 for PR 16369 at commit 
[`651ce53`](https://github.com/apache/spark/commit/651ce532423a728ec2a995c9f4149b71d2c0203c).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16323: [SPARK-18911] [SQL] Define CatalogStatistics to interact...

2016-12-20 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/16323
  
Since adding a switch for cbo is not a trivial one, I want to do it in a 
separate pr, and let this one only deal with decoupling Statistics from 
CatalogTable. Do you agree? @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16323: [SPARK-18911] [SQL] Define CatalogStatistics to i...

2016-12-20 Thread wzhfy
Github user wzhfy commented on a diff in the pull request:

https://github.com/apache/spark/pull/16323#discussion_r93376249
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -198,6 +200,10 @@ case class CatalogTable(
   locationUri, inputFormat, outputFormat, serde, compressed, 
properties))
   }
 
+  def withStats(cboStatsEnabled: Boolean): CatalogTable = {
--- End diff --

Thanks. I think the first one is better, the second one will lead to many 
if-else on caller sides.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15212: [SPARK-17645][MLLIB][ML]add feature selector method base...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15212
  
**[Test build #70454 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70454/testReport)**
 for PR 15212 at commit 
[`83a429e`](https://github.com/apache/spark/commit/83a429e9907aac389d45aa1b6a23f432216e0382).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16304: [SPARK-18894][SS] Fix event time watermark delay thresho...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16304
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16304: [SPARK-18894][SS] Fix event time watermark delay thresho...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16304
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70449/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16304: [SPARK-18894][SS] Fix event time watermark delay thresho...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16304
  
**[Test build #70449 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70449/testReport)**
 for PR 16304 at commit 
[`29f0037`](https://github.com/apache/spark/commit/29f0037631399bf2226ff3c2e630e4927e177eb4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16356: [SPARK-18949] [SQL] Add recoverPartitions API to Catalog

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16356
  
**[Test build #3511 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3511/testReport)**
 for PR 16356 at commit 
[`451ab05`](https://github.com/apache/spark/commit/451ab0598d59bb5df9a222df931b1be127c3082a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16296: [SPARK-18885][SQL] unify CREATE TABLE syntax for data so...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16296
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70447/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16296: [SPARK-18885][SQL] unify CREATE TABLE syntax for data so...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16296
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16296: [SPARK-18885][SQL] unify CREATE TABLE syntax for data so...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16296
  
**[Test build #70447 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70447/testReport)**
 for PR 16296 at commit 
[`7b5f226`](https://github.com/apache/spark/commit/7b5f226b94f3e6b830f3d92b33599dd3f57f8dbd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class DetermineHiveSerde(conf: SQLConf) extends Rule[LogicalPlan] `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16369: [SPARK-18956][SQL][PySpark] Reuse existing SparkSession ...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16369
  
**[Test build #70453 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70453/testReport)**
 for PR 16369 at commit 
[`651ce53`](https://github.com/apache/spark/commit/651ce532423a728ec2a995c9f4149b71d2c0203c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16369: [SPARK-18956][SQL][PySpark] Reuse existing SparkS...

2016-12-20 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/16369

[SPARK-18956][SQL][PySpark] Reuse existing SparkSession while creating new 
SQLContext instances

## What changes were proposed in this pull request?

To reuse existing SparkSession while creating new SQLContext instances in 
PySpark.

## How was this patch tested?

N/A

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 reuse-sparksession-pyspark

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16369.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16369


commit 651ce532423a728ec2a995c9f4149b71d2c0203c
Author: Liang-Chi Hsieh 
Date:   2016-12-21T04:40:53Z

Reuse existing SparkSession while creating new SQLContext instances.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16359: [SPARK-18951] Upgrade com.thoughtworks.paranamer/paranam...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16359
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16359: [SPARK-18951] Upgrade com.thoughtworks.paranamer/paranam...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16359
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70445/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16359: [SPARK-18951] Upgrade com.thoughtworks.paranamer/paranam...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16359
  
**[Test build #70445 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70445/testReport)**
 for PR 16359 at commit 
[`c502aeb`](https://github.com/apache/spark/commit/c502aeb123641f634c107c0ad8c0f1986fea8ee1).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16352: [SPARK-18947][SQL] SQLContext.tableNames should not call...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16352
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70446/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16352: [SPARK-18947][SQL] SQLContext.tableNames should not call...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16352
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16352: [SPARK-18947][SQL] SQLContext.tableNames should not call...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16352
  
**[Test build #70446 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70446/testReport)**
 for PR 16352 at commit 
[`1f69b38`](https://github.com/apache/spark/commit/1f69b381f0a916e98200cad596dfea534ec08ffc).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16366: [SPARK-18953][CORE][WEB UI] Do now show the link to a de...

2016-12-20 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16366
  
We also show worker info in `driverRow`. Although it doesn't show worker 
state, I am wondering if we can also check worker state, and disable the link 
and add a suffix like `(DEAD)` if a worker is dead?


https://github.com/apache/spark/blob/39e2bad6a866d27c3ca594d15e574a1da3ee84cc/core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala#L249


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread xuanyuanking
Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/16350
  
Delete the UT and metrics done. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16350
  
**[Test build #70452 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70452/consoleFull)**
 for PR 16350 at commit 
[`8dd0169`](https://github.com/apache/spark/commit/8dd01693c5fca8a724fe0e9f1ada0f7bdaf1f5f6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16366: [SPARK-18953][CORE][WEB UI] Do now show the link to a de...

2016-12-20 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16366
  
One question: should we do the same thing for `WorkerState.DECOMMISSIONED` 
and `WorkerState.UNKNOWN` ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16282: [SPARK-18588][SS][Kafka]Create a new KafkaConsumer when ...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16282
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70439/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16282: [SPARK-18588][SS][Kafka]Create a new KafkaConsumer when ...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16282
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16282: [SPARK-18588][SS][Kafka]Create a new KafkaConsumer when ...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16282
  
**[Test build #70439 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70439/testReport)**
 for PR 16282 at commit 
[`7c789e8`](https://github.com/apache/spark/commit/7c789e80255fbcb400eb2f62c959fed7ccb93455).
 * This patch **fails from timeout after a configured wait of \`250m\`**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2016-12-20 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/15018
  
@neggert I am fine for throwing an error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16368: [SPARK-18958][SPARKR] R API toJSON on DataFrame

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16368
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16368: [SPARK-18958][SPARKR] R API toJSON on DataFrame

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16368
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70451/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16368: [SPARK-18958][SPARKR] R API toJSON on DataFrame

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16368
  
**[Test build #70451 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70451/testReport)**
 for PR 16368 at commit 
[`e2031ea`](https://github.com/apache/spark/commit/e2031ea36cc46c46fd3c0a20d8708eb313c78a28).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16323: [SPARK-18911] [SQL] Define CatalogStatistics to i...

2016-12-20 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/16323#discussion_r93370436
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
 ---
@@ -198,6 +200,10 @@ case class CatalogTable(
   locationUri, inputFormat, outputFormat, serde, compressed, 
properties))
   }
 
+  def withStats(cboStatsEnabled: Boolean): CatalogTable = {
--- End diff --

I can think of two approaches:

1.

We can keep the current naive version of `statistics` and add new 
`statistics` function which takes conf.

A default implementation of the new `statistics` function simply returns 
the naive version of `statistics`.

In `Join` or `Aggregate`, we can include more complex logic in the new 
`statistics` to return naive calculation or something estimation.

The caller always calls new `statistics` function and passes in current 
conf.

2.

Add new `statisticsCBO` which doesn't take conf because it is called only 
cbo is enabled. So the caller decides to call non-cbo version `statistics` or 
cbo version `statisticsCBO`.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16366: [SPARK-18953][CORE][WEB UI] Do now show the link to a de...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16366
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16366: [SPARK-18953][CORE][WEB UI] Do now show the link to a de...

2016-12-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16366
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70442/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16366: [SPARK-18953][CORE][WEB UI] Do now show the link to a de...

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16366
  
**[Test build #70442 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70442/testReport)**
 for PR 16366 at commit 
[`4e5d5f2`](https://github.com/apache/spark/commit/4e5d5f2ae4b13ec172c0ab81f15d86af8149596d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16350: [SPARK-18700][SQL][BACKPORT-2.0] Add StripedLock for eac...

2016-12-20 Thread ericl
Github user ericl commented on the issue:

https://github.com/apache/spark/pull/16350
  
yeah, i don't think we need the unit test for 2.0


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16362: [SPARK-18954][Tests]Fix flaky test: o.a.s.streaming.Basi...

2016-12-20 Thread tdas
Github user tdas commented on the issue:

https://github.com/apache/spark/pull/16362
  
LGTM. Did you run it many times?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16343: [FLAKY-TEST] InputStreamsSuite.socket input stream

2016-12-20 Thread tdas
Github user tdas commented on the issue:

https://github.com/apache/spark/pull/16343
  
Why didnt `eventually` and assert on the size of the collected data work?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16314: [SPARK-18900][FLAKY-TEST] StateStoreSuite.mainten...

2016-12-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16314


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16314: [SPARK-18900][FLAKY-TEST] StateStoreSuite.maintenance

2016-12-20 Thread tdas
Github user tdas commented on the issue:

https://github.com/apache/spark/pull/16314
  
Merging this master and 2.1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16368: [SPARK-18958][SPARKR] R API toJSON on DataFrame

2016-12-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16368
  
**[Test build #70451 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70451/testReport)**
 for PR 16368 at commit 
[`e2031ea`](https://github.com/apache/spark/commit/e2031ea36cc46c46fd3c0a20d8708eb313c78a28).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16368: [SPARK-18958][SPARKR] R API toJSON on DataFrame

2016-12-20 Thread felixcheung
GitHub user felixcheung opened a pull request:

https://github.com/apache/spark/pull/16368

[SPARK-18958][SPARKR] R API toJSON on DataFrame

## What changes were proposed in this pull request?

It would make it easier to integrate with other component expecting JSON 
format.

## How was this patch tested?

manual, unit tests


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/felixcheung/spark rJSON

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16368.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16368


commit 886efe9b962bb22eed469bb0f853ee280eb06a45
Author: Felix Cheung 
Date:   2016-12-21T01:31:48Z

add toJSON DataFrame API

commit e2031ea36cc46c46fd3c0a20d8708eb313c78a28
Author: Felix Cheung 
Date:   2016-12-21T03:08:46Z

fix test




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >