[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...

2018-12-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23184#discussion_r238057238
  
--- Diff: R/pkg/R/functions.R ---
@@ -2254,40 +2255,48 @@ setMethod("date_format", signature(y = "Column", x 
= "character"),
 column(jc)
   })
 
+setClassUnion("characterOrstructTypeOrColumn", c("character", 
"structType", "Column"))
+
 #' @details
 #' \code{from_json}: Parses a column containing a JSON string into a 
Column of \code{structType}
 #' with the specified \code{schema} or array of \code{structType} if 
\code{as.json.array} is set
 #' to \code{TRUE}. If the string is unparseable, the Column will contain 
the value NA.
 #'
 #' @rdname column_collection_functions
 #' @param as.json.array indicating if input string is JSON array of 
objects or a single object.
-#' @aliases from_json from_json,Column,characterOrstructType-method
+#' @aliases from_json from_json,Column,characterOrstructTypeOrColumn-method
 #' @examples
 #'
 #' \dontrun{
 #' df2 <- sql("SELECT named_struct('date', cast('2000-01-01' as date)) as 
d")
 #' df2 <- mutate(df2, d2 = to_json(df2$d, dateFormat = 'dd/MM/'))
 #' schema <- structType(structField("date", "string"))
 #' head(select(df2, from_json(df2$d2, schema, dateFormat = 'dd/MM/')))
-
 #' df2 <- sql("SELECT named_struct('name', 'Bob') as people")
 #' df2 <- mutate(df2, people_json = to_json(df2$people))
 #' schema <- structType(structField("name", "string"))
 #' head(select(df2, from_json(df2$people_json, schema)))
-#' head(select(df2, from_json(df2$people_json, "name STRING")))}
+#' head(select(df2, from_json(df2$people_json, "name STRING")))
+#' head(select(df2, from_json(df2$people_json, 
schema_of_json(head(df2)$people_json}
 #' @note from_json since 2.2.0
-setMethod("from_json", signature(x = "Column", schema = 
"characterOrstructType"),
+setMethod("from_json", signature(x = "Column", schema = 
"characterOrstructTypeOrColumn"),
   function(x, schema, as.json.array = FALSE, ...) {
 if (is.character(schema)) {
-  schema <- structType(schema)
+  jschema <- structType(schema)$jobj
+} else if (class(schema) == "structType") {
+  jschema <- schema$jobj
+} else {
+  jschema <- schema@jc
 }
 
 if (as.json.array) {
-  jschema <- 
callJStatic("org.apache.spark.sql.types.DataTypes",
- "createArrayType",
- schema$jobj)
-} else {
-  jschema <- schema$jobj
+  # This case is R-specifically different. Unlike Scala and 
Python side,
--- End diff --

If so, the provided schema is wrapped by Array. The test cases are ...


here 
https://github.com/apache/spark/pull/23184/files#diff-d4011863c8b176830365b2f224a84bf2R1707


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23186: [SPARK-26230][SQL]FileIndex: if case sensitive, validate...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23186
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5607/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23186: [SPARK-26230][SQL]FileIndex: if case sensitive, validate...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23186
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23190
  
**[Test build #99541 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99541/testReport)**
 for PR 23190 at commit 
[`3376524`](https://github.com/apache/spark/commit/33765248d2afdccf4e3cedf96200791ad48ef6be).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23190
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99541/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23190
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Reuse withTempDir function to ...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23151
  
**[Test build #99533 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99533/testReport)**
 for PR 23151 at commit 
[`6d2f2d3`](https://github.com/apache/spark/commit/6d2f2d36f0e8dcdbe54e82865f7bd1b10dd323bd).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23189: [SPARK-26235][Core] Change log level for ClassNotFoundEx...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23189
  
**[Test build #99545 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99545/testReport)**
 for PR 23189 at commit 
[`1662a8b`](https://github.com/apache/spark/commit/1662a8b41564b57eac37e9155eaa51dff24c99dc).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23189: [SPARK-26235][Core] Change log level for ClassNotFoundEx...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23189
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5606/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...

2018-12-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23184#discussion_r238057161
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala 
---
@@ -225,4 +225,10 @@ private[sql] object SQLUtils extends Logging {
 }
 sparkSession.sessionState.catalog.listTables(db).map(_.table).toArray
   }
+
+  def createArrayType(elementType: DataType): ArrayType = 
DataTypes.createArrayType(elementType)
--- End diff --

Yea I remember that. I thought this case is a bit different from other 
instances actually. It reduces the code complexity in R side because R side 
directly calls this overridden methods. For instance, currently it's being 
called:

```r
jschema <-  callJStatic("org.apache.spark.sql.api.r.SQLUtils",
"createArrayType",
jschema)
```

but if we remove those, it should be like:

```r
if (class(schema) == "dataType") {
  jschema <- callJStatic("org.apache.spark.sql.types.DataTypes",
 "createArrayType",
 schema$jobj)
} else {
  jschema <- callJStatic("org.apache.spark.sql.api.r.SQLUtils",
 "createArrayType",
 schema$jobj)
}
```

Let me try to remove this one anyway.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...

2018-12-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23184#discussion_r238057208
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala 
---
@@ -225,4 +225,10 @@ private[sql] object SQLUtils extends Logging {
 }
 sparkSession.sessionState.catalog.listTables(db).map(_.table).toArray
   }
+
+  def createArrayType(elementType: DataType): ArrayType = 
DataTypes.createArrayType(elementType)
+
+  def createArrayType(elementType: Column): ArrayType = {
+new ArrayType(ExprUtils.evalTypeExpr(elementType.expr), true)
--- End diff --

Yup, that sounds more correct.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23184: [SPARK-26227][R] from_[csv|json] should accept schema_of...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23184
  
**[Test build #99547 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99547/testReport)**
 for PR 23184 at commit 
[`c731ad1`](https://github.com/apache/spark/commit/c731ad181cc0b69f263b8334d1e1498240121b0c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23194
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23194
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Reuse withTempDir function to ...

2018-12-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23151
  
Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23177: [SPARK-26212][Build][test-maven] Upgrade maven version t...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23177
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23177: [SPARK-26212][Build][test-maven] Upgrade maven version t...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23177
  
**[Test build #99538 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99538/testReport)**
 for PR 23177 at commit 
[`aa25833`](https://github.com/apache/spark/commit/aa258334170cc2ba009603b4547b4184b3736881).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23178: [SPARK-26216][SQL] Do not use case class as publi...

2018-12-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23178#discussion_r238057361
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala
 ---
@@ -38,114 +38,106 @@ import org.apache.spark.sql.types.DataType
  * @since 1.3.0
  */
 @Stable
--- End diff --

yea actually I was wondering about the same thing.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23191: [SPARK-26219][CORE][branch-2.4] Executor summary should ...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23191
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23191: [SPARK-26219][CORE][branch-2.4] Executor summary should ...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23191
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99537/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23191: [SPARK-26219][CORE][branch-2.4] Executor summary should ...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23191
  
**[Test build #99537 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99537/testReport)**
 for PR 23191 at commit 
[`8656936`](https://github.com/apache/spark/commit/8656936df9f5fd4f51c967aac3201c0988c05b5c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23173: [SPARK-26208][SQL] add headers to empty csv files...

2018-12-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23173#discussion_r238058594
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala
 ---
@@ -171,15 +171,21 @@ private[csv] class CsvOutputWriter(
 
   private var univocityGenerator: Option[UnivocityGenerator] = None
 
-  override def write(row: InternalRow): Unit = {
-val gen = univocityGenerator.getOrElse {
-  val charset = Charset.forName(params.charset)
-  val os = CodecStreams.createOutputStreamWriter(context, new 
Path(path), charset)
-  val newGen = new UnivocityGenerator(dataSchema, os, params)
-  univocityGenerator = Some(newGen)
-  newGen
-}
+  if (params.headerFlag) {
+val gen = getGen()
+gen.writeHeaders()
+  }
 
+  private def getGen(): UnivocityGenerator = univocityGenerator.getOrElse {
+val charset = Charset.forName(params.charset)
+val os = CodecStreams.createOutputStreamWriter(context, new 
Path(path), charset)
+val newGen = new UnivocityGenerator(dataSchema, os, params)
+univocityGenerator = Some(newGen)
+newGen
+  }
+
+  override def write(row: InternalRow): Unit = {
+val gen = getGen()
--- End diff --

Wait .. is this going to create `UnivocityGenerator` for each record?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23184: [SPARK-26227][R] from_[csv|json] should accept schema_of...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23184
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5608/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23173: [SPARK-26208][SQL] add headers to empty csv files...

2018-12-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23173#discussion_r238058669
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala
 ---
@@ -171,15 +171,21 @@ private[csv] class CsvOutputWriter(
 
   private var univocityGenerator: Option[UnivocityGenerator] = None
 
-  override def write(row: InternalRow): Unit = {
-val gen = univocityGenerator.getOrElse {
-  val charset = Charset.forName(params.charset)
-  val os = CodecStreams.createOutputStreamWriter(context, new 
Path(path), charset)
-  val newGen = new UnivocityGenerator(dataSchema, os, params)
-  univocityGenerator = Some(newGen)
-  newGen
-}
+  if (params.headerFlag) {
+val gen = getGen()
+gen.writeHeaders()
+  }
 
+  private def getGen(): UnivocityGenerator = univocityGenerator.getOrElse {
+val charset = Charset.forName(params.charset)
+val os = CodecStreams.createOutputStreamWriter(context, new 
Path(path), charset)
+val newGen = new UnivocityGenerator(dataSchema, os, params)
+univocityGenerator = Some(newGen)
+newGen
+  }
+
+  override def write(row: InternalRow): Unit = {
+val gen = getGen()
--- End diff --

Ah, it's `getOrElse`. Okay but still can we simplify this logic? Looks a 
bit confusing. For instance, I think we can do this with lazy val.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23184: [SPARK-26227][R] from_[csv|json] should accept schema_of...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23184
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Reuse withTempDir function to ...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23151
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99533/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23194
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23194: [MINOR][SQL] Combine the same codes in test cases

2018-12-01 Thread CarolinePeng
GitHub user CarolinePeng opened a pull request:

https://github.com/apache/spark/pull/23194

[MINOR][SQL] Combine the same codes in test cases

## What changes were proposed in this pull request?

In the DDLSuit, there are four test cases have the same codes , writing a 
function can combine the same code.

## How was this patch tested?

existing tests.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CarolinePeng/spark Update_temp

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/23194.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #23194


commit eff036c031fe96e9c609b30c882be3ac3bd64e86
Author: 彭灿00244106 <00244106@...>
Date:   2018-12-01T07:51:39Z

update some codes




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Reuse withTempDir function to ...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23151
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23189: [SPARK-26235][Core] Change log level for ClassNotFoundEx...

2018-12-01 Thread gengliangwang
Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/23189
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23177: [SPARK-26212][Build][test-maven] Upgrade maven version t...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23177
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99538/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Reuse withTempDir funct...

2018-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23151


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23189: [SPARK-26235][Core] Change log level for ClassNotFoundEx...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23189
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...

2018-12-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23184#discussion_r238056921
  
--- Diff: R/pkg/R/functions.R ---
@@ -202,8 +202,9 @@ NULL
 #'  \itemize{
 #'  \item \code{from_json}: a structType object to use as the 
schema to use
 #'  when parsing the JSON string. Since Spark 2.3, the 
DDL-formatted string is
-#'  also supported for the schema.
-#'  \item \code{from_csv}: a DDL-formatted string
+#'  also supported for the schema. Since Spark 3.0, 
\code{schema_of_json} or
+#'  a string literal can also be accepted.
--- End diff --

Basically yea same. The only difference is it allows takes a string literal 
(not only string). Let me try to clarify it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...

2018-12-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23184#discussion_r238056981
  
--- Diff: R/pkg/R/functions.R ---
@@ -2254,40 +2255,48 @@ setMethod("date_format", signature(y = "Column", x 
= "character"),
 column(jc)
   })
 
+setClassUnion("characterOrstructTypeOrColumn", c("character", 
"structType", "Column"))
--- End diff --

Yup, I agree.. Would you mind if I do this separately? I roughly checked by 
`grep` and looks:

```
./pkg/R/DataFrame.R:setClassUnion("characterOrstructType", c("character", 
"structType"))
./pkg/R/DataFrame.R:setClassUnion("numericOrcharacter", c("numeric", 
"character"))
./pkg/R/DataFrame.R:setClassUnion("characterOrColumn", c("character", 
"Column"))
./pkg/R/DataFrame.R:setClassUnion("numericOrColumn", c("numeric", "Column"))
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23186: [SPARK-26230][SQL]FileIndex: if case sensitive, validate...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23186
  
**[Test build #99546 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99546/testReport)**
 for PR 23186 at commit 
[`313366d`](https://github.com/apache/spark/commit/313366d58075daba055c392682abaa01fbb574ee).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...

2018-12-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/23184#discussion_r238058149
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala 
---
@@ -225,4 +225,10 @@ private[sql] object SQLUtils extends Logging {
 }
 sparkSession.sessionState.catalog.listTables(db).map(_.table).toArray
   }
+
+  def createArrayType(elementType: DataType): ArrayType = 
DataTypes.createArrayType(elementType)
+
+  def createArrayType(elementType: Column): ArrayType = {
+new ArrayType(ExprUtils.evalTypeExpr(elementType.expr), true)
--- End diff --

Oops, it looks not actually. `elementType.expr.nullable` will return 
nullability from the expression (like `schema_of_json` or string literal), not 
for something related with its input schema.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23173: [SPARK-26208][SQL] add headers to empty csv files when h...

2018-12-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/23173
  
Looks fine to me too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23160: [SPARK-26196][WebUI] Total tasks title in the stage page...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23160
  
**[Test build #99542 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99542/testReport)**
 for PR 23160 at commit 
[`71aff34`](https://github.com/apache/spark/commit/71aff3447b4c1a80e168a18202864191df189709).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22683
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22683
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99544/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23160: [SPARK-26196][WebUI] Total tasks title in the stage page...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23160
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...

2018-12-01 Thread MaxGekk
Github user MaxGekk commented on a diff in the pull request:

https://github.com/apache/spark/pull/23150#discussion_r238075585
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
 ---
@@ -1107,7 +,7 @@ class CSVSuite extends QueryTest with 
SharedSQLContext with SQLTestUtils with Te
 
   test("SPARK-18699 put malformed records in a `columnNameOfCorruptRecord` 
field") {
 Seq(false, true).foreach { multiLine =>
-  val schema = new StructType().add("a", IntegerType).add("b", 
TimestampType)
+  val schema = new StructType().add("a", IntegerType).add("b", 
DateType)
--- End diff --

I changed the type because supposed to valid date `"1983-08-04"` cannot be 
parsed with default timestamp pattern.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23196
  
**[Test build #99555 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99555/testReport)**
 for PR 23196 at commit 
[`4646ded`](https://github.com/apache/spark/commit/4646dededae832185a35a85244baab6507d28f0d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23173: [SPARK-26208][SQL] add headers to empty csv files...

2018-12-01 Thread koertkuipers
Github user koertkuipers commented on a diff in the pull request:

https://github.com/apache/spark/pull/23173#discussion_r238077135
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala
 ---
@@ -171,15 +171,21 @@ private[csv] class CsvOutputWriter(
 
   private var univocityGenerator: Option[UnivocityGenerator] = None
 
-  override def write(row: InternalRow): Unit = {
-val gen = univocityGenerator.getOrElse {
-  val charset = Charset.forName(params.charset)
-  val os = CodecStreams.createOutputStreamWriter(context, new 
Path(path), charset)
-  val newGen = new UnivocityGenerator(dataSchema, os, params)
-  univocityGenerator = Some(newGen)
-  newGen
-}
+  if (params.headerFlag) {
+val gen = getGen()
+gen.writeHeaders()
+  }
 
+  private def getGen(): UnivocityGenerator = univocityGenerator.getOrElse {
+val charset = Charset.forName(params.charset)
+val os = CodecStreams.createOutputStreamWriter(context, new 
Path(path), charset)
+val newGen = new UnivocityGenerator(dataSchema, os, params)
+univocityGenerator = Some(newGen)
+newGen
+  }
+
+  override def write(row: InternalRow): Unit = {
+val gen = getGen()
--- End diff --

i will revert this change to lazy val for now since it doesnt have anything 
to do wit this pullreq or jira: the Option approach was created in another 
pullreq.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23196
  
**[Test build #99558 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99558/testReport)**
 for PR 23196 at commit 
[`f326042`](https://github.com/apache/spark/commit/f326042aa1aff540d06c79fd73395204d846f3ea).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23190
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99557/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23150
  
**[Test build #99556 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99556/testReport)**
 for PR 23150 at commit 
[`00509d3`](https://github.com/apache/spark/commit/00509d3a94e0679505cd9fde78e38a3a15d11bde).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23192: [SPARK-26241][SQL] Add queryId to IncrementalExec...

2018-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23192


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23193: [SPARK-26226][SQL] Track optimization phase for s...

2018-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23193


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread shahidki31
Github user shahidki31 commented on the issue:

https://github.com/apache/spark/pull/23088
  
Hi @srowen . Yes. Currently for disk store case, we need to have a more 
optimized code.

> While it makes some sense I have two concerns: different answers based on 
disk vs memory store which shouldn't really affect things. But would a user 
ever have both and see both side by side and be confused?

This configurable store is only for history server. So user can configure 
either one at a time for history server. But, the live UI (which open from Yarn 
UI), also goes through the same code flow. Where it has 'ElementTrackingStore', 
which is also inMemory. So, if a user configure disk store for History server 
and open both live and inProgress History UI, the summary metrics will be 
different.

> changing the way the indexing works, so that you can index by specific 
metrics for successful and failed tasks differently, would be tricky, and also 
would require changing the disk store version (to invalidate old stores).

I think @vanzin suggestion seems work, but need time to give it a try and 
to test it. May be we can add as "TODO" for diskStore case or open a seperate 
JIRA for that.

> Second is, that seems like it should still entail pushing down all the 
quantile logic into the KVStore, to be clean, right? and that's a bigger change.

Thanks @srowen for the suggestion. Probably @vanzin can answer this well.

I have modified the code, for InMemory case. Disk store still uses the old 
code.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23154: [SPARK-26195][SQL] Correct exception messages in some cl...

2018-12-01 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23154
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23120: [SPARK-26151][SQL] Return partial results for bad...

2018-12-01 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23120#discussion_r238083349
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala
 ---
@@ -243,21 +243,27 @@ class UnivocityParser(
 () => getPartialResult(),
 new RuntimeException("Malformed CSV record"))
 } else {
-  try {
-// When the length of the returned tokens is identical to the 
length of the parsed schema,
-// we just need to convert the tokens that correspond to the 
required columns.
-var i = 0
-while (i < requiredSchema.length) {
+  // When the length of the returned tokens is identical to the length 
of the parsed schema,
+  // we just need to convert the tokens that correspond to the 
required columns.
+  var badRecordException: Option[Throwable] = None
+  var i = 0
+  while (i < requiredSchema.length) {
+try {
   row(i) = valueConverters(i).apply(getToken(tokens, i))
-  i += 1
+} catch {
+  case NonFatal(e) =>
+badRecordException = badRecordException.orElse(Some(e))
 }
+i += 1
+  }
+
+  if (badRecordException.isEmpty) {
 row
-  } catch {
-case NonFatal(e) =>
-  // For corrupted records with the number of tokens same as the 
schema,
-  // CSV reader doesn't support partial results. All fields other 
than the field
-  // configured by `columnNameOfCorruptRecord` are set to `null`.
-  throw BadRecordException(() => getCurrentInput, () => None, e)
+  } else {
+// For corrupted records with the number of tokens same as the 
schema,
+// CSV reader doesn't support partial results. All fields other 
than the field
+// configured by `columnNameOfCorruptRecord` are set to `null`.
--- End diff --

what do you mean here?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99561/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23088
  
**[Test build #99561 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99561/testReport)**
 for PR 23088 at commit 
[`dc95355`](https://github.com/apache/spark/commit/dc9535547c353509930ea340780611f3129da962).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23164: [SPARK-26198][SQL] Fix Metadata serialize null values th...

2018-12-01 Thread wangyum
Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/23164
  
I used it here: 
https://github.com/apache/spark/compare/master...wangyum:default-value?expand=1#diff-9847f5cef7cf7fbc5830fbc6b779ee10R1827


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...

2018-12-01 Thread MaxGekk
Github user MaxGekk commented on a diff in the pull request:

https://github.com/apache/spark/pull/23150#discussion_r238075664
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala
 ---
@@ -622,10 +623,11 @@ class CSVSuite extends QueryTest with 
SharedSQLContext with SQLTestUtils with Te
 val options = Map(
   "header" -> "true",
   "inferSchema" -> "false",
-  "dateFormat" -> "dd/MM/ hh:mm")
+  "dateFormat" -> "dd/MM/ HH:mm")
--- End diff --

According to iso 8601:
```
h   clock-hour-of-am-pm (1-12)  number12
H   hour-of-day (0-23)  number0
```
but real data is not in the allowed range:
```
date
26/08/2015 18:00
27/10/2014 18:30
28/01/2016 20:00
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23196
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99555/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23196
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23150
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99556/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23150
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23187: [SPARK-26211][SQL][TEST][FOLLOW-UP] Combine test cases f...

2018-12-01 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23187
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...

2018-12-01 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23190
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...

2018-12-01 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23190
  
LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23088
  
**[Test build #99562 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99562/testReport)**
 for PR 23088 at commit 
[`64fbe5d`](https://github.com/apache/spark/commit/64fbe5d7b845b6351e2dae2af231d2be37ca13b8).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5621/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23154: [SPARK-26195][SQL] Correct exception messages in ...

2018-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23154


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23088
  
**[Test build #99564 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99564/testReport)**
 for PR 23088 at commit 
[`7956b27`](https://github.com/apache/spark/commit/7956b27bd6b19065d367d96cd5e2b448507c7dc4).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23088
  
**[Test build #99564 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99564/testReport)**
 for PR 23088 at commit 
[`7956b27`](https://github.com/apache/spark/commit/7956b27bd6b19065d367d96cd5e2b448507c7dc4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...

2018-12-01 Thread MaxGekk
Github user MaxGekk commented on a diff in the pull request:

https://github.com/apache/spark/pull/23150#discussion_r238075711
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/csv/UnivocityParserSuite.scala
 ---
@@ -86,62 +85,74 @@ class UnivocityParserSuite extends SparkFunSuite with 
SQLHelper {
 // null.
 Seq(true, false).foreach { b =>
   val options = new CSVOptions(Map("nullValue" -> "null"), false, 
"GMT")
-  val converter =
-parser.makeConverter("_1", StringType, nullable = b, options = 
options)
+  val parser = new UnivocityParser(StructType(Seq.empty), options)
+  val converter = parser.makeConverter("_1", StringType, nullable = b)
   assert(converter.apply("") == UTF8String.fromString(""))
 }
   }
 
   test("Throws exception for empty string with non null type") {
-  val options = new CSVOptions(Map.empty[String, String], false, "GMT")
+val options = new CSVOptions(Map.empty[String, String], false, "GMT")
+val parser = new UnivocityParser(StructType(Seq.empty), options)
 val exception = intercept[RuntimeException]{
-  parser.makeConverter("_1", IntegerType, nullable = false, options = 
options).apply("")
+  parser.makeConverter("_1", IntegerType, nullable = false).apply("")
 }
 assert(exception.getMessage.contains("null value found but field _1 is 
not nullable."))
   }
 
   test("Types are cast correctly") {
 val options = new CSVOptions(Map.empty[String, String], false, "GMT")
-assert(parser.makeConverter("_1", ByteType, options = 
options).apply("10") == 10)
-assert(parser.makeConverter("_1", ShortType, options = 
options).apply("10") == 10)
-assert(parser.makeConverter("_1", IntegerType, options = 
options).apply("10") == 10)
-assert(parser.makeConverter("_1", LongType, options = 
options).apply("10") == 10)
-assert(parser.makeConverter("_1", FloatType, options = 
options).apply("1.00") == 1.0)
-assert(parser.makeConverter("_1", DoubleType, options = 
options).apply("1.00") == 1.0)
-assert(parser.makeConverter("_1", BooleanType, options = 
options).apply("true") == true)
-
-val timestampsOptions =
+var parser = new UnivocityParser(StructType(Seq.empty), options)
+assert(parser.makeConverter("_1", ByteType).apply("10") == 10)
+assert(parser.makeConverter("_1", ShortType).apply("10") == 10)
+assert(parser.makeConverter("_1", IntegerType).apply("10") == 10)
+assert(parser.makeConverter("_1", LongType).apply("10") == 10)
+assert(parser.makeConverter("_1", FloatType).apply("1.00") == 1.0)
+assert(parser.makeConverter("_1", DoubleType).apply("1.00") == 1.0)
+assert(parser.makeConverter("_1", BooleanType).apply("true") == true)
+
+var timestampsOptions =
   new CSVOptions(Map("timestampFormat" -> "dd/MM/ hh:mm"), false, 
"GMT")
+parser = new UnivocityParser(StructType(Seq.empty), timestampsOptions)
 val customTimestamp = "31/01/2015 00:00"
-val expectedTime = 
timestampsOptions.timestampFormat.parse(customTimestamp).getTime
-val castedTimestamp =
-  parser.makeConverter("_1", TimestampType, nullable = true, options = 
timestampsOptions)
+var format = FastDateFormat.getInstance(
+  timestampsOptions.timestampFormat, timestampsOptions.timeZone, 
timestampsOptions.locale)
+val expectedTime = format.parse(customTimestamp).getTime
+val castedTimestamp = parser.makeConverter("_1", TimestampType, 
nullable = true)
 .apply(customTimestamp)
 assert(castedTimestamp == expectedTime * 1000L)
 
 val customDate = "31/01/2015"
 val dateOptions = new CSVOptions(Map("dateFormat" -> "dd/MM/"), 
false, "GMT")
-val expectedDate = dateOptions.dateFormat.parse(customDate).getTime
-val castedDate =
-  parser.makeConverter("_1", DateType, nullable = true, options = 
dateOptions)
-.apply(customTimestamp)
-assert(castedDate == DateTimeUtils.millisToDays(expectedDate))
+parser = new UnivocityParser(StructType(Seq.empty), dateOptions)
+format = FastDateFormat.getInstance(
+  dateOptions.dateFormat, dateOptions.timeZone, dateOptions.locale)
+val expectedDate = format.parse(customDate).getTime
+val castedDate = parser.makeConverter("_1", DateType, nullable = true)
+.apply(customDate)
+assert(castedDate == DateTimeUtils.millisToDays(expectedDate, 
TimeZone.getTimeZone("GMT")))
 
 val timestamp = "2015-01-01 00:00:00"
-assert(parser.makeConverter("_1", TimestampType, options = 
options).apply(timestamp) ==
-  DateTimeUtils.stringToTime(timestamp).getTime  * 1000L)
-assert(parser.makeConverter("_1", DateType, options = 

[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23196
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5617/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23178: [SPARK-26216][SQL] Do not use case class as public API (...

2018-12-01 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23178
  
thanks for the review, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23178: [SPARK-26216][SQL] Do not use case class as publi...

2018-12-01 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/23178#discussion_r238083055
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala
 ---
@@ -38,114 +38,106 @@ import org.apache.spark.sql.types.DataType
  * @since 1.3.0
  */
 @Stable
--- End diff --

It's not a new API anyway, it will be weird to change since to 3.0.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99562/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23088
  
**[Test build #99562 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99562/testReport)**
 for PR 23088 at commit 
[`64fbe5d`](https://github.com/apache/spark/commit/64fbe5d7b845b6351e2dae2af231d2be37ca13b8).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23072: [SPARK-19827][R]spark.ml R API for PIC

2018-12-01 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/23072#discussion_r238087240
  
--- Diff: 
examples/src/main/scala/org/apache/spark/examples/ml/FPGrowthExample.scala ---
@@ -64,4 +64,3 @@ object FPGrowthExample {
 spark.stop()
   }
 }
-// scalastyle:on println
--- End diff --

yes, println is not used


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22683
  
**[Test build #99565 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99565/testReport)**
 for PR 22683 at commit 
[`5188c54`](https://github.com/apache/spark/commit/5188c54fcf33c24dac341c044f7ffa75c272bf52).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23190
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23190
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5619/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23130: [SPARK-26161][SQL] Ignore empty files in load

2018-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23130


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23178: [SPARK-26216][SQL] Do not use case class as publi...

2018-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23178


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23120: [SPARK-26151][SQL] Return partial results for bad CSV re...

2018-12-01 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23120
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23120: [SPARK-26151][SQL] Return partial results for bad CSV re...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23120
  
**[Test build #99563 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99563/testReport)**
 for PR 23120 at commit 
[`8f2d69d`](https://github.com/apache/spark/commit/8f2d69d848b8242c529118436249019016069ca2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5623/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22683
  
**[Test build #99565 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99565/testReport)**
 for PR 22683 at commit 
[`5188c54`](https://github.com/apache/spark/commit/5188c54fcf33c24dac341c044f7ffa75c272bf52).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99564/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23088
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23190
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5616/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23190
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #23187: [SPARK-26211][SQL][TEST][FOLLOW-UP] Combine test ...

2018-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/23187


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23190
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99560/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...

2018-12-01 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/23196
  
**[Test build #99558 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99558/testReport)**
 for PR 23196 at commit 
[`f326042`](https://github.com/apache/spark/commit/f326042aa1aff540d06c79fd73395204d846f3ea).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23173: [SPARK-26208][SQL] add headers to empty csv files when h...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23173
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23173: [SPARK-26208][SQL] add headers to empty csv files when h...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23173
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99559/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...

2018-12-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/23190
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #23130: [SPARK-26161][SQL] Ignore empty files in load

2018-12-01 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/23130
  
We don't need to block it, but @MaxGekk if you have time, it would great to 
answer https://github.com/apache/spark/pull/23130#issuecomment-442491582

thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   >