[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23184#discussion_r238057238 --- Diff: R/pkg/R/functions.R --- @@ -2254,40 +2255,48 @@ setMethod("date_format", signature(y = "Column", x = "character"), column(jc) }) +setClassUnion("characterOrstructTypeOrColumn", c("character", "structType", "Column")) + #' @details #' \code{from_json}: Parses a column containing a JSON string into a Column of \code{structType} #' with the specified \code{schema} or array of \code{structType} if \code{as.json.array} is set #' to \code{TRUE}. If the string is unparseable, the Column will contain the value NA. #' #' @rdname column_collection_functions #' @param as.json.array indicating if input string is JSON array of objects or a single object. -#' @aliases from_json from_json,Column,characterOrstructType-method +#' @aliases from_json from_json,Column,characterOrstructTypeOrColumn-method #' @examples #' #' \dontrun{ #' df2 <- sql("SELECT named_struct('date', cast('2000-01-01' as date)) as d") #' df2 <- mutate(df2, d2 = to_json(df2$d, dateFormat = 'dd/MM/')) #' schema <- structType(structField("date", "string")) #' head(select(df2, from_json(df2$d2, schema, dateFormat = 'dd/MM/'))) - #' df2 <- sql("SELECT named_struct('name', 'Bob') as people") #' df2 <- mutate(df2, people_json = to_json(df2$people)) #' schema <- structType(structField("name", "string")) #' head(select(df2, from_json(df2$people_json, schema))) -#' head(select(df2, from_json(df2$people_json, "name STRING")))} +#' head(select(df2, from_json(df2$people_json, "name STRING"))) +#' head(select(df2, from_json(df2$people_json, schema_of_json(head(df2)$people_json} #' @note from_json since 2.2.0 -setMethod("from_json", signature(x = "Column", schema = "characterOrstructType"), +setMethod("from_json", signature(x = "Column", schema = "characterOrstructTypeOrColumn"), function(x, schema, as.json.array = FALSE, ...) { if (is.character(schema)) { - schema <- structType(schema) + jschema <- structType(schema)$jobj +} else if (class(schema) == "structType") { + jschema <- schema$jobj +} else { + jschema <- schema@jc } if (as.json.array) { - jschema <- callJStatic("org.apache.spark.sql.types.DataTypes", - "createArrayType", - schema$jobj) -} else { - jschema <- schema$jobj + # This case is R-specifically different. Unlike Scala and Python side, --- End diff -- If so, the provided schema is wrapped by Array. The test cases are ... here https://github.com/apache/spark/pull/23184/files#diff-d4011863c8b176830365b2f224a84bf2R1707 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23186: [SPARK-26230][SQL]FileIndex: if case sensitive, validate...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23186 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5607/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23186: [SPARK-26230][SQL]FileIndex: if case sensitive, validate...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23186 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23190 **[Test build #99541 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99541/testReport)** for PR 23190 at commit [`3376524`](https://github.com/apache/spark/commit/33765248d2afdccf4e3cedf96200791ad48ef6be). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23190 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99541/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23190 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Reuse withTempDir function to ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23151 **[Test build #99533 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99533/testReport)** for PR 23151 at commit [`6d2f2d3`](https://github.com/apache/spark/commit/6d2f2d36f0e8dcdbe54e82865f7bd1b10dd323bd). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23189: [SPARK-26235][Core] Change log level for ClassNotFoundEx...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23189 **[Test build #99545 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99545/testReport)** for PR 23189 at commit [`1662a8b`](https://github.com/apache/spark/commit/1662a8b41564b57eac37e9155eaa51dff24c99dc). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23189: [SPARK-26235][Core] Change log level for ClassNotFoundEx...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23189 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5606/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23184#discussion_r238057161 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala --- @@ -225,4 +225,10 @@ private[sql] object SQLUtils extends Logging { } sparkSession.sessionState.catalog.listTables(db).map(_.table).toArray } + + def createArrayType(elementType: DataType): ArrayType = DataTypes.createArrayType(elementType) --- End diff -- Yea I remember that. I thought this case is a bit different from other instances actually. It reduces the code complexity in R side because R side directly calls this overridden methods. For instance, currently it's being called: ```r jschema <- callJStatic("org.apache.spark.sql.api.r.SQLUtils", "createArrayType", jschema) ``` but if we remove those, it should be like: ```r if (class(schema) == "dataType") { jschema <- callJStatic("org.apache.spark.sql.types.DataTypes", "createArrayType", schema$jobj) } else { jschema <- callJStatic("org.apache.spark.sql.api.r.SQLUtils", "createArrayType", schema$jobj) } ``` Let me try to remove this one anyway. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23184#discussion_r238057208 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala --- @@ -225,4 +225,10 @@ private[sql] object SQLUtils extends Logging { } sparkSession.sessionState.catalog.listTables(db).map(_.table).toArray } + + def createArrayType(elementType: DataType): ArrayType = DataTypes.createArrayType(elementType) + + def createArrayType(elementType: Column): ArrayType = { +new ArrayType(ExprUtils.evalTypeExpr(elementType.expr), true) --- End diff -- Yup, that sounds more correct. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23184: [SPARK-26227][R] from_[csv|json] should accept schema_of...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23184 **[Test build #99547 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99547/testReport)** for PR 23184 at commit [`c731ad1`](https://github.com/apache/spark/commit/c731ad181cc0b69f263b8334d1e1498240121b0c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23194 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23194 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Reuse withTempDir function to ...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23151 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23177: [SPARK-26212][Build][test-maven] Upgrade maven version t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23177 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23177: [SPARK-26212][Build][test-maven] Upgrade maven version t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23177 **[Test build #99538 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99538/testReport)** for PR 23177 at commit [`aa25833`](https://github.com/apache/spark/commit/aa258334170cc2ba009603b4547b4184b3736881). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23178: [SPARK-26216][SQL] Do not use case class as publi...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23178#discussion_r238057361 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -38,114 +38,106 @@ import org.apache.spark.sql.types.DataType * @since 1.3.0 */ @Stable --- End diff -- yea actually I was wondering about the same thing. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23191: [SPARK-26219][CORE][branch-2.4] Executor summary should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23191 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23191: [SPARK-26219][CORE][branch-2.4] Executor summary should ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23191 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99537/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23191: [SPARK-26219][CORE][branch-2.4] Executor summary should ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23191 **[Test build #99537 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99537/testReport)** for PR 23191 at commit [`8656936`](https://github.com/apache/spark/commit/8656936df9f5fd4f51c967aac3201c0988c05b5c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23173: [SPARK-26208][SQL] add headers to empty csv files...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23173#discussion_r238058594 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala --- @@ -171,15 +171,21 @@ private[csv] class CsvOutputWriter( private var univocityGenerator: Option[UnivocityGenerator] = None - override def write(row: InternalRow): Unit = { -val gen = univocityGenerator.getOrElse { - val charset = Charset.forName(params.charset) - val os = CodecStreams.createOutputStreamWriter(context, new Path(path), charset) - val newGen = new UnivocityGenerator(dataSchema, os, params) - univocityGenerator = Some(newGen) - newGen -} + if (params.headerFlag) { +val gen = getGen() +gen.writeHeaders() + } + private def getGen(): UnivocityGenerator = univocityGenerator.getOrElse { +val charset = Charset.forName(params.charset) +val os = CodecStreams.createOutputStreamWriter(context, new Path(path), charset) +val newGen = new UnivocityGenerator(dataSchema, os, params) +univocityGenerator = Some(newGen) +newGen + } + + override def write(row: InternalRow): Unit = { +val gen = getGen() --- End diff -- Wait .. is this going to create `UnivocityGenerator` for each record? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23184: [SPARK-26227][R] from_[csv|json] should accept schema_of...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23184 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5608/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23173: [SPARK-26208][SQL] add headers to empty csv files...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23173#discussion_r238058669 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala --- @@ -171,15 +171,21 @@ private[csv] class CsvOutputWriter( private var univocityGenerator: Option[UnivocityGenerator] = None - override def write(row: InternalRow): Unit = { -val gen = univocityGenerator.getOrElse { - val charset = Charset.forName(params.charset) - val os = CodecStreams.createOutputStreamWriter(context, new Path(path), charset) - val newGen = new UnivocityGenerator(dataSchema, os, params) - univocityGenerator = Some(newGen) - newGen -} + if (params.headerFlag) { +val gen = getGen() +gen.writeHeaders() + } + private def getGen(): UnivocityGenerator = univocityGenerator.getOrElse { +val charset = Charset.forName(params.charset) +val os = CodecStreams.createOutputStreamWriter(context, new Path(path), charset) +val newGen = new UnivocityGenerator(dataSchema, os, params) +univocityGenerator = Some(newGen) +newGen + } + + override def write(row: InternalRow): Unit = { +val gen = getGen() --- End diff -- Ah, it's `getOrElse`. Okay but still can we simplify this logic? Looks a bit confusing. For instance, I think we can do this with lazy val. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23184: [SPARK-26227][R] from_[csv|json] should accept schema_of...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23184 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Reuse withTempDir function to ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23151 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99533/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23194: [MINOR][SQL] Combine the same codes in test cases
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23194 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23194: [MINOR][SQL] Combine the same codes in test cases
GitHub user CarolinePeng opened a pull request: https://github.com/apache/spark/pull/23194 [MINOR][SQL] Combine the same codes in test cases ## What changes were proposed in this pull request? In the DDLSuit, there are four test cases have the same codes , writing a function can combine the same code. ## How was this patch tested? existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/CarolinePeng/spark Update_temp Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23194.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23194 commit eff036c031fe96e9c609b30c882be3ac3bd64e86 Author: å½ç¿00244106 <00244106@...> Date: 2018-12-01T07:51:39Z update some codes --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23151: [SPARK-26180][CORE][TEST] Reuse withTempDir function to ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23151 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23189: [SPARK-26235][Core] Change log level for ClassNotFoundEx...
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/23189 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23177: [SPARK-26212][Build][test-maven] Upgrade maven version t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23177 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99538/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23151: [SPARK-26180][CORE][TEST] Reuse withTempDir funct...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/23151 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23189: [SPARK-26235][Core] Change log level for ClassNotFoundEx...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23189 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23184#discussion_r238056921 --- Diff: R/pkg/R/functions.R --- @@ -202,8 +202,9 @@ NULL #' \itemize{ #' \item \code{from_json}: a structType object to use as the schema to use #' when parsing the JSON string. Since Spark 2.3, the DDL-formatted string is -#' also supported for the schema. -#' \item \code{from_csv}: a DDL-formatted string +#' also supported for the schema. Since Spark 3.0, \code{schema_of_json} or +#' a string literal can also be accepted. --- End diff -- Basically yea same. The only difference is it allows takes a string literal (not only string). Let me try to clarify it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23184#discussion_r238056981 --- Diff: R/pkg/R/functions.R --- @@ -2254,40 +2255,48 @@ setMethod("date_format", signature(y = "Column", x = "character"), column(jc) }) +setClassUnion("characterOrstructTypeOrColumn", c("character", "structType", "Column")) --- End diff -- Yup, I agree.. Would you mind if I do this separately? I roughly checked by `grep` and looks: ``` ./pkg/R/DataFrame.R:setClassUnion("characterOrstructType", c("character", "structType")) ./pkg/R/DataFrame.R:setClassUnion("numericOrcharacter", c("numeric", "character")) ./pkg/R/DataFrame.R:setClassUnion("characterOrColumn", c("character", "Column")) ./pkg/R/DataFrame.R:setClassUnion("numericOrColumn", c("numeric", "Column")) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23186: [SPARK-26230][SQL]FileIndex: if case sensitive, validate...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23186 **[Test build #99546 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99546/testReport)** for PR 23186 at commit [`313366d`](https://github.com/apache/spark/commit/313366d58075daba055c392682abaa01fbb574ee). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/23184#discussion_r238058149 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala --- @@ -225,4 +225,10 @@ private[sql] object SQLUtils extends Logging { } sparkSession.sessionState.catalog.listTables(db).map(_.table).toArray } + + def createArrayType(elementType: DataType): ArrayType = DataTypes.createArrayType(elementType) + + def createArrayType(elementType: Column): ArrayType = { +new ArrayType(ExprUtils.evalTypeExpr(elementType.expr), true) --- End diff -- Oops, it looks not actually. `elementType.expr.nullable` will return nullability from the expression (like `schema_of_json` or string literal), not for something related with its input schema. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23173: [SPARK-26208][SQL] add headers to empty csv files when h...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/23173 Looks fine to me too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23160: [SPARK-26196][WebUI] Total tasks title in the stage page...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23160 **[Test build #99542 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99542/testReport)** for PR 23160 at commit [`71aff34`](https://github.com/apache/spark/commit/71aff3447b4c1a80e168a18202864191df189709). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22683 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22683 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99544/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23160: [SPARK-26196][WebUI] Total tasks title in the stage page...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23160 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...
Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/23150#discussion_r238075585 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -1107,7 +,7 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils with Te test("SPARK-18699 put malformed records in a `columnNameOfCorruptRecord` field") { Seq(false, true).foreach { multiLine => - val schema = new StructType().add("a", IntegerType).add("b", TimestampType) + val schema = new StructType().add("a", IntegerType).add("b", DateType) --- End diff -- I changed the type because supposed to valid date `"1983-08-04"` cannot be parsed with default timestamp pattern. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23196 **[Test build #99555 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99555/testReport)** for PR 23196 at commit [`4646ded`](https://github.com/apache/spark/commit/4646dededae832185a35a85244baab6507d28f0d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23173: [SPARK-26208][SQL] add headers to empty csv files...
Github user koertkuipers commented on a diff in the pull request: https://github.com/apache/spark/pull/23173#discussion_r238077135 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala --- @@ -171,15 +171,21 @@ private[csv] class CsvOutputWriter( private var univocityGenerator: Option[UnivocityGenerator] = None - override def write(row: InternalRow): Unit = { -val gen = univocityGenerator.getOrElse { - val charset = Charset.forName(params.charset) - val os = CodecStreams.createOutputStreamWriter(context, new Path(path), charset) - val newGen = new UnivocityGenerator(dataSchema, os, params) - univocityGenerator = Some(newGen) - newGen -} + if (params.headerFlag) { +val gen = getGen() +gen.writeHeaders() + } + private def getGen(): UnivocityGenerator = univocityGenerator.getOrElse { +val charset = Charset.forName(params.charset) +val os = CodecStreams.createOutputStreamWriter(context, new Path(path), charset) +val newGen = new UnivocityGenerator(dataSchema, os, params) +univocityGenerator = Some(newGen) +newGen + } + + override def write(row: InternalRow): Unit = { +val gen = getGen() --- End diff -- i will revert this change to lazy val for now since it doesnt have anything to do wit this pullreq or jira: the Option approach was created in another pullreq. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23196 **[Test build #99558 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99558/testReport)** for PR 23196 at commit [`f326042`](https://github.com/apache/spark/commit/f326042aa1aff540d06c79fd73395204d846f3ea). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23190 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99557/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23150 **[Test build #99556 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99556/testReport)** for PR 23150 at commit [`00509d3`](https://github.com/apache/spark/commit/00509d3a94e0679505cd9fde78e38a3a15d11bde). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23192: [SPARK-26241][SQL] Add queryId to IncrementalExec...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/23192 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23193: [SPARK-26226][SQL] Track optimization phase for s...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/23193 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user shahidki31 commented on the issue: https://github.com/apache/spark/pull/23088 Hi @srowen . Yes. Currently for disk store case, we need to have a more optimized code. > While it makes some sense I have two concerns: different answers based on disk vs memory store which shouldn't really affect things. But would a user ever have both and see both side by side and be confused? This configurable store is only for history server. So user can configure either one at a time for history server. But, the live UI (which open from Yarn UI), also goes through the same code flow. Where it has 'ElementTrackingStore', which is also inMemory. So, if a user configure disk store for History server and open both live and inProgress History UI, the summary metrics will be different. > changing the way the indexing works, so that you can index by specific metrics for successful and failed tasks differently, would be tricky, and also would require changing the disk store version (to invalidate old stores). I think @vanzin suggestion seems work, but need time to give it a try and to test it. May be we can add as "TODO" for diskStore case or open a seperate JIRA for that. > Second is, that seems like it should still entail pushing down all the quantile logic into the KVStore, to be clean, right? and that's a bigger change. Thanks @srowen for the suggestion. Probably @vanzin can answer this well. I have modified the code, for InMemory case. Disk store still uses the old code. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23154: [SPARK-26195][SQL] Correct exception messages in some cl...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23154 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23120: [SPARK-26151][SQL] Return partial results for bad...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23120#discussion_r238083349 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/UnivocityParser.scala --- @@ -243,21 +243,27 @@ class UnivocityParser( () => getPartialResult(), new RuntimeException("Malformed CSV record")) } else { - try { -// When the length of the returned tokens is identical to the length of the parsed schema, -// we just need to convert the tokens that correspond to the required columns. -var i = 0 -while (i < requiredSchema.length) { + // When the length of the returned tokens is identical to the length of the parsed schema, + // we just need to convert the tokens that correspond to the required columns. + var badRecordException: Option[Throwable] = None + var i = 0 + while (i < requiredSchema.length) { +try { row(i) = valueConverters(i).apply(getToken(tokens, i)) - i += 1 +} catch { + case NonFatal(e) => +badRecordException = badRecordException.orElse(Some(e)) } +i += 1 + } + + if (badRecordException.isEmpty) { row - } catch { -case NonFatal(e) => - // For corrupted records with the number of tokens same as the schema, - // CSV reader doesn't support partial results. All fields other than the field - // configured by `columnNameOfCorruptRecord` are set to `null`. - throw BadRecordException(() => getCurrentInput, () => None, e) + } else { +// For corrupted records with the number of tokens same as the schema, +// CSV reader doesn't support partial results. All fields other than the field +// configured by `columnNameOfCorruptRecord` are set to `null`. --- End diff -- what do you mean here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99561/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23088 **[Test build #99561 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99561/testReport)** for PR 23088 at commit [`dc95355`](https://github.com/apache/spark/commit/dc9535547c353509930ea340780611f3129da962). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23164: [SPARK-26198][SQL] Fix Metadata serialize null values th...
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/23164 I used it here: https://github.com/apache/spark/compare/master...wangyum:default-value?expand=1#diff-9847f5cef7cf7fbc5830fbc6b779ee10R1827 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...
Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/23150#discussion_r238075664 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -622,10 +623,11 @@ class CSVSuite extends QueryTest with SharedSQLContext with SQLTestUtils with Te val options = Map( "header" -> "true", "inferSchema" -> "false", - "dateFormat" -> "dd/MM/ hh:mm") + "dateFormat" -> "dd/MM/ HH:mm") --- End diff -- According to iso 8601: ``` h clock-hour-of-am-pm (1-12) number12 H hour-of-day (0-23) number0 ``` but real data is not in the allowed range: ``` date 26/08/2015 18:00 27/10/2014 18:30 28/01/2016 20:00 ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23196 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99555/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23196 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23150 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99556/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23150: [SPARK-26178][SQL] Use java.time API for parsing timesta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23150 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23187: [SPARK-26211][SQL][TEST][FOLLOW-UP] Combine test cases f...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23187 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23190 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23190 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23088 **[Test build #99562 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99562/testReport)** for PR 23088 at commit [`64fbe5d`](https://github.com/apache/spark/commit/64fbe5d7b845b6351e2dae2af231d2be37ca13b8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5621/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23154: [SPARK-26195][SQL] Correct exception messages in ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/23154 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23088 **[Test build #99564 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99564/testReport)** for PR 23088 at commit [`7956b27`](https://github.com/apache/spark/commit/7956b27bd6b19065d367d96cd5e2b448507c7dc4). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23088 **[Test build #99564 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99564/testReport)** for PR 23088 at commit [`7956b27`](https://github.com/apache/spark/commit/7956b27bd6b19065d367d96cd5e2b448507c7dc4). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23150: [SPARK-26178][SQL] Use java.time API for parsing ...
Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/23150#discussion_r238075711 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/csv/UnivocityParserSuite.scala --- @@ -86,62 +85,74 @@ class UnivocityParserSuite extends SparkFunSuite with SQLHelper { // null. Seq(true, false).foreach { b => val options = new CSVOptions(Map("nullValue" -> "null"), false, "GMT") - val converter = -parser.makeConverter("_1", StringType, nullable = b, options = options) + val parser = new UnivocityParser(StructType(Seq.empty), options) + val converter = parser.makeConverter("_1", StringType, nullable = b) assert(converter.apply("") == UTF8String.fromString("")) } } test("Throws exception for empty string with non null type") { - val options = new CSVOptions(Map.empty[String, String], false, "GMT") +val options = new CSVOptions(Map.empty[String, String], false, "GMT") +val parser = new UnivocityParser(StructType(Seq.empty), options) val exception = intercept[RuntimeException]{ - parser.makeConverter("_1", IntegerType, nullable = false, options = options).apply("") + parser.makeConverter("_1", IntegerType, nullable = false).apply("") } assert(exception.getMessage.contains("null value found but field _1 is not nullable.")) } test("Types are cast correctly") { val options = new CSVOptions(Map.empty[String, String], false, "GMT") -assert(parser.makeConverter("_1", ByteType, options = options).apply("10") == 10) -assert(parser.makeConverter("_1", ShortType, options = options).apply("10") == 10) -assert(parser.makeConverter("_1", IntegerType, options = options).apply("10") == 10) -assert(parser.makeConverter("_1", LongType, options = options).apply("10") == 10) -assert(parser.makeConverter("_1", FloatType, options = options).apply("1.00") == 1.0) -assert(parser.makeConverter("_1", DoubleType, options = options).apply("1.00") == 1.0) -assert(parser.makeConverter("_1", BooleanType, options = options).apply("true") == true) - -val timestampsOptions = +var parser = new UnivocityParser(StructType(Seq.empty), options) +assert(parser.makeConverter("_1", ByteType).apply("10") == 10) +assert(parser.makeConverter("_1", ShortType).apply("10") == 10) +assert(parser.makeConverter("_1", IntegerType).apply("10") == 10) +assert(parser.makeConverter("_1", LongType).apply("10") == 10) +assert(parser.makeConverter("_1", FloatType).apply("1.00") == 1.0) +assert(parser.makeConverter("_1", DoubleType).apply("1.00") == 1.0) +assert(parser.makeConverter("_1", BooleanType).apply("true") == true) + +var timestampsOptions = new CSVOptions(Map("timestampFormat" -> "dd/MM/ hh:mm"), false, "GMT") +parser = new UnivocityParser(StructType(Seq.empty), timestampsOptions) val customTimestamp = "31/01/2015 00:00" -val expectedTime = timestampsOptions.timestampFormat.parse(customTimestamp).getTime -val castedTimestamp = - parser.makeConverter("_1", TimestampType, nullable = true, options = timestampsOptions) +var format = FastDateFormat.getInstance( + timestampsOptions.timestampFormat, timestampsOptions.timeZone, timestampsOptions.locale) +val expectedTime = format.parse(customTimestamp).getTime +val castedTimestamp = parser.makeConverter("_1", TimestampType, nullable = true) .apply(customTimestamp) assert(castedTimestamp == expectedTime * 1000L) val customDate = "31/01/2015" val dateOptions = new CSVOptions(Map("dateFormat" -> "dd/MM/"), false, "GMT") -val expectedDate = dateOptions.dateFormat.parse(customDate).getTime -val castedDate = - parser.makeConverter("_1", DateType, nullable = true, options = dateOptions) -.apply(customTimestamp) -assert(castedDate == DateTimeUtils.millisToDays(expectedDate)) +parser = new UnivocityParser(StructType(Seq.empty), dateOptions) +format = FastDateFormat.getInstance( + dateOptions.dateFormat, dateOptions.timeZone, dateOptions.locale) +val expectedDate = format.parse(customDate).getTime +val castedDate = parser.makeConverter("_1", DateType, nullable = true) +.apply(customDate) +assert(castedDate == DateTimeUtils.millisToDays(expectedDate, TimeZone.getTimeZone("GMT"))) val timestamp = "2015-01-01 00:00:00" -assert(parser.makeConverter("_1", TimestampType, options = options).apply(timestamp) == - DateTimeUtils.stringToTime(timestamp).getTime * 1000L) -assert(parser.makeConverter("_1", DateType, options =
[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23196 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5617/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23178: [SPARK-26216][SQL] Do not use case class as public API (...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23178 thanks for the review, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23178: [SPARK-26216][SQL] Do not use case class as publi...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/23178#discussion_r238083055 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala --- @@ -38,114 +38,106 @@ import org.apache.spark.sql.types.DataType * @since 1.3.0 */ @Stable --- End diff -- It's not a new API anyway, it will be weird to change since to 3.0. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99562/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23088 **[Test build #99562 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99562/testReport)** for PR 23088 at commit [`64fbe5d`](https://github.com/apache/spark/commit/64fbe5d7b845b6351e2dae2af231d2be37ca13b8). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23072: [SPARK-19827][R]spark.ml R API for PIC
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/23072#discussion_r238087240 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/FPGrowthExample.scala --- @@ -64,4 +64,3 @@ object FPGrowthExample { spark.stop() } } -// scalastyle:on println --- End diff -- yes, println is not used --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22683 **[Test build #99565 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99565/testReport)** for PR 22683 at commit [`5188c54`](https://github.com/apache/spark/commit/5188c54fcf33c24dac341c044f7ffa75c272bf52). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23190 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23190 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5619/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23130: [SPARK-26161][SQL] Ignore empty files in load
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/23130 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23178: [SPARK-26216][SQL] Do not use case class as publi...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/23178 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23120: [SPARK-26151][SQL] Return partial results for bad CSV re...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23120 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23120: [SPARK-26151][SQL] Return partial results for bad CSV re...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23120 **[Test build #99563 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99563/testReport)** for PR 23120 at commit [`8f2d69d`](https://github.com/apache/spark/commit/8f2d69d848b8242c529118436249019016069ca2). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5623/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22683: [SPARK-25696] The storage memory displayed on spark Appl...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22683 **[Test build #99565 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99565/testReport)** for PR 22683 at commit [`5188c54`](https://github.com/apache/spark/commit/5188c54fcf33c24dac341c044f7ffa75c272bf52). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99564/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23088: [SPARK-26119][CORE][WEBUI]Task summary table should cont...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23088 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23190 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5616/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23190 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #23187: [SPARK-26211][SQL][TEST][FOLLOW-UP] Combine test ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/23187 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23190 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99560/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23196: [SPARK-26243][SQL] Use java.time API for parsing timesta...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/23196 **[Test build #99558 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99558/testReport)** for PR 23196 at commit [`f326042`](https://github.com/apache/spark/commit/f326042aa1aff540d06c79fd73395204d846f3ea). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23173: [SPARK-26208][SQL] add headers to empty csv files when h...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23173 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23173: [SPARK-26208][SQL] add headers to empty csv files when h...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23173 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99559/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23190: [MINOR][SQL]throw SparkOutOfMemoryError intead of SparkE...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/23190 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #23130: [SPARK-26161][SQL] Ignore empty files in load
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/23130 We don't need to block it, but @MaxGekk if you have time, it would great to answer https://github.com/apache/spark/pull/23130#issuecomment-442491582 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org