[spark] branch master updated (a364cc0 -> 9b885ae)
This is an automated email from the ASF dual-hosted git repository. maxgekk pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a364cc0 [SPARK-38336][SQL] Support INSERT INTO commands into tables with DEFAULT columns add 9b885ae [SPARK-38701][SQL] Inline `IllegalStateException` out from `QueryExecutionErrors` No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/catalog/interface.scala | 4 +-- .../expressions/codegen/CodeGenerator.scala| 7 ++-- .../sql/catalyst/expressions/jsonExpressions.scala | 4 +-- .../catalyst/expressions/namedExpressions.scala| 3 +- .../catalyst/optimizer/DecorrelateInnerQuery.scala | 3 +- .../spark/sql/errors/QueryExecutionErrors.scala| 41 +- .../execution/datasources/DataSourceUtils.scala| 4 +-- 7 files changed, 15 insertions(+), 51 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-38336][SQL] Support INSERT INTO commands into tables with DEFAULT columns
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new a364cc0 [SPARK-38336][SQL] Support INSERT INTO commands into tables with DEFAULT columns a364cc0 is described below commit a364cc01716279d19eb5b43bde4961b04f5102af Author: Daniel Tenedorio AuthorDate: Thu Mar 31 14:29:58 2022 +0800 [SPARK-38336][SQL] Support INSERT INTO commands into tables with DEFAULT columns ### What changes were proposed in this pull request? Extend INSERT INTO statements to support omitting default values or referring to them explicitly with the DEFAULT keyword, in which case the Spark analyzer will automatically insert the appropriate corresponding values in the right places. Example: ``` CREATE TABLE T(a INT DEFAULT 4, b INT NOT NULL DEFAULT 5); INSERT INTO T VALUES (1); INSERT INTO T VALUES (1, DEFAULT); INSERT INTO T VALUES (DEFAULT, 6); SELECT * FROM T; (1, 5) (1, 5) (4, 6) ``` ### Why are the changes needed? This helps users issue INSERT INTO statements with less effort, and helps people creating or updating tables to add custom optional columns for use in specific circumstances as desired. ### How was this patch tested? This change is covered by new and existing unit test coverage as well as new INSERT INTO query test cases covering a variety of positive and negative scenarios. Closes #35982 from dtenedor/default-columns-insert-into. Authored-by: Daniel Tenedorio Signed-off-by: Gengliang Wang --- .../spark/sql/catalyst/parser/SqlBaseParser.g4 | 1 + .../spark/sql/catalyst/analysis/Analyzer.scala | 1 + .../catalyst/analysis/ResolveDefaultColumns.scala | 250 .../spark/sql/catalyst/parser/AstBuilder.scala | 5 + .../sql/catalyst/rules/RuleIdCollection.scala | 1 + ...lumns.scala => ResolveDefaultColumnsUtil.scala} | 52 +++-- .../spark/sql/errors/QueryParsingErrors.scala | 5 + .../org/apache/spark/sql/internal/SQLConf.scala| 14 ++ .../org/apache/spark/sql/SQLInsertTestSuite.scala | 30 ++- .../org/apache/spark/sql/sources/InsertSuite.scala | 258 - .../org/apache/spark/sql/hive/InsertSuite.scala| 91 11 files changed, 638 insertions(+), 70 deletions(-) diff --git a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 index 17c3395..872ea53 100644 --- a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 +++ b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 @@ -322,6 +322,7 @@ partitionSpec partitionVal : identifier (EQ constant)? +| identifier EQ DEFAULT ; namespace diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala index f69f17d..bd437c3 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala @@ -311,6 +311,7 @@ class Analyzer(override val catalogManager: CatalogManager) ResolveAggregateFunctions :: TimeWindowing :: SessionWindowing :: + ResolveDefaultColumns(this, v1SessionCatalog) :: ResolveInlineTables :: ResolveLambdaVariables :: ResolveTimeZone :: diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultColumns.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultColumns.scala new file mode 100644 index 000..f4502c9 --- /dev/null +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveDefaultColumns.scala @@ -0,0 +1,250 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.analysis + +imp
[spark] branch branch-3.3 updated: [SPARK-38650][SQL] Better ParseException message for char types without length
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new b9055a4 [SPARK-38650][SQL] Better ParseException message for char types without length b9055a4 is described below commit b9055a48a3150bcba2bc1708a57bfb48761ea8a1 Author: Xinyi Yu AuthorDate: Thu Mar 31 13:24:59 2022 +0800 [SPARK-38650][SQL] Better ParseException message for char types without length ### What changes were proposed in this pull request? This PR improves the error messages for the char / varchar / character datatypes without length. It also added related testcases. Details We support char and varchar types. But when users input the type without length, the message is confusing and not helpful at all: ``` > SELECT cast('a' as CHAR) DataType char is not supported.(line 1, pos 19) == SQL == SELECT cast('a' AS CHAR) ---^^^ ``` In the after case, the messages would be: ``` Datatype char requires a length parameter, for example char(10). Please specify the length. == SQL == SELECT cast('a' AS CHAR) ---^^^ ``` ### Why are the changes needed? To improve error messages for better usability. ### Does this PR introduce _any_ user-facing change? If error messages are considered as user-facing changes, then yes. It improves the messages as above. ### How was this patch tested? It's tested by newly added unit tests. Closes #35966 from anchovYu/better-msg-for-char. Authored-by: Xinyi Yu Signed-off-by: Wenchen Fan (cherry picked from commit d678ed488d176c89df7bff39c4f8b4675232b667) Signed-off-by: Wenchen Fan --- core/src/main/resources/error/error-classes.json| 4 .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala | 2 ++ .../org/apache/spark/sql/errors/QueryParsingErrors.scala| 4 .../apache/spark/sql/catalyst/parser/ErrorParserSuite.scala | 13 + 4 files changed, 23 insertions(+) diff --git a/core/src/main/resources/error/error-classes.json b/core/src/main/resources/error/error-classes.json index e159e7c..d9e2e74 100644 --- a/core/src/main/resources/error/error-classes.json +++ b/core/src/main/resources/error/error-classes.json @@ -138,6 +138,10 @@ "message" : [ "PARTITION clause cannot contain a non-partition column name: %s" ], "sqlState" : "42000" }, + "PARSE_CHAR_MISSING_LENGTH" : { +"message" : [ "DataType %s requires a length parameter, for example %s(10). Please specify the length." ], +"sqlState" : "42000" + }, "PARSE_EMPTY_STATEMENT" : { "message" : [ "Syntax error, unexpected empty statement" ], "sqlState" : "42000" diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala index 9266388..3a22c5e 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala @@ -2671,6 +2671,8 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] with SQLConfHelper wit DecimalType(precision.getText.toInt, scale.getText.toInt) case ("void", Nil) => NullType case ("interval", Nil) => CalendarIntervalType + case (dt @ ("character" | "char" | "varchar"), Nil) => +throw QueryParsingErrors.charTypeMissingLengthError(dt, ctx) case (dt, params) => val dtStr = if (params.nonEmpty) s"$dt(${params.mkString(",")})" else dt throw QueryParsingErrors.dataTypeUnsupportedError(dtStr, ctx) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala index c092958..e41c4cd 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala @@ -220,6 +220,10 @@ object QueryParsingErrors { new ParseException(s"DataType $dataType is not supported.", ctx) } + def charTypeMissingLengthError(dataType: String, ctx: PrimitiveDataTypeContext): Throwable = { +new ParseException("PARSE_CHAR_MISSING_LENGTH", Array(dataType, dataType), ctx) + } + def partitionTransformNotExpectedError( name: String, describe: String, ctx: ApplyTransformContext): Throwable = { new ParseException(s"Expected a column reference for transform $name: $describe", ctx) diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ErrorParserSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/cataly
[spark] branch master updated: [SPARK-38650][SQL] Better ParseException message for char types without length
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new d678ed4 [SPARK-38650][SQL] Better ParseException message for char types without length d678ed4 is described below commit d678ed488d176c89df7bff39c4f8b4675232b667 Author: Xinyi Yu AuthorDate: Thu Mar 31 13:24:59 2022 +0800 [SPARK-38650][SQL] Better ParseException message for char types without length ### What changes were proposed in this pull request? This PR improves the error messages for the char / varchar / character datatypes without length. It also added related testcases. Details We support char and varchar types. But when users input the type without length, the message is confusing and not helpful at all: ``` > SELECT cast('a' as CHAR) DataType char is not supported.(line 1, pos 19) == SQL == SELECT cast('a' AS CHAR) ---^^^ ``` In the after case, the messages would be: ``` Datatype char requires a length parameter, for example char(10). Please specify the length. == SQL == SELECT cast('a' AS CHAR) ---^^^ ``` ### Why are the changes needed? To improve error messages for better usability. ### Does this PR introduce _any_ user-facing change? If error messages are considered as user-facing changes, then yes. It improves the messages as above. ### How was this patch tested? It's tested by newly added unit tests. Closes #35966 from anchovYu/better-msg-for-char. Authored-by: Xinyi Yu Signed-off-by: Wenchen Fan --- core/src/main/resources/error/error-classes.json| 4 .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala | 2 ++ .../org/apache/spark/sql/errors/QueryParsingErrors.scala| 4 .../apache/spark/sql/catalyst/parser/ErrorParserSuite.scala | 13 + 4 files changed, 23 insertions(+) diff --git a/core/src/main/resources/error/error-classes.json b/core/src/main/resources/error/error-classes.json index e159e7c..d9e2e74 100644 --- a/core/src/main/resources/error/error-classes.json +++ b/core/src/main/resources/error/error-classes.json @@ -138,6 +138,10 @@ "message" : [ "PARTITION clause cannot contain a non-partition column name: %s" ], "sqlState" : "42000" }, + "PARSE_CHAR_MISSING_LENGTH" : { +"message" : [ "DataType %s requires a length parameter, for example %s(10). Please specify the length." ], +"sqlState" : "42000" + }, "PARSE_EMPTY_STATEMENT" : { "message" : [ "Syntax error, unexpected empty statement" ], "sqlState" : "42000" diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala index 01e627f..c7925c9 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala @@ -2679,6 +2679,8 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] with SQLConfHelper wit DecimalType(precision.getText.toInt, scale.getText.toInt) case ("void", Nil) => NullType case ("interval", Nil) => CalendarIntervalType + case (dt @ ("character" | "char" | "varchar"), Nil) => +throw QueryParsingErrors.charTypeMissingLengthError(dt, ctx) case (dt, params) => val dtStr = if (params.nonEmpty) s"$dt(${params.mkString(",")})" else dt throw QueryParsingErrors.dataTypeUnsupportedError(dtStr, ctx) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala index d40f276..b13a530 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala @@ -221,6 +221,10 @@ object QueryParsingErrors { new ParseException(s"DataType $dataType is not supported.", ctx) } + def charTypeMissingLengthError(dataType: String, ctx: PrimitiveDataTypeContext): Throwable = { +new ParseException("PARSE_CHAR_MISSING_LENGTH", Array(dataType, dataType), ctx) + } + def partitionTransformNotExpectedError( name: String, describe: String, ctx: ApplyTransformContext): Throwable = { new ParseException(s"Expected a column reference for transform $name: $describe", ctx) diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ErrorParserSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ErrorParserSuite.scala index 20e17a8..c42f725 100644 --- a/sql/catalyst/src/test/scala/org/apache/spar
[spark] branch branch-3.3 updated: [SPARK-38698][SQL] Provide query context in runtime error of Divide/Div/Reminder/Pmod
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new 20d545c [SPARK-38698][SQL] Provide query context in runtime error of Divide/Div/Reminder/Pmod 20d545c is described below commit 20d545c01594b03e0815823e8dca600fd2c1de55 Author: Gengliang Wang AuthorDate: Thu Mar 31 13:18:45 2022 +0800 [SPARK-38698][SQL] Provide query context in runtime error of Divide/Div/Reminder/Pmod ### What changes were proposed in this pull request? Provide SQL query context in the following runtime error: - Divide: divide by 0 error, including numeric types and ANSI interval types - Integral Divide: divide by 0 error and overflow error - Reminder: divide by 0 error - Pmod: divide by 0 error Example1: ``` == SQL(line 1, position 7) == select smallint('100') / bigint('0') ^ ``` Example 2: ``` == SQL(line 1, position 7) == select interval '2' year / 0 ^ ``` ### Why are the changes needed? Provide SQL query context of runtime errors to users, so that they can understand it better. ### Does this PR introduce _any_ user-facing change? Yes, improve the runtime error message of Divide/Div/Reminder/Pmod ### How was this patch tested? UT Closes #36013 from gengliangwang/divideError. Authored-by: Gengliang Wang Signed-off-by: Gengliang Wang (cherry picked from commit e96883d98c32cef04d6015d9937979d663c1e754) Signed-off-by: Gengliang Wang --- core/src/main/resources/error/error-classes.json | 2 +- .../org/apache/spark/SparkThrowableSuite.scala | 4 +- .../sql/catalyst/expressions/arithmetic.scala | 19 +-- .../catalyst/expressions/intervalExpressions.scala | 172 +++-- .../spark/sql/catalyst/util/IntervalUtils.scala| 2 +- .../spark/sql/errors/QueryExecutionErrors.scala| 8 +- .../expressions/ArithmeticExpressionSuite.scala| 46 ++ .../sql-tests/results/ansi/interval.sql.out| 18 +++ .../resources/sql-tests/results/interval.sql.out | 18 +++ .../sql-tests/results/postgreSQL/case.sql.out | 9 ++ .../sql-tests/results/postgreSQL/int8.sql.out | 9 ++ .../results/postgreSQL/select_having.sql.out | 3 + .../results/udf/postgreSQL/udf-case.sql.out| 9 ++ .../udf/postgreSQL/udf-select_having.sql.out | 3 + 14 files changed, 227 insertions(+), 95 deletions(-) diff --git a/core/src/main/resources/error/error-classes.json b/core/src/main/resources/error/error-classes.json index cd47d50..e159e7c 100644 --- a/core/src/main/resources/error/error-classes.json +++ b/core/src/main/resources/error/error-classes.json @@ -33,7 +33,7 @@ "sqlState" : "22008" }, "DIVIDE_BY_ZERO" : { -"message" : [ "divide by zero. To return NULL instead, use 'try_divide'. If necessary set %s to false (except for ANSI interval type) to bypass this error." ], +"message" : [ "divide by zero. To return NULL instead, use 'try_divide'. If necessary set %s to false (except for ANSI interval type) to bypass this error.%s" ], "sqlState" : "22012" }, "DUPLICATE_KEY" : { diff --git a/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala b/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala index 47df19f..f1eb27c 100644 --- a/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala +++ b/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala @@ -124,9 +124,9 @@ class SparkThrowableSuite extends SparkFunSuite { } // Does not fail with too many args (expects 0 args) -assert(getMessage("DIVIDE_BY_ZERO", Array("foo", "bar")) == +assert(getMessage("DIVIDE_BY_ZERO", Array("foo", "bar", "baz")) == "divide by zero. To return NULL instead, use 'try_divide'. If necessary set foo to false " + -"(except for ANSI interval type) to bypass this error.") +"(except for ANSI interval type) to bypass this error.bar") } test("Error message is formatted") { diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala index 7251e47..c6d66d8 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala @@ -457,10 +457,10 @@ trait DivModLike extends BinaryArithmetic { } else { if (isZero(input2)) { // when we reach here, failOnError must be true. - throw QueryExecutionErrors.divideByZeroError() + throw QueryExecutionErrors.divideByZeroError(o
[spark] branch master updated: [SPARK-38698][SQL] Provide query context in runtime error of Divide/Div/Reminder/Pmod
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new e96883d [SPARK-38698][SQL] Provide query context in runtime error of Divide/Div/Reminder/Pmod e96883d is described below commit e96883d98c32cef04d6015d9937979d663c1e754 Author: Gengliang Wang AuthorDate: Thu Mar 31 13:18:45 2022 +0800 [SPARK-38698][SQL] Provide query context in runtime error of Divide/Div/Reminder/Pmod ### What changes were proposed in this pull request? Provide SQL query context in the following runtime error: - Divide: divide by 0 error, including numeric types and ANSI interval types - Integral Divide: divide by 0 error and overflow error - Reminder: divide by 0 error - Pmod: divide by 0 error Example1: ``` == SQL(line 1, position 7) == select smallint('100') / bigint('0') ^ ``` Example 2: ``` == SQL(line 1, position 7) == select interval '2' year / 0 ^ ``` ### Why are the changes needed? Provide SQL query context of runtime errors to users, so that they can understand it better. ### Does this PR introduce _any_ user-facing change? Yes, improve the runtime error message of Divide/Div/Reminder/Pmod ### How was this patch tested? UT Closes #36013 from gengliangwang/divideError. Authored-by: Gengliang Wang Signed-off-by: Gengliang Wang --- core/src/main/resources/error/error-classes.json | 2 +- .../org/apache/spark/SparkThrowableSuite.scala | 4 +- .../sql/catalyst/expressions/arithmetic.scala | 19 +-- .../catalyst/expressions/intervalExpressions.scala | 172 +++-- .../spark/sql/catalyst/util/IntervalUtils.scala| 2 +- .../spark/sql/errors/QueryExecutionErrors.scala| 8 +- .../expressions/ArithmeticExpressionSuite.scala| 46 ++ .../sql-tests/results/ansi/interval.sql.out| 18 +++ .../resources/sql-tests/results/interval.sql.out | 18 +++ .../sql-tests/results/postgreSQL/case.sql.out | 9 ++ .../sql-tests/results/postgreSQL/int8.sql.out | 9 ++ .../results/postgreSQL/select_having.sql.out | 3 + .../results/udf/postgreSQL/udf-case.sql.out| 9 ++ .../udf/postgreSQL/udf-select_having.sql.out | 3 + 14 files changed, 227 insertions(+), 95 deletions(-) diff --git a/core/src/main/resources/error/error-classes.json b/core/src/main/resources/error/error-classes.json index cd47d50..e159e7c 100644 --- a/core/src/main/resources/error/error-classes.json +++ b/core/src/main/resources/error/error-classes.json @@ -33,7 +33,7 @@ "sqlState" : "22008" }, "DIVIDE_BY_ZERO" : { -"message" : [ "divide by zero. To return NULL instead, use 'try_divide'. If necessary set %s to false (except for ANSI interval type) to bypass this error." ], +"message" : [ "divide by zero. To return NULL instead, use 'try_divide'. If necessary set %s to false (except for ANSI interval type) to bypass this error.%s" ], "sqlState" : "22012" }, "DUPLICATE_KEY" : { diff --git a/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala b/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala index 47df19f..f1eb27c 100644 --- a/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala +++ b/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala @@ -124,9 +124,9 @@ class SparkThrowableSuite extends SparkFunSuite { } // Does not fail with too many args (expects 0 args) -assert(getMessage("DIVIDE_BY_ZERO", Array("foo", "bar")) == +assert(getMessage("DIVIDE_BY_ZERO", Array("foo", "bar", "baz")) == "divide by zero. To return NULL instead, use 'try_divide'. If necessary set foo to false " + -"(except for ANSI interval type) to bypass this error.") +"(except for ANSI interval type) to bypass this error.bar") } test("Error message is formatted") { diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala index 7251e47..c6d66d8 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala @@ -457,10 +457,10 @@ trait DivModLike extends BinaryArithmetic { } else { if (isZero(input2)) { // when we reach here, failOnError must be true. - throw QueryExecutionErrors.divideByZeroError() + throw QueryExecutionErrors.divideByZeroError(origin.context) } if (checkDivideOverflow && input1 == Long.MinValue && input2 == -1) { -
[spark] branch branch-3.3 updated: [SPARK-38706][CORE] Use URI in `FallbackStorage.copy`
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new 48839f6 [SPARK-38706][CORE] Use URI in `FallbackStorage.copy` 48839f6 is described below commit 48839f6cad14a3462c278d1a3c10b35dde1adcc3 Author: William Hyun AuthorDate: Wed Mar 30 21:15:50 2022 -0700 [SPARK-38706][CORE] Use URI in `FallbackStorage.copy` ### What changes were proposed in this pull request? This PR aims to use URI in `FallbackStorage.copy` method. ### Why are the changes needed? Like the case of SPARK-38652, the current fallback feature is broken with `S3A` due to Hadoop 3.3.2's `org.apache.hadoop.fs.PathIOException`. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Manually start one master and executor and decommission the executor. ``` spark.decommission.enabled true spark.storage.decommission.enabled true spark.storage.decommission.shuffleBlocks.enabledtrue spark.storage.decommission.fallbackStorage.path s3a://spark/storage/ ``` ``` $ curl -v -X POST -d "host=hostname" http://hostname:8080/workers/kill/ ``` Closes #36017 from williamhyun/fallbackstorage. Authored-by: William Hyun Signed-off-by: Dongjoon Hyun (cherry picked from commit 60d09213105f235793f3418d79e6755561a19b15) Signed-off-by: Dongjoon Hyun --- core/src/main/scala/org/apache/spark/storage/FallbackStorage.scala | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/storage/FallbackStorage.scala b/core/src/main/scala/org/apache/spark/storage/FallbackStorage.scala index 0c1206c..e644ffe 100644 --- a/core/src/main/scala/org/apache/spark/storage/FallbackStorage.scala +++ b/core/src/main/scala/org/apache/spark/storage/FallbackStorage.scala @@ -63,14 +63,14 @@ private[storage] class FallbackStorage(conf: SparkConf) extends Logging { if (indexFile.exists()) { val hash = JavaUtils.nonNegativeHash(indexFile.getName) fallbackFileSystem.copyFromLocalFile( -new Path(indexFile.getAbsolutePath), +new Path(Utils.resolveURI(indexFile.getAbsolutePath)), new Path(fallbackPath, s"$appId/$shuffleId/$hash/${indexFile.getName}")) val dataFile = r.getDataFile(shuffleId, mapId) if (dataFile.exists()) { val hash = JavaUtils.nonNegativeHash(dataFile.getName) fallbackFileSystem.copyFromLocalFile( - new Path(dataFile.getAbsolutePath), + new Path(Utils.resolveURI(dataFile.getAbsolutePath)), new Path(fallbackPath, s"$appId/$shuffleId/$hash/${dataFile.getName}")) } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-38706][CORE] Use URI in `FallbackStorage.copy`
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 60d0921 [SPARK-38706][CORE] Use URI in `FallbackStorage.copy` 60d0921 is described below commit 60d09213105f235793f3418d79e6755561a19b15 Author: William Hyun AuthorDate: Wed Mar 30 21:15:50 2022 -0700 [SPARK-38706][CORE] Use URI in `FallbackStorage.copy` ### What changes were proposed in this pull request? This PR aims to use URI in `FallbackStorage.copy` method. ### Why are the changes needed? Like the case of SPARK-38652, the current fallback feature is broken with `S3A` due to Hadoop 3.3.2's `org.apache.hadoop.fs.PathIOException`. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Manually start one master and executor and decommission the executor. ``` spark.decommission.enabled true spark.storage.decommission.enabled true spark.storage.decommission.shuffleBlocks.enabledtrue spark.storage.decommission.fallbackStorage.path s3a://spark/storage/ ``` ``` $ curl -v -X POST -d "host=hostname" http://hostname:8080/workers/kill/ ``` Closes #36017 from williamhyun/fallbackstorage. Authored-by: William Hyun Signed-off-by: Dongjoon Hyun --- core/src/main/scala/org/apache/spark/storage/FallbackStorage.scala | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/core/src/main/scala/org/apache/spark/storage/FallbackStorage.scala b/core/src/main/scala/org/apache/spark/storage/FallbackStorage.scala index 0c1206c..e644ffe 100644 --- a/core/src/main/scala/org/apache/spark/storage/FallbackStorage.scala +++ b/core/src/main/scala/org/apache/spark/storage/FallbackStorage.scala @@ -63,14 +63,14 @@ private[storage] class FallbackStorage(conf: SparkConf) extends Logging { if (indexFile.exists()) { val hash = JavaUtils.nonNegativeHash(indexFile.getName) fallbackFileSystem.copyFromLocalFile( -new Path(indexFile.getAbsolutePath), +new Path(Utils.resolveURI(indexFile.getAbsolutePath)), new Path(fallbackPath, s"$appId/$shuffleId/$hash/${indexFile.getName}")) val dataFile = r.getDataFile(shuffleId, mapId) if (dataFile.exists()) { val hash = JavaUtils.nonNegativeHash(dataFile.getName) fallbackFileSystem.copyFromLocalFile( - new Path(dataFile.getAbsolutePath), + new Path(Utils.resolveURI(dataFile.getAbsolutePath)), new Path(fallbackPath, s"$appId/$shuffleId/$hash/${dataFile.getName}")) } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (ef8fb9b -> 26e93f9)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from ef8fb9b [SPARK-38694][TESTS] Simplify Java UT code with Junit `assertThrows` Api add 26e93f9 [SPARK-38705][SQL] Use function identifier in create and drop function command No new revisions were added by this update. Summary of changes: .../catalyst/analysis/ResolveSessionCatalog.scala | 8 +++ .../spark/sql/execution/SparkSqlParser.scala | 8 +++ .../spark/sql/execution/command/functions.scala| 25 ++ .../sql/execution/command/DDLParserSuite.scala | 6 +++--- 4 files changed, 21 insertions(+), 26 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-38694][TESTS] Simplify Java UT code with Junit `assertThrows` Api
This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new ef8fb9b [SPARK-38694][TESTS] Simplify Java UT code with Junit `assertThrows` Api ef8fb9b is described below commit ef8fb9b9d84b6adfe5a4e03b6e775e709d624144 Author: yangjie01 AuthorDate: Wed Mar 30 18:32:37 2022 -0500 [SPARK-38694][TESTS] Simplify Java UT code with Junit `assertThrows` Api ### What changes were proposed in this pull request? There are some code patterns in Spark Java UTs: ```java Test public void testAuthReplay() throws Exception { try { doSomeOperation(); fail("Should have failed"); } catch (Exception e) { assertTrue(doExceptionCheck(e)); } } ``` or ```java Test(expected = SomeException.class) public void testAuthReplay() throws Exception { try { doSomeOperation(); fail("Should have failed"); } catch (Exception e) { assertTrue(doExceptionCheck(e)); throw e; } } ``` This pr use Junit `assertThrows` Api to simplify the similar patterns. ### Why are the changes needed? Simplify code. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GA Closes #36008 from LuciferYang/SPARK-38694. Authored-by: yangjie01 Signed-off-by: Sean Owen --- .../spark/util/kvstore/InMemoryStoreSuite.java | 21 +- .../apache/spark/util/kvstore/LevelDBSuite.java| 21 +- .../apache/spark/util/kvstore/RocksDBSuite.java| 21 +- .../spark/network/crypto/AuthIntegrationSuite.java | 39 +- .../spark/network/crypto/TransportCipherSuite.java | 21 +- .../apache/spark/network/sasl/SparkSaslSuite.java | 41 +- .../server/OneForOneStreamManagerSuite.java| 23 +- .../spark/network/sasl/SaslIntegrationSuite.java | 37 +- .../network/shuffle/ExternalBlockHandlerSuite.java | 14 +- .../shuffle/ExternalShuffleBlockResolverSuite.java | 17 +- .../shuffle/ExternalShuffleSecuritySuite.java | 16 +- .../shuffle/OneForOneBlockFetcherSuite.java| 14 +- .../shuffle/RemoteBlockPushResolverSuite.java | 464 + .../apache/spark/unsafe/types/UTF8StringSuite.java | 8 +- .../apache/spark/launcher/SparkLauncherSuite.java | 15 +- .../shuffle/sort/PackedRecordPointerSuite.java | 14 +- .../unsafe/map/AbstractBytesToBytesMapSuite.java | 40 +- .../java/test/org/apache/spark/JavaAPISuite.java | 16 +- .../spark/launcher/CommandBuilderUtilsSuite.java | 7 +- .../apache/spark/launcher/LauncherServerSuite.java | 14 +- .../JavaRandomForestClassifierSuite.java | 8 +- .../regression/JavaRandomForestRegressorSuite.java | 8 +- .../spark/ml/util/JavaDefaultReadWriteSuite.java | 8 +- .../expressions/RowBasedKeyValueBatchSuite.java| 60 +-- .../spark/sql/JavaBeanDeserializationSuite.java| 15 +- .../spark/sql/JavaColumnExpressionSuite.java | 16 +- 26 files changed, 317 insertions(+), 661 deletions(-) diff --git a/common/kvstore/src/test/java/org/apache/spark/util/kvstore/InMemoryStoreSuite.java b/common/kvstore/src/test/java/org/apache/spark/util/kvstore/InMemoryStoreSuite.java index 198b6e8..b2acd1a 100644 --- a/common/kvstore/src/test/java/org/apache/spark/util/kvstore/InMemoryStoreSuite.java +++ b/common/kvstore/src/test/java/org/apache/spark/util/kvstore/InMemoryStoreSuite.java @@ -34,24 +34,14 @@ public class InMemoryStoreSuite { t.id = "id"; t.name = "name"; -try { - store.read(CustomType1.class, t.key); - fail("Expected exception for non-existent object."); -} catch (NoSuchElementException nsee) { - // Expected. -} +assertThrows(NoSuchElementException.class, () -> store.read(CustomType1.class, t.key)); store.write(t); assertEquals(t, store.read(t.getClass(), t.key)); assertEquals(1L, store.count(t.getClass())); store.delete(t.getClass(), t.key); -try { - store.read(t.getClass(), t.key); - fail("Expected exception for deleted object."); -} catch (NoSuchElementException nsee) { - // Expected. -} +assertThrows(NoSuchElementException.class, () -> store.read(t.getClass(), t.key)); } @Test @@ -78,12 +68,7 @@ public class InMemoryStoreSuite { store.delete(t1.getClass(), t1.key); assertEquals(t2, store.read(t2.getClass(), t2.key)); store.delete(t2.getClass(), t2.key); -try { - store.read(t2.getClass(), t2.key); - fail("Expected exception for deleted object."); -} catch (NoSuchElementException nsee) { - // Expected. -} +assertThrows(NoSuchElementException.class, () -> store.read(t2.getClass(), t2.key)); } @Test diff --git
[spark] branch branch-3.0 updated: [SPARK-38652][K8S] `uploadFileUri` should preserve file scheme
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new dd54888 [SPARK-38652][K8S] `uploadFileUri` should preserve file scheme dd54888 is described below commit dd54888e546cecf9d542ce4448c78c77e55655f3 Author: Dongjoon Hyun AuthorDate: Wed Mar 30 08:26:41 2022 -0700 [SPARK-38652][K8S] `uploadFileUri` should preserve file scheme ### What changes were proposed in this pull request? This PR replaces `new Path(fileUri.getPath)` with `new Path(fileUri)`. By using `Path` class constructor with URI parameter, we can preserve file scheme. ### Why are the changes needed? If we use, `Path` class constructor with `String` parameter, it loses file scheme information. Although the original code works so far, it fails at Apache Hadoop 3.3.2 and breaks dependency upload feature which is covered by K8s Minikube integration tests. ```scala test("uploadFileUri") { val fileUri = org.apache.spark.util.Utils.resolveURI("/tmp/1.txt") assert(new Path(fileUri).toString == "file:/private/tmp/1.txt") assert(new Path(fileUri.getPath).toString == "/private/tmp/1.txt") } ``` ### Does this PR introduce _any_ user-facing change? No, this will prevent a regression at Apache Spark 3.3.0 instead. ### How was this patch tested? Pass the CIs. In addition, this PR and #36009 will recover K8s IT `DepsTestsSuite`. Closes #36010 from dongjoon-hyun/SPARK-38652. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun (cherry picked from commit cab8aa1c4fe66c4cb1b69112094a203a04758f76) Signed-off-by: Dongjoon Hyun --- .../src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala index c49f4a1..69ea071 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala @@ -282,7 +282,7 @@ private[spark] object KubernetesUtils extends Logging { fs.mkdirs(new Path(s"${uploadPath}/${randomDirName}")) val targetUri = s"${uploadPath}/${randomDirName}/${fileUri.getPath.split("/").last}" log.info(s"Uploading file: ${fileUri.getPath} to dest: $targetUri...") -uploadFileToHadoopCompatibleFS(new Path(fileUri.getPath), new Path(targetUri), fs) +uploadFileToHadoopCompatibleFS(new Path(fileUri), new Path(targetUri), fs) targetUri } catch { case e: Exception => - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-38652][K8S] `uploadFileUri` should preserve file scheme
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new 4b3f8d4 [SPARK-38652][K8S] `uploadFileUri` should preserve file scheme 4b3f8d4 is described below commit 4b3f8d4a7b8929c34ce373f1bbc22062bbbf114e Author: Dongjoon Hyun AuthorDate: Wed Mar 30 08:26:41 2022 -0700 [SPARK-38652][K8S] `uploadFileUri` should preserve file scheme ### What changes were proposed in this pull request? This PR replaces `new Path(fileUri.getPath)` with `new Path(fileUri)`. By using `Path` class constructor with URI parameter, we can preserve file scheme. ### Why are the changes needed? If we use, `Path` class constructor with `String` parameter, it loses file scheme information. Although the original code works so far, it fails at Apache Hadoop 3.3.2 and breaks dependency upload feature which is covered by K8s Minikube integration tests. ```scala test("uploadFileUri") { val fileUri = org.apache.spark.util.Utils.resolveURI("/tmp/1.txt") assert(new Path(fileUri).toString == "file:/private/tmp/1.txt") assert(new Path(fileUri.getPath).toString == "/private/tmp/1.txt") } ``` ### Does this PR introduce _any_ user-facing change? No, this will prevent a regression at Apache Spark 3.3.0 instead. ### How was this patch tested? Pass the CIs. In addition, this PR and #36009 will recover K8s IT `DepsTestsSuite`. Closes #36010 from dongjoon-hyun/SPARK-38652. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun (cherry picked from commit cab8aa1c4fe66c4cb1b69112094a203a04758f76) Signed-off-by: Dongjoon Hyun --- .../src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala index 7e5edd9..86a2daa 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala @@ -289,7 +289,7 @@ private[spark] object KubernetesUtils extends Logging { fs.mkdirs(new Path(s"${uploadPath}/${randomDirName}")) val targetUri = s"${uploadPath}/${randomDirName}/${fileUri.getPath.split("/").last}" log.info(s"Uploading file: ${fileUri.getPath} to dest: $targetUri...") -uploadFileToHadoopCompatibleFS(new Path(fileUri.getPath), new Path(targetUri), fs) +uploadFileToHadoopCompatibleFS(new Path(fileUri), new Path(targetUri), fs) targetUri } catch { case e: Exception => - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.2 updated: [SPARK-38652][K8S] `uploadFileUri` should preserve file scheme
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 01600ae [SPARK-38652][K8S] `uploadFileUri` should preserve file scheme 01600ae is described below commit 01600aeded50f4e9751ef48de5e7ebf2493bfb2c Author: Dongjoon Hyun AuthorDate: Wed Mar 30 08:26:41 2022 -0700 [SPARK-38652][K8S] `uploadFileUri` should preserve file scheme ### What changes were proposed in this pull request? This PR replaces `new Path(fileUri.getPath)` with `new Path(fileUri)`. By using `Path` class constructor with URI parameter, we can preserve file scheme. ### Why are the changes needed? If we use, `Path` class constructor with `String` parameter, it loses file scheme information. Although the original code works so far, it fails at Apache Hadoop 3.3.2 and breaks dependency upload feature which is covered by K8s Minikube integration tests. ```scala test("uploadFileUri") { val fileUri = org.apache.spark.util.Utils.resolveURI("/tmp/1.txt") assert(new Path(fileUri).toString == "file:/private/tmp/1.txt") assert(new Path(fileUri.getPath).toString == "/private/tmp/1.txt") } ``` ### Does this PR introduce _any_ user-facing change? No, this will prevent a regression at Apache Spark 3.3.0 instead. ### How was this patch tested? Pass the CIs. In addition, this PR and #36009 will recover K8s IT `DepsTestsSuite`. Closes #36010 from dongjoon-hyun/SPARK-38652. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun (cherry picked from commit cab8aa1c4fe66c4cb1b69112094a203a04758f76) Signed-off-by: Dongjoon Hyun --- .../src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala index 0c8d964..9ab6d30 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala @@ -320,7 +320,7 @@ object KubernetesUtils extends Logging { fs.mkdirs(new Path(s"${uploadPath}/${randomDirName}")) val targetUri = s"${uploadPath}/${randomDirName}/${fileUri.getPath.split("/").last}" log.info(s"Uploading file: ${fileUri.getPath} to dest: $targetUri...") -uploadFileToHadoopCompatibleFS(new Path(fileUri.getPath), new Path(targetUri), fs) +uploadFileToHadoopCompatibleFS(new Path(fileUri), new Path(targetUri), fs) targetUri } catch { case e: Exception => - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6b29b28 -> cab8aa1)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6b29b28 [SPARK-38696][BUILD] Add `commons-collections` back for hadoop-3 profile add cab8aa1 [SPARK-38652][K8S] `uploadFileUri` should preserve file scheme No new revisions were added by this update. Summary of changes: .../src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.3 updated: [SPARK-38652][K8S] `uploadFileUri` should preserve file scheme
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new d503629 [SPARK-38652][K8S] `uploadFileUri` should preserve file scheme d503629 is described below commit d503629d444f72a536d85ad1b7208f583ad30974 Author: Dongjoon Hyun AuthorDate: Wed Mar 30 08:26:41 2022 -0700 [SPARK-38652][K8S] `uploadFileUri` should preserve file scheme ### What changes were proposed in this pull request? This PR replaces `new Path(fileUri.getPath)` with `new Path(fileUri)`. By using `Path` class constructor with URI parameter, we can preserve file scheme. ### Why are the changes needed? If we use, `Path` class constructor with `String` parameter, it loses file scheme information. Although the original code works so far, it fails at Apache Hadoop 3.3.2 and breaks dependency upload feature which is covered by K8s Minikube integration tests. ```scala test("uploadFileUri") { val fileUri = org.apache.spark.util.Utils.resolveURI("/tmp/1.txt") assert(new Path(fileUri).toString == "file:/private/tmp/1.txt") assert(new Path(fileUri.getPath).toString == "/private/tmp/1.txt") } ``` ### Does this PR introduce _any_ user-facing change? No, this will prevent a regression at Apache Spark 3.3.0 instead. ### How was this patch tested? Pass the CIs. In addition, this PR and #36009 will recover K8s IT `DepsTestsSuite`. Closes #36010 from dongjoon-hyun/SPARK-38652. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun (cherry picked from commit cab8aa1c4fe66c4cb1b69112094a203a04758f76) Signed-off-by: Dongjoon Hyun --- .../src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala index a05d07a..b9cf111 100644 --- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala +++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesUtils.scala @@ -320,7 +320,7 @@ object KubernetesUtils extends Logging { fs.mkdirs(new Path(s"${uploadPath}/${randomDirName}")) val targetUri = s"${uploadPath}/${randomDirName}/${fileUri.getPath.split("/").last}" log.info(s"Uploading file: ${fileUri.getPath} to dest: $targetUri...") -uploadFileToHadoopCompatibleFS(new Path(fileUri.getPath), new Path(targetUri), fs) +uploadFileToHadoopCompatibleFS(new Path(fileUri), new Path(targetUri), fs) targetUri } catch { case e: Exception => - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.3 updated: [SPARK-38696][BUILD] Add `commons-collections` back for hadoop-3 profile
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new bc6a645 [SPARK-38696][BUILD] Add `commons-collections` back for hadoop-3 profile bc6a645 is described below commit bc6a64556f19280e953cf3001889f0e2b0135022 Author: Dongjoon Hyun AuthorDate: Wed Mar 30 07:26:21 2022 -0700 [SPARK-38696][BUILD] Add `commons-collections` back for hadoop-3 profile ### What changes were proposed in this pull request? - SPARK-37968 replaced `commons-collections 3.x` with `commons-collections4`. - SPARK-37600 upgrades to Apache Hadoop 3.3.2. This PR adds it back for `hadoop-3` profile because Apache Hadoop 3.3.2 is still using it. Since SPARK-37968 didn't remove it from `hadoop-2` profile , this is a regression in `hadoop-3` profile only. ### Why are the changes needed? [HADOOP-17139](https://issues.apache.org/jira/browse/HADOOP-17139) added `commons-collections` usage at 3.3.2. Without this patch, `ClassNotFound` error happens while using S3A filesystem. https://github.com/apache/hadoop/blob/f91452b289aea1418f56d242c046b58d9f214a1d/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/CopyFromLocalOperation.java#L41 ``` import org.apache.commons.collections.comparators.ReverseComparator; ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This is a kind of dependency recovery. This should pass CIs. Closes #36009 from dongjoon-hyun/SPARK-38696. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun (cherry picked from commit 6b29b28deffd11edd65b69e0f5c79ed51d483b66) Signed-off-by: Dongjoon Hyun --- core/pom.xml | 4 dev/deps/spark-deps-hadoop-3-hive-2.3 | 1 + pom.xml | 10 -- 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/core/pom.xml b/core/pom.xml index a753a59..e696853 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -193,6 +193,10 @@ commons-io + commons-collections + commons-collections + + org.apache.commons commons-collections4 diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 b/dev/deps/spark-deps-hadoop-3-hive-2.3 index 600caf9..d1593d7 100644 --- a/dev/deps/spark-deps-hadoop-3-hive-2.3 +++ b/dev/deps/spark-deps-hadoop-3-hive-2.3 @@ -38,6 +38,7 @@ chill-java/0.10.0//chill-java-0.10.0.jar chill_2.12/0.10.0//chill_2.12-0.10.0.jar commons-cli/1.5.0//commons-cli-1.5.0.jar commons-codec/1.15//commons-codec-1.15.jar +commons-collections/3.2.2//commons-collections-3.2.2.jar commons-collections4/4.4//commons-collections4-4.4.jar commons-compiler/3.0.16//commons-compiler-3.0.16.jar commons-compress/1.21//commons-compress-1.21.jar diff --git a/pom.xml b/pom.xml index 6daeb70..a8a6a13 100644 --- a/pom.xml +++ b/pom.xml @@ -160,7 +160,8 @@ 4.4.14 3.6.1 -4.4 +3.2.2 +4.4 2.12.15 2.12 2.0.2 @@ -621,9 +622,14 @@ ${commons.math3.version} +commons-collections +commons-collections +${commons.collections.version} + + org.apache.commons commons-collections4 -${commons.collections.version} +${commons.collections4.version} commons-beanutils - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-38696][BUILD] Add `commons-collections` back for hadoop-3 profile
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 6b29b28 [SPARK-38696][BUILD] Add `commons-collections` back for hadoop-3 profile 6b29b28 is described below commit 6b29b28deffd11edd65b69e0f5c79ed51d483b66 Author: Dongjoon Hyun AuthorDate: Wed Mar 30 07:26:21 2022 -0700 [SPARK-38696][BUILD] Add `commons-collections` back for hadoop-3 profile ### What changes were proposed in this pull request? - SPARK-37968 replaced `commons-collections 3.x` with `commons-collections4`. - SPARK-37600 upgrades to Apache Hadoop 3.3.2. This PR adds it back for `hadoop-3` profile because Apache Hadoop 3.3.2 is still using it. Since SPARK-37968 didn't remove it from `hadoop-2` profile , this is a regression in `hadoop-3` profile only. ### Why are the changes needed? [HADOOP-17139](https://issues.apache.org/jira/browse/HADOOP-17139) added `commons-collections` usage at 3.3.2. Without this patch, `ClassNotFound` error happens while using S3A filesystem. https://github.com/apache/hadoop/blob/f91452b289aea1418f56d242c046b58d9f214a1d/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/CopyFromLocalOperation.java#L41 ``` import org.apache.commons.collections.comparators.ReverseComparator; ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This is a kind of dependency recovery. This should pass CIs. Closes #36009 from dongjoon-hyun/SPARK-38696. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun --- core/pom.xml | 4 dev/deps/spark-deps-hadoop-3-hive-2.3 | 1 + pom.xml | 10 -- 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/core/pom.xml b/core/pom.xml index 24294a2..c1e494f 100644 --- a/core/pom.xml +++ b/core/pom.xml @@ -193,6 +193,10 @@ commons-io + commons-collections + commons-collections + + org.apache.commons commons-collections4 diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 b/dev/deps/spark-deps-hadoop-3-hive-2.3 index 27b0ec5..08fa631 100644 --- a/dev/deps/spark-deps-hadoop-3-hive-2.3 +++ b/dev/deps/spark-deps-hadoop-3-hive-2.3 @@ -38,6 +38,7 @@ chill-java/0.10.0//chill-java-0.10.0.jar chill_2.12/0.10.0//chill_2.12-0.10.0.jar commons-cli/1.5.0//commons-cli-1.5.0.jar commons-codec/1.15//commons-codec-1.15.jar +commons-collections/3.2.2//commons-collections-3.2.2.jar commons-collections4/4.4//commons-collections4-4.4.jar commons-compiler/3.0.16//commons-compiler-3.0.16.jar commons-compress/1.21//commons-compress-1.21.jar diff --git a/pom.xml b/pom.xml index 6c5b321..d45d6ca 100644 --- a/pom.xml +++ b/pom.xml @@ -160,7 +160,8 @@ 4.4.14 3.6.1 -4.4 +3.2.2 +4.4 2.12.15 2.12 2.0.2 @@ -620,9 +621,14 @@ ${commons.math3.version} +commons-collections +commons-collections +${commons.collections.version} + + org.apache.commons commons-collections4 -${commons.collections.version} +${commons.collections4.version} commons-beanutils - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.3 updated: [SPARK-38676][SQL] Provide SQL query context in runtime error message of Add/Subtract/Multiply
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new 7c523ea [SPARK-38676][SQL] Provide SQL query context in runtime error message of Add/Subtract/Multiply 7c523ea is described below commit 7c523eaff5c6c07f14832e7c2de80229656d6f9c Author: Gengliang Wang AuthorDate: Wed Mar 30 17:17:54 2022 +0800 [SPARK-38676][SQL] Provide SQL query context in runtime error message of Add/Subtract/Multiply ### What changes were proposed in this pull request? Provide SQL query context in runtime error of Add/Subtract/Multiply if the data type is Int or Long. This is the first PR for improving the runtime error message. There are more to be done after we decide what the error message should be like. Before changes: ``` > SELECT i.f1 - 100, i.f1 * smallint('2') AS x FROM INT4_TBL i java.lang.ArithmeticException integer overflow. If necessary set spark.sql.ansi.enabled to false (except for ANSI interval type) to bypass this error. ``` After changes: ``` > SELECT i.f1 - 100, i.f1 * smallint('2') AS x FROM INT4_TBL i java.lang.ArithmeticException integer overflow. If necessary set spark.sql.ansi.enabled to false (except for ANSI interval type) to bypass this error. == SQL(line 1, position 25) == SELECT '' AS five, i.f1, i.f1 * smallint('2') AS x FROM INT4_TBL i ``` ### Why are the changes needed? Provide SQL query context of runtime errors to users, so that they can understand it better. ### Does this PR introduce _any_ user-facing change? Yes, improve the runtime error message of Add/Subtract/Multiply ### How was this patch tested? UT Closes #35992 from gengliangwang/runtimeError. Authored-by: Gengliang Wang Signed-off-by: Gengliang Wang (cherry picked from commit 9f6aad407724997dc04a7689bb870d1d3ddd5526) Signed-off-by: Gengliang Wang --- .../sql/catalyst/expressions/arithmetic.scala | 31 ++--- .../apache/spark/sql/catalyst/trees/TreeNode.scala | 75 +- .../apache/spark/sql/catalyst/util/MathUtils.scala | 22 ++- .../spark/sql/errors/QueryExecutionErrors.scala| 8 ++- .../expressions/ArithmeticExpressionSuite.scala| 63 ++ .../spark/sql/catalyst/trees/TreeNodeSuite.scala | 32 + .../sql-tests/results/postgreSQL/int4.sql.out | 18 ++ .../sql-tests/results/postgreSQL/int8.sql.out | 12 8 files changed, 246 insertions(+), 15 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala index 88a3861..7251e47 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala @@ -268,19 +268,16 @@ abstract class BinaryArithmetic extends BinaryOperator with NullIntolerant { |${ev.value} = (${CodeGenerator.javaType(dataType)})($tmpResult); """.stripMargin }) -case IntegerType | LongType => +case IntegerType | LongType if failOnError && exactMathMethod.isDefined => nullSafeCodeGen(ctx, ev, (eval1, eval2) => { -val operation = if (failOnError && exactMathMethod.isDefined) { - val mathUtils = MathUtils.getClass.getCanonicalName.stripSuffix("$") - s"$mathUtils.${exactMathMethod.get}($eval1, $eval2)" -} else { - s"$eval1 $symbol $eval2" -} +val errorContext = ctx.addReferenceObj("errCtx", origin.context) +val mathUtils = MathUtils.getClass.getCanonicalName.stripSuffix("$") s""" - |${ev.value} = $operation; + |${ev.value} = $mathUtils.${exactMathMethod.get}($eval1, $eval2, $errorContext); """.stripMargin }) -case DoubleType | FloatType => + +case IntegerType | LongType | DoubleType | FloatType => // When Double/Float overflows, there can be 2 cases: // - precision loss: according to SQL standard, the number is truncated; // - returns (+/-)Infinite: same behavior also other DBs have (e.g. Postgres) @@ -333,6 +330,10 @@ case class Add( MathUtils.addExact(input1.asInstanceOf[Long], input2.asInstanceOf[Long]) case _: YearMonthIntervalType => MathUtils.addExact(input1.asInstanceOf[Int], input2.asInstanceOf[Int]) +case _: IntegerType if failOnError => + MathUtils.addExact(input1.asInstanceOf[Int], input2.asInstanceOf[Int], origin.context) +case _: LongType if failOnError => + MathUtils.addExact(input1.asInstanceOf[Long], input2.asInst
[spark] branch master updated (a445536 -> 9f6aad4)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a445536 [SPARK-38349][SS] No need to filter events when sessionwindow gapDuration greater than 0 add 9f6aad4 [SPARK-38676][SQL] Provide SQL query context in runtime error message of Add/Subtract/Multiply No new revisions were added by this update. Summary of changes: .../sql/catalyst/expressions/arithmetic.scala | 31 ++--- .../apache/spark/sql/catalyst/trees/TreeNode.scala | 75 +- .../apache/spark/sql/catalyst/util/MathUtils.scala | 22 ++- .../spark/sql/errors/QueryExecutionErrors.scala| 8 ++- .../expressions/ArithmeticExpressionSuite.scala| 63 ++ .../spark/sql/catalyst/trees/TreeNodeSuite.scala | 32 + .../sql-tests/results/postgreSQL/int4.sql.out | 18 ++ .../sql-tests/results/postgreSQL/int8.sql.out | 12 8 files changed, 246 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org