from:"maxgekk"

[spark] branch branch-3.4 updated: [SPARK-42233][SQL] Improve error message for `PIVOT_AFTER_GROUP_BY`

2023-01-30 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
 new 6b156ac067b [SPARK-42233][SQL] Improve error message for 
`PIVOT_AFTER_GROUP_BY`
6b156ac067b is described below

commit 6b156ac067b33f342beac5b8443f521ee44ba87f
Author: itholic 
AuthorDate: Mon Jan 30 16:50:42 2023 +0300

[SPARK-42233][SQL] Improve error message for `PIVOT_AFTER_GROUP_BY`

### What changes were proposed in this pull request?

This PR proposes to improve error message for `PIVOT_AFTER_GROUP_BY` to 
give users better error message.

### Why are the changes needed?

The current error message only shows the cause, not a solution. We should 
provide proper solution as well in the error message.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

The existing CI should pass.

Closes #39793 from itholic/PIVOT_AFTER_GROUP_BY.

Authored-by: itholic 
Signed-off-by: Max Gekk 
(cherry picked from commit c1bee1058667b631ed4e027ebb9791698023e9c9)
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 8975fe279c2..af5e17d56d4 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1523,7 +1523,7 @@
   },
   "PIVOT_AFTER_GROUP_BY" : {
 "message" : [
-  "PIVOT clause following a GROUP BY clause."
+  "PIVOT clause following a GROUP BY clause. Consider pushing the 
GROUP BY into a subquery."
 ]
   },
   "PIVOT_TYPE" : {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-42233][SQL] Improve error message for `PIVOT_AFTER_GROUP_BY`

2023-01-30 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new c1bee105866 [SPARK-42233][SQL] Improve error message for 
`PIVOT_AFTER_GROUP_BY`
c1bee105866 is described below

commit c1bee1058667b631ed4e027ebb9791698023e9c9
Author: itholic 
AuthorDate: Mon Jan 30 16:50:42 2023 +0300

[SPARK-42233][SQL] Improve error message for `PIVOT_AFTER_GROUP_BY`

### What changes were proposed in this pull request?

This PR proposes to improve error message for `PIVOT_AFTER_GROUP_BY` to 
give users better error message.

### Why are the changes needed?

The current error message only shows the cause, not a solution. We should 
provide proper solution as well in the error message.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

The existing CI should pass.

Closes #39793 from itholic/PIVOT_AFTER_GROUP_BY.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 67e324db1dc..e6dcdf1bed8 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1557,7 +1557,7 @@
   },
   "PIVOT_AFTER_GROUP_BY" : {
 "message" : [
-  "PIVOT clause following a GROUP BY clause."
+  "PIVOT clause following a GROUP BY clause. Consider pushing the 
GROUP BY into a subquery."
 ]
   },
   "PIVOT_TYPE" : {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41490][SQL] Assign name to _LEGACY_ERROR_TEMP_2441

2023-01-30 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 04517fc803e [SPARK-41490][SQL] Assign name to _LEGACY_ERROR_TEMP_2441
04517fc803e is described below

commit 04517fc803e400388acd1d70dd0634125c78d91f
Author: itholic 
AuthorDate: Mon Jan 30 16:22:24 2023 +0300

[SPARK-41490][SQL] Assign name to _LEGACY_ERROR_TEMP_2441

### What changes were proposed in this pull request?

This PR proposes to assign name to _LEGACY_ERROR_TEMP_2441, 
"UNSUPPORTED_EXPR_FOR_OPERATOR".

### Why are the changes needed?

We should assign proper name to _LEGACY_ERROR_TEMP_*

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

`./build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*`

Closes #39700 from itholic/LEGACY_2441.

Lead-authored-by: itholic 
Co-authored-by: Haejoon Lee <44108233+itho...@users.noreply.github.com>
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json | 12 ++--
 .../apache/spark/sql/catalyst/analysis/CheckAnalysis.scala   |  5 ++---
 .../spark/sql/catalyst/analysis/ResolveSubquerySuite.scala   |  6 +-
 .../sql-tests/results/postgreSQL/window_part3.sql.out| 10 --
 4 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 653b1ebd013..67e324db1dc 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1459,6 +1459,12 @@
 },
 "sqlState" : "0A000"
   },
+  "UNSUPPORTED_EXPR_FOR_OPERATOR" : {
+"message" : [
+  "A query operator contains one or more unsupported expressions. Consider 
to rewrite it to avoid window functions, aggregate functions, and generator 
functions in the WHERE clause.",
+  "Invalid expressions: []"
+]
+  },
   "UNSUPPORTED_FEATURE" : {
 "message" : [
   "The feature is not supported:"
@@ -5280,12 +5286,6 @@
   "in operator ."
 ]
   },
-  "_LEGACY_ERROR_TEMP_2441" : {
-"message" : [
-  "The query operator `` contains one or more unsupported 
expression types Aggregate, Window or Generate.",
-  "Invalid expressions: []."
-]
-  },
   "_LEGACY_ERROR_TEMP_2443" : {
 "message" : [
   "Multiple definitions of observed metrics named '': ."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index 276bf714a34..bccced3dff6 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
@@ -724,11 +724,10 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
 
   case other if 
PlanHelper.specialExpressionsInUnsupportedOperator(other).nonEmpty =>
 val invalidExprSqls =
-  
PlanHelper.specialExpressionsInUnsupportedOperator(other).map(_.sql)
+  
PlanHelper.specialExpressionsInUnsupportedOperator(other).map(toSQLExpr)
 other.failAnalysis(
-  errorClass = "_LEGACY_ERROR_TEMP_2441",
+  errorClass = "UNSUPPORTED_EXPR_FOR_OPERATOR",
   messageParameters = Map(
-"operator" -> other.nodeName,
 "invalidExprSqls" -> invalidExprSqls.mkString(", ")))
 
   // This should not happen, resolved Project or Aggregate should 
restore or resolve
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveSubquerySuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveSubquerySuite.scala
index 7b99153acf9..67265fe6f3b 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveSubquerySuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveSubquerySuite.scala
@@ -187,7 +187,11 @@ class ResolveSubquerySuite extends AnalysisTest {
   test("lateral join with unsupported expressions") {
 val plan = lateralJoin(t1, t0.select(($"a" + $"b").as("c")),
   condition = Some(sum($"a") === sum($"c")))
-assertAnalysisError(plan, Seq("Invalid expressions: [sum(a), sum(c)]"))
+assertAnalysisErrorClass(
+  plan

[spark] branch master updated: [SPARK-41489][SQL] Assign name to _LEGACY_ERROR_TEMP_2415

2023-01-28 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new d2fc1992058 [SPARK-41489][SQL] Assign name to _LEGACY_ERROR_TEMP_2415
d2fc1992058 is described below

commit d2fc19920588f2f6c83c31a9519702f9416190fe
Author: itholic 
AuthorDate: Sun Jan 29 08:45:14 2023 +0300

[SPARK-41489][SQL] Assign name to _LEGACY_ERROR_TEMP_2415

### What changes were proposed in this pull request?

This PR proposes to assign name to _LEGACY_ERROR_TEMP_2415, 
"DATATYPE_MISMATCH.FILTER_NOT_BOOLEAN".

### Why are the changes needed?

We should assign proper name to _LEGACY_ERROR_TEMP_*

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

`./build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*`

Closes #39701 from itholic/LEGACY_2415.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 +-
 .../apache/spark/sql/catalyst/analysis/CheckAnalysis.scala |  7 ---
 .../spark/sql/catalyst/analysis/AnalysisErrorSuite.scala   |  5 +++--
 .../apache/spark/sql/catalyst/analysis/AnalysisSuite.scala | 14 ++
 .../optimizer/ReplaceNullWithFalseInPredicateSuite.scala   | 11 +++
 5 files changed, 33 insertions(+), 14 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index ae766de3e20..936f996f3a4 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -265,6 +265,11 @@
   "Input to  should all be the same type, but it's 
."
 ]
   },
+  "FILTER_NOT_BOOLEAN" : {
+"message" : [
+  "Filter expression  of type  is not a boolean."
+]
+  },
   "HASH_MAP_TYPE" : {
 "message" : [
   "Input to the function  cannot contain elements of the 
\"MAP\" type. In Spark, same maps may have different hashcode, thus hash 
expressions are prohibited on \"MAP\" elements. To restore previous behavior 
set \"spark.sql.legacy.allowHashOnMapType\" to \"true\"."
@@ -5175,11 +5180,6 @@
   "Event time must be defined on a window or a timestamp, but  is 
of type ."
 ]
   },
-  "_LEGACY_ERROR_TEMP_2415" : {
-"message" : [
-  "filter expression '' of type  is not a boolean."
-]
-  },
   "_LEGACY_ERROR_TEMP_2416" : {
 "message" : [
   "join condition '' of type  is not a boolean."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index d5ef71adc4f..276bf714a34 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
@@ -355,10 +355,11 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
 }
   case f: Filter if f.condition.dataType != BooleanType =>
 f.failAnalysis(
-  errorClass = "_LEGACY_ERROR_TEMP_2415",
+  errorClass = "DATATYPE_MISMATCH.FILTER_NOT_BOOLEAN",
   messageParameters = Map(
-"filter" -> f.condition.sql,
-"type" -> f.condition.dataType.catalogString))
+"sqlExpr" -> f.expressions.map(toSQLExpr).mkString(","),
+"filter" -> toSQLExpr(f.condition),
+"type" -> toSQLType(f.condition.dataType)))
 
   case j @ Join(_, _, _, Some(condition), _) if condition.dataType != 
BooleanType =>
 j.failAnalysis(
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisErrorSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisErrorSuite.scala
index faa8c1f4558..56bb8b0ccc2 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisErrorSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisErrorSuite.scala
@@ -349,10 +349,11 @@ class AnalysisErrorSuite extends AnalysisTest {
 "UNRESOLVED_COLUMN.WITH_SUGGESTION",
 Map("objectName" -> "`b`", "proposal" -> "`a`, `c`, `a3`"))
 
-  errorTest(
+  errorClassTest(
 "non-boolean filters",

[spark] branch master updated: [SPARK-41931][SQL] Better error message for incomplete complex type definition

2023-01-27 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 0ef7afe0dc3 [SPARK-41931][SQL] Better error message for incomplete 
complex type definition
0ef7afe0dc3 is described below

commit 0ef7afe0dc3723b97b750c071a908f363e514a26
Author: Runyao Chen 
AuthorDate: Fri Jan 27 18:06:32 2023 +0300

[SPARK-41931][SQL] Better error message for incomplete complex type 
definition

### What changes were proposed in this pull request?

This PR improves error messages for `ARRAY` / `MAP` / `STRUCT` types 
without element type specification. A new error class 
`INCOMPLETE_TYPE_DEFINITION` with subclasses (`ARRAY`, `MAP`, and `STRUCT`) is 
introduced.

**Details**

In the case where we do `CAST AS` or `CREATE` a complex type without 
specifying its element type,
e.g.
```
CREATE TABLE t (col ARRAY)
```
`[UNSUPPORTED_DATATYPE] Unsupported data type "ARRAY"` error would be 
thrown, while we do support the `ARRAY` type and just require it to be typed.

This PR proposes a better error message like
```
The definition of `ARRAY` type is incomplete. You must provide an element 
type. For example: `ARRAY`.
```

### Why are the changes needed?

The previous error message for incomplete complex types is confusing. A 
`UNSUPPORTED_DATATYPE` error would be thrown, while we do support complex 
types. We just require complex types to have their element types specified. We 
need a clear error message with an example in this case.

### Does this PR introduce _any_ user-facing change?

Yes, this PR changes the error message which is user-facing.

Error message before this PR:
```
spark-sql> SELECT CAST(array(1, 2, 3) AS ARRAY);

[UNSUPPORTED_DATATYPE] Unsupported data type "ARRAY"(line 1, pos 30)
```

Error message after this PR:
```
[INCOMPLETE_TYPE_DEFINITION.ARRAY] Incomplete complex type: The definition 
of `ARRAY` type is incomplete. You must provide an element type. For example: 
`ARRAY`.
```
Similarly for MAP and STRUCT types.

### How was this patch tested?

Added unit tests covering CAST and CREATE with ARRAY / STRUCT / MAP types 
and their nested combinations.

Closes #39711 from RunyaoChen/better_error_msg_nested_type.

Lead-authored-by: Runyao Chen 
Co-authored-by: RunyaoChen 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 23 +++
 .../spark/sql/catalyst/parser/AstBuilder.scala |  2 +
 .../spark/sql/errors/QueryParsingErrors.scala  | 21 +++
 .../spark/sql/errors/QueryParsingErrorsSuite.scala | 72 ++
 4 files changed, 118 insertions(+)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index e6876751a22..ae766de3e20 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -592,6 +592,29 @@
   "Detected an incompatible DataSourceRegister. Please remove the 
incompatible library from classpath or upgrade it. Error: "
 ]
   },
+  "INCOMPLETE_TYPE_DEFINITION" : {
+"message" : [
+  "Incomplete complex type:"
+],
+"subClass" : {
+  "ARRAY" : {
+"message" : [
+  "The definition of \"ARRAY\" type is incomplete. You must provide an 
element type. For example: \"ARRAY\"."
+]
+  },
+  "MAP" : {
+"message" : [
+  "The definition of \"MAP\" type is incomplete. You must provide a 
key type and a value type. For example: \"MAP\"."
+]
+  },
+  "STRUCT" : {
+"message" : [
+  "The definition of \"STRUCT\" type is incomplete. You must provide 
at least one field type. For example: \"STRUCT\"."
+]
+  }
+},
+"sqlState" : "42K01"
+  },
   "INCONSISTENT_BEHAVIOR_CROSS_VERSION" : {
 "message" : [
   "You may get a different result due to the upgrading to"
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
index c6e50f3f514..d2a1cb1eb16 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
@@ -2889,6 +2889,8 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] 
with S

[spark] branch master updated: [SPARK-42158][SQL] Integrate `_LEGACY_ERROR_TEMP_1003` into `FIELD_NOT_FOUND`

2023-01-26 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new f373df8a757 [SPARK-42158][SQL] Integrate `_LEGACY_ERROR_TEMP_1003` 
into `FIELD_NOT_FOUND`
f373df8a757 is described below

commit f373df8a757e36ea84275c637087045d6cca3939
Author: itholic 
AuthorDate: Fri Jan 27 10:40:47 2023 +0300

[SPARK-42158][SQL] Integrate `_LEGACY_ERROR_TEMP_1003` into 
`FIELD_NOT_FOUND`

### What changes were proposed in this pull request?

This PR proposes to integrate `_LEGACY_ERROR_TEMP_1003` into 
`FIELD_NOT_FOUND`

### Why are the changes needed?

We should deduplicate the similar error classes into single error class by 
merging them.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Fixed exiting UTs.

Closes #39706 from itholic/LEGACY_1003.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   |  5 --
 .../spark/sql/catalyst/analysis/Analyzer.scala |  3 +-
 .../spark/sql/errors/QueryCompilationErrors.scala  |  8 ++-
 .../spark/sql/connector/AlterTableTests.scala  | 18 +-
 .../connector/V2CommandsCaseSensitivitySuite.scala | 64 +++---
 5 files changed, 65 insertions(+), 33 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 5d2e184874a..e6876751a22 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -2031,11 +2031,6 @@
   "Try moving this class out of its parent class."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1003" : {
-"message" : [
-  "Couldn't find the reference column for  at ."
-]
-  },
   "_LEGACY_ERROR_TEMP_1004" : {
 "message" : [
   "Window specification  is not defined in the WINDOW clause."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index f0c22471afa..6f27c97ddf9 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -4053,9 +4053,8 @@ class Analyzer(override val catalogManager: 
CatalogManager)
   case Some(colName) =>
 ResolvedFieldPosition(ColumnPosition.after(colName))
   case None =>
-val name = if (resolvedParentName.isEmpty) "root" else 
resolvedParentName.quoted
 throw 
QueryCompilationErrors.referenceColNotFoundForAlterTableChangesError(
-  after, name)
+  col.colName, allFields)
 }
   case _ => ResolvedFieldPosition(u.position)
 }
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index c415fb91c5d..1a8c42b599e 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -295,10 +295,12 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
   }
 
   def referenceColNotFoundForAlterTableChangesError(
-  after: TableChange.After, parentName: String): Throwable = {
+  fieldName: String, fields: Array[String]): Throwable = {
 new AnalysisException(
-  errorClass = "_LEGACY_ERROR_TEMP_1003",
-  messageParameters = Map("after" -> after.toString, "parentName" -> 
parentName))
+  errorClass = "FIELD_NOT_FOUND",
+  messageParameters = Map(
+"fieldName" -> toSQLId(fieldName),
+"fields" -> fields.mkString(", ")))
   }
 
   def windowSpecificationNotDefinedError(windowName: String): Throwable = {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala
index b69a0628f3e..2047212a4ea 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala
@@ -160,7 +160,11 @@ trait AlterTableTests extends SharedSparkSession with 
QueryErrorsBase {
 
   val e1 = intercept[AnalysisException](
 sql(s"ALTER TABLE $t ADD COLUMN c string AFTER non_exist"))
-  assert(e1.getMessage().co

[spark] branch master updated: [SPARK-41948][SQL] Fix NPE for error classes: CANNOT_PARSE_JSON_FIELD

2023-01-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new cc1674d66ef [SPARK-41948][SQL] Fix NPE for error classes: 
CANNOT_PARSE_JSON_FIELD
cc1674d66ef is described below

commit cc1674d66ef34f540aa7bd5c7e465605e264e040
Author: panbingkun 
AuthorDate: Mon Jan 23 15:15:59 2023 +0300

[SPARK-41948][SQL] Fix NPE for error classes: CANNOT_PARSE_JSON_FIELD

### What changes were proposed in this pull request?
The pr aims to fix NPE for error classes: CANNOT_PARSE_JSON_FIELD.

### Why are the changes needed?
1. When I want to delete redundant 'toString()' in code block as follow
https://user-images.githubusercontent.com/15246973/211269145-0f087bb1-dc93-480c-9f9d-afde5ac1c8de.png";>
I found the UT("select from_json('[1, \"2\", 3]', 'array')") failed.

 Why can it succeed before deletion?

`parse.getCurrentName.toString()` => null.toString() => throw NPE, but 
follow logical can cover it,

https://github.com/apache/spark/blob/15a0f55246bee7b043bd6081f53744fbf74403eb/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala#L569-L573
But obviously this is not our original intention.

 After deletion, The IllegalArgumentException will be thrown.
`parse.getCurrentName` => throw java.lang.IllegalArgumentException as 
follow:
`Caused by: java.lang.IllegalArgumentException: Cannot resolve variable 
'fieldName' (enableSubstitutionInVariables=false).
at 
org.apache.commons.text.StringSubstitutor.substitute(StringSubstitutor.java:1532)
at 
org.apache.commons.text.StringSubstitutor.substitute(StringSubstitutor.java:1389)
at 
org.apache.commons.text.StringSubstitutor.replace(StringSubstitutor.java:893)
at 
org.apache.spark.ErrorClassesJsonReader.getErrorMessage(ErrorClassesJSONReader.scala:51)
... 140 more
`
Above code can't handle IllegalArgumentException, so the UT failed.

So, we should consider the case where `parse.getCurrentName` is null.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.
Existed UT.

Closes #39466 from panbingkun/SPARK-41948.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 .../main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala | 4 ++--
 sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala | 3 ++-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
index 8128c460602..9c8c764cf92 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
@@ -1443,8 +1443,8 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase {
 new SparkRuntimeException(
   errorClass = "CANNOT_PARSE_JSON_FIELD",
   messageParameters = Map(
-"fieldName" -> parser.getCurrentName.toString(),
-"fieldValue" -> parser.getText.toString(),
+"fieldName" -> toSQLValue(parser.getCurrentName, StringType),
+"fieldValue" -> parser.getText,
 "jsonType" -> jsonType.toString(),
 "dataType" -> toSQLType(dataType)))
   }
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala
index 6e16533eb30..57c54e88229 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala
@@ -27,6 +27,7 @@ import org.apache.commons.lang3.exception.ExceptionUtils
 import org.apache.spark.{SparkException, SparkRuntimeException}
 import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.catalyst.expressions.{Literal, StructsToJson}
+import org.apache.spark.sql.catalyst.expressions.Cast._
 import org.apache.spark.sql.functions._
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.test.SharedSparkSession
@@ -785,7 +786,7 @@ class JsonFunctionsSuite extends QueryTest with 
SharedSparkSession {
 exception = 
ExceptionUtils.getRootCause(exception).asInstanceOf[SparkRuntimeException],
 errorClass = "CANNOT_PARSE_JSON_FIELD",
 parameters = Map(
-  "fieldName" -> "a",
+  "fieldName" -> toSQLValue("a",

[spark] branch master updated: [SPARK-41575][SQL] Assign name to _LEGACY_ERROR_TEMP_2054

2023-01-10 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new aaee89a12fd [SPARK-41575][SQL] Assign name to _LEGACY_ERROR_TEMP_2054
aaee89a12fd is described below

commit aaee89a12fd9b8ca3c57fa4283a51ce229dd7b71
Author: itholic 
AuthorDate: Tue Jan 10 16:25:15 2023 +0300

[SPARK-41575][SQL] Assign name to _LEGACY_ERROR_TEMP_2054

### What changes were proposed in this pull request?

This PR proposes to assign name to _LEGACY_ERROR_TEMP_2054, 
"TASK_WRITE_FAILED".

### Why are the changes needed?

We should assign proper name to _LEGACY_ERROR_TEMP_*

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

`./build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*`

Closes #39394 from itholic/LEGACY_2054.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 +--
 .../spark/sql/errors/QueryExecutionErrors.scala|  6 +-
 .../execution/datasources/FileFormatWriter.scala   |  2 +-
 .../apache/spark/sql/CharVarcharTestSuite.scala| 82 +++---
 .../org/apache/spark/sql/sources/InsertSuite.scala | 16 +++--
 .../spark/sql/HiveCharVarcharTestSuite.scala   | 27 +++
 6 files changed, 104 insertions(+), 39 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index a3acb940585..edf46a0fe09 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1187,6 +1187,11 @@
 ],
 "sqlState" : "42000"
   },
+  "TASK_WRITE_FAILED" : {
+"message" : [
+  "Task failed while writing rows to ."
+]
+  },
   "TEMP_TABLE_OR_VIEW_ALREADY_EXISTS" : {
 "message" : [
   "Cannot create the temporary view  because it already 
exists.",
@@ -3728,11 +3733,6 @@
   "buildReader is not supported for "
 ]
   },
-  "_LEGACY_ERROR_TEMP_2054" : {
-"message" : [
-  "Task failed while writing rows. "
-]
-  },
   "_LEGACY_ERROR_TEMP_2055" : {
 "message" : [
   "",
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
index 17fc38812f8..9598933d941 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
@@ -782,10 +782,10 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase {
   messageParameters = Map("format" -> format))
   }
 
-  def taskFailedWhileWritingRowsError(cause: Throwable): Throwable = {
+  def taskFailedWhileWritingRowsError(path: String, cause: Throwable): 
Throwable = {
 new SparkException(
-  errorClass = "_LEGACY_ERROR_TEMP_2054",
-  messageParameters = Map("message" -> cause.getMessage),
+  errorClass = "TASK_WRITE_FAILED",
+  messageParameters = Map("path" -> path),
   cause = cause)
   }
 
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala
index 6285095c647..5c4d662c145 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala
@@ -423,7 +423,7 @@ object FileFormatWriter extends Logging {
 // We throw the exception and let Executor throw ExceptionFailure to 
abort the job.
 throw new TaskOutputFileAlreadyExistException(f)
   case t: Throwable =>
-throw QueryExecutionErrors.taskFailedWhileWritingRowsError(t)
+throw 
QueryExecutionErrors.taskFailedWhileWritingRowsError(description.path, t)
 }
   }
 
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/CharVarcharTestSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/CharVarcharTestSuite.scala
index 95c2e5085d9..c0ceebaa9a6 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/CharVarcharTestSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/CharVarcharTestSuite.scala
@@ -178,26 +178,6 @@ trait CharVarcharTestSuite extends QueryTest with 
SQLTestUtils {
 }
   }
 
-  test("char/varchar type values length check: partitioned columns of other 
types") {
-Seq("CHAR(5)", "VARCHAR(5)").for

[spark] branch master updated: [SPARK-41947][CORE][DOCS] Update the contents of error class guidelines

2023-01-09 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 786594734bd [SPARK-41947][CORE][DOCS] Update the contents of error 
class guidelines
786594734bd is described below

commit 786594734bd79017ebd42eb117b62958afad07bb
Author: itholic 
AuthorDate: Mon Jan 9 23:24:09 2023 +0300

[SPARK-41947][CORE][DOCS] Update the contents of error class guidelines

### What changes were proposed in this pull request?

This PR proposes to update error class guidelines for 
`core/src/main/resources/error/README.md`.

### Why are the changes needed?

Because some of contents are out of date, and no longer valid for current 
behavior.

### Does this PR introduce _any_ user-facing change?

No. It fixed the developer guidelines for error class.

### How was this patch tested?

The existing CI should pass.

Closes #39464 from itholic/SPARK-41947.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/README.md | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/core/src/main/resources/error/README.md 
b/core/src/main/resources/error/README.md
index 23e62cd25fb..8ea9e37c27f 100644
--- a/core/src/main/resources/error/README.md
+++ b/core/src/main/resources/error/README.md
@@ -8,9 +8,9 @@ and message parameters rather than an arbitrary error message.
 1. Check if the error is an internal error.
Internal errors are bugs in the code that we do not expect users to 
encounter; this does not include unsupported operations.
If true, use the error class `INTERNAL_ERROR` and skip to step 4.
-2. Check if an appropriate error class already exists in `error-class.json`.
+2. Check if an appropriate error class already exists in `error-classes.json`.
If true, use the error class and skip to step 4.
-3. Add a new class to `error-class.json`; keep in mind the invariants below.
+3. Add a new class to `error-classes.json`; keep in mind the invariants below.
 4. Check if the exception type already extends `SparkThrowable`.
If true, skip to step 6.
 5. Mix `SparkThrowable` into the exception.
@@ -24,10 +24,10 @@ Throw with arbitrary error message:
 
 ### After
 
-`error-class.json`
+`error-classes.json`
 
 "PROBLEM_BECAUSE": {
-  "message": ["Problem %s because %s"],
+  "message": ["Problem  because "],
   "sqlState": "X"
 }
 
@@ -35,16 +35,18 @@ Throw with arbitrary error message:
 
 class SparkTestException(
 errorClass: String,
-messageParameters: Seq[String])
+messageParameters: Map[String, String])
   extends TestException(SparkThrowableHelper.getMessage(errorClass, 
messageParameters))
 with SparkThrowable {
 
-  def getErrorClass: String = errorClass
+  override def getMessageParameters: java.util.Map[String, String] = 
messageParameters.asJava
+
+  override def getErrorClass: String = errorClass
 }
 
 Throw with error class and message parameters:
 
-throw new SparkTestException("PROBLEM_BECAUSE", Seq("A", "B"))
+throw new SparkTestException("PROBLEM_BECAUSE", Map("problem" -> "A", 
"cause" -> "B"))
 
 ## Access fields
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41780][SQL] Should throw INVALID_PARAMETER_VALUE.PATTERN when the parameters `regexp` is invalid

2023-01-09 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 15a0f55246b [SPARK-41780][SQL] Should throw 
INVALID_PARAMETER_VALUE.PATTERN when the parameters `regexp` is invalid
15a0f55246b is described below

commit 15a0f55246bee7b043bd6081f53744fbf74403eb
Author: panbingkun 
AuthorDate: Mon Jan 9 11:37:54 2023 +0300

[SPARK-41780][SQL] Should throw INVALID_PARAMETER_VALUE.PATTERN when the 
parameters `regexp` is invalid

### What changes were proposed in this pull request?
In the PR, I propose to throw error classes - 
`INVALID_PARAMETER_VALUE.PATTERN` when the parameters `regexp` in 
regexp_replace & regexp_extract & rlike is invalid.

### Why are the changes needed?
Clear error prompt should improve user experience with Spark SQL.
The original error prompt is:
https://user-images.githubusercontent.com/15246973/210493673-c1de9927-9a18-4f9d-a94c-48735b6c5e5a.png";>
Valid: [a\\d]{0,2}
Invalid: [a\\d]{0, 2}

![image](https://user-images.githubusercontent.com/15246973/210494925-cb6c8043-de02-4c8e-9b40-225350422340.png)

### Does this PR introduce _any_ user-facing change?
Yes.

### How was this patch tested?
Add new UT.
Pass GA.

Closes #39383 from panbingkun/SPARK-41780.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 .../catalyst/expressions/regexpExpressions.scala   | 20 ++--
 .../expressions/RegexpExpressionsSuite.scala   | 29 -
 .../apache/spark/sql/StringFunctionsSuite.scala| 38 ++
 3 files changed, 76 insertions(+), 11 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
index c86dcfb3b96..29510bc3852 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
@@ -57,7 +57,12 @@ abstract class StringRegexExpression extends BinaryExpression
 null
   } else {
 // Let it raise exception if couldn't compile the regex string
-Pattern.compile(escape(str))
+try {
+  Pattern.compile(escape(str))
+} catch {
+  case e: PatternSyntaxException =>
+throw QueryExecutionErrors.invalidPatternError(prettyName, 
e.getPattern, e)
+}
   }
 
   protected def pattern(str: String) = if (cache == null) compile(str) else 
cache
@@ -634,7 +639,12 @@ case class RegExpReplace(subject: Expression, regexp: 
Expression, rep: Expressio
 if (!p.equals(lastRegex)) {
   // regex value changed
   lastRegex = p.asInstanceOf[UTF8String].clone()
-  pattern = Pattern.compile(lastRegex.toString)
+  try {
+pattern = Pattern.compile(lastRegex.toString)
+  } catch {
+case e: PatternSyntaxException =>
+  throw QueryExecutionErrors.invalidPatternError(prettyName, 
e.getPattern, e)
+  }
 }
 if (!r.equals(lastReplacementInUTF8)) {
   // replacement string changed
@@ -688,7 +698,11 @@ case class RegExpReplace(subject: Expression, regexp: 
Expression, rep: Expressio
   if (!$regexp.equals($termLastRegex)) {
 // regex value changed
 $termLastRegex = $regexp.clone();
-$termPattern = $classNamePattern.compile($termLastRegex.toString());
+try {
+  $termPattern = $classNamePattern.compile($termLastRegex.toString());
+} catch (java.util.regex.PatternSyntaxException e) {
+  throw QueryExecutionErrors.invalidPatternError("$prettyName", 
e.getPattern(), e);
+}
   }
   if (!$rep.equals($termLastReplacementInUTF8)) {
 // replacement string changed
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/RegexpExpressionsSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/RegexpExpressionsSuite.scala
index 8b5e303849c..af051a1a9bc 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/RegexpExpressionsSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/RegexpExpressionsSuite.scala
@@ -279,14 +279,27 @@ class RegexpExpressionsSuite extends SparkFunSuite with 
ExpressionEvalHelper {
 checkLiteralRow("abc"  rlike _, "^bc", false)
 checkLiteralRow("abc"  rlike _, "^ab", true)
 checkLiteralRow("abc"  rlike _, "^bc", false)
-
-intercept[java.util.regex.PatternSyntaxException] {
-  evaluateWithoutCodegen("ac" rlike "**")
-}
-intercept[java.util.regex.PatternS

[spark] branch master updated: [SPARK-41581][SQL] Update `_LEGACY_ERROR_TEMP_1230` as `INTERNAL_ERROR`

2023-01-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 6b92cda04e6 [SPARK-41581][SQL] Update `_LEGACY_ERROR_TEMP_1230` as 
`INTERNAL_ERROR`
6b92cda04e6 is described below

commit 6b92cda04e618f82711587d027fa20601e094418
Author: itholic 
AuthorDate: Mon Jan 9 10:41:49 2023 +0300

[SPARK-41581][SQL] Update `_LEGACY_ERROR_TEMP_1230` as `INTERNAL_ERROR`

### What changes were proposed in this pull request?

This PR proposes to update `_LEGACY_ERROR_TEMP_1230`, as `INTERNAL_ERROR`.

### Why are the changes needed?

We should assign proper name to _LEGACY_ERROR_TEMP_*

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

`./build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*`

Closes #39282 from itholic/LEGACY_1230.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json|  5 -
 .../apache/spark/sql/errors/QueryCompilationErrors.scala| 10 --
 .../scala/org/apache/spark/sql/types/DecimalSuite.scala | 13 -
 3 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 5409507c3c8..a3acb940585 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -2944,11 +2944,6 @@
   " can only support precision up to ."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1230" : {
-"message" : [
-  "Negative scale is not allowed: . You can use =true to 
enable legacy mode to allow it."
-]
-  },
   "_LEGACY_ERROR_TEMP_1231" : {
 "message" : [
   " is not a valid partition column in table ."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index 2ced0b8ac7a..25005a1f609 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -21,7 +21,7 @@ import scala.collection.mutable
 
 import org.apache.hadoop.fs.Path
 
-import org.apache.spark.{SparkThrowable, SparkThrowableHelper}
+import org.apache.spark.{SparkException, SparkThrowable, SparkThrowableHelper}
 import org.apache.spark.sql.AnalysisException
 import org.apache.spark.sql.catalyst.{FunctionIdentifier, QualifiedTableName, 
TableIdentifier}
 import 
org.apache.spark.sql.catalyst.analysis.{CannotReplaceMissingTableException, 
FunctionAlreadyExistsException, NamespaceAlreadyExistsException, 
NoSuchFunctionException, NoSuchNamespaceException, NoSuchPartitionException, 
NoSuchTableException, ResolvedTable, Star, TableAlreadyExistsException, 
UnresolvedRegex}
@@ -2242,11 +2242,9 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
   }
 
   def negativeScaleNotAllowedError(scale: Int): Throwable = {
-new AnalysisException(
-  errorClass = "_LEGACY_ERROR_TEMP_1230",
-  messageParameters = Map(
-"scale" -> scale.toString,
-"config" -> LEGACY_ALLOW_NEGATIVE_SCALE_OF_DECIMAL_ENABLED.key))
+SparkException.internalError(s"Negative scale is not allowed: 
${scale.toString}." +
+  s" Set the config 
${toSQLConf(LEGACY_ALLOW_NEGATIVE_SCALE_OF_DECIMAL_ENABLED.key)}" +
+  " to \"true\" to allow it.")
   }
 
   def invalidPartitionColumnKeyInTableError(key: String, tblName: String): 
Throwable = {
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala
index 73944d9dff9..465c25118fa 100644
--- a/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala
+++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala
@@ -19,8 +19,7 @@ package org.apache.spark.sql.types
 
 import org.scalatest.PrivateMethodTester
 
-import org.apache.spark.{SparkArithmeticException, SparkFunSuite, 
SparkNumberFormatException}
-import org.apache.spark.sql.AnalysisException
+import org.apache.spark.{SparkArithmeticException, SparkException, 
SparkFunSuite, SparkNumberFormatException}
 import org.apache.spark.sql.catalyst.plans.SQLHelper
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types.Decimal._
@@ -111,9 +110,13 @@ class DecimalSuite extends SparkFunSuite with 
PrivateMethodTester with SQLHelper
 
   test("SPARK-30252: Negative scale is not allowed by default") {

[spark] branch master updated (514449b7cbf -> a641dc4954d)

2023-01-07 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 514449b7cbf [SPARK-41899][CONNECT][PYTHON] createDataFrame` should 
respect user provided DDL schema
 add a641dc4954d [SPARK-41889][SQL] Attach root cause to 
invalidPatternError & refactor error classes INVALID_PARAMETER_VALUE

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   | 24 -
 .../catalyst/expressions/regexpExpressions.scala   |  5 ++---
 .../spark/sql/errors/QueryCompilationErrors.scala  |  7 +++---
 .../spark/sql/errors/QueryExecutionErrors.scala| 25 --
 .../expressions/RegexpExpressionsSuite.scala   | 19 ++--
 .../sql-tests/results/postgreSQL/text.sql.out  |  5 ++---
 .../sql-tests/results/regexp-functions.sql.out | 18 
 .../sql/errors/QueryCompilationErrorsSuite.scala   |  9 
 .../sql/errors/QueryExecutionErrorsSuite.scala | 19 
 9 files changed, 78 insertions(+), 53 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41580][SQL] Assign name to _LEGACY_ERROR_TEMP_2137

2023-01-04 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 4d3bc8f5b55 [SPARK-41580][SQL] Assign name to _LEGACY_ERROR_TEMP_2137
4d3bc8f5b55 is described below

commit 4d3bc8f5b55969f7c954991239ff43f9faba1346
Author: itholic 
AuthorDate: Thu Jan 5 10:58:14 2023 +0500

[SPARK-41580][SQL] Assign name to _LEGACY_ERROR_TEMP_2137

### What changes were proposed in this pull request?

This PR proposes to assign name to _LEGACY_ERROR_TEMP_2137, 
"INVALID_JSON_ROOT_FIELD".

### Why are the changes needed?

We should assign proper name to _LEGACY_ERROR_TEMP_*

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

`./build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*`

Closes #39305 from itholic/LEGACY_2137.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 +-
 .../org/apache/spark/sql/errors/QueryExecutionErrors.scala |  2 +-
 .../spark/sql/execution/datasources/json/JsonSuite.scala   | 14 +++---
 3 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 12f4b0f9c37..29cafdcc1b6 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -760,6 +760,11 @@
   "The identifier  is invalid. Please, consider quoting it with 
back-quotes as ``."
 ]
   },
+  "INVALID_JSON_ROOT_FIELD" : {
+"message" : [
+  "Cannot convert JSON root field to target Spark type."
+]
+  },
   "INVALID_JSON_SCHEMA_MAP_TYPE" : {
 "message" : [
   "Input schema  can only contain STRING as a key type for a 
MAP."
@@ -4110,11 +4115,6 @@
   "Failed to parse an empty string for data type "
 ]
   },
-  "_LEGACY_ERROR_TEMP_2137" : {
-"message" : [
-  "Root converter returned null"
-]
-  },
   "_LEGACY_ERROR_TEMP_2138" : {
 "message" : [
   "Cannot have circular references in bean class, but got the circular 
reference of class "
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
index 227e86994f5..0c92d56ed04 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
@@ -1457,7 +1457,7 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase {
 
   def rootConverterReturnNullError(): SparkRuntimeException = {
 new SparkRuntimeException(
-  errorClass = "_LEGACY_ERROR_TEMP_2137",
+  errorClass = "INVALID_JSON_ROOT_FIELD",
   messageParameters = Map.empty)
   }
 
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
index 0d2c98316e7..a4b7df9af42 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
@@ -25,11 +25,12 @@ import java.time.{Duration, Instant, LocalDate, 
LocalDateTime, Period, ZoneId}
 import java.util.Locale
 
 import com.fasterxml.jackson.core.JsonFactory
+import org.apache.commons.lang3.exception.ExceptionUtils
 import org.apache.hadoop.fs.{Path, PathFilter}
 import org.apache.hadoop.io.SequenceFile.CompressionType
 import org.apache.hadoop.io.compress.GzipCodec
 
-import org.apache.spark.{SparkConf, SparkException, SparkUpgradeException, 
TestUtils}
+import org.apache.spark.{SparkConf, SparkException, SparkRuntimeException, 
SparkUpgradeException, TestUtils}
 import org.apache.spark.rdd.RDD
 import org.apache.spark.sql.{functions => F, _}
 import org.apache.spark.sql.catalyst.json._
@@ -3192,10 +3193,17 @@ abstract class JsonSuite
   }
 
   test("SPARK-36379: proceed parsing with root nulls in permissive mode") {
-assert(intercept[SparkException] {
+val exception = intercept[SparkException] {
   spark.read.option("mode", "failfast")
 .schema("a string").json(Seq("""[{"a": "str"}, 
null]""").toDS).collect()
-}.getMessage.contains("Malformed records are detected"))
+}
+assert(exception.getMessage.contains("Malformed records are d

[spark] branch master updated: [SPARK-41576][SQL] Assign name to _LEGACY_ERROR_TEMP_2051

2023-01-04 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 76d7c857078 [SPARK-41576][SQL] Assign name to _LEGACY_ERROR_TEMP_2051
76d7c857078 is described below

commit 76d7c8570788c773720c6e143e496647dfe9ebe0
Author: itholic 
AuthorDate: Thu Jan 5 10:47:46 2023 +0500

[SPARK-41576][SQL] Assign name to _LEGACY_ERROR_TEMP_2051

### What changes were proposed in this pull request?

This PR proposes to assign name to _LEGACY_ERROR_TEMP_2051, 
"DATA_SOURCE_NOT_FOUND".

### Why are the changes needed?

We should assign proper name to _LEGACY_ERROR_TEMP_*

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

`./build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*`

Closes #39281 from itholic/LEGACY_2051.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json | 10 +-
 .../org/apache/spark/sql/errors/QueryExecutionErrors.scala   |  4 ++--
 .../apache/spark/sql/execution/datasources/DataSource.scala  |  2 +-
 .../org/apache/spark/sql/execution/command/DDLSuite.scala| 12 
 .../apache/spark/sql/sources/ResolvedDataSourceSuite.scala   |  9 +++--
 5 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 120925f5254..12f4b0f9c37 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -441,6 +441,11 @@
 ],
 "sqlState" : "42000"
   },
+  "DATA_SOURCE_NOT_FOUND" : {
+"message" : [
+  "Failed to find the data source: . Please find packages at 
`https://spark.apache.org/third-party-projects.html`.";
+]
+  },
   "DATETIME_OVERFLOW" : {
 "message" : [
   "Datetime operation overflow: ."
@@ -3696,11 +3701,6 @@
   "Expected exactly one path to be specified, but got: "
 ]
   },
-  "_LEGACY_ERROR_TEMP_2051" : {
-"message" : [
-  "Failed to find data source: . Please find packages at 
https://spark.apache.org/third-party-projects.html";
-]
-  },
   "_LEGACY_ERROR_TEMP_2052" : {
 "message" : [
   " was removed in Spark 2.0. Please check if your library is 
compatible with Spark 2.0"
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
index 44a1972272f..227e86994f5 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
@@ -731,10 +731,10 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase {
   messageParameters = Map("paths" -> allPaths.mkString(", ")))
   }
 
-  def failedToFindDataSourceError(
+  def dataSourceNotFoundError(
   provider: String, error: Throwable): SparkClassNotFoundException = {
 new SparkClassNotFoundException(
-  errorClass = "_LEGACY_ERROR_TEMP_2051",
+  errorClass = "DATA_SOURCE_NOT_FOUND",
   messageParameters = Map("provider" -> provider),
   cause = error)
   }
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
index edbdd6bbc67..9bb5191dc01 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
@@ -643,7 +643,7 @@ object DataSource extends Logging {
 } else if (provider1.toLowerCase(Locale.ROOT) == "kafka") {
   throw 
QueryCompilationErrors.failedToFindKafkaDataSourceError(provider1)
 } else {
-  throw 
QueryExecutionErrors.failedToFindDataSourceError(provider1, error)
+  throw 
QueryExecutionErrors.dataSourceNotFoundError(provider1, error)
 }
 }
   } catch {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
index 6cc37a41210..f5d17b142e2 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
@@ -24,7 +24,7

[spark] branch master updated: [SPARK-41573][SQL] Assign name to _LEGACY_ERROR_TEMP_2136

2023-01-04 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new f352f103ed5 [SPARK-41573][SQL] Assign name to _LEGACY_ERROR_TEMP_2136
f352f103ed5 is described below

commit f352f103ed512806abb3f642571a0c595b8b0509
Author: itholic 
AuthorDate: Thu Jan 5 00:21:32 2023 +0500

[SPARK-41573][SQL] Assign name to _LEGACY_ERROR_TEMP_2136

### What changes were proposed in this pull request?

This PR proposes to assign name to _LEGACY_ERROR_TEMP_2136, 
"CANNOT_PARSE_JSON_FIELD".

### Why are the changes needed?

We should assign proper name to _LEGACY_ERROR_TEMP_*

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

`./build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*`

Closes #39284 from itholic/LEGACY_2136.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json| 10 +-
 .../spark/sql/catalyst/json/JacksonParser.scala |  2 +-
 .../spark/sql/errors/QueryExecutionErrors.scala |  8 
 .../org/apache/spark/sql/JsonFunctionsSuite.scala   | 21 ++---
 4 files changed, 24 insertions(+), 17 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index a7b120ef427..120925f5254 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -75,6 +75,11 @@
 ],
 "sqlState" : "42000"
   },
+  "CANNOT_PARSE_JSON_FIELD" : {
+"message" : [
+  "Cannot parse the field name  and the value  of 
the JSON token type  to target Spark data type "
+]
+  },
   "CANNOT_PARSE_PROTOBUF_DESCRIPTOR" : {
 "message" : [
   "Error parsing file  descriptor byte[] into Descriptor 
object"
@@ -4105,11 +4110,6 @@
   "Failed to parse an empty string for data type "
 ]
   },
-  "_LEGACY_ERROR_TEMP_2136" : {
-"message" : [
-  "Failed to parse field name , field value , 
[] to target spark data type []."
-]
-  },
   "_LEGACY_ERROR_TEMP_2137" : {
 "message" : [
   "Root converter returned null"
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
index ee21a1e2b76..3fe26e87499 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala
@@ -430,7 +430,7 @@ class JacksonParser(
 case token =>
   // We cannot parse this token based on the given data type. So, we throw 
a
   // RuntimeException and this exception will be caught by `parse` method.
-  throw QueryExecutionErrors.failToParseValueForDataTypeError(parser, 
token, dataType)
+  throw QueryExecutionErrors.cannotParseJSONFieldError(parser, token, 
dataType)
   }
 
   /**
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
index 3e234cfee2c..44a1972272f 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
@@ -1444,15 +1444,15 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase {
 "dataType" -> dataType.catalogString))
   }
 
-  def failToParseValueForDataTypeError(parser: JsonParser, token: JsonToken, 
dataType: DataType)
+  def cannotParseJSONFieldError(parser: JsonParser, jsonType: JsonToken, 
dataType: DataType)
   : SparkRuntimeException = {
 new SparkRuntimeException(
-  errorClass = "_LEGACY_ERROR_TEMP_2136",
+  errorClass = "CANNOT_PARSE_JSON_FIELD",
   messageParameters = Map(
 "fieldName" -> parser.getCurrentName.toString(),
 "fieldValue" -> parser.getText.toString(),
-"token" -> token.toString(),
-"dataType" -> dataType.toString()))
+"jsonType" -> jsonType.toString(),
+"dataType" -> toSQLType(dataType)))
   }
 
   def rootConverterReturnNullError(): SparkRuntimeException = {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala
index 399665c0de6..0f282336d58 100644
--- a/sql/core/src/test/scala/org/apa

[spark] branch master updated (b7a0fc4b7bd -> 4d6856e913c)

2023-01-02 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from b7a0fc4b7bd [SPARK-41658][CONNECT][TESTS] Enable doctests in 
pyspark.sql.connect.functions
 add 4d6856e913c [SPARK-41311][SQL][TESTS] Rewrite test 
RENAME_SRC_PATH_NOT_FOUND to trigger the error from user space

No new revisions were added by this update.

Summary of changes:
 .../sql/errors/QueryExecutionErrorsSuite.scala | 54 +-
 1 file changed, 31 insertions(+), 23 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2cf11cdb04f -> 973b8ffc828)

2023-01-02 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 2cf11cdb04f [SPARK-41854][PYTHON][BUILD] Automatic reformat/check 
python/setup.py
 add 973b8ffc828 [SPARK-41807][CORE] Remove non-existent error class: 
UNSUPPORTED_FEATURE.DISTRIBUTE_BY

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json | 5 -
 1 file changed, 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41571][SQL] Assign name to _LEGACY_ERROR_TEMP_2310

2023-01-01 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 470beda2231 [SPARK-41571][SQL] Assign name to _LEGACY_ERROR_TEMP_2310
470beda2231 is described below

commit 470beda2231c89d9cbd609bcf1e83d84c80a7f06
Author: itholic 
AuthorDate: Mon Jan 2 11:53:27 2023 +0500

[SPARK-41571][SQL] Assign name to _LEGACY_ERROR_TEMP_2310

### What changes were proposed in this pull request?

This PR proposes to assign name to _LEGACY_ERROR_TEMP_2310, 
"WRITE_STREAM_NOT_ALLOWED".

### Why are the changes needed?

We should assign proper name to _LEGACY_ERROR_TEMP_*

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

`./build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*`

Closes #39285 from itholic/LEGACY_2310.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 R/pkg/tests/fulltests/test_streaming.R |  3 +--
 core/src/main/resources/error/error-classes.json   | 10 +-
 sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala |  2 +-
 .../org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala |  8 +---
 4 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/R/pkg/tests/fulltests/test_streaming.R 
b/R/pkg/tests/fulltests/test_streaming.R
index cc84a985423..8804471e640 100644
--- a/R/pkg/tests/fulltests/test_streaming.R
+++ b/R/pkg/tests/fulltests/test_streaming.R
@@ -140,8 +140,7 @@ test_that("Non-streaming DataFrame", {
   expect_false(isStreaming(c))
 
   expect_error(write.stream(c, "memory", queryName = "people", outputMode = 
"complete"),
-   paste0(".*(writeStream : analysis error - 'writeStream' can be 
called only on ",
-  "streaming Dataset/DataFrame).*"))
+   paste0("Error in writeStream : analysis error - 
\\[WRITE_STREAM_NOT_ALLOWED\\].*"))
 })
 
 test_that("Unsupported operation", {
diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 4003fab0685..4687d04bf71 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1618,6 +1618,11 @@
 ],
 "sqlState" : "42000"
   },
+  "WRITE_STREAM_NOT_ALLOWED" : {
+"message" : [
+  "`writeStream` can be called only on streaming Dataset/DataFrame."
+]
+  },
   "WRONG_NUM_ARGS" : {
 "message" : [
   "Invalid number of arguments for the function ."
@@ -4907,11 +4912,6 @@
   "cannot resolve  in MERGE command given columns []"
 ]
   },
-  "_LEGACY_ERROR_TEMP_2310" : {
-"message" : [
-  "'writeStream' can be called only on streaming Dataset/DataFrame"
-]
-  },
   "_LEGACY_ERROR_TEMP_2311" : {
 "message" : [
   "'writeTo' can not be called on streaming Dataset/DataFrame"
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
index 5f6512d4e4b..c8e2a48859d 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
@@ -3875,7 +3875,7 @@ class Dataset[T] private[sql](
   def writeStream: DataStreamWriter[T] = {
 if (!isStreaming) {
   logicalPlan.failAnalysis(
-errorClass = "_LEGACY_ERROR_TEMP_2310",
+errorClass = "WRITE_STREAM_NOT_ALLOWED",
 messageParameters = Map.empty)
 }
 new DataStreamWriter[T](this)
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
index 3f2414d2178..17a003dfe8f 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
@@ -162,9 +162,11 @@ class DataFrameReaderWriterSuite extends QueryTest with 
SharedSparkSession with
 .writeStream
 .start()
 }
-Seq("'writeStream'", "only", "streaming Dataset/DataFrame").foreach { s =>
-  
assert(e.getMessage.toLowerCase(Locale.ROOT).contains(s.toLowerCase(Locale.ROOT)))
-}
+checkError(
+  exception = e,
+  errorClass = "WRITE_STREAM_NOT_ALLOWED",
+  parameters = Map.empty
+)
   }
 
   test("resolve default source") {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41796][TESTS] Test the error class: UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY.UNSUPPORTED_CORRELATED_REFERENCE_DATA_TYPE

2023-01-01 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new b0751ed22b9 [SPARK-41796][TESTS] Test the error class: 
UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY.UNSUPPORTED_CORRELATED_REFERENCE_DATA_TYPE
b0751ed22b9 is described below

commit b0751ed22b94a93a5a60a20b24a88ca77d67c694
Author: panbingkun 
AuthorDate: Sun Jan 1 21:45:56 2023 +0500

[SPARK-41796][TESTS] Test the error class: 
UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY.UNSUPPORTED_CORRELATED_REFERENCE_DATA_TYPE

### What changes were proposed in this pull request?
This PR aims to modify a test for the error class 
UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY.UNSUPPORTED_CORRELATED_REFERENCE_DATA_TYPE
 in SubquerySuite.

### Why are the changes needed?
The changes improve test coverage, and document expected error messages in 
tests.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.
Update existed UT.

Closes #39320 from panbingkun/SPARK-41796.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 .../scala/org/apache/spark/sql/SubquerySuite.scala   | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala
index 65dd911df31..3d4a629f7a9 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala
@@ -2452,16 +2452,24 @@ class SubquerySuite extends QueryTest
 Row(2))
 
   // Cannot use non-orderable data type in one row subquery that cannot be 
collapsed.
-val error = intercept[AnalysisException] {
+  checkError(
+exception = intercept[AnalysisException] {
   sql(
-"""
-  |select (
+"""select (
   |  select concat(a, a) from
   |  (select upper(x['a'] + rand()) as a)
   |) from v1
-  |""".stripMargin).collect()
-}
-assert(error.getMessage.contains("Correlated column reference 'v1.x' 
cannot be map type"))
+  |""".stripMargin
+  ).collect()
+},
+errorClass = "UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY." +
+  "UNSUPPORTED_CORRELATED_REFERENCE_DATA_TYPE",
+parameters = Map("expr" -> "v1.x", "dataType" -> "map"),
+context = ExpectedContext(
+  fragment = "select upper(x['a'] + rand()) as a",
+  start = 39,
+  stop = 72)
+  )
 }
   }
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41578][SQL] Assign name to _LEGACY_ERROR_TEMP_2141

2022-12-29 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 7823f84942a [SPARK-41578][SQL] Assign name to _LEGACY_ERROR_TEMP_2141
7823f84942a is described below

commit 7823f84942acd1a1a6abc5c1f9045317795d00fb
Author: itholic 
AuthorDate: Fri Dec 30 12:18:50 2022 +0500

[SPARK-41578][SQL] Assign name to _LEGACY_ERROR_TEMP_2141

### What changes were proposed in this pull request?

This PR proposes to assign name to _LEGACY_ERROR_TEMP_2141, 
"ENCODER_NOT_FOUND".

### Why are the changes needed?

We should assign proper name to _LEGACY_ERROR_TEMP_*

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

`./build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*`

Closes #39279 from itholic/LEGACY_2141.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 11 ++-
 .../spark/sql/catalyst/ScalaReflection.scala   |  2 +-
 .../spark/sql/errors/QueryExecutionErrors.scala|  8 +--
 .../encoders/EncoderErrorMessageSuite.scala| 80 ++
 .../catalyst/encoders/ExpressionEncoderSuite.scala | 13 ++--
 5 files changed, 52 insertions(+), 62 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 21b7c467b64..67398a30180 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -459,6 +459,11 @@
   "The index 0 is invalid. An index shall be either < 0 or > 0 (the first 
element has index 1)."
 ]
   },
+  "ENCODER_NOT_FOUND" : {
+"message" : [
+  "Not found an encoder of the type  to Spark SQL internal 
representation. Consider to change the input type to one of supported at 
https://spark.apache.org/docs/latest/sql-ref-datatypes.html.";
+]
+  },
   "FAILED_EXECUTE_UDF" : {
 "message" : [
   "Failed to execute user defined function (: () 
=> )"
@@ -4116,12 +4121,6 @@
   ""
 ]
   },
-  "_LEGACY_ERROR_TEMP_2141" : {
-"message" : [
-  "No Encoder found for ",
-  ""
-]
-  },
   "_LEGACY_ERROR_TEMP_2142" : {
 "message" : [
   "Attributes for type  is not supported"
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
index 0a8a823216f..e02e42cea1a 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
@@ -779,7 +779,7 @@ object ScalaReflection extends ScalaReflection {
 }
 ProductEncoder(ClassTag(getClassFromType(t)), params)
   case _ =>
-throw QueryExecutionErrors.cannotFindEncoderForTypeError(tpe.toString, 
path)
+throw QueryExecutionErrors.cannotFindEncoderForTypeError(tpe.toString)
 }
   }
 }
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
index cef4acafe07..3e234cfee2c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
@@ -1483,13 +1483,11 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase {
 "walkedTypePath" -> walkedTypePath.toString()))
   }
 
-  def cannotFindEncoderForTypeError(
-  tpe: String, walkedTypePath: WalkedTypePath): 
SparkUnsupportedOperationException = {
+  def cannotFindEncoderForTypeError(typeName: String): 
SparkUnsupportedOperationException = {
 new SparkUnsupportedOperationException(
-  errorClass = "_LEGACY_ERROR_TEMP_2141",
+  errorClass = "ENCODER_NOT_FOUND",
   messageParameters = Map(
-"tpe" -> tpe,
-"walkedTypePath" -> walkedTypePath.toString()))
+"typeName" -> typeName))
   }
 
   def attributesForTypeUnsupportedError(schema: Schema): 
SparkUnsupportedOperationException = {
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/EncoderErrorMessageSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/EncoderErrorMessageSuite.scala
index 8c766ef8299..501dfa58305 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/EncoderErrorMessageSuite.scala
++

[spark] branch master updated: [SPARK-41729][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_0011` to `UNSUPPORTED_FEATURE.COMBINATION_QUERY_RESULT_CLAUSES`

2022-12-27 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e5508443f66 [SPARK-41729][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_0011` 
to `UNSUPPORTED_FEATURE.COMBINATION_QUERY_RESULT_CLAUSES`
e5508443f66 is described below

commit e5508443f66d92fe5106bcdf7f2a868164c62c9c
Author: yangjie01 
AuthorDate: Wed Dec 28 11:36:47 2022 +0500

[SPARK-41729][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_0011` to 
`UNSUPPORTED_FEATURE.COMBINATION_QUERY_RESULT_CLAUSES`

### What changes were proposed in this pull request?
In the PR, I propose to assign the name 
`UNSUPPORTED_FEATURE.COMBINATION_QUERY_RESULT_CLAUSES` to the error class 
`_LEGACY_ERROR_TEMP_0011`.

### Why are the changes needed?
Proper names of error classes should improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GA

Closes #39235 from LuciferYang/SPARK-41729.

Authored-by: yangjie01 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 +-
 .../scala/org/apache/spark/sql/errors/QueryParsingErrors.scala |  2 +-
 .../apache/spark/sql/catalyst/parser/ErrorParserSuite.scala|  2 +-
 .../org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala |  8 
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 2f144251e5d..21b7c467b64 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1331,6 +1331,11 @@
   "Catalog  does not support ."
 ]
   },
+  "COMBINATION_QUERY_RESULT_CLAUSES" : {
+"message" : [
+  "Combination of ORDER BY/SORT BY/DISTRIBUTE BY/CLUSTER BY."
+]
+  },
   "DESC_TABLE_COLUMN_PARTITION" : {
 "message" : [
   "DESC TABLE COLUMN for a specific partition."
@@ -1645,11 +1650,6 @@
   "There must be at least one WHEN clause in a MERGE statement."
 ]
   },
-  "_LEGACY_ERROR_TEMP_0011" : {
-"message" : [
-  "Combination of ORDER BY/SORT BY/DISTRIBUTE BY/CLUSTER BY is not 
supported."
-]
-  },
   "_LEGACY_ERROR_TEMP_0012" : {
 "message" : [
   "DISTRIBUTE BY is not supported."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
index 773a79a3f3f..ef59dfa5517 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
@@ -78,7 +78,7 @@ private[sql] object QueryParsingErrors extends 
QueryErrorsBase {
   }
 
   def combinationQueryResultClausesUnsupportedError(ctx: 
QueryOrganizationContext): Throwable = {
-new ParseException(errorClass = "_LEGACY_ERROR_TEMP_0011", ctx)
+new ParseException(errorClass = 
"UNSUPPORTED_FEATURE.COMBINATION_QUERY_RESULT_CLAUSES", ctx)
   }
 
   def distributeByUnsupportedError(ctx: QueryOrganizationContext): Throwable = 
{
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ErrorParserSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ErrorParserSuite.scala
index a985992abba..7cf853b0812 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ErrorParserSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ErrorParserSuite.scala
@@ -34,7 +34,7 @@ class ErrorParserSuite extends AnalysisTest {
   test("semantic errors") {
 checkError(
   exception = parseException("select *\nfrom r\norder by q\ncluster by q"),
-  errorClass = "_LEGACY_ERROR_TEMP_0011",
+  errorClass = "UNSUPPORTED_FEATURE.COMBINATION_QUERY_RESULT_CLAUSES",
   parameters = Map.empty,
   context = ExpectedContext(fragment = "order by q\ncluster by q", start = 
16, stop = 38))
   }
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala
index 035e6231178..c25f218fe1b 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala
@@ -376,7 +376,7 @@ class PlanParserSuite extends AnalysisTest {
 val sql1 = s"$baseSql order

[spark] branch master updated: [SPARK-41666][PYTHON] Support parameterized SQL by `sql()`

2022-12-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new a1c727f3867 [SPARK-41666][PYTHON] Support parameterized SQL by `sql()`
a1c727f3867 is described below

commit a1c727f386724156f680953fa34ec51bb35348a4
Author: Max Gekk 
AuthorDate: Fri Dec 23 12:30:30 2022 +0300

[SPARK-41666][PYTHON] Support parameterized SQL by `sql()`

### What changes were proposed in this pull request?
In the PR, I propose to extend the `sql()` method in PySpark to support 
parameterized SQL queries, see https://github.com/apache/spark/pull/38864, and 
add new parameter - `args` of the type `Dict[str, str]`. This parameter maps 
named parameters that can occur in the input SQL query to SQL literals like 1, 
INTERVAL '1-1' YEAR TO MONTH, DATE'2022-12-22' (see [the doc 
](https://spark.apache.org/docs/latest/sql-ref-literals.html)of supported 
literals).

For example:
```python
>>> spark.sql("SELECT * FROM range(10) WHERE id > :minId", args = 
{"minId" : "7"})
   id
0   8
1   9
```

Closes #39159

### Why are the changes needed?
To achieve feature parity with Scala/Java API, and provide PySpark users 
the same feature.

### Does this PR introduce _any_ user-facing change?
No, it shouldn't.

### How was this patch tested?
Checked the examples locally, and running the tests:
```
$ python/run-tests --modules=pyspark-sql --parallelism=1
```

Closes #39183 from MaxGekk/parameterized-sql-pyspark-dict.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   |  6 +++---
 .../source/migration_guide/pyspark_3.3_to_3.4.rst  |  2 ++
 python/pyspark/pandas/sql_formatter.py | 20 +--
 python/pyspark/sql/session.py  | 23 ++
 4 files changed, 42 insertions(+), 9 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index ff235e80dbb..95db9005d02 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -813,7 +813,7 @@
   },
   "INVALID_SQL_ARG" : {
 "message" : [
-  "The argument  of `sql()` is invalid. Consider to replace it by a 
SQL literal statement."
+  "The argument  of `sql()` is invalid. Consider to replace it by a 
SQL literal."
 ]
   },
   "INVALID_SQL_SYNTAX" : {
@@ -1164,7 +1164,7 @@
   },
   "UNBOUND_SQL_PARAMETER" : {
 "message" : [
-  "Found the unbound parameter: . Please, fix `args` and provide a 
mapping of the parameter to a SQL literal statement."
+  "Found the unbound parameter: . Please, fix `args` and provide a 
mapping of the parameter to a SQL literal."
 ]
   },
   "UNCLOSED_BRACKETED_COMMENT" : {
@@ -5225,4 +5225,4 @@
   "grouping() can only be used with GroupingSets/Cube/Rollup"
 ]
   }
-}
\ No newline at end of file
+}
diff --git a/python/docs/source/migration_guide/pyspark_3.3_to_3.4.rst 
b/python/docs/source/migration_guide/pyspark_3.3_to_3.4.rst
index b3baa8345aa..ca942c54979 100644
--- a/python/docs/source/migration_guide/pyspark_3.3_to_3.4.rst
+++ b/python/docs/source/migration_guide/pyspark_3.3_to_3.4.rst
@@ -39,3 +39,5 @@ Upgrading from PySpark 3.3 to 3.4
 * In Spark 3.4, the ``Series.concat`` sort parameter will be respected to 
follow pandas 1.4 behaviors.
 
 * In Spark 3.4, the ``DataFrame.__setitem__`` will make a copy and replace 
pre-existing arrays, which will NOT be over-written to follow pandas 1.4 
behaviors.
+
+* In Spark 3.4, the ``SparkSession.sql`` and the Pandas on Spark API ``sql`` 
have got new parameter ``args`` which provides binding of named parameters to 
their SQL literals.
diff --git a/python/pyspark/pandas/sql_formatter.py 
b/python/pyspark/pandas/sql_formatter.py
index 45c615161d9..9103366c192 100644
--- a/python/pyspark/pandas/sql_formatter.py
+++ b/python/pyspark/pandas/sql_formatter.py
@@ -17,7 +17,7 @@
 
 import os
 import string
-from typing import Any, Optional, Union, List, Sequence, Mapping, Tuple
+from typing import Any, Dict, Optional, Union, List, Sequence, Mapping, Tuple
 import uuid
 import warnings
 
@@ -43,6 +43,7 @@ _CAPTURE_SCOPES = 3
 def sql(
 query: str,
 index_col: Optional[Union[str, List[str]]] = None,
+args: Dict[str, str] = {},
 **kwargs: Any,
 ) -> DataFrame:
 """
@@ -57,6 +58,8 @@ def sql(
 * pandas Series
 * string
 
+Also the method can bind named parameters to SQL literals from `args`.
+
 Parameters

[spark] branch master updated: [SPARK-41565][SQL] Add the error class `UNRESOLVED_ROUTINE`

2022-12-21 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new cd832e546fc [SPARK-41565][SQL] Add the error class `UNRESOLVED_ROUTINE`
cd832e546fc is described below

commit cd832e546fc58c522d4afa90fc781c3be2cc527e
Author: Max Gekk 
AuthorDate: Wed Dec 21 16:33:02 2022 +0300

[SPARK-41565][SQL] Add the error class `UNRESOLVED_ROUTINE`

### What changes were proposed in this pull request?
In the PR, I propose to remove the error classes `_LEGACY_ERROR_TEMP_1041`, 
`_LEGACY_ERROR_TEMP_1242` and `_LEGACY_ERROR_TEMP_1243`, and use new one 
`UNRESOLVED_ROUTINE` instead.

Closes #38870

### Why are the changes needed?
To improve user experience with Spark SQL, and unify representation of 
error messages.

### Does this PR introduce _any_ user-facing change?
Yes, the PR changes an user-facing error message.

### How was this patch tested?
By running the modified test suites:
```
$ build/sbt "core/testOnly *SparkThrowableSuite"
$ build/sbt "test:testOnly *SQLQuerySuite"
$ build/sbt "test:testOnly *UDFSuite"
$ build/sbt "test:testOnly *HiveUDFSuite"
$ build/sbt "test:testOnly *HiveQuerySuite"
$ build/sbt "test:testOnly *JDBCV2Suite"
$ build/sbt "test:testOnly *DDLSuite"
$ build/sbt "test:testOnly *DataSourceV2FunctionSuite"
$ build/sbt "test:testOnly *LookupFunctionsSuite"
$ PYSPARK_PYTHON=python3 build/sbt "sql/testOnly 
org.apache.spark.sql.SQLQueryTestSuite"
```

Closes #39095 from MaxGekk/unresolved-routine-error-class.

Lead-authored-by: Max Gekk 
Co-authored-by: Serge Rielau 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 21 ---
 .../spark/sql/catalyst/analysis/Analyzer.scala | 15 ++--
 .../sql/catalyst/analysis/CheckAnalysis.scala  |  7 ++--
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  2 +-
 .../spark/sql/errors/QueryCompilationErrors.scala  | 41 ++
 .../catalyst/analysis/LookupFunctionsSuite.scala   | 10 +++---
 .../apache/spark/sql/internal/CatalogImpl.scala|  5 ++-
 .../double-quoted-identifiers-disabled.sql.out | 13 ---
 .../ansi/double-quoted-identifiers-enabled.sql.out | 26 --
 .../sql-tests/results/ansi/interval.sql.out| 28 ---
 .../results/double-quoted-identifiers.sql.out  | 13 ---
 .../sql-tests/results/inline-table.sql.out |  7 ++--
 .../resources/sql-tests/results/interval.sql.out   | 28 ---
 .../results/postgreSQL/window_part3.sql.out|  7 ++--
 .../sql-tests/results/udf/udf-inline-table.sql.out |  7 ++--
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 14 ++--
 .../test/scala/org/apache/spark/sql/UDFSuite.scala | 19 ++
 .../sql/connector/DataSourceV2FunctionSuite.scala  | 15 ++--
 .../spark/sql/execution/command/DDLSuite.scala | 34 --
 .../org/apache/spark/sql/jdbc/JDBCV2Suite.scala| 32 -
 .../spark/sql/hive/execution/HiveQuerySuite.scala  | 16 -
 .../spark/sql/hive/execution/HiveUDFSuite.scala| 17 ++---
 .../spark/sql/hive/execution/SQLQuerySuite.scala   | 13 +--
 23 files changed, 244 insertions(+), 146 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index e6ae5678993..989df84ed53 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1266,6 +1266,12 @@
 },
 "sqlState" : "42000"
   },
+  "UNRESOLVED_ROUTINE" : {
+"message" : [
+  "Cannot resolve function  on search path ."
+],
+"sqlState" : "42000"
+  },
   "UNSUPPORTED_DATATYPE" : {
 "message" : [
   "Unsupported data type "
@@ -2060,11 +2066,6 @@
   "Gap duration expression used in session window must be 
CalendarIntervalType, but got ."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1041" : {
-"message" : [
-  "Undefined function ."
-]
-  },
   "_LEGACY_ERROR_TEMP_1045" : {
 "message" : [
   "ALTER TABLE SET LOCATION does not support partition for v2 tables."
@@ -2920,16 +2921,6 @@
   "CREATE-TABLE-AS-SELECT cannot create table with location to a non-empty 
directory . To allow overwriting the existing non-empty directory, 
set '' to true."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1242" : {
-"message" : [
-  "Undefined function: . This function is neither a 
built-in/temporary fun

[spark] branch master updated: [SPARK-41568][SQL] Assign name to _LEGACY_ERROR_TEMP_1236

2022-12-20 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 2440f699797 [SPARK-41568][SQL] Assign name to _LEGACY_ERROR_TEMP_1236
2440f699797 is described below

commit 2440f6997978ca033579a311caea561140ef76d5
Author: panbingkun 
AuthorDate: Tue Dec 20 21:16:43 2022 +0300

[SPARK-41568][SQL] Assign name to _LEGACY_ERROR_TEMP_1236

### What changes were proposed in this pull request?
In the PR, I propose to assign the name `UNSUPPORTED_FEATURE.ANALYZE_VIEW` 
to the error class `_LEGACY_ERROR_TEMP_1236`.

### Why are the changes needed?
Proper names of error classes should improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes #39119 from panbingkun/LEGACY_ERROR_TEMP_1236.

Lead-authored-by: panbingkun 
Co-authored-by: Maxim Gekk 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 -
 .../spark/sql/errors/QueryCompilationErrors.scala  |  2 +-
 .../spark/sql/StatisticsCollectionSuite.scala  | 24 +-
 3 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 30b0a5ce8f3..b5e846a8a89 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1309,6 +1309,11 @@
   "The ANALYZE TABLE FOR COLUMNS command does not support the type 
 of the column  in the table ."
 ]
   },
+  "ANALYZE_VIEW" : {
+"message" : [
+  "The ANALYZE TABLE command does not support views."
+]
+  },
   "CATALOG_OPERATION" : {
 "message" : [
   "Catalog  does not support ."
@@ -2895,11 +2900,6 @@
   "Partition spec is invalid. The spec () must match the 
partition spec () defined in table ''."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1236" : {
-"message" : [
-  "ANALYZE TABLE is not supported on views."
-]
-  },
   "_LEGACY_ERROR_TEMP_1237" : {
 "message" : [
   "The list of partition columns with values in partition specification 
for table '' in database '' is not a prefix of the list of 
partition columns defined in the table schema. Expected a prefix of 
[], but got []."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index 2ddd0704565..b0cf8f6876c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -2302,7 +2302,7 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
 
   def analyzeTableNotSupportedOnViewsError(): Throwable = {
 new AnalysisException(
-  errorClass = "_LEGACY_ERROR_TEMP_1236",
+  errorClass = "UNSUPPORTED_FEATURE.ANALYZE_VIEW",
   messageParameters = Map.empty)
   }
 
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala
index dda1cc5b52b..2ab8bb25a8b 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala
@@ -63,22 +63,26 @@ class StatisticsCollectionSuite extends 
StatisticsCollectionTestBase with Shared
   }
 
   test("analyzing views is not supported") {
-def assertAnalyzeUnsupported(analyzeCommand: String): Unit = {
-  val err = intercept[AnalysisException] {
-sql(analyzeCommand)
-  }
-  assert(err.message.contains("ANALYZE TABLE is not supported"))
-}
-
 val tableName = "tbl"
 withTable(tableName) {
   spark.range(10).write.saveAsTable(tableName)
   val viewName = "view"
   withView(viewName) {
 sql(s"CREATE VIEW $viewName AS SELECT * FROM $tableName")
-
-assertAnalyzeUnsupported(s"ANALYZE TABLE $viewName COMPUTE STATISTICS")
-assertAnalyzeUnsupported(s"ANALYZE TABLE $viewName COMPUTE STATISTICS 
FOR COLUMNS id")
+checkError(
+  exception = intercept[AnalysisException] {
+sql(s"ANALYZE TABLE $viewName COMPUTE STATISTICS")
+  },
+  errorClass = "UNSUPPORTED_FEATURE.ANALYZE_VIEW",
+  parameters = Ma

[spark] branch master updated: [SPARK-41582][CORE][SQL] Reuse `INVALID_TYPED_LITERAL` instead of `_LEGACY_ERROR_TEMP_0022`

2022-12-20 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 9840a0327a3 [SPARK-41582][CORE][SQL] Reuse `INVALID_TYPED_LITERAL` 
instead of `_LEGACY_ERROR_TEMP_0022`
9840a0327a3 is described below

commit 9840a0327a3f242877759c97d2e7bbf8b4ac1072
Author: yangjie01 
AuthorDate: Tue Dec 20 18:15:08 2022 +0300

[SPARK-41582][CORE][SQL] Reuse `INVALID_TYPED_LITERAL` instead of 
`_LEGACY_ERROR_TEMP_0022`

### What changes were proposed in this pull request?
This pr aims reuse `INVALID_TYPED_LITERAL` instead of 
`_LEGACY_ERROR_TEMP_0022`.

### Why are the changes needed?
Proper names of error classes to improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
Yes, the PR changes user-facing error message.

### How was this patch tested?
Pass GitHub Actions

Closes #39122 from LuciferYang/SPARK-41582.

Authored-by: yangjie01 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   |   5 -
 .../spark/sql/catalyst/parser/AstBuilder.scala | 130 ++---
 .../spark/sql/errors/QueryParsingErrors.scala  |   9 --
 .../catalyst/parser/ExpressionParserSuite.scala|   8 +-
 .../sql-tests/results/ansi/literals.sql.out|   6 +-
 .../resources/sql-tests/results/literals.sql.out   |   6 +-
 6 files changed, 77 insertions(+), 87 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 68034a5221e..30b0a5ce8f3 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1663,11 +1663,6 @@
   "Function trim doesn't support with type . Please use BOTH, 
LEADING or TRAILING as trim type."
 ]
   },
-  "_LEGACY_ERROR_TEMP_0022" : {
-"message" : [
-  "."
-]
-  },
   "_LEGACY_ERROR_TEMP_0023" : {
 "message" : [
   "Numeric literal  does not fit in range 
[, ] for type ."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
index 545d5d97d88..ea752a420d5 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
@@ -2379,76 +2379,72 @@ class AstBuilder extends 
SqlBaseParserBaseVisitor[AnyRef] with SQLConfHelper wit
   specialTs.getOrElse(toLiteral(stringToTimestamp(_, zoneId), 
TimestampType))
 }
 
-try {
-  valueType match {
-case "DATE" =>
-  val zoneId = getZoneId(conf.sessionLocalTimeZone)
-  val specialDate = convertSpecialDate(value, zoneId).map(Literal(_, 
DateType))
-  specialDate.getOrElse(toLiteral(stringToDate, DateType))
-case "TIMESTAMP_NTZ" =>
-  convertSpecialTimestampNTZ(value, 
getZoneId(conf.sessionLocalTimeZone))
-.map(Literal(_, TimestampNTZType))
-.getOrElse(toLiteral(stringToTimestampWithoutTimeZone, 
TimestampNTZType))
-case "TIMESTAMP_LTZ" =>
-  constructTimestampLTZLiteral(value)
-case "TIMESTAMP" =>
-  SQLConf.get.timestampType match {
-case TimestampNTZType =>
-  convertSpecialTimestampNTZ(value, 
getZoneId(conf.sessionLocalTimeZone))
-.map(Literal(_, TimestampNTZType))
-.getOrElse {
-  val containsTimeZonePart =
-
DateTimeUtils.parseTimestampString(UTF8String.fromString(value))._2.isDefined
-  // If the input string contains time zone part, return a 
timestamp with local time
-  // zone literal.
-  if (containsTimeZonePart) {
-constructTimestampLTZLiteral(value)
-  } else {
-toLiteral(stringToTimestampWithoutTimeZone, 
TimestampNTZType)
-  }
+valueType match {
+  case "DATE" =>
+val zoneId = getZoneId(conf.sessionLocalTimeZone)
+val specialDate = convertSpecialDate(value, zoneId).map(Literal(_, 
DateType))
+specialDate.getOrElse(toLiteral(stringToDate, DateType))
+  case "TIMESTAMP_NTZ" =>
+convertSpecialTimestampNTZ(value, getZoneId(conf.sessionLocalTimeZone))
+  .map(Literal(_, TimestampNTZType))
+  .getOrElse(toLiteral(stringToTimestampWithoutTimeZone, 
TimestampNTZType))
+  case "TIMESTAMP_LTZ" =>
+constructTimestampLTZLiteral(value)
+  case "TIMESTAMP" =>
+

[spark] branch branch-3.3 updated: [SPARK-41538][SQL] Metadata column should be appended at the end of project list

2022-12-15 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new b23198ee6d7 [SPARK-41538][SQL] Metadata column should be appended at 
the end of project list
b23198ee6d7 is described below

commit b23198ee6d76cc0486ae810a1d37f0474b74c27c
Author: Gengliang Wang 
AuthorDate: Fri Dec 16 10:43:17 2022 +0300

[SPARK-41538][SQL] Metadata column should be appended at the end of project 
list

### What changes were proposed in this pull request?

For the following query:
```
CREATE TABLE table_1 (
  a ARRAY,
 s STRUCT)
USING parquet;

CREATE VIEW view_1 (id)
AS WITH source AS (
SELECT * FROM table_1
),
renamed AS (
SELECT
 s.id
FROM source
)
SELECT id FROM renamed;

with foo AS (
  SELECT 'a' as id
),
bar AS (
  SELECT 'a' as id
)
SELECT
  1
FROM foo
FULL OUTER JOIN bar USING(id)
FULL OUTER JOIN view_1 USING(id)
WHERE foo.id IS NOT NULL
```

There will be the following error:
```
class org.apache.spark.sql.types.ArrayType cannot be cast to class 
org.apache.spark.sql.types.StructType (org.apache.spark.sql.types.ArrayType and 
org.apache.spark.sql.types.StructType are in unnamed module of loader 'app')
java.lang.ClassCastException: class org.apache.spark.sql.types.ArrayType 
cannot be cast to class org.apache.spark.sql.types.StructType 
(org.apache.spark.sql.types.ArrayType and org.apache.spark.sql.types.StructType 
are in unnamed module of loader 'app')
at 
org.apache.spark.sql.catalyst.expressions.GetStructField.childSchema$lzycompute(complexTypeExtractors.scala:108)
at 
org.apache.spark.sql.catalyst.expressions.GetStructField.childSchema(complexTypeExtractors.scala:108)
```
This is caused by the inconsistent metadata column positions in the 
following two nodes:
* Table relation: at the ending position
* Project list: at the beginning position
https://user-images.githubusercontent.com/1097932/207992343-438714bc-e1d1-46f7-9a79-84ab83dd299f.png";>

When the InlineCTE rule executes, the metadata column in the project is 
wrongly combined with the table output.
https://user-images.githubusercontent.com/1097932/207992431-f4cfc774-4cab-4728-b109-2ebff94e5fe2.png";>

Thus the column `a ARRAY` is casted as `s STRUCT` and 
cause the error.

This PR is to fix the issue by putting the Metadata column at the end of 
project list, so that it is consistent with the table relation.
### Why are the changes needed?

Bug fix

### Does this PR introduce _any_ user-facing change?

Yes, it fixes a bug in the analysis rule `AddMetadataColumns`

### How was this patch tested?

New test case

Closes #39081 from gengliangwang/fixMetadata.

Authored-by: Gengliang Wang 
Signed-off-by: Max Gekk 
(cherry picked from commit 172f719fffa84a2528628e08627a02cf8d1fe8a8)
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/analysis/Analyzer.scala |  2 +-
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 39 ++
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index 0c68dd8839d..c6429077b07 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -970,7 +970,7 @@ class Analyzer(override val catalogManager: CatalogManager)
   case s: ExposesMetadataColumns => s.withMetadataColumns()
   case p: Project =>
 val newProj = p.copy(
-  projectList = p.metadataOutput ++ p.projectList,
+  projectList = p.projectList ++ p.metadataOutput,
   child = addMetadataCol(p.child))
 newProj.copyTagsFrom(p)
 newProj
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
index 5b42d05c237..66f9700e8ac 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
@@ -4572,6 +4572,45 @@ class SQLQuerySuite extends QueryTest with 
SharedSparkSession with AdaptiveSpark
   sql("SELECT /*+ hash(t2) */ * FROM t1 join t2 on c1 = c2")
 }
   }
+
+  test("SPARK-41538: Metadata column should be appended at the end of 
project") {
+val tableName = "table_1"
+val viewName = "view_1"
+withTable(tabl

[spark] branch master updated (066870d938c -> 172f719fffa)

2022-12-15 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 066870d938c [SPARK-41518][SQL] Assign a name to the error class 
`_LEGACY_ERROR_TEMP_2422`
 add 172f719fffa [SPARK-41538][SQL] Metadata column should be appended at 
the end of project list

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala |  2 +-
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 39 ++
 2 files changed, 40 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41518][SQL] Assign a name to the error class `_LEGACY_ERROR_TEMP_2422`

2022-12-15 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 066870d938c [SPARK-41518][SQL] Assign a name to the error class 
`_LEGACY_ERROR_TEMP_2422`
066870d938c is described below

commit 066870d938cf7fb2f088c2a7f6a036de6fb5b7d2
Author: Max Gekk 
AuthorDate: Fri Dec 16 10:20:38 2022 +0300

[SPARK-41518][SQL] Assign a name to the error class 
`_LEGACY_ERROR_TEMP_2422`

### What changes were proposed in this pull request?
In the PR, I propose to assign new name `MISSING_GROUP_BY` to the legacy 
error class `_LEGACY_ERROR_TEMP_2422`, improve its error message format, and 
regenerate the SQL golden files.

### Why are the changes needed?
To improve user experience with Spark SQL, and unify representation of 
error messages.

### Does this PR introduce _any_ user-facing change?
Yes, it changes an user-facing error message.

### How was this patch tested?
By running the affected test suites:
```
$ PYSPARK_PYTHON=python3 build/sbt "sql/testOnly 
org.apache.spark.sql.SQLQueryTestSuite"
```

Closes #39061 from MaxGekk/error-class-_LEGACY_ERROR_TEMP_2422.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 -
 .../sql/catalyst/analysis/CheckAnalysis.scala  | 14 -
 .../sql-tests/results/group-by-filter.sql.out  | 12 ---
 .../resources/sql-tests/results/group-by.sql.out   | 24 --
 .../results/postgreSQL/select_having.sql.out   |  6 +-
 .../negative-cases/invalid-correlation.sql.out | 12 ---
 .../results/udaf/udaf-group-by-ordinal.sql.out |  6 +-
 .../sql-tests/results/udaf/udaf-group-by.sql.out   | 12 ---
 .../udf/postgreSQL/udf-select_having.sql.out   |  6 +-
 .../sql-tests/results/udf/udf-group-by.sql.out | 18 +---
 10 files changed, 37 insertions(+), 83 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index ab4a93798a7..7af794b9ef9 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -863,6 +863,11 @@
 ],
 "sqlState" : "42000"
   },
+  "MISSING_GROUP_BY" : {
+"message" : [
+  "The query does not include a GROUP BY clause. Add GROUP BY or turn it 
into the window functions using OVER clauses."
+]
+  },
   "MISSING_STATIC_PARTITION_COLUMN" : {
 "message" : [
   "Unknown static partition column: "
@@ -5116,11 +5121,6 @@
   "nondeterministic expression  should not appear in the 
arguments of an aggregate function."
 ]
   },
-  "_LEGACY_ERROR_TEMP_2422" : {
-"message" : [
-  "grouping expressions sequence is empty, and '' is not an 
aggregate function. Wrap '' in windowing function(s) or wrap 
'' in first() (or first_value) if you don't care which value you get."
-]
-  },
   "_LEGACY_ERROR_TEMP_2423" : {
 "message" : [
   "Correlated scalar subquery '' is neither present in the group 
by, nor in an aggregate function. Add it to group by using ordinal position or 
wrap it in first() (or first_value) if you don't care which value you get."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index 11b2d6671c7..2c57c2b9bab 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
@@ -402,16 +402,10 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
   messageParameters = Map("sqlExpr" -> expr.sql))
   }
 }
-  case e: Attribute if groupingExprs.isEmpty =>
-// Collect all [[AggregateExpressions]]s.
-val aggExprs = aggregateExprs.filter(_.collect {
-  case a: AggregateExpression => a
-}.nonEmpty)
-e.failAnalysis(
-  errorClass = "_LEGACY_ERROR_TEMP_2422",
-  messageParameters = Map(
-"sqlExpr" -> e.sql,
-"aggExprs" -> aggExprs.map(_.sql).mkString("(", ", ", 
")")))
+  case _: Attribute if groupingExprs.isEmpty =>
+operator.failAnalysis(
+

[spark] branch master updated (e03f86d84bd -> 92440151c9e)

2022-12-15 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from e03f86d84bd [SPARK-41542][CONNECT][TESTS] Set parallelism as 1 for 
coverage report in Spark Connect
 add 92440151c9e [SPARK-41508][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_1180` 
to `UNEXPECTED_INPUT_TYPE` and remove `_LEGACY_ERROR_TEMP_1179`

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   | 17 +++---
 .../sql/catalyst/analysis/FunctionRegistry.scala   | 12 +--
 .../plans/logical/basicLogicalOperators.scala  | 19 +++
 .../spark/sql/errors/QueryCompilationErrors.scala  | 27 +++
 .../sql-tests/results/postgreSQL/int8.sql.out  |  7 ++--
 .../results/table-valued-functions.sql.out | 38 ++
 6 files changed, 49 insertions(+), 71 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (0bd8c856c74 -> cd117fbd402)

2022-12-15 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 0bd8c856c74 [SPARK-41465][SQL] Assign a name to the error class 
_LEGACY_ERROR_TEMP_1235
 add cd117fbd402 [SPARK-41350][SQL][FOLLOWUP] Allow simple name access of 
join hidden columns after alias

No new revisions were added by this update.

Summary of changes:
 .../catalyst/expressions/namedExpressions.scala|  7 +++-
 .../resources/sql-tests/inputs/natural-join.sql|  2 +
 .../test/resources/sql-tests/inputs/using-join.sql |  8 
 .../sql-tests/results/natural-join.sql.out | 10 +
 .../resources/sql-tests/results/using-join.sql.out | 44 ++
 5 files changed, 69 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41465][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1235

2022-12-15 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 0bd8c856c74 [SPARK-41465][SQL] Assign a name to the error class 
_LEGACY_ERROR_TEMP_1235
0bd8c856c74 is described below

commit 0bd8c856c748a73f0bb1fecdeae05bf6f2e4063e
Author: panbingkun 
AuthorDate: Thu Dec 15 21:21:44 2022 +0300

[SPARK-41465][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1235

### What changes were proposed in this pull request?
In the PR, I propose to assign the name `ANALYZE_UNSUPPORTED_COLUMN_TYPE` 
to the error class `_LEGACY_ERROR_TEMP_1235`.

### Why are the changes needed?
Proper names of error classes should improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Add new UT.
Pass GA.

Closes #39003 from panbingkun/LEGACY_ERROR_TEMP_1235.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json  | 10 +-
 .../spark/sql/errors/QueryCompilationErrors.scala |  8 
 .../apache/spark/sql/StatisticsCollectionSuite.scala  |  9 +
 .../org/apache/spark/sql/hive/StatisticsSuite.scala   | 19 +++
 4 files changed, 33 insertions(+), 13 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index b7bf07a0e48..a60a24d14c6 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1289,6 +1289,11 @@
   "The ANALYZE TABLE FOR COLUMNS command can operate on temporary 
views that have been cached already. Consider to cache the view ."
 ]
   },
+  "ANALYZE_UNSUPPORTED_COLUMN_TYPE" : {
+"message" : [
+  "The ANALYZE TABLE FOR COLUMNS command does not support the type 
 of the column  in the table ."
+]
+  },
   "CATALOG_OPERATION" : {
 "message" : [
   "Catalog  does not support ."
@@ -2892,11 +2897,6 @@
   "Partition spec is invalid. The spec () must match the 
partition spec () defined in table ''."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1235" : {
-"message" : [
-  "Column  in table  is of type , and Spark 
does not support statistics collection on this column type."
-]
-  },
   "_LEGACY_ERROR_TEMP_1236" : {
 "message" : [
   "ANALYZE TABLE is not supported on views."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index a5ff2084ca8..18ac6b7bcf8 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -2298,11 +2298,11 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
   tableIdent: TableIdentifier,
   dataType: DataType): Throwable = {
 new AnalysisException(
-  errorClass = "_LEGACY_ERROR_TEMP_1235",
+  errorClass = "UNSUPPORTED_FEATURE.ANALYZE_UNSUPPORTED_COLUMN_TYPE",
   messageParameters = Map(
-"name" -> name,
-"tableIdent" -> tableIdent.toString,
-"dataType" -> dataType.toString))
+"columnType" -> toSQLType(dataType),
+"columnName" -> toSQLId(name),
+"tableName" -> toSQLId(tableIdent.toString)))
   }
 
   def analyzeTableNotSupportedOnViewsError(): Throwable = {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala
index 95d9245c57d..dda1cc5b52b 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/StatisticsCollectionSuite.scala
@@ -76,6 +76,7 @@ class StatisticsCollectionSuite extends 
StatisticsCollectionTestBase with Shared
   val viewName = "view"
   withView(viewName) {
 sql(s"CREATE VIEW $viewName AS SELECT * FROM $tableName")
+
 assertAnalyzeUnsupported(s"ANALYZE TABLE $viewName COMPUTE STATISTICS")
 assertAnalyzeUnsupported(s"ANALYZE TABLE $viewName COMPUTE STATISTICS 
FOR COLUMNS id")
   }
@@ -128,11 +129,11 @@ class StatisticsCollectionSuite extends 
StatisticsCollectionTestBase with Shared
 exception = intercept[AnalysisException] {
   sql(s"ANALYZ

[spark] branch master updated (724bbfdce87 -> a09a2736866)

2022-12-15 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 724bbfdce87 Revert "[SPARK-41521][BUILD][K8S] Upgrade 
`kubernetes-client` to 6.3.0"
 add a09a2736866 [MINOR][SQL][TESTS] Fix Typos 'e1 -> e2'

No new revisions were added by this update.

Summary of changes:
 sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41271][SQL] Support parameterized SQL queries by `sql()`

2022-12-14 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 35fa5e6716e [SPARK-41271][SQL] Support parameterized SQL queries by 
`sql()`
35fa5e6716e is described below

commit 35fa5e6716e59b004851b61f7fbfbdace15f46b7
Author: Max Gekk 
AuthorDate: Thu Dec 15 09:14:46 2022 +0300

[SPARK-41271][SQL] Support parameterized SQL queries by `sql()`

### What changes were proposed in this pull request?
In the PR, I propose to extend SparkSession API and override the `sql` 
method by:
```scala
  def sql(sqlText: String, args: Map[String, String]): DataFrame
```
which accepts a map with:
- keys are parameters names,
- values are SQL literal values.

And the first argument `sqlText` might have named parameters in the 
positions of constants like literal values.

For example:
```scala
  spark.sql(
sqlText = "SELECT * FROM tbl WHERE date > :startDate LIMIT :maxRows",
args = Map(
  "startDate" -> "DATE'2022-12-01'",
  "maxRows" -> "100"))
```
The new `sql()` method parses the input SQL statement and provided 
parameter values, and replaces the named parameters by the literal values. And 
then it eagerly runs DDL/DML commands, but not for SELECT queries.

Closes #38712

### Why are the changes needed?
1. To improve user experience with Spark SQL via
- Using Spark as remote service (microservice).
- Write SQL code that will power reports, dashboards, charts and other 
data presentation solutions that need to account for criteria modifiable by 
users through an interface.
- Build a generic integration layer based on the SQL API. The goal is 
to expose managed data to a wide application ecosystem with a microservice 
architecture. It is only natural in such a setup to ask for modular and 
reusable SQL code, that can be executed repeatedly with different parameter 
values.

2. To achieve feature parity with other systems that support named 
parameters:
- Redshift: 
https://docs.aws.amazon.com/redshift/latest/mgmt/data-api.html#data-api-calling
- BigQuery: 
https://cloud.google.com/bigquery/docs/parameterized-queries#api
- MS DBSQL: 
https://learn.microsoft.com/en-us/azure/databricks/sql/user/queries/query-parameters

### Does this PR introduce _any_ user-facing change?
No, this is an extension of the existing APIs.

### How was this patch tested?
By running new tests:
```
$ build/sbt "core/testOnly *SparkThrowableSuite"
$ build/sbt "test:testOnly *PlanParserSuite"
$ build/sbt "test:testOnly *AnalysisSuite"
$ build/sbt "test:testOnly *ParametersSuite"

```

Closes #38864 from MaxGekk/parameterized-sql-2.

Lead-authored-by: Max Gekk 
Co-authored-by: Maxim Gekk 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 +++
 .../spark/sql/catalyst/parser/SqlBaseParser.g4 |  1 +
 .../sql/catalyst/analysis/CheckAnalysis.scala  |  5 ++
 .../sql/catalyst/expressions/parameters.scala  | 64 ++
 .../spark/sql/catalyst/parser/AstBuilder.scala |  7 ++
 .../spark/sql/catalyst/trees/TreePatterns.scala|  1 +
 .../sql/catalyst/analysis/AnalysisSuite.scala  | 14 
 .../sql/catalyst/parser/PlanParserSuite.scala  | 26 
 .../scala/org/apache/spark/sql/SparkSession.scala  | 40 +--
 .../org/apache/spark/sql/ParametersSuite.scala | 78 ++
 .../org/apache/spark/sql/test/SQLTestUtils.scala   |  2 +-
 .../benchmark/InsertIntoHiveTableBenchmark.scala   |  4 +-
 .../ObjectHashAggregateExecBenchmark.scala |  4 +-
 13 files changed, 246 insertions(+), 10 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index f66d6998e26..b7bf07a0e48 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -806,6 +806,11 @@
   }
 }
   },
+  "INVALID_SQL_ARG" : {
+"message" : [
+  "The argument  of `sql()` is invalid. Consider to replace it by a 
SQL literal statement."
+]
+  },
   "INVALID_SQL_SYNTAX" : {
 "message" : [
   "Invalid SQL syntax: "
@@ -1147,6 +1152,11 @@
   "Unable to convert SQL type  to Protobuf type ."
 ]
   },
+  "UNBOUND_SQL_PARAMETER" : {
+"message" : [
+  "Found the unbound parameter: . Please, fix `args` and provide a 
mapping of the parameter to a SQL literal statement."
+]
+  },
   "UNCLOSED

[spark] branch master updated (4e8980e6ae9 -> 5b5083484cd)

2022-12-13 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 4e8980e6ae9 [SPARK-41409][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_1043` 
to `WRONG_NUM_ARGS.WITHOUT_SUGGESTION`
 add 5b5083484cd [SPARK-41248][SQL] Add 
"spark.sql.json.enablePartialResults" to enable/disable JSON partial results

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/json/JacksonParser.scala|  10 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|  11 ++
 sql/core/benchmarks/JsonBenchmark-results.txt  | 155 ++---
 .../org/apache/spark/sql/JsonFunctionsSuite.scala  |  67 +++--
 .../sql/execution/datasources/json/JsonSuite.scala |  25 +++-
 5 files changed, 158 insertions(+), 110 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41409][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_1043` to `WRONG_NUM_ARGS.WITHOUT_SUGGESTION`

2022-12-13 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 4e8980e6ae9 [SPARK-41409][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_1043` 
to `WRONG_NUM_ARGS.WITHOUT_SUGGESTION`
4e8980e6ae9 is described below

commit 4e8980e6ae9a513bb4c990944841a9db073013ea
Author: yangjie01 
AuthorDate: Wed Dec 14 08:22:33 2022 +0300

[SPARK-41409][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_1043` to 
`WRONG_NUM_ARGS.WITHOUT_SUGGESTION`

### What changes were proposed in this pull request?
This pr introduces sub-classes of `WRONG_NUM_ARGS`:

- WITHOUT_SUGGESTION
- WITH_SUGGESTION

then replace existing  `WRONG_NUM_ARGS` to `WRONG_NUM_ARGS.WITH_SUGGESTION` 
and rename error class `_LEGACY_ERROR_TEMP_1043` to 
`WRONG_NUM_ARGS.WITHOUT_SUGGESTION`

### Why are the changes needed?
Proper names of error classes to improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Add new test case

Closes #38940 from LuciferYang/legacy-1043.

Lead-authored-by: yangjie01 
Co-authored-by: YangJie 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json| 21 ++---
 .../spark/sql/errors/QueryCompilationErrors.scala   |  8 
 .../resources/sql-tests/results/ansi/date.sql.out   |  2 +-
 .../sql-tests/results/ansi/string-functions.sql.out |  4 ++--
 .../results/ceil-floor-with-scale-param.sql.out |  4 ++--
 .../sql-tests/results/csv-functions.sql.out |  2 +-
 .../test/resources/sql-tests/results/date.sql.out   |  2 +-
 .../sql-tests/results/datetime-legacy.sql.out   |  2 +-
 .../sql-tests/results/json-functions.sql.out|  8 
 .../results/sql-compatibility-functions.sql.out |  2 +-
 .../sql-tests/results/string-functions.sql.out  |  4 ++--
 .../results/table-valued-functions.sql.out  |  2 +-
 .../sql-tests/results/timestamp-ntz.sql.out |  2 +-
 .../resources/sql-tests/results/udaf/udaf.sql.out   |  2 +-
 .../sql-tests/results/udf/udf-udaf.sql.out  |  2 +-
 .../apache/spark/sql/DataFrameFunctionsSuite.scala  |  2 +-
 .../org/apache/spark/sql/DateFunctionsSuite.scala   |  2 +-
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala  |  2 +-
 .../org/apache/spark/sql/StringFunctionsSuite.scala |  2 +-
 .../test/scala/org/apache/spark/sql/UDFSuite.scala  | 11 ++-
 .../sql/errors/QueryCompilationErrorsSuite.scala| 13 +
 .../spark/sql/hive/execution/HiveUDAFSuite.scala|  2 +-
 22 files changed, 57 insertions(+), 44 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index e1df3db4291..f66d6998e26 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1548,8 +1548,20 @@
   },
   "WRONG_NUM_ARGS" : {
 "message" : [
-  "The  requires  parameters but the actual 
number is ."
-]
+  "Invalid number of arguments for the function ."
+],
+"subClass" : {
+  "WITHOUT_SUGGESTION" : {
+"message" : [
+  "Please, refer to 
'https://spark.apache.org/docs/latest/sql-ref-functions.html' for a fix."
+]
+  },
+  "WITH_SUGGESTION" : {
+"message" : [
+  "Consider to change the number of arguments because the function 
requires  parameters but the actual number is ."
+]
+  }
+}
   },
   "_LEGACY_ERROR_TEMP_0001" : {
 "message" : [
@@ -2018,11 +2030,6 @@
   "Undefined function ."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1043" : {
-"message" : [
-  "Invalid arguments for function ."
-]
-  },
   "_LEGACY_ERROR_TEMP_1045" : {
 "message" : [
   "ALTER TABLE SET LOCATION does not support partition for v2 tables."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index b329f6689d4..a5ff2084ca8 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -640,7 +640,7 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
   def invalidFunctionArgumentsError(
   name: String, expectedNum: String, actualNum: Int): Throwable = {
 new AnalysisException(
-  errorClass = "WRONG_NUM_ARGS",
+  errorClass = "WRONG_NUM_ARGS.WITH_S

[spark] branch master updated: [SPARK-41062][SQL] Rename `UNSUPPORTED_CORRELATED_REFERENCE` to `CORRELATED_REFERENCE`

2022-12-13 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e29ada0c13e [SPARK-41062][SQL] Rename 
`UNSUPPORTED_CORRELATED_REFERENCE` to `CORRELATED_REFERENCE`
e29ada0c13e is described below

commit e29ada0c13e71aaad0566ef67591a33d4c58fe2a
Author: itholic 
AuthorDate: Tue Dec 13 21:48:11 2022 +0300

[SPARK-41062][SQL] Rename `UNSUPPORTED_CORRELATED_REFERENCE` to 
`CORRELATED_REFERENCE`

### What changes were proposed in this pull request?

This PR proposes to rename `UNSUPPORTED_CORRELATED_REFERENCE` to 
`CORRELATED_REFERENCE`.

Also, show `sqlExprs` rather than `treeNode` which is more useful 
information to users.

### Why are the changes needed?

The sub-error class name is duplicated with its main class, 
`UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY`.

We should make the all error class name clear and briefly.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

```
./build/sbt “sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*”
```

Closes #38576 from itholic/SPARK-41062.

Lead-authored-by: itholic 
Co-authored-by: Haejoon Lee <44108233+itho...@users.noreply.github.com>
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json| 10 +-
 .../apache/spark/sql/catalyst/analysis/CheckAnalysis.scala  |  7 ---
 .../spark/sql/catalyst/analysis/ResolveSubquerySuite.scala  | 13 -
 .../subquery/negative-cases/invalid-correlation.sql.out |  4 ++--
 .../src/test/scala/org/apache/spark/sql/SubquerySuite.scala | 12 +---
 5 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 25362d5893f..e1df3db4291 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1471,6 +1471,11 @@
   "A correlated outer name reference within a subquery expression body 
was not found in the enclosing query: "
 ]
   },
+  "CORRELATED_REFERENCE" : {
+"message" : [
+  "Expressions referencing the outer query are not supported outside 
of WHERE/HAVING clauses: "
+]
+  },
   "LATERAL_JOIN_CONDITION_NON_DETERMINISTIC" : {
 "message" : [
   "Lateral join condition cannot be non-deterministic: "
@@ -1496,11 +1501,6 @@
   "Non-deterministic lateral subqueries are not supported when joining 
with outer relations that produce more than one row"
 ]
   },
-  "UNSUPPORTED_CORRELATED_REFERENCE" : {
-"message" : [
-  "Expressions referencing the outer query are not supported outside 
of WHERE/HAVING clauses"
-]
-  },
   "UNSUPPORTED_CORRELATED_REFERENCE_DATA_TYPE" : {
 "message" : [
   "Correlated column reference '' cannot be  type"
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index e7e153a319d..5303364710c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
@@ -1089,11 +1089,12 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
 // 2. Expressions containing outer references on plan nodes other than 
allowed operators.
 def failOnInvalidOuterReference(p: LogicalPlan): Unit = {
   p.expressions.foreach(checkMixedReferencesInsideAggregateExpr)
-  if (!canHostOuter(p) && p.expressions.exists(containsOuter)) {
+  val exprs = stripOuterReferences(p.expressions.filter(expr => 
containsOuter(expr)))
+  if (!canHostOuter(p) && !exprs.isEmpty) {
 p.failAnalysis(
   errorClass =
-
"UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY.UNSUPPORTED_CORRELATED_REFERENCE",
-  messageParameters = Map("treeNode" -> planToString(p)))
+"UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY.CORRELATED_REFERENCE",
+  messageParameters = Map("sqlExprs" -> 
exprs.map(toSQLExpr).mkString(",")))
   }
 }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveSubquerySuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveSubquerySuite.scala
index 577f663d8b1..7b99153acf9 100

[spark] branch master updated (0e2d604fd33 -> 3809ccdca6e)

2022-12-13 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 0e2d604fd33 [SPARK-41406][SQL] Refactor error message for 
`NUM_COLUMNS_MISMATCH` to make it more generic
 add 3809ccdca6e [SPARK-41478][SQL] Assign a name to the error class 
_LEGACY_ERROR_TEMP_1234

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   | 10 +-
 .../spark/sql/errors/QueryCompilationErrors.scala  |  4 ++--
 .../spark/sql/StatisticsCollectionSuite.scala  | 23 +-
 .../apache/spark/sql/execution/SQLViewSuite.scala  | 11 +++
 4 files changed, 28 insertions(+), 20 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41406][SQL] Refactor error message for `NUM_COLUMNS_MISMATCH` to make it more generic

2022-12-13 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 0e2d604fd33 [SPARK-41406][SQL] Refactor error message for 
`NUM_COLUMNS_MISMATCH` to make it more generic
0e2d604fd33 is described below

commit 0e2d604fd33c8236cfa8ae243eeaec42d3176a06
Author: panbingkun 
AuthorDate: Tue Dec 13 14:02:36 2022 +0300

[SPARK-41406][SQL] Refactor error message for `NUM_COLUMNS_MISMATCH` to 
make it more generic

### What changes were proposed in this pull request?
The pr aims to refactor error message for `NUM_COLUMNS_MISMATCH` to make it 
more generic.

### Why are the changes needed?
The changes improve the error framework.

### Does this PR introduce _any_ user-facing change?
Yes.

### How was this patch tested?
Update existed UT.
Pass GA.

Closes #38937 from panbingkun/SPARK-41406.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   |   2 +-
 .../sql/catalyst/analysis/CheckAnalysis.scala  |   4 +-
 .../plans/logical/basicLogicalOperators.scala  |   4 +-
 .../resources/sql-tests/results/except-all.sql.out |   6 +-
 .../sql-tests/results/intersect-all.sql.out|   6 +-
 .../native/widenSetOperationTypes.sql.out  | 140 ++---
 .../sql-tests/results/udf/udf-except-all.sql.out   |   6 +-
 .../results/udf/udf-intersect-all.sql.out  |   6 +-
 .../spark/sql/DataFrameSetOperationsSuite.scala|   9 +-
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala |  22 +++-
 10 files changed, 110 insertions(+), 95 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index e76328e970d..6faaf0af35f 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -943,7 +943,7 @@
   },
   "NUM_COLUMNS_MISMATCH" : {
 "message" : [
-  " can only be performed on tables with the same number of 
columns, but the first table has  columns and the 
 table has  columns."
+  " can only be performed on inputs with the same number of 
columns, but the first input has  columns and the 
 input has  columns."
 ]
   },
   "ORDER_BY_POS_OUT_OF_RANGE" : {
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index 12dac5c632a..be812adaaa1 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
@@ -552,7 +552,7 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
   errorClass = "NUM_COLUMNS_MISMATCH",
   messageParameters = Map(
 "operator" -> toSQLStmt(operator.nodeName),
-"refNumColumns" -> ref.length.toString,
+"firstNumColumns" -> ref.length.toString,
 "invalidOrdinalNum" -> ordinalNumber(ti + 1),
 "invalidNumColumns" -> child.output.length.toString))
   }
@@ -565,7 +565,7 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
   e.failAnalysis(
 errorClass = "_LEGACY_ERROR_TEMP_2430",
 messageParameters = Map(
-  "operator" -> operator.nodeName,
+  "operator" -> toSQLStmt(operator.nodeName),
   "ci" -> ordinalNumber(ci),
   "ti" -> ordinalNumber(ti + 1),
   "dt1" -> dt1.catalogString,
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
index 60586e4166c..878ad91c088 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
@@ -342,7 +342,7 @@ case class Intersect(
 right: LogicalPlan,
 isAll: Boolean) extends SetOperation(left, right) {
 
-  override def nodeName: String = getClass.getSimpleName + ( if ( isAll ) 
"All" else "" )
+  override def nodeName: String = getClass.getSimpleName + ( if ( isAll ) " 
All" else "" )
 
   final

[spark] branch master updated (af8dd411aa9 -> 9b69331602e)

2022-12-12 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from af8dd411aa9 [SPARK-33782][K8S][CORE] Place spark.files, spark.jars and 
spark.files under the current working directory on the driver in K8S cluster 
mode
 add 9b69331602e [SPARK-41481][CORE][SQL] Reuse `INVALID_TYPED_LITERAL` 
instead of `_LEGACY_ERROR_TEMP_0020`

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   |  5 --
 .../spark/sql/catalyst/parser/AstBuilder.scala |  2 +-
 .../spark/sql/errors/QueryParsingErrors.scala  |  7 ---
 .../catalyst/parser/ExpressionParserSuite.scala| 21 +---
 .../sql-tests/results/ansi/interval.sql.out| 60 ++
 .../resources/sql-tests/results/interval.sql.out   | 60 ++
 6 files changed, 96 insertions(+), 59 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5d52bb36d3b -> cd2f78657ce)

2022-12-11 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 5d52bb36d3b [SPARK-41486][SQL][TESTS] Upgrade `MySQL` docker image to 
8.0.31 to support `ARM64` test
 add cd2f78657ce [SPARK-41463][SQL][TESTS] Ensure error class names contain 
only capital letters, numbers and underscores

No new revisions were added by this update.

Summary of changes:
 .../test/scala/org/apache/spark/SparkThrowableSuite.scala| 12 
 1 file changed, 12 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41443][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1061

2022-12-10 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 92655db9fc6 [SPARK-41443][SQL] Assign a name to the error class 
_LEGACY_ERROR_TEMP_1061
92655db9fc6 is described below

commit 92655db9fc69410022052b6e662488285a322490
Author: panbingkun 
AuthorDate: Sat Dec 10 19:27:26 2022 +0300

[SPARK-41443][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1061

### What changes were proposed in this pull request?
In the PR, I propose to assign the name COLUMN_NOT_FOUND to the error class 
_LEGACY_ERROR_TEMP_1061.

### Why are the changes needed?
Proper names of error classes should improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Add new UT.
Pass GA.

Closes #38972 from panbingkun/LEGACY_ERROR_TEMP_1061.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 
 .../spark/sql/errors/QueryCompilationErrors.scala  | 14 ++-
 .../catalyst/analysis/ResolveSessionCatalog.scala  |  2 +-
 .../execution/command/AnalyzeColumnCommand.scala   |  2 +-
 .../spark/sql/execution/command/tables.scala   |  2 +-
 .../spark/sql/StatisticsCollectionSuite.scala  | 29 --
 .../execution/command/v1/DescribeTableSuite.scala  | 28 +++--
 .../apache/spark/sql/hive/StatisticsSuite.scala| 21 ++--
 8 files changed, 76 insertions(+), 32 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index a8738994e17..3f091f090fc 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -109,6 +109,11 @@
   "The column  already exists. Consider to choose another name 
or rename the existing column."
 ]
   },
+  "COLUMN_NOT_FOUND" : {
+"message" : [
+  "The column  cannot be found. Verify the spelling and 
correctness of the column name according to the SQL config 
."
+]
+  },
   "CONCURRENT_QUERY" : {
 "message" : [
   "Another instance of this query was just started by a concurrent 
session."
@@ -2092,11 +2097,6 @@
   " does not support nested column: ."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1061" : {
-"message" : [
-  "Column  does not exist."
-]
-  },
   "_LEGACY_ERROR_TEMP_1065" : {
 "message" : [
   "`` is not a valid name for tables/databases. Valid names only 
contain alphabet characters, numbers and _."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index ed08e33829e..b507045f8c6 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -795,12 +795,6 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
 "column" -> quoted))
   }
 
-  def columnDoesNotExistError(colName: String): Throwable = {
-new AnalysisException(
-  errorClass = "_LEGACY_ERROR_TEMP_1061",
-  messageParameters = Map("colName" -> colName))
-  }
-
   def renameTempViewToExistingViewError(newName: String): Throwable = {
 new TableAlreadyExistsException(newName)
   }
@@ -2281,6 +2275,14 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
   messageParameters = Map("columnName" -> toSQLId(columnName)))
   }
 
+  def columnNotFoundError(colName: String): Throwable = {
+new AnalysisException(
+  errorClass = "COLUMN_NOT_FOUND",
+  messageParameters = Map(
+"colName" -> toSQLId(colName),
+"caseSensitiveConfig" -> toSQLConf(SQLConf.CASE_SENSITIVE.key)))
+  }
+
   def noSuchTableError(db: String, table: String): Throwable = {
 new NoSuchTableException(db = db, table = table)
   }
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
index 4afcf5b7514..7b2d5015840 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
@@ -155,7 +155,7 @@ class ResolveSessionCatalog(val catalogManager: 
CatalogManager)
 case Descri

[spark] branch master updated: [SPARK-41417][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_0019` to `INVALID_TYPED_LITERAL`

2022-12-09 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 6972341b06e [SPARK-41417][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_0019` 
to `INVALID_TYPED_LITERAL`
6972341b06e is described below

commit 6972341b06eae40dda787306e2d1bde062501617
Author: yangjie01 
AuthorDate: Sat Dec 10 09:50:08 2022 +0300

[SPARK-41417][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_0019` to 
`INVALID_TYPED_LITERAL`

### What changes were proposed in this pull request?
This pr aims rename `_LEGACY_ERROR_TEMP_0019` to `INVALID_TYPED_LITERAL`

### Why are the changes needed?
Proper names of error classes to improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

Closes #38954 from LuciferYang/SPARK-41417.

Authored-by: yangjie01 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 11 +--
 .../spark/sql/errors/QueryParsingErrors.scala  |  7 +-
 .../catalyst/parser/ExpressionParserSuite.scala| 31 +---
 .../resources/sql-tests/results/ansi/date.sql.out  | 21 +++---
 .../sql-tests/results/ansi/literals.sql.out| 14 ++--
 .../sql-tests/results/ansi/timestamp.sql.out   | 21 +++---
 .../test/resources/sql-tests/results/date.sql.out  | 21 +++---
 .../sql-tests/results/datetime-legacy.sql.out  | 42 ++-
 .../resources/sql-tests/results/literals.sql.out   | 14 ++--
 .../sql-tests/results/postgreSQL/date.sql.out  | 84 --
 .../resources/sql-tests/results/timestamp.sql.out  | 21 +++---
 .../results/timestampNTZ/timestamp-ansi.sql.out| 21 +++---
 .../results/timestampNTZ/timestamp.sql.out | 21 +++---
 13 files changed, 192 insertions(+), 137 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 19ab5ada2b5..a8738994e17 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -813,6 +813,12 @@
   }
 }
   },
+  "INVALID_TYPED_LITERAL" : {
+"message" : [
+  "The value of the typed literal  is invalid: ."
+],
+"sqlState" : "42000"
+  },
   "INVALID_WHERE_CONDITION" : {
 "message" : [
   "The WHERE condition  contains invalid expressions: 
.",
@@ -1599,11 +1605,6 @@
   "Function trim doesn't support with type . Please use BOTH, 
LEADING or TRAILING as trim type."
 ]
   },
-  "_LEGACY_ERROR_TEMP_0019" : {
-"message" : [
-  "Cannot parse the  value: ."
-]
-  },
   "_LEGACY_ERROR_TEMP_0020" : {
 "message" : [
   "Cannot parse the INTERVAL value: ."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
index 018e9a12e01..ad6f72986d6 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
@@ -211,8 +211,11 @@ private[sql] object QueryParsingErrors extends 
QueryErrorsBase {
   def cannotParseValueTypeError(
   valueType: String, value: String, ctx: TypeConstructorContext): 
Throwable = {
 new ParseException(
-  errorClass = "_LEGACY_ERROR_TEMP_0019",
-  messageParameters = Map("valueType" -> valueType, "value" -> value),
+  errorClass = "INVALID_TYPED_LITERAL",
+  messageParameters = Map(
+"valueType" -> toSQLType(valueType),
+"value" -> toSQLValue(value, StringType)
+  ),
   ctx)
   }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
index 884e782736c..01c9907cb8c 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala
@@ -521,8 +521,12 @@ class ExpressionParserSuite extends AnalysisTest {
 Literal(Timestamp.valueOf("2016-03-11 20:54:00.000")))
   checkError(
 exception = parseException("timestamP_LTZ '2016-33-11 20:54:00.000'"),
-errorClass = "_LEGACY_ERROR_TEMP_0019",
-parameters = Map("valueType" -> "TIMESTAMP_LTZ", "value" -> 
"2016-33-11 20:54:00.000"),
+errorCl

[spark] branch master updated (fc3c0f1008d -> 928eab666da)

2022-12-09 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from fc3c0f1008d [SPARK-41450][BUILD] Fix shading in `core` module
 add 928eab666da [SPARK-41462][SQL] Date and timestamp type can up cast to 
TimestampNTZ

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/catalyst/expressions/Cast.scala   | 3 +++
 .../apache/spark/sql/catalyst/expressions/CastSuiteBase.scala| 9 +
 2 files changed, 12 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41435][SQL] Change to call `invalidFunctionArgumentsError` for `curdate()` when `expressions` is not empty

2022-12-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new d5e32757429 [SPARK-41435][SQL] Change to call 
`invalidFunctionArgumentsError` for `curdate()`  when `expressions` is not empty
d5e32757429 is described below

commit d5e327574290e1da92d109081c500782d5a3bc21
Author: yangjie01 
AuthorDate: Thu Dec 8 15:40:18 2022 +0300

[SPARK-41435][SQL] Change to call `invalidFunctionArgumentsError` for 
`curdate()`  when `expressions` is not empty

### What changes were proposed in this pull request?
This pr change to call `invalidFunctionArgumentsError` instead of 
`invalidFunctionArgumentNumberError ` for `curdate()`  when `expressions` is 
not empty, then `curdate()` will throw `AnalysisException` with error class 
`WRONG_NUM_ARGS` when input args it not empty.

### Why are the changes needed?
`WRONG_NUM_ARGS` is a more appropriate error class

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Add new test case

Closes #38960 from LuciferYang/curdate-err-msg.

Authored-by: yangjie01 
Signed-off-by: Max Gekk 
---
 .../catalyst/expressions/datetimeExpressions.scala |  4 ++--
 .../src/test/resources/sql-tests/inputs/date.sql   |  1 +
 .../resources/sql-tests/results/ansi/date.sql.out  | 23 ++
 .../test/resources/sql-tests/results/date.sql.out  | 23 ++
 .../sql-tests/results/datetime-legacy.sql.out  | 23 ++
 .../org/apache/spark/sql/DateFunctionsSuite.scala  | 13 
 6 files changed, 85 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
index e8bad46e84a..3e89dfe39ce 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala
@@ -171,8 +171,8 @@ object CurDateExpressionBuilder extends ExpressionBuilder {
 if (expressions.isEmpty) {
   CurrentDate()
 } else {
-  throw QueryCompilationErrors.invalidFunctionArgumentNumberError(
-Seq.empty, funcName, expressions.length)
+  throw QueryCompilationErrors.invalidFunctionArgumentsError(
+funcName, "0", expressions.length)
 }
   }
 }
diff --git a/sql/core/src/test/resources/sql-tests/inputs/date.sql 
b/sql/core/src/test/resources/sql-tests/inputs/date.sql
index ab57c7c754c..163855069f0 100644
--- a/sql/core/src/test/resources/sql-tests/inputs/date.sql
+++ b/sql/core/src/test/resources/sql-tests/inputs/date.sql
@@ -19,6 +19,7 @@ select date'2021-4294967297-11';
 select current_date = current_date;
 -- under ANSI mode, `current_date` can't be a function name.
 select current_date() = current_date();
+select curdate(1);
 
 -- conversions between date and unix_date (number of days from epoch)
 select DATE_FROM_UNIX_DATE(0), DATE_FROM_UNIX_DATE(1000), 
DATE_FROM_UNIX_DATE(null);
diff --git a/sql/core/src/test/resources/sql-tests/results/ansi/date.sql.out 
b/sql/core/src/test/resources/sql-tests/results/ansi/date.sql.out
index 9ddbaec4f99..d0f5b02c916 100644
--- a/sql/core/src/test/resources/sql-tests/results/ansi/date.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/ansi/date.sql.out
@@ -135,6 +135,29 @@ struct<(current_date() = current_date()):boolean>
 true
 
 
+-- !query
+select curdate(1)
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.AnalysisException
+{
+  "errorClass" : "WRONG_NUM_ARGS",
+  "messageParameters" : {
+"actualNum" : "1",
+"expectedNum" : "0",
+"functionName" : "`curdate`"
+  },
+  "queryContext" : [ {
+"objectType" : "",
+"objectName" : "",
+"startIndex" : 8,
+"stopIndex" : 17,
+"fragment" : "curdate(1)"
+  } ]
+}
+
+
 -- !query
 select DATE_FROM_UNIX_DATE(0), DATE_FROM_UNIX_DATE(1000), 
DATE_FROM_UNIX_DATE(null)
 -- !query schema
diff --git a/sql/core/src/test/resources/sql-tests/results/date.sql.out 
b/sql/core/src/test/resources/sql-tests/results/date.sql.out
index 9e427adb052..434e3c7abd3 100644
--- a/sql/core/src/test/resources/sql-tests/results/date.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/date.sql.out
@@ -121,6 +121,29 @@ struct<(current_date() = current_date()):boolean>
 true
 
 
+-- !query
+select curdate(1)
+-- !query schema
+struct<>
+-- !query output
+org.apache.spark.sql.AnalysisE

[spark] branch master updated: [SPARK-41390][SQL] Update the script used to generate `register` function in `UDFRegistration`

2022-12-05 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 11cebdbdd0e [SPARK-41390][SQL] Update the script used to generate 
`register` function in `UDFRegistration`
11cebdbdd0e is described below

commit 11cebdbdd0e6d83cbde5f1cb5e4802a7dd5ada48
Author: yangjie01 
AuthorDate: Mon Dec 5 23:11:23 2022 +0300

[SPARK-41390][SQL] Update the script used to generate `register` function 
in `UDFRegistration`

### What changes were proposed in this pull request?
SPARK-35065 use `QueryCompilationErrors.invalidFunctionArgumentsError` 
instead of  `throw new AnalysisException(...)` for `register` function in 
`UDFRegistration`,  but the script used to generate `register` function has not 
been updated, so this pr update the script.

### Why are the changes needed?
Update the script used to generate `register` function in `UDFRegistration`

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Manually checked the results of the script.

Closes #38916 from LuciferYang/register-func-script.

Authored-by: yangjie01 
Signed-off-by: Max Gekk 
---
 sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala
index 99820336477..80550dc21d2 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala
@@ -145,8 +145,7 @@ class UDFRegistration private[sql] (functionRegistry: 
FunctionRegistry) extends
 |  def builder(e: Seq[Expression]) = if (e.length == $x) {
 |finalUdf.createScalaUDF(e)
 |  } else {
-|throw new AnalysisException("Invalid number of arguments for 
function " + name +
-|  ". Expected: $x; Found: " + e.length)
+|throw QueryCompilationErrors.invalidFunctionArgumentsError(name, 
"$x", e.length)
 |  }
 |  functionRegistry.createOrReplaceTempFunction(name, builder, 
"scala_udf")
 |  finalUdf
@@ -171,8 +170,7 @@ class UDFRegistration private[sql] (functionRegistry: 
FunctionRegistry) extends
 |  def builder(e: Seq[Expression]) = if (e.length == $i) {
 |ScalaUDF(func, replaced, e, Nil, udfName = Some(name))
 |  } else {
-|throw new AnalysisException("Invalid number of arguments for 
function " + name +
-|  ". Expected: $i; Found: " + e.length)
+|throw QueryCompilationErrors.invalidFunctionArgumentsError(name, 
"$i", e.length)
 |  }
 |  functionRegistry.createOrReplaceTempFunction(name, builder, 
"java_udf")
 |}""".stripMargin)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41389][CORE][SQL] Reuse `WRONG_NUM_ARGS` instead of `_LEGACY_ERROR_TEMP_1044`

2022-12-05 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 1996a94b09f [SPARK-41389][CORE][SQL] Reuse `WRONG_NUM_ARGS` instead of 
`_LEGACY_ERROR_TEMP_1044`
1996a94b09f is described below

commit 1996a94b09fe1f450eb33ddb23b16af090bc4d1b
Author: yangjie01 
AuthorDate: Mon Dec 5 18:04:51 2022 +0300

[SPARK-41389][CORE][SQL] Reuse `WRONG_NUM_ARGS` instead of 
`_LEGACY_ERROR_TEMP_1044`

### What changes were proposed in this pull request?
This pr aims to reuse error class `WRONG_NUM_ARGS` instead of 
`_LEGACY_ERROR_TEMP_1044`.

### Why are the changes needed?
Proper names of error classes to improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass Github Actions.

Closes #38913 from LuciferYang/SPARK-41389.

Authored-by: yangjie01 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json| 5 -
 .../org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala   | 5 +++--
 .../scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala  | 6 --
 .../resources/sql-tests/results/sql-compatibility-functions.sql.out | 6 --
 4 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 7d5c272a77f..19ab5ada2b5 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -2011,11 +2011,6 @@
   "Invalid arguments for function ."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1044" : {
-"message" : [
-  "Function  accepts only one argument."
-]
-  },
   "_LEGACY_ERROR_TEMP_1045" : {
 "message" : [
   "ALTER TABLE SET LOCATION does not support partition for v2 tables."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
index 3817f00d09d..be16eaec6ac 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
@@ -896,8 +896,9 @@ object FunctionRegistry {
   name: String,
   dataType: DataType): (String, (ExpressionInfo, FunctionBuilder)) = {
 val builder = (args: Seq[Expression]) => {
-  if (args.size != 1) {
-throw QueryCompilationErrors.functionAcceptsOnlyOneArgumentError(name)
+  val argSize = args.size
+  if (argSize != 1) {
+throw QueryCompilationErrors.invalidFunctionArgumentsError(name, "1", 
argSize)
   }
   Cast(args.head, dataType)
 }
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index 2e20d7aec8d..ed08e33829e 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -663,12 +663,6 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
 }
   }
 
-  def functionAcceptsOnlyOneArgumentError(name: String): Throwable = {
-new AnalysisException(
-  errorClass = "_LEGACY_ERROR_TEMP_1044",
-  messageParameters = Map("name" -> name))
-  }
-
   def alterV2TableSetLocationWithPartitionNotSupportedError(): Throwable = {
 new AnalysisException(
   errorClass = "_LEGACY_ERROR_TEMP_1045",
diff --git 
a/sql/core/src/test/resources/sql-tests/results/sql-compatibility-functions.sql.out
 
b/sql/core/src/test/resources/sql-tests/results/sql-compatibility-functions.sql.out
index e0d5874d058..319ac059385 100644
--- 
a/sql/core/src/test/resources/sql-tests/results/sql-compatibility-functions.sql.out
+++ 
b/sql/core/src/test/resources/sql-tests/results/sql-compatibility-functions.sql.out
@@ -94,9 +94,11 @@ struct<>
 -- !query output
 org.apache.spark.sql.AnalysisException
 {
-  "errorClass" : "_LEGACY_ERROR_TEMP_1044",
+  "errorClass" : "WRONG_NUM_ARGS",
   "messageParameters" : {
-"name" : "string"
+"actualNum" : "2",
+"expectedNum" : "1",
+"functionName" : "`string`"
   },
   "queryContext" : [ {
 "objectType" : "",


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41373][SQL][ERROR] Rename CAST_WITH_FUN_SUGGESTION to CAST_WITH_FUNC_SUGGESTION

2022-12-03 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 811921be3ba [SPARK-41373][SQL][ERROR] Rename CAST_WITH_FUN_SUGGESTION 
to CAST_WITH_FUNC_SUGGESTION
811921be3ba is described below

commit 811921be3bacb2edb1d382257561429a0a604adb
Author: Rui Wang 
AuthorDate: Sun Dec 4 00:44:11 2022 +0300

[SPARK-41373][SQL][ERROR] Rename CAST_WITH_FUN_SUGGESTION to 
CAST_WITH_FUNC_SUGGESTION

### What changes were proposed in this pull request?

Rename CAST_WITH_FUN_SUGGESTION to CAST_WITH_FUNC_SUGGESTION. This is just 
`_FUN_SUGGESTION` could has other meaning. `CAST_WITH_FUNC_SUGGESTION` is more 
clear.

I didn't choose to rename this it `CAST_WITH_SUGGESTION` because there is a 
`CAST_WITH_CONF_SUGGESTION` so we need to differentiate.

### Why are the changes needed?

Better error message name.

### Does this PR introduce _any_ user-facing change?

NO

### How was this patch tested?

Existing UT.

Closes #38892 from amaliujia/improve_error_message.

Authored-by: Rui Wang 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json |  2 +-
 .../org/apache/spark/sql/catalyst/expressions/Cast.scala |  2 +-
 .../spark/sql/catalyst/expressions/CastSuiteBase.scala   | 12 ++--
 .../spark/sql/catalyst/expressions/CastWithAnsiOnSuite.scala |  4 ++--
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 347b9a14862..7d5c272a77f 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -197,7 +197,7 @@
   "If you have to cast  to , you can set  
as ."
 ]
   },
-  "CAST_WITH_FUN_SUGGESTION" : {
+  "CAST_WITH_FUNC_SUGGESTION" : {
 "message" : [
   "cannot cast  to .",
   "To convert values from  to , you can use the 
functions  instead."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
index a302298d99c..23152adc0ca 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
@@ -419,7 +419,7 @@ object Cast extends QueryErrorsBase {
   fallbackConf: Option[(String, String)]): DataTypeMismatch = {
 def withFunSuggest(names: String*): DataTypeMismatch = {
   DataTypeMismatch(
-errorSubClass = "CAST_WITH_FUN_SUGGESTION",
+errorSubClass = "CAST_WITH_FUNC_SUGGESTION",
 messageParameters = Map(
   "srcType" -> toSQLType(from),
   "targetType" -> toSQLType(to),
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala
index 6d972a8482a..68b3d5c8446 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala
@@ -545,7 +545,7 @@ abstract class CastSuiteBase extends SparkFunSuite with 
ExpressionEvalHelper {
   protected def checkInvalidCastFromNumericType(to: DataType): Unit = {
 cast(1.toByte, to).checkInputDataTypes() ==
   DataTypeMismatch(
-errorSubClass = "CAST_WITH_FUN_SUGGESTION",
+errorSubClass = "CAST_WITH_FUNC_SUGGESTION",
 messageParameters = Map(
   "srcType" -> toSQLType(Literal(1.toByte).dataType),
   "targetType" -> toSQLType(to),
@@ -554,7 +554,7 @@ abstract class CastSuiteBase extends SparkFunSuite with 
ExpressionEvalHelper {
   )
 cast(1.toShort, to).checkInputDataTypes() ==
   DataTypeMismatch(
-errorSubClass = "CAST_WITH_FUN_SUGGESTION",
+errorSubClass = "CAST_WITH_FUNC_SUGGESTION",
 messageParameters = Map(
   "srcType" -> toSQLType(Literal(1.toShort).dataType),
   "targetType" -> toSQLType(to),
@@ -563,7 +563,7 @@ abstract class CastSuiteBase extends SparkFunSuite with 
ExpressionEvalHelper {
   )
 cast(1, to).checkInputDataTypes() ==
   DataTypeMismatch(
-errorSubClass = "CAST_WITH_FUN_SUGGESTION",
+errorSubClass = "CAST_WITH_FUNC_SUGGESTION",
 messageParameters = Map(
   "srcType" -> toSQLType(Lit

[spark] branch master updated (0f1c515179e -> 3fc8a902673)

2022-12-01 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 0f1c515179e [SPARK-41345][CONNECT] Add Hint to Connect Proto
 add 3fc8a902673 [SPARK-41348][SQL][TESTS] Refactor 
`UnsafeArrayWriterSuite` to check error class

No new revisions were added by this update.

Summary of changes:
 .../expressions/codegen/UnsafeArrayWriterSuite.scala| 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41314][SQL] Assign a name to the error class `_LEGACY_ERROR_TEMP_1094`

2022-12-01 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e00f14ff521 [SPARK-41314][SQL] Assign a name to the error class 
`_LEGACY_ERROR_TEMP_1094`
e00f14ff521 is described below

commit e00f14ff5216e194fe39ef38d2c9414a22ef696a
Author: yangjie01 
AuthorDate: Thu Dec 1 11:49:42 2022 +0300

[SPARK-41314][SQL] Assign a name to the error class 
`_LEGACY_ERROR_TEMP_1094`

### What changes were proposed in this pull request?
This pr aims to rename error class `_LEGACY_ERROR_TEMP_1094` to 
`INVALID_SCHEMA.NON_STRUCT_TYPE`.

### Why are the changes needed?
Proper names of error classes to improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Add new tests to check `INVALID_SCHEMA.NON_STRUCT_TYPE`

Closes #38856 from LuciferYang/SPARK-41314.

Authored-by: yangjie01 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 
 .../spark/sql/catalyst/expressions/ExprUtils.scala |  2 +-
 .../spark/sql/errors/QueryCompilationErrors.scala  |  9 ---
 .../resources/sql-tests/inputs/csv-functions.sql   |  1 +
 .../sql-tests/results/csv-functions.sql.out| 22 
 .../org/apache/spark/sql/CsvFunctionsSuite.scala   | 29 ++
 6 files changed, 64 insertions(+), 9 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 65b6dc68d12..347b9a14862 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -782,6 +782,11 @@
   "The input expression must be string literal and not null."
 ]
   },
+  "NON_STRUCT_TYPE" : {
+"message" : [
+  "The input expression should be evaluated to struct type, but got 
."
+]
+  },
   "PARSE_ERROR" : {
 "message" : [
   "Cannot parse the schema:",
@@ -2211,11 +2216,6 @@
   "Cannot read table property '' as it's corrupted.."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1094" : {
-"message" : [
-  "Schema should be struct type but got ."
-]
-  },
   "_LEGACY_ERROR_TEMP_1097" : {
 "message" : [
   "The field for corrupt records must be string type and nullable."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala
index fbe3d5eb458..2fa970bac0c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala
@@ -46,7 +46,7 @@ object ExprUtils extends QueryErrorsBase {
   def evalSchemaExpr(exp: Expression): StructType = {
 val dataType = evalTypeExpr(exp)
 if (!dataType.isInstanceOf[StructType]) {
-  throw QueryCompilationErrors.schemaIsNotStructTypeError(dataType)
+  throw QueryCompilationErrors.schemaIsNotStructTypeError(exp, dataType)
 }
 dataType.asInstanceOf[StructType]
   }
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index fc9a08104b4..2e20d7aec8d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -1010,10 +1010,13 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
   messageParameters = Map("inputSchema" -> toSQLExpr(exp)))
   }
 
-  def schemaIsNotStructTypeError(dataType: DataType): Throwable = {
+  def schemaIsNotStructTypeError(exp: Expression, dataType: DataType): 
Throwable = {
 new AnalysisException(
-  errorClass = "_LEGACY_ERROR_TEMP_1094",
-  messageParameters = Map("dataType" -> dataType.toString))
+  errorClass = "INVALID_SCHEMA.NON_STRUCT_TYPE",
+  messageParameters = Map(
+"inputSchema" -> toSQLExpr(exp),
+"dataType" -> toSQLType(dataType)
+  ))
   }
 
   def keyValueInMapNotStringError(m: CreateMap): Throwable = {
diff --git a/sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql 
b/sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql
index a1a4bc9de3f..01d436534a1 100644
--- a/sql/core/src/test/resources/sql-tests/inputs/csv-functions.sql
+++ b/sql/core/src/test/resour

[spark] branch master updated: [SPARK-41228][SQL] Rename & Improve error message for `COLUMN_NOT_IN_GROUP_BY_CLAUSE`

2022-11-30 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 5badb2446fa [SPARK-41228][SQL] Rename & Improve error message for 
`COLUMN_NOT_IN_GROUP_BY_CLAUSE`
5badb2446fa is described below

commit 5badb2446fa2b51e8ea239ced6c9b44178b2f1fa
Author: itholic 
AuthorDate: Thu Dec 1 09:18:17 2022 +0300

[SPARK-41228][SQL] Rename & Improve error message for 
`COLUMN_NOT_IN_GROUP_BY_CLAUSE`

### What changes were proposed in this pull request?

This PR proposes to rename `COLUMN_NOT_IN_GROUP_BY_CLAUSE` to 
`MISSING_AGGREGATION`.

Also, improve its error message.

### Why are the changes needed?

The current error class name and its error message doesn't illustrate the 
error cause and resolution correctly.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

```
./build/sbt “sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*”
```

Closes #38769 from itholic/SPARK-41128.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json  | 13 +++--
 .../sql/tests/pandas/test_pandas_udf_grouped_agg.py   |  2 +-
 .../apache/spark/sql/errors/QueryCompilationErrors.scala  |  7 +--
 .../spark/sql/catalyst/analysis/AnalysisErrorSuite.scala  |  7 +--
 .../src/test/resources/sql-tests/results/extract.sql.out  |  2 ++
 .../resources/sql-tests/results/group-by-filter.sql.out   | 10 ++
 .../src/test/resources/sql-tests/results/group-by.sql.out | 15 +--
 .../test/resources/sql-tests/results/grouping_set.sql.out |  5 +++--
 .../sql-tests/results/postgreSQL/create_view.sql.out  |  5 +++--
 .../sql-tests/results/udaf/udaf-group-by-ordinal.sql.out  | 15 +--
 .../sql-tests/results/udaf/udaf-group-by.sql.out  | 15 +--
 .../resources/sql-tests/results/udf/udf-group-by.sql.out  | 15 +--
 .../org/apache/spark/sql/execution/SQLViewSuite.scala |  5 +++--
 13 files changed, 71 insertions(+), 45 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index a79c02e1f1d..65b6dc68d12 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -109,12 +109,6 @@
   "The column  already exists. Consider to choose another name 
or rename the existing column."
 ]
   },
-  "COLUMN_NOT_IN_GROUP_BY_CLAUSE" : {
-"message" : [
-  "The expression  is neither present in the group by, nor is 
it an aggregate function. Add to group by or wrap in `first()` (or 
`first_value()`) if you don't care which value you get."
-],
-"sqlState" : "42000"
-  },
   "CONCURRENT_QUERY" : {
 "message" : [
   "Another instance of this query was just started by a concurrent 
session."
@@ -830,6 +824,13 @@
   "Malformed Protobuf messages are detected in message deserialization. 
Parse Mode: . To process malformed protobuf message as null 
result, try setting the option 'mode' as 'PERMISSIVE'."
 ]
   },
+  "MISSING_AGGREGATION" : {
+"message" : [
+  "The non-aggregating expression  is based on columns which 
are not participating in the GROUP BY clause.",
+  "Add the columns or the expression to the GROUP BY, aggregate the 
expression, or use  if you do not care which of the values 
within a group is returned."
+],
+"sqlState" : "42000"
+  },
   "MISSING_STATIC_PARTITION_COLUMN" : {
 "message" : [
   "Unknown static partition column: "
diff --git a/python/pyspark/sql/tests/pandas/test_pandas_udf_grouped_agg.py 
b/python/pyspark/sql/tests/pandas/test_pandas_udf_grouped_agg.py
index 6f475624b74..aa844fc5fd5 100644
--- a/python/pyspark/sql/tests/pandas/test_pandas_udf_grouped_agg.py
+++ b/python/pyspark/sql/tests/pandas/test_pandas_udf_grouped_agg.py
@@ -475,7 +475,7 @@ class GroupedAggPandasUDFTests(ReusedSQLTestCase):
 mean_udf = self.pandas_agg_mean_udf
 
 with QuietTest(self.sc):
-with self.assertRaisesRegex(AnalysisException, "nor.*aggregate 
function"):
+with self.assertRaisesRegex(AnalysisException, 
"[MISSING_AGGREGATION]"):
 df.groupby(df.id).agg(plus_one(df.v)).collect()
 
 with QuietTest(self.sc):
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scal

[spark] branch master updated (ce41ca0848e -> c5f189c5365)

2022-11-30 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from ce41ca0848e [SPARK-41343][CONNECT] Move FunctionName parsing to server 
side
 add c5f189c5365 [SPARK-41237][SQL] Reuse the error class 
`UNSUPPORTED_DATATYPE` for `_LEGACY_ERROR_TEMP_0030`

No new revisions were added by this update.

Summary of changes:
 R/pkg/tests/fulltests/test_sparkSQL.R|  6 +++---
 R/pkg/tests/fulltests/test_streaming.R   |  2 +-
 R/pkg/tests/fulltests/test_utils.R   |  2 +-
 core/src/main/resources/error/error-classes.json |  5 -
 .../org/apache/spark/sql/errors/QueryParsingErrors.scala |  4 ++--
 .../apache/spark/sql/catalyst/parser/DDLParserSuite.scala|  4 ++--
 .../spark/sql/catalyst/parser/DataTypeParserSuite.scala  | 12 ++--
 .../apache/spark/sql/catalyst/parser/ErrorParserSuite.scala  |  4 ++--
 .../test/resources/sql-tests/results/csv-functions.sql.out   |  1 -
 .../test/resources/sql-tests/results/postgreSQL/with.sql.out | 10 ++
 .../sql/execution/datasources/jdbc/JdbcUtilsSuite.scala  |  4 ++--
 .../datasources/v2/jdbc/JDBCTableCatalogSuite.scala  |  4 ++--
 .../scala/org/apache/spark/sql/jdbc/JDBCWriteSuite.scala |  4 ++--
 13 files changed, 33 insertions(+), 29 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41309][SQL] Reuse `INVALID_SCHEMA.NON_STRING_LITERAL` instead of `_LEGACY_ERROR_TEMP_1093`

2022-11-29 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new a47869af7fa [SPARK-41309][SQL] Reuse 
`INVALID_SCHEMA.NON_STRING_LITERAL` instead of  `_LEGACY_ERROR_TEMP_1093`
a47869af7fa is described below

commit a47869af7fa82b708520da123fa0446214f601c2
Author: yangjie01 
AuthorDate: Tue Nov 29 19:36:59 2022 +0300

[SPARK-41309][SQL] Reuse `INVALID_SCHEMA.NON_STRING_LITERAL` instead of  
`_LEGACY_ERROR_TEMP_1093`

### What changes were proposed in this pull request?
This pr aims reuse `INVALID_SCHEMA.NON_STRING_LITERAL` instead of 
`_LEGACY_ERROR_TEMP_1093`.

### Why are the changes needed?
Proper names of error classes to improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

Closes #38830 from LuciferYang/SPARK-41309.

Lead-authored-by: yangjie01 
Co-authored-by: YangJie 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json|  5 -
 .../apache/spark/sql/catalyst/expressions/ExprUtils.scala   |  2 +-
 .../apache/spark/sql/errors/QueryCompilationErrors.scala|  6 --
 .../test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala | 13 -
 .../scala/org/apache/spark/sql/JsonFunctionsSuite.scala | 13 -
 5 files changed, 17 insertions(+), 22 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 89728777201..cddb0848765 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -2215,11 +2215,6 @@
   "Cannot read table property '' as it's corrupted.."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1093" : {
-"message" : [
-  "Schema should be specified in DDL format as a string literal or output 
of the schema_of_json/schema_of_csv functions instead of ."
-]
-  },
   "_LEGACY_ERROR_TEMP_1094" : {
 "message" : [
   "Schema should be struct type but got ."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala
index e9084442b22..fbe3d5eb458 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala
@@ -39,7 +39,7 @@ object ExprUtils extends QueryErrorsBase {
 
   }
 } else {
-  throw QueryCompilationErrors.schemaNotFoldableError(exp)
+  throw QueryCompilationErrors.unexpectedSchemaTypeError(exp)
 }
   }
 
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index ce99bf4aa47..e5b1c3c100d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -1009,12 +1009,6 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
   messageParameters = Map("inputSchema" -> toSQLExpr(exp)))
   }
 
-  def schemaNotFoldableError(exp: Expression): Throwable = {
-new AnalysisException(
-  errorClass = "_LEGACY_ERROR_TEMP_1093",
-  messageParameters = Map("expr" -> exp.sql))
-  }
-
   def schemaIsNotStructTypeError(dataType: DataType): Throwable = {
 new AnalysisException(
   errorClass = "_LEGACY_ERROR_TEMP_1094",
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala
index 940eaaed6ac..ab4c148da04 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/CsvFunctionsSuite.scala
@@ -357,11 +357,14 @@ class CsvFunctionsSuite extends QueryTest with 
SharedSparkSession {
   Seq("""1,"a"""").toDS().select(from_csv($"value", schema, options)),
   Row(Row(1, "a")))
 
-val errMsg = intercept[AnalysisException] {
-  Seq(("1", "i int")).toDF("csv", "schema")
-.select(from_csv($"csv", $"schema", options)).collect()
-}.getMessage
-assert(errMsg.contains("Schema should be specified in DDL format as a 
string literal"))
+checkError(
+  exception = intercept[AnalysisException] {
+Seq(("1", &quo

[spark] branch master updated: [SPARK-41180][SQL] Reuse `INVALID_SCHEMA` instead of `_LEGACY_ERROR_TEMP_1227`

2022-11-28 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new bdb4d5e4da5 [SPARK-41180][SQL] Reuse `INVALID_SCHEMA` instead of 
`_LEGACY_ERROR_TEMP_1227`
bdb4d5e4da5 is described below

commit bdb4d5e4da558775df2be712dd8760d5f5f14747
Author: yangjie01 
AuthorDate: Mon Nov 28 20:26:27 2022 +0300

[SPARK-41180][SQL] Reuse `INVALID_SCHEMA` instead of 
`_LEGACY_ERROR_TEMP_1227`

### What changes were proposed in this pull request?
This pr aims rename `_LEGACY_ERROR_TEMP_1227` to `INVALID_SCHEMA`

### Why are the changes needed?
Proper names of error classes to improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

Closes #38754 from LuciferYang/SPARK-41180.

Lead-authored-by: yangjie01 
Co-authored-by: YangJie 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 23 ---
 project/MimaExcludes.scala |  5 +-
 .../spark/sql/catalyst/expressions/ExprUtils.scala |  2 +-
 .../spark/sql/errors/QueryCompilationErrors.scala  | 23 ---
 .../apache/spark/sql/errors/QueryErrorsBase.scala  |  4 ++
 .../org/apache/spark/sql/types/DataType.scala  | 13 ++--
 .../scala/org/apache/spark/sql/functions.scala |  1 -
 .../sql-tests/results/csv-functions.sql.out| 12 ++--
 .../sql-tests/results/json-functions.sql.out   | 12 ++--
 .../org/apache/spark/sql/CsvFunctionsSuite.scala   |  4 +-
 .../apache/spark/sql/DataFrameFunctionsSuite.scala |  4 +-
 .../org/apache/spark/sql/JsonFunctionsSuite.scala  | 73 --
 12 files changed, 115 insertions(+), 61 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 9f4337d0618..89728777201 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -780,8 +780,21 @@
   },
   "INVALID_SCHEMA" : {
 "message" : [
-  "The expression  is not a valid schema string."
-]
+  "The input schema  is not a valid schema string."
+],
+"subClass" : {
+  "NON_STRING_LITERAL" : {
+"message" : [
+  "The input expression must be string literal and not null."
+]
+  },
+  "PARSE_ERROR" : {
+"message" : [
+  "Cannot parse the schema:",
+  ""
+]
+  }
+}
   },
   "INVALID_SQL_SYNTAX" : {
 "message" : [
@@ -2844,12 +2857,6 @@
   "The SQL config '' was removed in the version . 
"
 ]
   },
-  "_LEGACY_ERROR_TEMP_1227" : {
-"message" : [
-  "",
-  "Failed fallback parsing: "
-]
-  },
   "_LEGACY_ERROR_TEMP_1228" : {
 "message" : [
   "Decimal scale () cannot be greater than precision ()."
diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index d8f87a504fa..eed79d1f204 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -120,7 +120,10 @@ object MimaExcludes {
 
ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.storage.ShuffleBlockFetcherIterator#FetchRequest.apply"),
 
 // [SPARK-41072][SS] Add the error class STREAM_FAILED to 
StreamingQueryException
-
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.streaming.StreamingQueryException.this")
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.streaming.StreamingQueryException.this"),
+
+// [SPARK-41180][SQL] Reuse INVALID_SCHEMA instead of 
_LEGACY_ERROR_TEMP_1227
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.types.DataType.parseTypeWithFallback")
   )
 
   // Defulat exclude rules
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala
index 3e10b820aa6..e9084442b22 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala
@@ -35,7 +35,7 @@ object ExprUtils extends QueryErrorsBase {
 case s: UTF8String if s != null =>
   val dataType = DataType.fromDDL(s.toString)
   CharVarcharUtils.failIfHasCharVarchar(dataType)
-case _ => throw QueryCompilationErrors.invalidSchemaStringError(exp)
+case _ => throw QueryCompilationEr

[spark] branch master updated: [SPARK-41293][SQL][TESTS] Code cleanup for `assertXXX` methods in `ExpressionTypeCheckingSuite`

2022-11-28 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new c3de4ca1477 [SPARK-41293][SQL][TESTS] Code cleanup for `assertXXX` 
methods in `ExpressionTypeCheckingSuite`
c3de4ca1477 is described below

commit c3de4ca14772fa6dff703b662a561a9e65e23d9e
Author: yangjie01 
AuthorDate: Mon Nov 28 14:55:28 2022 +0300

[SPARK-41293][SQL][TESTS] Code cleanup for `assertXXX` methods in 
`ExpressionTypeCheckingSuite`

### What changes were proposed in this pull request?
This pr do some code clean up for `assertXXX` method in 
`ExpressionTypeCheckingSuite`:

1.  Reuse `analysisException` instead of duplicate 
`intercept[AnalysisException](assertSuccess(expr))` in `assertErrorForXXX` 
methods.
2. remove  `assertError` method that is no longer used
3. Change `assertErrorForXXX` methods access scope to `private` due to they 
are only used in `ExpressionTypeCheckingSuite`.

### Why are the changes needed?
Code clean up.

### Does this PR introduce _any_ user-facing change?
No, just for test

### How was this patch tested?
Pass GitHub Actions

Closes #38820 from LuciferYang/SPARK-41293.

Authored-by: yangjie01 
Signed-off-by: Max Gekk 
---
 .../analysis/ExpressionTypeCheckingSuite.scala | 41 ++
 1 file changed, 11 insertions(+), 30 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala
index d406ec8f74a..6202d1e367a 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala
@@ -44,65 +44,46 @@ class ExpressionTypeCheckingSuite extends SparkFunSuite 
with SQLHelper with Quer
 intercept[AnalysisException](assertSuccess(expr))
   }
 
-  def assertError(expr: Expression, errorMessage: String): Unit = {
-val e = intercept[AnalysisException] {
-  assertSuccess(expr)
-}
-assert(e.getMessage.contains(
-  s"cannot resolve '${expr.sql}' due to data type mismatch:"))
-assert(e.getMessage.contains(errorMessage))
-  }
-
-  def assertSuccess(expr: Expression): Unit = {
+  private def assertSuccess(expr: Expression): Unit = {
 val analyzed = testRelation.select(expr.as("c")).analyze
 SimpleAnalyzer.checkAnalysis(analyzed)
   }
 
-  def assertErrorForBinaryDifferingTypes(
+  private def assertErrorForBinaryDifferingTypes(
   expr: Expression, messageParameters: Map[String, String]): Unit = {
 checkError(
-  exception = intercept[AnalysisException] {
-assertSuccess(expr)
-  },
+  exception = analysisException(expr),
   errorClass = "DATATYPE_MISMATCH.BINARY_OP_DIFF_TYPES",
   parameters = messageParameters)
   }
 
-  def assertErrorForOrderingTypes(
+  private def assertErrorForOrderingTypes(
   expr: Expression, messageParameters: Map[String, String]): Unit = {
 checkError(
-  exception = intercept[AnalysisException] {
-assertSuccess(expr)
-  },
+  exception = analysisException(expr),
   errorClass = "DATATYPE_MISMATCH.INVALID_ORDERING_TYPE",
   parameters = messageParameters)
   }
 
-  def assertErrorForDataDifferingTypes(
+  private def assertErrorForDataDifferingTypes(
   expr: Expression, messageParameters: Map[String, String]): Unit = {
 checkError(
-  exception = intercept[AnalysisException] {
-assertSuccess(expr)
-  },
+  exception = analysisException(expr),
   errorClass = "DATATYPE_MISMATCH.DATA_DIFF_TYPES",
   parameters = messageParameters)
   }
 
-  def assertErrorForWrongNumParameters(
+  private def assertErrorForWrongNumParameters(
   expr: Expression, messageParameters: Map[String, String]): Unit = {
 checkError(
-  exception = intercept[AnalysisException] {
-assertSuccess(expr)
-  },
+  exception = analysisException(expr),
   errorClass = "DATATYPE_MISMATCH.WRONG_NUM_ARGS",
   parameters = messageParameters)
   }
 
-  def assertForWrongType(expr: Expression, messageParameters: Map[String, 
String]): Unit = {
+  private def assertForWrongType(expr: Expression, messageParameters: 
Map[String, String]): Unit = {
 checkError(
-  exception = intercept[AnalysisException] {
-assertSuccess(expr)
-  },
+  exception = analysisException(expr),
   errorClass = "DATATYPE_MISMATCH.BINARY_OP_WRONG_TYPE",
   parameters = messageParameters)
   }


-

[spark] branch master updated: [SPARK-41272][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2019

2022-11-28 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new d979736a9eb [SPARK-41272][SQL] Assign a name to the error class 
_LEGACY_ERROR_TEMP_2019
d979736a9eb is described below

commit d979736a9eb754725d33fd5baca88a1c1a8c23ce
Author: panbingkun 
AuthorDate: Mon Nov 28 12:01:02 2022 +0300

[SPARK-41272][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_2019

### What changes were proposed in this pull request?
In the PR, I propose to assign the name `NULL_MAP_KEY` to the error class 
`_LEGACY_ERROR_TEMP_2019`.

### Why are the changes needed?
Proper names of error classes should improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes #38808 from panbingkun/LEGACY_2019.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 ++---
 .../spark/sql/errors/QueryExecutionErrors.scala|  2 +-
 .../catalyst/encoders/ExpressionEncoderSuite.scala | 20 +++---
 .../expressions/CollectionExpressionsSuite.scala   | 10 ++---
 .../catalyst/expressions/ComplexTypeSuite.scala| 11 +++---
 .../expressions/ExpressionEvalHelper.scala | 43 +-
 .../expressions/ObjectExpressionsSuite.scala   | 10 ++---
 .../catalyst/util/ArrayBasedMapBuilderSuite.scala  |  8 +++-
 .../apache/spark/sql/DataFrameFunctionsSuite.scala | 38 ++-
 9 files changed, 113 insertions(+), 39 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 1246e870e0d..9f4337d0618 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -895,6 +895,11 @@
   "The comparison result is null. If you want to handle null as 0 (equal), 
you can set \"spark.sql.legacy.allowNullComparisonResultInArraySort\" to 
\"true\"."
 ]
   },
+  "NULL_MAP_KEY" : {
+"message" : [
+  "Cannot use null as map key."
+]
+  },
   "NUMERIC_OUT_OF_SUPPORTED_RANGE" : {
 "message" : [
   "The value  cannot be interpreted as a numeric since it has more 
than 38 digits."
@@ -3504,11 +3509,6 @@
   "class `` is not supported by `MapObjects` as resulting collection."
 ]
   },
-  "_LEGACY_ERROR_TEMP_2019" : {
-"message" : [
-  "Cannot use null as map key!"
-]
-  },
   "_LEGACY_ERROR_TEMP_2020" : {
 "message" : [
   "Couldn't find a valid constructor on "
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
index 5db54f7f4cf..15dfa581c59 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
@@ -444,7 +444,7 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase {
 
   def nullAsMapKeyNotAllowedError(): SparkRuntimeException = {
 new SparkRuntimeException(
-  errorClass = "_LEGACY_ERROR_TEMP_2019",
+  errorClass = "NULL_MAP_KEY",
   messageParameters = Map.empty)
   }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
index 9b481b13fee..e9336405a53 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
@@ -24,7 +24,7 @@ import java.util.Arrays
 import scala.collection.mutable.ArrayBuffer
 import scala.reflect.runtime.universe.TypeTag
 
-import org.apache.spark.SparkArithmeticException
+import org.apache.spark.{SparkArithmeticException, SparkRuntimeException}
 import org.apache.spark.sql.{Encoder, Encoders}
 import org.apache.spark.sql.catalyst.{FooClassWithEnum, FooEnum, OptionalData, 
PrimitiveData, ScroogeLikeExample}
 import org.apache.spark.sql.catalyst.analysis.AnalysisTest
@@ -539,14 +539,24 @@ class ExpressionEncoderSuite extends 
CodegenInterpretedPlanTest with AnalysisTes
 
   test("null check for map key: String") {
 val toRow = ExpressionEncoder[Map[String, Int]]().createSerializer()
-val e = intercept[RuntimeException](toRow(Map(("a", 1), (null, 2
-assert(e.getMessage.contains("Cannot use null as

[spark] branch master updated (ed3775704bb -> e4b5eec6e27)

2022-11-27 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from ed3775704bb [MINOR][SQL][TESTS] Restore the code style check of 
`QueryExecutionErrorsSuite`
 add e4b5eec6e27 [SPARK-38728][SQL] Test the error class: FAILED_RENAME_PATH

No new revisions were added by this update.

Summary of changes:
 .../sql/errors/QueryExecutionErrorsSuite.scala | 35 ++
 1 file changed, 35 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [MINOR][SQL][TESTS] Restore the code style check of `QueryExecutionErrorsSuite`

2022-11-27 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ed3775704bb [MINOR][SQL][TESTS] Restore the code style check of 
`QueryExecutionErrorsSuite`
ed3775704bb is described below

commit ed3775704bbdc9a9c479dc06565c8bf8c4d9640c
Author: yangjie01 
AuthorDate: Sun Nov 27 15:03:35 2022 +0300

[MINOR][SQL][TESTS] Restore the code style check of 
`QueryExecutionErrorsSuite`

### What changes were proposed in this pull request?

https://github.com/apache/spark/blob/9af216d7ac26f0ec916833c2e80a01aef8933529/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala#L451-L454

As above code, line 451 in `QueryExecutionErrorsSuite.scala` turn off all 
scala style check and line 454 just turn on `throwerror` check, so the code 
after line 454 of the `QueryExecutionErrorsSuite.scala` will not be checked for 
code style except `throwerror`.

This pr restore the code style check and fix a existing `File line length 
exceeds 100 characters.` case.

### Why are the changes needed?
Restore the code style check of `QueryExecutionErrorsSuite`

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

Closes #38812 from LuciferYang/minor-checkstyle.

Authored-by: yangjie01 
Signed-off-by: Max Gekk 
---
 .../org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala  | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala
index aa0f720d4de..807188bee3a 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala
@@ -448,7 +448,7 @@ class QueryExecutionErrorsSuite
 
   override def getResources(name: String): java.util.Enumeration[URL] = {
 if 
(name.equals("META-INF/services/org.apache.spark.sql.sources.DataSourceRegister"))
 {
-  // scalastyle:off
+  // scalastyle:off throwerror
   throw new ServiceConfigurationError(s"Illegal configuration-file 
syntax: $name",
 new 
NoClassDefFoundError("org.apache.spark.sql.sources.HadoopFsRelationProvider"))
   // scalastyle:on throwerror
@@ -632,7 +632,8 @@ class QueryExecutionErrorsSuite
   },
   errorClass = "UNSUPPORTED_DATATYPE",
   parameters = Map(
-"typeName" -> "StructType()[1.1] failure: 'TimestampType' expected but 
'S' found\n\nStructType()\n^"
+"typeName" ->
+  "StructType()[1.1] failure: 'TimestampType' expected but 'S' 
found\n\nStructType()\n^"
   ),
   sqlState = "0A000")
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [MINOR][SQL] Fix the pretty name of the `AnyValue` expression

2022-11-26 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 9af216d7ac2 [MINOR][SQL] Fix the pretty name of the `AnyValue` 
expression
9af216d7ac2 is described below

commit 9af216d7ac26f0ec916833c2e80a01aef8933529
Author: Max Gekk 
AuthorDate: Sun Nov 27 10:33:26 2022 +0300

[MINOR][SQL] Fix the pretty name of the `AnyValue` expression

### What changes were proposed in this pull request?
In the PR, I propose to override the `prettyName` method of the `AnyValue` 
expression and set to `any_value` by default as in `FunctionRegistry`:

https://github.com/apache/spark/blob/40b7d29e14cfa96984c5b0a231a75b210dd85a7e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala#L466

### Why are the changes needed?
To don't confuse users by non-existent function name, and print correct 
name in errors.

### Does this PR introduce _any_ user-facing change?
Yes, it could be.

### How was this patch tested?
By running the affected test suite:
```
$ build/sbt "sql/testOnly *ExpressionsSchemaSuite"
```

Closes #38805 from MaxGekk/any_value-pretty-name.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 .../apache/spark/sql/catalyst/expressions/aggregate/AnyValue.scala| 4 
 sql/core/src/test/resources/sql-functions/sql-expression-schema.md| 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/AnyValue.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/AnyValue.scala
index 47559b90e9c..9fbca1629c9 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/AnyValue.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/AnyValue.scala
@@ -17,6 +17,7 @@
 
 package org.apache.spark.sql.catalyst.expressions.aggregate
 
+import org.apache.spark.sql.catalyst.analysis.FunctionRegistry
 import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.trees.UnaryLike
 import org.apache.spark.sql.types._
@@ -61,4 +62,7 @@ case class AnyValue(child: Expression, ignoreNulls: Boolean)
   override protected def withNewChildInternal(newChild: Expression): AnyValue =
 copy(child = newChild)
   override def inputTypes: Seq[AbstractDataType] = Seq(AnyDataType, 
BooleanType)
+
+  override def prettyName: String =
+getTagValue(FunctionRegistry.FUNC_ALIAS).getOrElse("any_value")
 }
diff --git a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md 
b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
index 482c72679bb..8d47878de15 100644
--- a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
+++ b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
@@ -349,7 +349,7 @@
 | org.apache.spark.sql.catalyst.expressions.XxHash64 | xxhash64 | SELECT 
xxhash64('Spark', array(123), 2) | struct |
 | org.apache.spark.sql.catalyst.expressions.Year | year | SELECT 
year('2016-07-30') | struct |
 | org.apache.spark.sql.catalyst.expressions.ZipWith | zip_with | SELECT 
zip_with(array(1, 2, 3), array('a', 'b', 'c'), (x, y) -> (y, x)) | 
struct>> |
-| org.apache.spark.sql.catalyst.expressions.aggregate.AnyValue | any_value | 
SELECT any_value(col) FROM VALUES (10), (5), (20) AS tab(col) | 
struct |
+| org.apache.spark.sql.catalyst.expressions.aggregate.AnyValue | any_value | 
SELECT any_value(col) FROM VALUES (10), (5), (20) AS tab(col) | 
struct |
 | org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | 
approx_percentile | SELECT approx_percentile(col, array(0.5, 0.4, 0.1), 100) 
FROM VALUES (0), (1), (2), (10) AS tab(col) | struct> |
 | org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | 
percentile_approx | SELECT percentile_approx(col, array(0.5, 0.4, 0.1), 100) 
FROM VALUES (0), (1), (2), (10) AS tab(col) | struct> |
 | org.apache.spark.sql.catalyst.expressions.aggregate.Average | avg | SELECT 
avg(col) FROM VALUES (1), (2), (3) AS tab(col) | struct |


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41181][SQL] Migrate the map options errors onto error classes

2022-11-24 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 0ae82d99d13 [SPARK-41181][SQL] Migrate the map options errors onto 
error classes
0ae82d99d13 is described below

commit 0ae82d99d13988086a297920d45a766115a70578
Author: panbingkun 
AuthorDate: Fri Nov 25 09:03:49 2022 +0300

[SPARK-41181][SQL] Migrate the map options errors onto error classes

### What changes were proposed in this pull request?
The pr aims to migrate the map options errors onto error classes.

### Why are the changes needed?
The changes improve the error framework.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes #38730 from panbingkun/SPARK-41181.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 27 +
 .../spark/sql/errors/QueryCompilationErrors.scala  |  6 +-
 .../sql-tests/results/csv-functions.sql.out| 13 +++--
 .../sql-tests/results/json-functions.sql.out   | 12 ++--
 .../org/apache/spark/sql/JsonFunctionsSuite.scala  | 66 --
 5 files changed, 81 insertions(+), 43 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 55a56712554..1246e870e0d 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -735,6 +735,23 @@
   "The  JOIN with LATERAL correlation is not allowed because an 
OUTER subquery cannot correlate to its join partner. Remove the LATERAL 
correlation or use an INNER JOIN, or LEFT OUTER JOIN instead."
 ]
   },
+  "INVALID_OPTIONS" : {
+"message" : [
+  "Invalid options:"
+],
+"subClass" : {
+  "NON_MAP_FUNCTION" : {
+"message" : [
+  "Must use the `map()` function for options."
+]
+  },
+  "NON_STRING_TYPE" : {
+"message" : [
+  "A type of keys and values in `map()` must be string, but got 
."
+]
+  }
+}
+  },
   "INVALID_PANDAS_UDF_PLACEMENT" : {
 "message" : [
   "The group aggregate pandas UDF  cannot be invoked 
together with as other, non-pandas aggregate functions."
@@ -2190,16 +2207,6 @@
   "Schema should be struct type but got ."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1095" : {
-"message" : [
-  "A type of keys and values in map() must be string, but got ."
-]
-  },
-  "_LEGACY_ERROR_TEMP_1096" : {
-"message" : [
-  "Must use a map() function for options."
-]
-  },
   "_LEGACY_ERROR_TEMP_1097" : {
 "message" : [
   "The field for corrupt records must be string type and nullable."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index fa22c36f841..486bd21b844 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -1013,13 +1013,13 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
 
   def keyValueInMapNotStringError(m: CreateMap): Throwable = {
 new AnalysisException(
-  errorClass = "_LEGACY_ERROR_TEMP_1095",
-  messageParameters = Map("map" -> m.dataType.catalogString))
+  errorClass = "INVALID_OPTIONS.NON_STRING_TYPE",
+  messageParameters = Map("mapType" -> toSQLType(m.dataType)))
   }
 
   def nonMapFunctionNotAllowedError(): Throwable = {
 new AnalysisException(
-  errorClass = "_LEGACY_ERROR_TEMP_1096",
+  errorClass = "INVALID_OPTIONS.NON_MAP_FUNCTION",
   messageParameters = Map.empty)
   }
 
diff --git 
a/sql/core/src/test/resources/sql-tests/results/csv-functions.sql.out 
b/sql/core/src/test/resources/sql-tests/results/csv-functions.sql.out
index 0b5a63c28e4..200ddd837e1 100644
--- a/sql/core/src/test/resources/sql-tests/results/csv-functions.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/csv-functions.sql.out
@@ -66,7 +66,7 @@ struct<>
 -- !query output
 org.apache.spark.sql.AnalysisException
 {
-  "errorClass" : "_LEGACY_ERROR_TEMP_1096",
+  "errorClass" : "INVALID_OPTIONS.NON_MAP_FUNCTION",
   "queryContext" : [ {
 "objectType" : "",
 "objectName" : ""

[spark] branch master updated (074444bd71f -> 1f90e416314)

2022-11-24 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 07bd71f [SPARK-41179][SQL] Assign a name to the error class 
_LEGACY_ERROR_TEMP_1092
 add 1f90e416314 [SPARK-41182][SQL] Assign a name to the error class 
_LEGACY_ERROR_TEMP_1102

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   | 10 ++--
 .../spark/sql/errors/QueryCompilationErrors.scala  |  6 +--
 .../resources/sql-tests/results/extract.sql.out| 56 +++---
 3 files changed, 35 insertions(+), 37 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41179][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1092

2022-11-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 07bd71f [SPARK-41179][SQL] Assign a name to the error class 
_LEGACY_ERROR_TEMP_1092
07bd71f is described below

commit 07bd71f088d1a5acb6f2ecf92d71ed06ef21
Author: panbingkun 
AuthorDate: Thu Nov 24 09:17:45 2022 +0300

[SPARK-41179][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1092

### What changes were proposed in this pull request?
In the PR, I propose to assign the name `INVALID_SCHEMA` to the error class 
`_LEGACY_ERROR_TEMP_1092`.

### Why are the changes needed?
Proper names of error classes should improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes #38710 from panbingkun/SPARK-41179.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json| 10 +-
 .../spark/sql/errors/QueryCompilationErrors.scala   |  4 ++--
 .../resources/sql-tests/results/csv-functions.sql.out   |  4 ++--
 .../resources/sql-tests/results/json-functions.sql.out  |  4 ++--
 .../scala/org/apache/spark/sql/CsvFunctionsSuite.scala  | 11 +++
 .../org/apache/spark/sql/DataFrameFunctionsSuite.scala  | 17 +
 .../scala/org/apache/spark/sql/JsonFunctionsSuite.scala | 17 +
 7 files changed, 48 insertions(+), 19 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index a89fffde51d..c58f9b9fb38 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -756,6 +756,11 @@
   " is not a Protobuf message type"
 ]
   },
+  "INVALID_SCHEMA" : {
+"message" : [
+  "The expression  is not a valid schema string."
+]
+  },
   "INVALID_SQL_SYNTAX" : {
 "message" : [
   "Invalid SQL syntax: "
@@ -2170,11 +2175,6 @@
   "Cannot read table property '' as it's corrupted.."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1092" : {
-"message" : [
-  "The expression '' is not a valid schema string."
-]
-  },
   "_LEGACY_ERROR_TEMP_1093" : {
 "message" : [
   "Schema should be specified in DDL format as a string literal or output 
of the schema_of_json/schema_of_csv functions instead of ."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index f52a0345bce..7772dd5e9a3 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -995,8 +995,8 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
 
   def invalidSchemaStringError(exp: Expression): Throwable = {
 new AnalysisException(
-  errorClass = "_LEGACY_ERROR_TEMP_1092",
-  messageParameters = Map("expr" -> exp.sql))
+  errorClass = "INVALID_SCHEMA",
+  messageParameters = Map("expr" -> toSQLExpr(exp)))
   }
 
   def schemaNotFoldableError(exp: Expression): Throwable = {
diff --git 
a/sql/core/src/test/resources/sql-tests/results/csv-functions.sql.out 
b/sql/core/src/test/resources/sql-tests/results/csv-functions.sql.out
index c2be9ed7d0b..0b5a63c28e4 100644
--- a/sql/core/src/test/resources/sql-tests/results/csv-functions.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/csv-functions.sql.out
@@ -22,9 +22,9 @@ struct<>
 -- !query output
 org.apache.spark.sql.AnalysisException
 {
-  "errorClass" : "_LEGACY_ERROR_TEMP_1092",
+  "errorClass" : "INVALID_SCHEMA",
   "messageParameters" : {
-"expr" : "1"
+"expr" : "\"1\""
   },
   "queryContext" : [ {
 "objectType" : "",
diff --git 
a/sql/core/src/test/resources/sql-tests/results/json-functions.sql.out 
b/sql/core/src/test/resources/sql-tests/results/json-functions.sql.out
index 3c98cc6e856..ab1465350d8 100644
--- a/sql/core/src/test/resources/sql-tests/results/json-functions.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/json-functions.sql.out
@@ -148,9 +148,9 @@ struct<>
 -- !query output
 org.apache.spark.sql.AnalysisException
 {
-  "errorClass" : "_LEGACY_ERROR_TEMP_1092",
+  "errorClass" : "INVALID_SCHEMA",
   "m

[spark] branch master updated (57f3f0fdd3a -> 9f0aa27a24d)

2022-11-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 57f3f0fdd3a [MINOR][SQL] Fix error message for `UNEXPECTED_INPUT_TYPE`
 add 9f0aa27a24d [SPARK-41176][SQL] Assign a name to the error class 
_LEGACY_ERROR_TEMP_1042

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   | 10 +++---
 .../spark/sql/errors/QueryCompilationErrors.scala  | 13 ---
 .../results/ansi/string-functions.sql.out  | 16 -
 .../results/ceil-floor-with-scale-param.sql.out| 16 -
 .../sql-tests/results/csv-functions.sql.out|  8 ++---
 .../sql-tests/results/json-functions.sql.out   | 32 -
 .../sql-tests/results/string-functions.sql.out | 16 -
 .../results/table-valued-functions.sql.out |  2 +-
 .../sql-tests/results/timestamp-ntz.sql.out|  8 ++---
 .../resources/sql-tests/results/udaf/udaf.sql.out  |  8 ++---
 .../sql-tests/results/udf/udf-udaf.sql.out |  8 ++---
 .../apache/spark/sql/DataFrameFunctionsSuite.scala | 37 ++-
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 18 --
 .../apache/spark/sql/StringFunctionsSuite.scala| 21 ---
 .../test/scala/org/apache/spark/sql/UDFSuite.scala | 41 +-
 .../spark/sql/hive/execution/HiveUDAFSuite.scala   |  9 ++---
 16 files changed, 166 insertions(+), 97 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [MINOR][SQL] Fix error message for `UNEXPECTED_INPUT_TYPE`

2022-11-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 57f3f0fdd3a [MINOR][SQL] Fix error message for `UNEXPECTED_INPUT_TYPE`
57f3f0fdd3a is described below

commit 57f3f0fdd3acd136ddf4904193bfa4e7102a255c
Author: itholic 
AuthorDate: Thu Nov 24 08:52:37 2022 +0300

[MINOR][SQL] Fix error message for `UNEXPECTED_INPUT_TYPE`

### What changes were proposed in this pull request?

This PR proposes to correct the minor syntax on error message for 
`UNEXPECTED_INPUT_TYPE`,

### Why are the changes needed?

Error message should be started with upper-case character, and clear to 
read.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

```
./build/sbt “sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*”
```

Closes #38766 from itholic/minor-UNEXPECTED_INPUT_TYPE.

Lead-authored-by: itholic 
Co-authored-by: Haejoon Lee <44108233+itho...@users.noreply.github.com>
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index f2e7783efdd..239f43ce6e8 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -364,7 +364,7 @@
   },
   "UNEXPECTED_INPUT_TYPE" : {
 "message" : [
-  "parameter  requires  type, however, 
 is of  type."
+  "Parameter  requires the  type, however 
 has the type ."
 ]
   },
   "UNEXPECTED_NULL" : {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c3f8c973d44 -> b77ced58b44)

2022-11-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from c3f8c973d44 [SPARK-41174][CORE][SQL] Propagate an error class to users 
for invalid `format` of `to_binary()`
 add b77ced58b44 [SPARK-41131][SQL] Improve error message for 
`UNRESOLVED_MAP_KEY.WITHOUT_SUGGESTION`

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41174][CORE][SQL] Propagate an error class to users for invalid `format` of `to_binary()`

2022-11-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new c3f8c973d44 [SPARK-41174][CORE][SQL] Propagate an error class to users 
for invalid `format` of `to_binary()`
c3f8c973d44 is described below

commit c3f8c973d448b4d9be7502985aededdd7b81d164
Author: yangjie01 
AuthorDate: Wed Nov 23 17:25:06 2022 +0300

[SPARK-41174][CORE][SQL] Propagate an error class to users for invalid 
`format` of `to_binary()`

### What changes were proposed in this pull request?
This pr overrides the `checkInputDataTypes()` method of `ToBinary` function 
to propagate error class to users for invalid `format`.

### Why are the changes needed?
Migration onto error classes unifies Spark SQL error messages.

### Does this PR introduce _any_ user-facing change?
Yes. The PR changes user-facing error messages.

### How was this patch tested?
Pass GitHub Actions

Closes #38737 from LuciferYang/SPARK-41174.

Authored-by: yangjie01 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   |  5 ++
 .../catalyst/expressions/stringExpressions.scala   | 85 +++---
 .../expressions/StringExpressionsSuite.scala   | 15 
 .../sql-tests/inputs/string-functions.sql  |  4 +
 .../results/ansi/string-functions.sql.out  | 70 +++---
 .../sql-tests/results/string-functions.sql.out | 70 +++---
 6 files changed, 204 insertions(+), 45 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index afe08f044c7..5bac5ae71f2 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -234,6 +234,11 @@
   "Input to the function  cannot contain elements of the 
\"MAP\" type. In Spark, same maps may have different hashcode, thus hash 
expressions are prohibited on \"MAP\" elements. To restore previous behavior 
set \"spark.sql.legacy.allowHashOnMapType\" to \"true\"."
 ]
   },
+  "INVALID_ARG_VALUE" : {
+"message" : [
+  "The  value must to be a  literal of 
, but got ."
+]
+  },
   "INVALID_JSON_MAP_KEY_TYPE" : {
 "message" : [
   "Input schema  can only contain STRING as a key type for a 
MAP."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
index 60b56f4fef7..3a1db2ce1b8 100755
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
@@ -2620,39 +2620,30 @@ case class ToBinary(
 nullOnInvalidFormat: Boolean = false) extends RuntimeReplaceable
 with ImplicitCastInputTypes {
 
-  override lazy val replacement: Expression = format.map { f =>
-assert(f.foldable && (f.dataType == StringType || f.dataType == NullType))
+  @transient lazy val fmt: String = format.map { f =>
 val value = f.eval()
 if (value == null) {
-  Literal(null, BinaryType)
+  null
 } else {
-  value.asInstanceOf[UTF8String].toString.toLowerCase(Locale.ROOT) match {
-case "hex" => Unhex(expr, failOnError = true)
-case "utf-8" | "utf8" => Encode(expr, Literal("UTF-8"))
-case "base64" => UnBase64(expr, failOnError = true)
-case _ if nullOnInvalidFormat => Literal(null, BinaryType)
-case other => throw 
QueryCompilationErrors.invalidStringLiteralParameter(
-  "to_binary",
-  "format",
-  other,
-  Some(
-"The value has to be a case-insensitive string literal of " +
-"'hex', 'utf-8', 'utf8', or 'base64'."))
-  }
+  value.asInstanceOf[UTF8String].toString.toLowerCase(Locale.ROOT)
+}
+  }.getOrElse("hex")
+
+  override lazy val replacement: Expression = if (fmt == null) {
+Literal(null, BinaryType)
+  } else {
+fmt match {
+  case "hex" => Unhex(expr, failOnError = true)
+  case "utf-8" | "utf8" => Encode(expr, Literal("UTF-8"))
+  case "base64" => UnBase64(expr, failOnError = true)
+  case _ => Literal(null, BinaryType)
 }
-  }.getOrElse(Unhex(expr, failOnError = true))
+  }
 
   def this(expr: Expression) = this(e

[spark] branch master updated (2dfb81f898c -> 291315853b8)

2022-11-23 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 2dfb81f898c [SPARK-41223][BUILD] Upgrade slf4j to 2.0.4
 add 291315853b8 [SPARK-41221][SQL] Add the error class `INVALID_FORMAT`

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   | 114 ++---
 .../sql/catalyst/analysis/CheckAnalysis.scala  |   7 ++
 .../sql/catalyst/analysis/TypeCheckResult.scala|  13 +++
 .../spark/sql/catalyst/analysis/package.scala  |   9 +-
 .../spark/sql/catalyst/util/ToNumberParser.scala   |  67 +---
 .../spark/sql/errors/QueryCompilationErrors.scala  |   9 +-
 .../expressions/RegexpExpressionsSuite.scala   |  35 ---
 .../expressions/StringExpressionsSuite.scala   | 110 ++--
 .../sql-tests/results/postgreSQL/numeric.sql.out   |  10 +-
 .../sql-tests/results/postgreSQL/strings.sql.out   |  32 +++---
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala |   8 +-
 11 files changed, 211 insertions(+), 203 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41206][SQL][FOLLOWUP] Make result of `checkColumnNameDuplication` stable to fix `COLUMN_ALREADY_EXISTS` check failed with Scala 2.13

2022-11-22 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e42d3836af9 [SPARK-41206][SQL][FOLLOWUP] Make result of 
`checkColumnNameDuplication` stable to fix `COLUMN_ALREADY_EXISTS` check failed 
with Scala 2.13
e42d3836af9 is described below

commit e42d3836af9eea881868c80f3c2cbc29e1d7b4f1
Author: yangjie01 
AuthorDate: Wed Nov 23 09:13:56 2022 +0300

[SPARK-41206][SQL][FOLLOWUP] Make result of `checkColumnNameDuplication` 
stable to fix `COLUMN_ALREADY_EXISTS` check failed with Scala 2.13

### What changes were proposed in this pull request?
This pr add a sort when `columnAlreadyExistsError` will be thrown to make 
the result of `SchemaUtils#checkColumnNameDuplication` stable.

### Why are the changes needed?
Fix `COLUMN_ALREADY_EXISTS` check failed with Scala 2.13

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

- Pass GA
- Manual test:

```
dev/change-scala-version.sh 2.13
build/sbt clean "sql/testOnly org.apache.spark.sql.DataFrameSuite" 
-Pscala-2.13
build/sbt  "sql/testOnly 
org.apache.spark.sql.execution.datasources.json.JsonV1Suite" -Pscala-2.13
build/sbt  "sql/testOnly 
org.apache.spark.sql.execution.datasources.json.JsonV2Suite" -Pscala-2.13
build/sbt  "sql/testOnly 
org.apache.spark.sql.execution.datasources.json.JsonLegacyTimeParserSuite" 
-Pscala-2.13
```
All tests passed

Closes #38764 from LuciferYang/SPARK-41206.

Authored-by: yangjie01 
Signed-off-by: Max Gekk 
---
 sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala
index aac96a9b56c..d202900381a 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/util/SchemaUtils.scala
@@ -107,7 +107,7 @@ private[spark] object SchemaUtils {
 val names = if (caseSensitiveAnalysis) columnNames else 
columnNames.map(_.toLowerCase)
 // scalastyle:on caselocale
 if (names.distinct.length != names.length) {
-  val columnName = names.groupBy(identity).collectFirst {
+  val columnName = names.groupBy(identity).toSeq.sortBy(_._1).collectFirst 
{
 case (x, ys) if ys.length > 1 => x
   }.get
   throw QueryCompilationErrors.columnAlreadyExistsError(columnName)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-40948][SQL][FOLLOWUP] Restore PATH_NOT_FOUND

2022-11-22 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 17816170316 [SPARK-40948][SQL][FOLLOWUP] Restore PATH_NOT_FOUND
17816170316 is described below

commit 178161703161ccf49b37baf9a667630865367950
Author: itholic 
AuthorDate: Wed Nov 23 08:38:20 2022 +0300

[SPARK-40948][SQL][FOLLOWUP] Restore PATH_NOT_FOUND

### What changes were proposed in this pull request?

The original PR to introduce the error class `PATH_NOT_FOUND` was reverted 
since it breaks the tests in different test env.

This PR proposes to restore it back.

### Why are the changes needed?

Restoring the reverted changes with proper fix.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

The existing CI should pass.

Closes #38575 from itholic/SPARK-40948-followup.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 R/pkg/tests/fulltests/test_sparkSQL.R  | 14 +---
 core/src/main/resources/error/error-classes.json   | 10 +++---
 .../spark/sql/errors/QueryCompilationErrors.scala  |  2 +-
 .../org/apache/spark/sql/DataFrameSuite.scala  | 37 --
 .../execution/datasources/DataSourceSuite.scala| 28 +---
 5 files changed, 52 insertions(+), 39 deletions(-)

diff --git a/R/pkg/tests/fulltests/test_sparkSQL.R 
b/R/pkg/tests/fulltests/test_sparkSQL.R
index 534ec07abac..d2b6220b2e7 100644
--- a/R/pkg/tests/fulltests/test_sparkSQL.R
+++ b/R/pkg/tests/fulltests/test_sparkSQL.R
@@ -3990,12 +3990,16 @@ test_that("Call DataFrameWriter.load() API in Java 
without path and check argume
   expect_error(read.df(source = "json"),
paste("Error in load : analysis error - Unable to infer schema 
for JSON.",
  "It must be specified manually"))
-  expect_error(read.df("arbitrary_path"), "Error in load : analysis error - 
Path does not exist")
-  expect_error(read.json("arbitrary_path"), "Error in json : analysis error - 
Path does not exist")
-  expect_error(read.text("arbitrary_path"), "Error in text : analysis error - 
Path does not exist")
-  expect_error(read.orc("arbitrary_path"), "Error in orc : analysis error - 
Path does not exist")
+  expect_error(read.df("arbitrary_path"),
+   "Error in load : analysis error - \\[PATH_NOT_FOUND\\].*")
+  expect_error(read.json("arbitrary_path"),
+   "Error in json : analysis error - \\[PATH_NOT_FOUND\\].*")
+  expect_error(read.text("arbitrary_path"),
+   "Error in text : analysis error - \\[PATH_NOT_FOUND\\].*")
+  expect_error(read.orc("arbitrary_path"),
+   "Error in orc : analysis error - \\[PATH_NOT_FOUND\\].*")
   expect_error(read.parquet("arbitrary_path"),
-  "Error in parquet : analysis error - Path does not exist")
+   "Error in parquet : analysis error - \\[PATH_NOT_FOUND\\].*")
 
   # Arguments checking in R side.
   expect_error(read.df(path = c(3)),
diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 77d155bfc21..12c97c2108a 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -912,6 +912,11 @@
 ],
 "sqlState" : "42000"
   },
+  "PATH_NOT_FOUND" : {
+"message" : [
+  "Path does not exist: ."
+]
+  },
   "PIVOT_VALUE_DATA_TYPE_MISMATCH" : {
 "message" : [
   "Invalid pivot value '': value data type  does not 
match pivot column data type "
@@ -2332,11 +2337,6 @@
   "Unable to infer schema for . It must be specified manually."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1130" : {
-"message" : [
-  "Path does not exist: ."
-]
-  },
   "_LEGACY_ERROR_TEMP_1131" : {
 "message" : [
   "Data source  does not support  output mode."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index 63c912c15a1..0f245597efd 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -1378,7 +1378,7 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
 
   def dataPathNotExistError(path: String): Throwa

[spark] branch master updated: [SPARK-41135][SQL] Rename `UNSUPPORTED_EMPTY_LOCATION` to `INVALID_EMPTY_LOCATION`

2022-11-22 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3bff4f6339f [SPARK-41135][SQL] Rename `UNSUPPORTED_EMPTY_LOCATION` to 
`INVALID_EMPTY_LOCATION`
3bff4f6339f is described below

commit 3bff4f6339f54d19362a0c03ef2b396e47881fd8
Author: itholic 
AuthorDate: Tue Nov 22 13:14:13 2022 +0300

[SPARK-41135][SQL] Rename `UNSUPPORTED_EMPTY_LOCATION` to 
`INVALID_EMPTY_LOCATION`

### What changes were proposed in this pull request?

This PR proposes to rename `UNSUPPORTED_EMPTY_LOCATION` to 
`INVALID_EMPTY_LOCATION`.

### Why are the changes needed?

Error class and its message should be clear/brief, and should not 
ambiguously specific when it illustrates things that possibly supported in the 
future.

### Does this PR introduce _any_ user-facing change?

Error message changes

From
```
"Unsupported empty location."
```

To
```
"The location name cannot be empty string, but `...` was given."
```

### How was this patch tested?

```
$ build/sbt “sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*”
$ build/sbt "core/testOnly *SparkThrowableSuite"
```

Closes #38650 from itholic/SPARK-41135.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 +-
 .../org/apache/spark/sql/errors/QueryExecutionErrors.scala |  6 +++---
 .../spark/sql/catalyst/analysis/ResolveSessionCatalog.scala|  4 ++--
 .../sql/execution/datasources/v2/DataSourceV2Strategy.scala|  4 ++--
 .../execution/command/AlterNamespaceSetLocationSuiteBase.scala |  4 ++--
 .../spark/sql/execution/command/CreateNamespaceSuiteBase.scala |  4 ++--
 6 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index ae76a52e40f..77d155bfc21 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -676,6 +676,11 @@
 ],
 "sqlState" : "42000"
   },
+  "INVALID_EMPTY_LOCATION" : {
+"message" : [
+  "The location name cannot be empty string, but `` was given."
+]
+  },
   "INVALID_FIELD_NAME" : {
 "message" : [
   "Field name  is invalid:  is not a struct."
@@ -1181,11 +1186,6 @@
   }
 }
   },
-  "UNSUPPORTED_EMPTY_LOCATION" : {
-"message" : [
-  "Unsupported empty location."
-]
-  },
   "UNSUPPORTED_FEATURE" : {
 "message" : [
   "The feature is not supported:"
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
index 6081d9f32a5..5db54f7f4cf 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
@@ -2806,10 +2806,10 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase {
 "size" -> elementSize.toString))
   }
 
-  def unsupportedEmptyLocationError(): SparkIllegalArgumentException = {
+  def invalidEmptyLocationError(location: String): 
SparkIllegalArgumentException = {
 new SparkIllegalArgumentException(
-  errorClass = "UNSUPPORTED_EMPTY_LOCATION",
-  messageParameters = Map.empty)
+  errorClass = "INVALID_EMPTY_LOCATION",
+  messageParameters = Map("location" -> location))
   }
 
   def malformedProtobufMessageDetectedInMessageParsingError(e: Throwable): 
Throwable = {
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
index d00d07150b0..d7e26b04ce4 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
@@ -134,7 +134,7 @@ class ResolveSessionCatalog(val catalogManager: 
CatalogManager)
 
 case SetNamespaceLocation(DatabaseInSessionCatalog(db), location) if 
conf.useV1Command =>
   if (StringUtils.isEmpty(location)) {
-throw QueryExecutionErrors.unsupportedEmptyLocationError()
+throw QueryExecutionErrors.invalidEmptyLocationError(location)
   }
   AlterDatabaseSetLocationCommand(db, location)
 
@@ -243,7 +243,7 @@ class ResolveSessionCatalog(val catalogManager: 
CatalogMana

[spark] branch master updated (40b7d29e14c -> a80899f8bef)

2022-11-21 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 40b7d29e14c [SPARK-41217][SQL] Add the error class 
`FAILED_FUNCTION_CALL`
 add a80899f8bef [SPARK-41206][SQL] Rename the error class 
`_LEGACY_ERROR_TEMP_1233` to `COLUMN_ALREADY_EXISTS`

No new revisions were added by this update.

Summary of changes:
 .../connect/planner/SparkConnectProtoSuite.scala   | 12 ++--
 core/src/main/resources/error/error-classes.json   | 10 +--
 .../spark/sql/catalyst/analysis/Analyzer.scala |  3 +-
 .../sql/catalyst/analysis/CheckAnalysis.scala  |  1 -
 .../spark/sql/catalyst/analysis/ResolveUnion.scala |  2 -
 .../spark/sql/errors/QueryCompilationErrors.scala  |  8 +--
 .../apache/spark/sql/util/PartitioningUtils.scala  |  3 +-
 .../org/apache/spark/sql/util/SchemaUtils.scala| 40 ---
 .../apache/spark/sql/util/SchemaUtilsSuite.scala   | 52 +++---
 .../main/scala/org/apache/spark/sql/Dataset.scala  |  2 -
 .../spark/sql/execution/command/CommandCheck.scala |  2 +-
 .../spark/sql/execution/command/tables.scala   |  1 -
 .../apache/spark/sql/execution/command/views.scala |  3 +-
 .../sql/execution/datasources/DataSource.scala | 10 +--
 .../InsertIntoHadoopFsRelationCommand.scala|  1 -
 .../execution/datasources/PartitioningUtils.scala  |  3 +-
 .../sql/execution/datasources/jdbc/JdbcUtils.scala |  8 +--
 .../spark/sql/execution/datasources/rules.scala|  9 +--
 .../sql/execution/datasources/v2/FileTable.scala   |  6 +-
 .../sql/execution/datasources/v2/FileWrite.scala   |  3 +-
 .../spark/sql/DataFrameSetOperationsSuite.scala| 21 +++---
 .../org/apache/spark/sql/DataFrameSuite.scala  | 67 +-
 .../apache/spark/sql/NestedDataSourceSuite.scala   |  5 +-
 .../org/apache/spark/sql/SQLInsertTestSuite.scala  |  7 +-
 .../spark/sql/StatisticsCollectionSuite.scala  | 10 +--
 .../spark/sql/connector/AlterTableTests.scala  | 34 +
 .../spark/sql/connector/DataSourceV2SQLSuite.scala | 80 --
 .../connector/V2CommandsCaseSensitivitySuite.scala | 21 --
 .../spark/sql/execution/command/DDLSuite.scala | 80 +-
 .../datasources/jdbc/JdbcUtilsSuite.scala  |  6 +-
 .../sql/execution/datasources/json/JsonSuite.scala | 10 +--
 .../org/apache/spark/sql/jdbc/JDBCWriteSuite.scala | 10 +--
 .../spark/sql/sources/PartitionedWriteSuite.scala  |  9 ++-
 .../spark/sql/streaming/FileStreamSinkSuite.scala  | 10 +--
 .../sql/test/DataFrameReaderWriterSuite.scala  | 65 ++
 .../hive/execution/InsertIntoHiveDirCommand.scala  |  1 -
 .../org/apache/spark/sql/hive/InsertSuite.scala|  9 ++-
 .../spark/sql/hive/execution/HiveDDLSuite.scala| 16 +++--
 38 files changed, 326 insertions(+), 314 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d453598a428 -> 40b7d29e14c)

2022-11-21 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from d453598a428 [SPARK-40809][CONNECT][FOLLOW-UP] Do not use Buffer to 
make Scala 2.13 test pass
 add 40b7d29e14c [SPARK-41217][SQL] Add the error class 
`FAILED_FUNCTION_CALL`

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   |  5 ++
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  9 +--
 .../spark/sql/errors/QueryCompilationErrors.scala  | 13 ++-
 .../results/ansi/string-functions.sql.out  | 16 +++-
 .../sql-tests/results/csv-functions.sql.out| 93 +-
 .../resources/sql-tests/results/extract.sql.out| 75 +++--
 .../sql-tests/results/json-functions.sql.out   | 93 +-
 .../sql-tests/results/postgreSQL/int8.sql.out  |  2 +-
 .../sql-tests/results/string-functions.sql.out | 16 +++-
 .../results/table-valued-functions.sql.out |  2 +-
 10 files changed, 269 insertions(+), 55 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41172][SQL] Migrate the ambiguous ref error to an error class

2022-11-19 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 62f8ce40ddb [SPARK-41172][SQL] Migrate the ambiguous ref error to an 
error class
62f8ce40ddb is described below

commit 62f8ce40ddbf76ce86fd5e51cc73c67d66e12f48
Author: panbingkun 
AuthorDate: Sat Nov 19 20:31:38 2022 +0300

[SPARK-41172][SQL] Migrate the ambiguous ref error to an error class

### What changes were proposed in this pull request?
The pr aims to migrate the ambiguous ref error to an error class.

### Why are the changes needed?
The changes improve the error framework.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes #38721 from panbingkun/SPARK-41172.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   |   5 +
 .../spark/sql/catalyst/expressions/package.scala   |   5 +-
 .../spark/sql/errors/QueryCompilationErrors.scala  |   9 ++
 .../sql/catalyst/analysis/AnalysisSuite.scala  |   5 +-
 .../catalyst/analysis/ResolveSubquerySuite.scala   |   4 +-
 .../expressions/AttributeResolutionSuite.scala |  30 +++--
 .../results/columnresolution-negative.sql.out  | 135 +++--
 .../sql-tests/results/postgreSQL/join.sql.out  |  30 -
 .../results/postgreSQL/select_implicit.sql.out |  45 ++-
 .../results/udf/postgreSQL/udf-join.sql.out|  30 -
 .../udf/postgreSQL/udf-select_implicit.sql.out |  45 ++-
 .../spark/sql/DataFrameNaFunctionsSuite.scala  |  42 +--
 .../org/apache/spark/sql/DataFrameStatSuite.scala  |  52 ++--
 .../execution/command/PlanResolutionSuite.scala|  22 ++--
 .../execution/datasources/orc/OrcFilterSuite.scala |  20 ++-
 15 files changed, 406 insertions(+), 73 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index fe340c517a2..4da9d2f9fbc 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -5,6 +5,11 @@
 ],
 "sqlState" : "42000"
   },
+  "AMBIGUOUS_REFERENCE" : {
+"message" : [
+  "Reference  is ambiguous, could be: ."
+]
+  },
   "ARITHMETIC_OVERFLOW" : {
 "message" : [
   ". If necessary set  to \"false\" to 
bypass this error."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/package.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/package.scala
index 7913f396120..ededac3d917 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/package.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/package.scala
@@ -21,9 +21,9 @@ import java.util.Locale
 
 import com.google.common.collect.Maps
 
-import org.apache.spark.sql.AnalysisException
 import org.apache.spark.sql.catalyst.analysis.{Resolver, UnresolvedAttribute}
 import org.apache.spark.sql.catalyst.util.MetadataColumnHelper
+import org.apache.spark.sql.errors.QueryCompilationErrors
 import org.apache.spark.sql.types.{StructField, StructType}
 
 /**
@@ -368,8 +368,7 @@ package object expressions  {
 
 case ambiguousReferences =>
   // More than one match.
-  val referenceNames = 
ambiguousReferences.map(_.qualifiedName).mkString(", ")
-  throw new AnalysisException(s"Reference '$name' is ambiguous, could 
be: $referenceNames.")
+  throw QueryCompilationErrors.ambiguousReferenceError(name, 
ambiguousReferences)
   }
 }
   }
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index 22b4cfdb3c6..cbdbb6adc11 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -1834,6 +1834,15 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
 "n" -> numMatches.toString))
   }
 
+  def ambiguousReferenceError(name: String, ambiguousReferences: 
Seq[Attribute]): Throwable = {
+new AnalysisException(
+  errorClass = "AMBIGUOUS_REFERENCE",
+  messageParameters = Map(
+"name" -> toSQLId(name),
+"referenceNames" ->
+  ambiguousReferences.map(ar => 
toSQLId(ar.qualifiedName)).sorted.mkString("[", ", ", "]")))
+  }
+
   def cannotUseIntervalTypeInTableSchemaError(): Throwable

[spark] branch master updated: [SPARK-41175][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1078

2022-11-18 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e62596a09f3 [SPARK-41175][SQL] Assign a name to the error class 
_LEGACY_ERROR_TEMP_1078
e62596a09f3 is described below

commit e62596a09f323bfe0f8592ba7a3c45674ce04ac6
Author: panbingkun 
AuthorDate: Sat Nov 19 09:02:33 2022 +0300

[SPARK-41175][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1078

### What changes were proposed in this pull request?
In the PR, I propose to assign the name `CANNOT_LOAD_FUNCTION_CLASS` to the 
error class _LEGACY_ERROR_TEMP_1078.

### Why are the changes needed?
Proper names of error classes should improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
By running the affected test suites:
> $ build/sbt "catalyst/testOnly *SessionCatalogSuite"

Closes #38696 from panbingkun/SPARK-41175.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json  | 10 +-
 .../apache/spark/sql/errors/QueryCompilationErrors.scala  |  4 ++--
 .../spark/sql/catalyst/catalog/SessionCatalogSuite.scala  | 15 +++
 .../test/resources/sql-tests/results/udaf/udaf.sql.out|  4 ++--
 .../test/resources/sql-tests/results/udf/udf-udaf.sql.out |  4 ++--
 .../test/scala/org/apache/spark/sql/SQLQuerySuite.scala   | 14 ++
 .../org/apache/spark/sql/execution/command/DDLSuite.scala | 14 ++
 7 files changed, 46 insertions(+), 19 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index a2d9fa071d0..fe340c517a2 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -48,6 +48,11 @@
 ],
 "sqlState" : "42000"
   },
+  "CANNOT_LOAD_FUNCTION_CLASS" : {
+"message" : [
+  "Cannot load class  when registering the function 
, please make sure it is on the classpath."
+]
+  },
   "CANNOT_LOAD_PROTOBUF_CLASS" : {
 "message" : [
   "Could not load Protobuf class with name . 
."
@@ -2075,11 +2080,6 @@
   "Partition spec is invalid. ."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1078" : {
-"message" : [
-  "Can not load class '' when registering the function 
'', please make sure it is on the classpath."
-]
-  },
   "_LEGACY_ERROR_TEMP_1079" : {
 "message" : [
   "Resource Type '' is not supported."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index e6ce12756ca..22b4cfdb3c6 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -899,10 +899,10 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
   def cannotLoadClassWhenRegisteringFunctionError(
   className: String, func: FunctionIdentifier): Throwable = {
 new AnalysisException(
-  errorClass = "_LEGACY_ERROR_TEMP_1078",
+  errorClass = "CANNOT_LOAD_FUNCTION_CLASS",
   messageParameters = Map(
 "className" -> className,
-"func" -> func.toString))
+"functionName" -> toSQLId(func.toString)))
   }
 
   def resourceTypeNotSupportedError(resourceType: String): Throwable = {
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
index f86d12474d6..a7254865c1e 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala
@@ -1477,6 +1477,21 @@ abstract class SessionCatalogSuite extends AnalysisTest 
with Eventually {
   assert(
 catalog.lookupFunction(
   FunctionIdentifier("temp1"), arguments) === 
Literal(arguments.length))
+
+  checkError(
+exception = intercept[AnalysisException] {
+  catalog.registerFunction(
+CatalogFunction(FunctionIdentifier("temp2", None),
+  "function_class_cannot_load", Seq.empty[FunctionResource]),
+overrideIfExists = false,
+None)
+},
+errorClass = "CANNOT_LO

[spark] branch master updated: [SPARK-41173][SQL] Move `require()` out from the constructors of string expressions

2022-11-18 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new b96ddce77aa [SPARK-41173][SQL] Move `require()` out from the 
constructors of string expressions
b96ddce77aa is described below

commit b96ddce77aa3f17eb0dea95083a9ac35d6077a94
Author: yangjie01 
AuthorDate: Fri Nov 18 22:14:32 2022 +0300

[SPARK-41173][SQL] Move `require()` out from the constructors of string 
expressions

### What changes were proposed in this pull request?
This pr aims to move `require()` out from the constructors of string 
expressions, include  `ConcatWs` and `FormatString`.
The args number checking logic moved into `checkInputDataTypes()`.

### Why are the changes needed?
Migration onto error classes unifies Spark SQL error messages.

### Does this PR introduce _any_ user-facing change?
Yes. The PR changes user-facing error messages.

### How was this patch tested?
Pass GitHub Actions

Closes #38705 from LuciferYang/SPARK-41173.

Authored-by: yangjie01 
Signed-off-by: Max Gekk 
---
 .../catalyst/expressions/stringExpressions.scala   | 35 +++---
 .../results/ansi/string-functions.sql.out  | 34 +++--
 .../sql-tests/results/string-functions.sql.out | 34 +++--
 3 files changed, 95 insertions(+), 8 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
index 45bed3e2387..60b56f4fef7 100755
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
@@ -67,8 +67,6 @@ import org.apache.spark.unsafe.types.{ByteArray, UTF8String}
 case class ConcatWs(children: Seq[Expression])
   extends Expression with ImplicitCastInputTypes {
 
-  require(children.nonEmpty, s"$prettyName requires at least one argument.")
-
   override def prettyName: String = "concat_ws"
 
   /** The 1st child (separator) is str, and rest are either str or array of 
str. */
@@ -82,6 +80,21 @@ case class ConcatWs(children: Seq[Expression])
   override def nullable: Boolean = children.head.nullable
   override def foldable: Boolean = children.forall(_.foldable)
 
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.isEmpty) {
+  DataTypeMismatch(
+errorSubClass = "WRONG_NUM_ARGS",
+messageParameters = Map(
+  "functionName" -> toSQLId(prettyName),
+  "expectedNum" -> "> 0",
+  "actualNum" -> children.length.toString
+)
+  )
+} else {
+  super.checkInputDataTypes()
+}
+  }
+
   override def eval(input: InternalRow): Any = {
 val flatInputs = children.flatMap { child =>
   child.eval(input) match {
@@ -1662,8 +1675,7 @@ case class StringRPad(str: Expression, len: Expression, 
pad: Expression = Litera
 // scalastyle:on line.size.limit
 case class FormatString(children: Expression*) extends Expression with 
ImplicitCastInputTypes {
 
-  require(children.nonEmpty, s"$prettyName() should take at least 1 argument")
-  if (!SQLConf.get.getConf(SQLConf.ALLOW_ZERO_INDEX_IN_FORMAT_STRING)) {
+  if (children.nonEmpty && 
!SQLConf.get.getConf(SQLConf.ALLOW_ZERO_INDEX_IN_FORMAT_STRING)) {
 checkArgumentIndexNotZero(children(0))
   }
 
@@ -1675,6 +1687,21 @@ case class FormatString(children: Expression*) extends 
Expression with ImplicitC
   override def inputTypes: Seq[AbstractDataType] =
 StringType :: List.fill(children.size - 1)(AnyDataType)
 
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.isEmpty) {
+  DataTypeMismatch(
+errorSubClass = "WRONG_NUM_ARGS",
+messageParameters = Map(
+  "functionName" -> toSQLId(prettyName),
+  "expectedNum" -> "> 0",
+  "actualNum" -> children.length.toString
+)
+  )
+} else {
+  super.checkInputDataTypes()
+}
+  }
+
   override def eval(input: InternalRow): Any = {
 val pattern = children(0).eval(input)
 if (pattern == null) {
diff --git 
a/sql/core/src/test/resources/sql-tests/results/ansi/string-functions.sql.out 
b/sql/core/src/test/resources/sql-tests/results/ansi/string-functions.sql.out
index 5b82cfa957d..41f1922f8bd 100644
--- 
a/sql/core/src/test/resources/sql-tests/results/ansi/string-functions.sql.out
+++ 
b/sql/core/src/test/resources/sql-tests/results/ansi/string-functions.sql.out
@@ -5,7 +5,22 @@ select concat_ws

[spark] branch master updated: [SPARK-41166][SQL][TESTS] Check errorSubClass of DataTypeMismatch in *ExpressionSuites

2022-11-18 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e7520fc58e1 [SPARK-41166][SQL][TESTS] Check errorSubClass of 
DataTypeMismatch in *ExpressionSuites
e7520fc58e1 is described below

commit e7520fc58e18c45e43e07dc63f1f03cfd4da0fcc
Author: panbingkun 
AuthorDate: Fri Nov 18 13:30:48 2022 +0300

[SPARK-41166][SQL][TESTS] Check errorSubClass of DataTypeMismatch in 
*ExpressionSuites

### What changes were proposed in this pull request?
The pr aims to check errorSubClass of DataTypeMismatch in 
`*ExpressionSuites`.

### Why are the changes needed?
The changes improve the error framework.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes #38688 from panbingkun/SPARK-41166.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 .../expressions/CallMethodViaReflectionSuite.scala |  30 -
 .../sql/catalyst/expressions/CastSuiteBase.scala   |  71 ++--
 .../catalyst/expressions/CastWithAnsiOnSuite.scala | 118 +++-
 .../expressions/CollectionExpressionsSuite.scala   |  32 +-
 .../catalyst/expressions/ComplexTypeSuite.scala|  52 -
 .../expressions/GeneratorExpressionSuite.scala |  36 +-
 .../expressions/JsonExpressionsSuite.scala |  14 ++-
 .../expressions/MiscExpressionsSuite.scala |  13 ++-
 .../expressions/StringExpressionsSuite.scala   |  81 --
 .../aggregate/AggregateExpressionSuite.scala   | 121 +
 .../ApproxCountDistinctForIntervalsSuite.scala |  26 -
 .../aggregate/ApproximatePercentileSuite.scala |  23 +++-
 .../expressions/aggregate/PercentileSuite.scala|  62 ++-
 13 files changed, 610 insertions(+), 69 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflectionSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflectionSuite.scala
index c8b99f6f026..e65b81ee166 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflectionSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflectionSuite.scala
@@ -97,10 +97,34 @@ class CallMethodViaReflectionSuite extends SparkFunSuite 
with ExpressionEvalHelp
   }
 
   test("input type checking") {
-assert(CallMethodViaReflection(Seq.empty).checkInputDataTypes().isFailure)
-
assert(CallMethodViaReflection(Seq(Literal(staticClassName))).checkInputDataTypes().isFailure)
+assert(CallMethodViaReflection(Seq.empty).checkInputDataTypes() ==
+  DataTypeMismatch(
+errorSubClass = "WRONG_NUM_ARGS",
+messageParameters = Map(
+  "functionName" -> "`reflect`",
+  "expectedNum" -> "> 1",
+  "actualNum" -> "0")
+  )
+)
+
assert(CallMethodViaReflection(Seq(Literal(staticClassName))).checkInputDataTypes()
 ==
+  DataTypeMismatch(
+errorSubClass = "WRONG_NUM_ARGS",
+messageParameters = Map(
+  "functionName" -> "`reflect`",
+  "expectedNum" -> "> 1",
+  "actualNum" -> "1")
+  )
+)
 assert(CallMethodViaReflection(
-  Seq(Literal(staticClassName), 
Literal(1))).checkInputDataTypes().isFailure)
+  Seq(Literal(staticClassName), Literal(1))).checkInputDataTypes() ==
+  DataTypeMismatch(
+errorSubClass = "NON_FOLDABLE_INPUT",
+messageParameters = Map(
+  "inputName" -> "method",
+  "inputType" -> "\"STRING\"",
+  "inputExpr" -> "\"1\"")
+  )
+)
 assert(createExpr(staticClassName, 
"method1").checkInputDataTypes().isSuccess)
   }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala
index a60491b0ab8..6d972a8482a 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CastSuiteBase.scala
@@ -29,6 +29,7 @@ import org.apache.spark.sql.Row
 import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.catalyst.analysis.TypeCheckResult.DataTypeMismatch
 import org.apache.spark.sql.catalyst.analysis.TypeCoercion.numericPrecedence
+import org.apache.spark.sql.catalyst.expressions.Cast._
 import org.apa

[spark] branch master updated (12a77bb22f1 -> bcbc88377ff)

2022-11-18 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 12a77bb22f1 [SPARK-41107][PYTHON][INFRA][TESTS] Install 
memory-profiler in the CI
 add bcbc88377ff [SPARK-41130][SQL] Rename `OUT_OF_DECIMAL_TYPE_RANGE` to 
`NUMERIC_OUT_OF_SUPPORTED_RANGE`

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   | 10 +-
 .../org/apache/spark/sql/errors/QueryExecutionErrors.scala |  2 +-
 .../spark/sql/catalyst/expressions/CastWithAnsiOnSuite.scala   |  4 ++--
 .../test/scala/org/apache/spark/sql/types/DecimalSuite.scala   |  4 ++--
 4 files changed, 10 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (f24f8489f80 -> 23fcd25b870)

2022-11-17 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from f24f8489f80 [SPARK-41106][SQL] Reduce collection conversion when 
create AttributeMap
 add 23fcd25b870 [SPARK-41133][SQL] Integrate 
`UNSCALED_VALUE_TOO_LARGE_FOR_PRECISION` into `NUMERIC_VALUE_OUT_OF_RANGE`

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/avro/AvroLogicalTypeSuite.scala  | 17 ++-
 core/src/main/resources/error/error-classes.json   |  7 +--
 .../spark/sql/errors/QueryExecutionErrors.scala| 17 ++-
 .../scala/org/apache/spark/sql/types/Decimal.scala |  2 +-
 .../org/apache/spark/sql/types/DecimalSuite.scala  | 24 --
 5 files changed, 44 insertions(+), 23 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41139][SQL] Improve error class: `PYTHON_UDF_IN_ON_CLAUSE`

2022-11-16 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new fea905acea2 [SPARK-41139][SQL] Improve error class: 
`PYTHON_UDF_IN_ON_CLAUSE`
fea905acea2 is described below

commit fea905acea2e8eedb10f86d4cea6565f19066023
Author: itholic 
AuthorDate: Wed Nov 16 19:13:52 2022 +0300

[SPARK-41139][SQL] Improve error class: `PYTHON_UDF_IN_ON_CLAUSE`

### What changes were proposed in this pull request?

This PR proposes to improve the error message and test for 
`PYTHON_UDF_IN_ON_CLAUSE`

### Why are the changes needed?

The current error message is not clear enough to let user understand the 
solve the problem.

We can provide more information to improve the usability.

Also, we should test the error class with `checkError` for better 
testability.

### Does this PR introduce _any_ user-facing change?

The error message is improved with additional detailed information.

From
```
Python UDF in the ON clause of a  JOIN.
```

To
```
Python UDF in the ON clause of a  JOIN. In case of an INNNER JOIN 
consider rewriting to a CROSS JOIN with a WHERE clause.
```

### How was this patch tested?

Manually tested for fixed test case.

Closes #38657 from itholic/SPARK-41139.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json  | 2 +-
 .../optimizer/ExtractPythonUDFFromJoinConditionSuite.scala| 8 +---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 32083c23df8..d5d6e938ad1 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1248,7 +1248,7 @@
   },
   "PYTHON_UDF_IN_ON_CLAUSE" : {
 "message" : [
-  "Python UDF in the ON clause of a  JOIN."
+  "Python UDF in the ON clause of a  JOIN. In case of an 
INNNER JOIN consider rewriting to a CROSS JOIN with a WHERE clause."
 ]
   },
   "REPEATED_PIVOT" : {
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ExtractPythonUDFFromJoinConditionSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ExtractPythonUDFFromJoinConditionSuite.scala
index 0b215818d36..854a3e8f7a7 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ExtractPythonUDFFromJoinConditionSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ExtractPythonUDFFromJoinConditionSuite.scala
@@ -187,9 +187,11 @@ class ExtractPythonUDFFromJoinConditionSuite extends 
PlanTest {
   condition = Some(unevaluableJoinCond))
 Optimize.execute(query.analyze)
   }
-  assert(e.message ==
-"[UNSUPPORTED_FEATURE.PYTHON_UDF_IN_ON_CLAUSE] The feature is not 
supported: " +
-s"""Python UDF in the ON clause of a ${joinType.sql} JOIN.""")
+  checkError(
+exception = e,
+errorClass = "UNSUPPORTED_FEATURE.PYTHON_UDF_IN_ON_CLAUSE",
+parameters = Map("joinType" -> joinType.sql)
+  )
 
   val query2 = testRelationLeft.join(
 testRelationRight,


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (0f7eaeee644 -> e3aa2fca385)

2022-11-16 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 0f7eaeee644 [SPARK-40809][CONNECT][FOLLOW] Support `alias()` in Python 
client
 add e3aa2fca385 [SPARK-41158][SQL][TESTS] Use `checkError()` to check 
`DATATYPE_MISMATCH` in `DataFrameFunctionsSuite`

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/DataFrameFunctionsSuite.scala | 496 +
 1 file changed, 405 insertions(+), 91 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (f3400e4fdac -> b8b90ad5880)

2022-11-15 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from f3400e4fdac [SPARK-40875][CONNECT][FOLLOW] Retain Group expressions in 
aggregate
 add b8b90ad5880 [SPARK-40755][SQL] Migrate type check failures of number 
formatting onto error classes

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   |  40 
 .../expressions/numberFormatExpressions.scala  |  24 ++-
 .../spark/sql/catalyst/util/ToNumberParser.scala   |  82 +
 .../expressions/StringExpressionsSuite.scala   | 202 -
 .../sql-tests/results/postgreSQL/numeric.sql.out   |  14 +-
 5 files changed, 275 insertions(+), 87 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8806086b952 -> 8b5fee7ea03)

2022-11-15 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 8806086b952 [SPARK-41128][CONNECT][PYTHON] Implement `DataFrame.fillna 
` and `DataFrame.na.fill `
 add 8b5fee7ea03 [SPARK-41140][SQL] Rename the error class 
`_LEGACY_ERROR_TEMP_2440` to `INVALID_WHERE_CONDITION`

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   | 13 ++--
 .../sql/catalyst/analysis/CheckAnalysis.scala  |  6 +++---
 .../sql/catalyst/analysis/AnalysisErrorSuite.scala |  7 +--
 .../resources/sql-tests/results/group-by.sql.out   | 24 +++---
 .../sql-tests/results/udf/udf-group-by.sql.out | 18 
 5 files changed, 35 insertions(+), 33 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (91af7e8f562 -> 3099e15e4c2)

2022-11-14 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 91af7e8f562 [SPARK-41072][SQL][SS] Add the error class `STREAM_FAILED` 
to `StreamingQueryException`
 add 3099e15e4c2 [SPARK-41137][SQL] Rename `LATERAL_JOIN_OF_TYPE` to 
`INVALID_LATERAL_JOIN_TYPE`

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   | 10 +-
 .../scala/org/apache/spark/sql/errors/QueryParsingErrors.scala |  2 +-
 .../org/apache/spark/sql/errors/QueryParsingErrorsSuite.scala  |  3 +--
 3 files changed, 7 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (490c6dbdf86 -> 91af7e8f562)

2022-11-14 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 490c6dbdf86 [SPARK-41134][SQL] Improve error message of internal errors
 add 91af7e8f562 [SPARK-41072][SQL][SS] Add the error class `STREAM_FAILED` 
to `StreamingQueryException`

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   |  5 
 project/MimaExcludes.scala |  5 +++-
 .../sql/execution/streaming/StreamExecution.scala  | 17 +++-
 .../sql/streaming/StreamingQueryException.scala| 30 --
 .../sql/errors/QueryExecutionErrorsSuite.scala | 16 +++-
 5 files changed, 62 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41109][CORE][FOLLOWUP] Re-order error class to fix `SparkThrowableSuite`

2022-11-14 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 5ff060ec5b6 [SPARK-41109][CORE][FOLLOWUP] Re-order error class to fix 
`SparkThrowableSuite`
5ff060ec5b6 is described below

commit 5ff060ec5b66c3c0cb74308db3b6556008b740f1
Author: yangjie01 
AuthorDate: Mon Nov 14 19:21:21 2022 +0300

[SPARK-41109][CORE][FOLLOWUP] Re-order error class to fix 
`SparkThrowableSuite`

### What changes were proposed in this pull request?
This pr aims to fix error class order of `ESC_IN_THE_MIDDLE` and 
`ESC_AT_THE_END` to make GA task passed.

### Why are the changes needed?
Fix GA test task failed.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GA
- Manual test:

```
build/sbt "core/testOnly *SparkThrowableSuite"
```

**Before**

```
[info] - Error classes are correctly formatted *** FAILED *** (91 
milliseconds)
[info]   "...ass" : {
[info] "ESC_[AT_THE_END" : {
[info]   "message" : [
[info] "the escape character is not allowed to end with."
[info]   ]
[info] },
[info] "ESC_IN_THE_MIDDLE" : {
[info]   "message" : [
[info] "the escape character is not allowed to precede ]."
[info]   ]
[info] }..." did not equal "...ass" : {
[info] "ESC_[IN_THE_MIDDLE" : {
[info]   "message" : [
[info] "the escape character is not allowed to precede ."
[info]   ]
[info] },
[info] "ESC_AT_THE_END" : {
[info]   "message" : [
[info] "the escape character is not allowed to end with]."
[info]   ]
[info] }..." (SparkThrowableSuite.scala:98)
```
**After**

```
[info] SparkThrowableSuite:
[info] - No duplicate error classes (39 milliseconds)
[info] - Error classes are correctly formatted (61 milliseconds)
[info] - SQLSTATE invariants (13 milliseconds)
[info] - Message invariants (15 milliseconds)
[info] - Message format invariants (33 milliseconds)
[info] - Round trip (33 milliseconds)
[info] - Check if error class is missing (32 milliseconds)
[info] - Check if message parameters match message format (8 milliseconds)
[info] - Error message is formatted (1 millisecond)
[info] - Try catching legacy SparkError (1 millisecond)
[info] - Try catching SparkError with error class (1 millisecond)
[info] - Try catching internal SparkError (1 millisecond)
[info] - Get message in the specified format (15 milliseconds)
[info] - overwrite error classes (190 milliseconds)
[info] - prohibit dots in error class names (57 milliseconds)
[info] Run completed in 2 seconds, 317 milliseconds.
[info] Total number of tests run: 15
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 15, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 16 s, completed 2022-11-14 19:34:11
```

Closes #38658 from LuciferYang/SPARK-41109-FOLLOWUP.

Authored-by: yangjie01 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 7f43cc2deda..28e19bfdff4 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -643,14 +643,14 @@
   "The pattern  is invalid."
 ],
 "subClass" : {
-  "ESC_IN_THE_MIDDLE" : {
+  "ESC_AT_THE_END" : {
 "message" : [
-  "the escape character is not allowed to precede ."
+  "the escape character is not allowed to end with."
 ]
   },
-  "ESC_AT_THE_END" : {
+  "ESC_IN_THE_MIDDLE" : {
 "message" : [
-  "the escape character is not allowed to end with."
+  "the escape character is not allowed to precede ."
 ]
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (46246c929ea -> cda7b70a81e)

2022-11-13 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 46246c929ea [SPARK-41109][SQL] Rename the error class 
_LEGACY_ERROR_TEMP_1216 to INVALID_LIKE_PATTERN
 add cda7b70a81e [SPARK-41098][SQL] Rename `GROUP_BY_POS_REFERS_AGG_EXPR` 
to `GROUP_BY_POS_AGGREGATE`

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json  | 8 
 .../org/apache/spark/sql/errors/QueryCompilationErrors.scala  | 2 +-
 .../src/test/resources/sql-tests/results/group-by-ordinal.sql.out | 8 
 3 files changed, 9 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6e0202ea8c8 -> 46246c929ea)

2022-11-13 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 6e0202ea8c8 [SPARK-41126][K8S] `entrypoint.sh` should use its WORKDIR 
instead of `/tmp` directory
 add 46246c929ea [SPARK-41109][SQL] Rename the error class 
_LEGACY_ERROR_TEMP_1216 to INVALID_LIKE_PATTERN

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   | 22 +++
 .../spark/sql/catalyst/util/StringUtils.scala  |  8 +++---
 .../spark/sql/errors/QueryCompilationErrors.scala  | 13 +++--
 .../sql-tests/results/postgreSQL/strings.sql.out   | 32 +++---
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 20 --
 5 files changed, 64 insertions(+), 31 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4f614b3f699 -> 7cbf7dd148d)

2022-11-11 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 4f614b3f699 [SPARK-41005][CONNECT][FOLLOWUP] Collect should use 
`submitJob` instead of `runJob`
 add 7cbf7dd148d [SPARK-40372][SQL] Migrate failures of array type checks 
onto error classes

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   |  15 +-
 .../expressions/collectionOperations.scala | 203 +++---
 .../resources/sql-tests/results/ansi/array.sql.out |  20 +-
 .../resources/sql-tests/results/ansi/map.sql.out   |   4 +-
 .../test/resources/sql-tests/results/array.sql.out |  20 +-
 .../test/resources/sql-tests/results/map.sql.out   |   4 +-
 .../apache/spark/sql/DataFrameFunctionsSuite.scala | 408 +++--
 7 files changed, 486 insertions(+), 188 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41044][SQL] Convert DATATYPE_MISMATCH.UNSPECIFIED_FRAME to INTERNAL_ERROR

2022-11-11 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new c80de3b67ae [SPARK-41044][SQL] Convert 
DATATYPE_MISMATCH.UNSPECIFIED_FRAME to INTERNAL_ERROR
c80de3b67ae is described below

commit c80de3b67ae05d8c17d9afef9655ad2e76bfd05f
Author: panbingkun 
AuthorDate: Fri Nov 11 12:27:28 2022 +0300

[SPARK-41044][SQL] Convert DATATYPE_MISMATCH.UNSPECIFIED_FRAME to 
INTERNAL_ERROR

### What changes were proposed in this pull request?
The pr aims to convert DATATYPE_MISMATCH.UNSPECIFIED_FRAME to 
INTERNAL_ERROR.

### Why are the changes needed?
1. When I work on https://issues.apache.org/jira/browse/SPARK-41021, I 
can't found the path to trigger it from the user's perspective, then we should 
convert it to an internal 
error.(https://github.com/apache/spark/pull/38520/files#r1015171962)
2. The changes improve the error framework.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
1. Update existed UT.
2. Pass GA.

Closes #38555 from panbingkun/convert_UNSPECIFIED_FRAME_to_INNER_ERROR.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   |  5 -
 .../spark/sql/catalyst/expressions/windowExpressions.scala |  8 +---
 .../catalyst/analysis/ExpressionTypeCheckingSuite.scala| 14 --
 3 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 63978e6be66..5d1fdbbdc05 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -349,11 +349,6 @@
   "cannot find a static method  that matches the argument 
types in "
 ]
   },
-  "UNSPECIFIED_FRAME" : {
-"message" : [
-  "Cannot use an UnspecifiedFrame. This should have been converted 
during analysis."
-]
-  },
   "UNSUPPORTED_INPUT_TYPE" : {
 "message" : [
   "The input of  can't be  type data."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
index 353ab22b5a5..c32bf4d4d45 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
@@ -19,6 +19,7 @@ package org.apache.spark.sql.catalyst.expressions
 
 import java.util.Locale
 
+import org.apache.spark.SparkException
 import org.apache.spark.sql.catalyst.analysis.{TypeCheckResult, 
UnresolvedException}
 import 
org.apache.spark.sql.catalyst.analysis.TypeCheckResult.{DataTypeMismatch, 
TypeCheckSuccess}
 import org.apache.spark.sql.catalyst.dsl.expressions._
@@ -57,8 +58,8 @@ case class WindowSpecDefinition(
   frameSpecification = newChildren.last.asInstanceOf[WindowFrame])
 
   override lazy val resolved: Boolean =
-childrenResolved && checkInputDataTypes().isSuccess &&
-  frameSpecification.isInstanceOf[SpecifiedWindowFrame]
+childrenResolved && frameSpecification.isInstanceOf[SpecifiedWindowFrame] 
&&
+  checkInputDataTypes().isSuccess
 
   override def nullable: Boolean = true
   override def dataType: DataType = throw 
QueryExecutionErrors.dataTypeOperationUnsupportedError
@@ -66,7 +67,8 @@ case class WindowSpecDefinition(
   override def checkInputDataTypes(): TypeCheckResult = {
 frameSpecification match {
   case UnspecifiedFrame =>
-DataTypeMismatch(errorSubClass = "UNSPECIFIED_FRAME")
+throw SparkException.internalError("Cannot use an UnspecifiedFrame. " +
+  "This should have been converted during analysis.")
   case f: SpecifiedWindowFrame if f.frameType == RangeFrame && 
!f.isUnbounded &&
   orderSpec.isEmpty =>
 DataTypeMismatch(errorSubClass = "RANGE_FRAME_WITHOUT_ORDER")
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala
index 256cf439b65..d406ec8f74a 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ExpressionTypeCheckingSuite.scala
@@ -17,7 +17,7 @@
 
 package org.apache.s

[spark] branch master updated: [SPARK-41095][SQL] Convert unresolved operators to internal errors

2022-11-10 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 38897b1ce96 [SPARK-41095][SQL] Convert unresolved operators to 
internal errors
38897b1ce96 is described below

commit 38897b1ce96a1e24629875619c165b4fd1fa2d8f
Author: Max Gekk 
AuthorDate: Fri Nov 11 09:37:10 2022 +0300

[SPARK-41095][SQL] Convert unresolved operators to internal errors

### What changes were proposed in this pull request?
In the PR, I propose to interpret the `unresolved operator` issue as an 
internal error, and throw `SparkException` w/ the error class `INTERNAL_ERROR`.

### Why are the changes needed?
The issues that leads to `unresolved operator` should be solved earlier w/ 
proper user-facing errors. If we reach the point when we cannot resolve an 
operator, we should interpret this as Spark SQL internal error.

### Does this PR introduce _any_ user-facing change?
No, in most regular cases.

### How was this patch tested?
By running the affected and modified test suites:
```
$ PYSPARK_PYTHON=python3 build/sbt "sql/testOnly 
org.apache.spark.sql.SQLQueryTestSuite"
$ build/sbt "test:testOnly *AnalysisErrorSuite"
```

    Closes #38582 from MaxGekk/unresolved-op-internal-error.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json |  5 -
 core/src/main/scala/org/apache/spark/SparkException.scala| 10 --
 .../apache/spark/sql/catalyst/analysis/CheckAnalysis.scala   |  8 +---
 .../spark/sql/catalyst/analysis/AnalysisErrorSuite.scala | 12 
 4 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 2b626ba5761..63978e6be66 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -5107,11 +5107,6 @@
   "Invalid expressions: []"
 ]
   },
-  "_LEGACY_ERROR_TEMP_2442" : {
-"message" : [
-  "unresolved operator "
-]
-  },
   "_LEGACY_ERROR_TEMP_2443" : {
 "message" : [
   "Multiple definitions of observed metrics named '': "
diff --git a/core/src/main/scala/org/apache/spark/SparkException.scala 
b/core/src/main/scala/org/apache/spark/SparkException.scala
index 03938444e12..2f05b2ad6a7 100644
--- a/core/src/main/scala/org/apache/spark/SparkException.scala
+++ b/core/src/main/scala/org/apache/spark/SparkException.scala
@@ -68,11 +68,17 @@ class SparkException(
 }
 
 object SparkException {
-  def internalError(msg: String): SparkException = {
+  def internalError(msg: String, context: Array[QueryContext], summary: 
String): SparkException = {
 new SparkException(
   errorClass = "INTERNAL_ERROR",
   messageParameters = Map("message" -> msg),
-  cause = null)
+  cause = null,
+  context,
+  summary)
+  }
+
+  def internalError(msg: String): SparkException = {
+internalError(msg, context = Array.empty[QueryContext], summary = "")
   }
 
   def internalError(msg: String, cause: Throwable): SparkException = {
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index f88ef522f34..285c7396124 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
@@ -18,6 +18,7 @@ package org.apache.spark.sql.catalyst.analysis
 
 import scala.collection.mutable
 
+import org.apache.spark.SparkException
 import org.apache.spark.sql.AnalysisException
 import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.expressions.SubExprUtils._
@@ -713,9 +714,10 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
 extendedCheckRules.foreach(_(plan))
 plan.foreachUp {
   case o if !o.resolved =>
-o.failAnalysis(
-  errorClass = "_LEGACY_ERROR_TEMP_2442",
-  messageParameters = Map("operator" -> 
o.simpleString(SQLConf.get.maxToStringFields)))
+throw SparkException.internalError(
+  msg = s"Found the unresolved operator: 
${o.simpleString(SQLConf.get.maxToStringFields)}",
+  context = o.origin.getQueryContext,
+  summary = o.origin.context.summary)
   case _ =>
 }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisErrorSuite.scala
 
b/sql/catalyst/src/test/

[spark] branch master updated (4a36c151ea1 -> 649c87780e5)

2022-11-10 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 4a36c151ea1 [SPARK-41108][CONNECT] Control the max size of arrow batch
 add 649c87780e5 [SPARK-41059][SQL] Rename `_LEGACY_ERROR_TEMP_2420` to 
`NESTED_AGGREGATE_FUNCTION`

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json | 10 +-
 .../spark/sql/catalyst/analysis/CheckAnalysis.scala  |  2 +-
 .../spark/sql/catalyst/analysis/AnalysisErrorSuite.scala | 16 +---
 .../src/test/resources/sql-tests/results/pivot.sql.out   |  2 +-
 .../results/postgreSQL/aggregates_part3.sql.out  |  2 +-
 .../results/udf/postgreSQL/udf-aggregates_part3.sql.out  |  2 +-
 .../resources/sql-tests/results/udf/udf-pivot.sql.out|  2 +-
 7 files changed, 19 insertions(+), 17 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41055][SQL] Rename `_LEGACY_ERROR_TEMP_2424` to `GROUP_BY_AGGREGATE`

2022-11-10 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 6aac34315de [SPARK-41055][SQL] Rename `_LEGACY_ERROR_TEMP_2424` to 
`GROUP_BY_AGGREGATE`
6aac34315de is described below

commit 6aac34315de2ee3d48fe2e1819a02600b3b22d22
Author: itholic 
AuthorDate: Thu Nov 10 19:30:11 2022 +0300

[SPARK-41055][SQL] Rename `_LEGACY_ERROR_TEMP_2424` to `GROUP_BY_AGGREGATE`

### What changes were proposed in this pull request?

This PR proposes to rename `_LEGACY_ERROR_TEMP_2424` to `GROUP_BY_AGGREGATE`

### Why are the changes needed?

To use proper error class name.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

```
./build/sbt “sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*”
```

Closes #38569 from itholic/SPARK-41055.

Lead-authored-by: itholic 
Co-authored-by: Haejoon Lee <44108233+itho...@users.noreply.github.com>
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 +-
 .../spark/sql/catalyst/analysis/CheckAnalysis.scala|  2 +-
 .../test/resources/sql-tests/results/group-by.sql.out  |  2 +-
 .../sql-tests/results/udf/udf-group-by.sql.out |  2 +-
 .../org/apache/spark/sql/DataFrameAggregateSuite.scala | 11 +++
 .../org/apache/spark/sql/DataFramePivotSuite.scala | 18 ++
 6 files changed, 25 insertions(+), 20 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 7c33c1059ae..dcc6effb30f 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -469,6 +469,11 @@
   "Grouping sets size cannot be greater than "
 ]
   },
+  "GROUP_BY_AGGREGATE" : {
+"message" : [
+  "Aggregate functions are not allowed in GROUP BY, but found ."
+]
+  },
   "GROUP_BY_POS_OUT_OF_RANGE" : {
 "message" : [
   "GROUP BY position  is not in select list (valid range is [1, 
])."
@@ -5008,11 +5013,6 @@
   "Correlated scalar subquery '' is neither present in the group 
by, nor in an aggregate function. Add it to group by using ordinal position or 
wrap it in first() (or first_value) if you don't care which value you get."
 ]
   },
-  "_LEGACY_ERROR_TEMP_2424" : {
-"message" : [
-  "aggregate functions are not allowed in GROUP BY, but found "
-]
-  },
   "_LEGACY_ERROR_TEMP_2425" : {
 "message" : [
   "expression  cannot be used as a grouping expression because 
its data type  is not an orderable data type."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index 9e41bcebe47..1ce1fcd0144 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
@@ -413,7 +413,7 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
 def checkValidGroupingExprs(expr: Expression): Unit = {
   if (expr.exists(_.isInstanceOf[AggregateExpression])) {
 expr.failAnalysis(
-  errorClass = "_LEGACY_ERROR_TEMP_2424",
+  errorClass = "GROUP_BY_AGGREGATE",
   messageParameters = Map("sqlExpr" -> expr.sql))
   }
 
diff --git a/sql/core/src/test/resources/sql-tests/results/group-by.sql.out 
b/sql/core/src/test/resources/sql-tests/results/group-by.sql.out
index 6ccc0c34ff0..1075a6ab887 100644
--- a/sql/core/src/test/resources/sql-tests/results/group-by.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/group-by.sql.out
@@ -213,7 +213,7 @@ struct<>
 -- !query output
 org.apache.spark.sql.AnalysisException
 {
-  "errorClass" : "_LEGACY_ERROR_TEMP_2424",
+  "errorClass" : "GROUP_BY_AGGREGATE",
   "messageParameters" : {
 "sqlExpr" : "count(testdata.b)"
   },
diff --git 
a/sql/core/src/test/resources/sql-tests/results/udf/udf-group-by.sql.out 
b/sql/core/src/test/resources/sql-tests/results/udf/udf-group-by.sql.out
index 4d336adc412..093cdcac25a 100644
--- a/sql/core/src/test/resources/sql-tests/results/udf/udf-group-by.sql.out
+++ b/sql/core/src/test/resources/sql-tests/results/udf/udf-group-by.sql.out
@@ -190,7 +190,7 @@ struct<>
 -- !query output
 org.apa

[spark] branch master updated: [SPARK-41092][SQL] Do not use identifier to match interval units

2022-11-10 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 012d99d3a3d [SPARK-41092][SQL] Do not use identifier to match interval 
units
012d99d3a3d is described below

commit 012d99d3a3d92819d5a26cddcfd566c46380b952
Author: Wenchen Fan 
AuthorDate: Thu Nov 10 11:36:21 2022 +0300

[SPARK-41092][SQL] Do not use identifier to match interval units

### What changes were proposed in this pull request?

The current antlr-based SQL parser is pretty fragile due to the fact that 
we make the antlr parser rule pretty flexible and push more parsing logic to 
the Scala side (`AstBuilder`). A tiny change to the antlr parser rule may break 
the parser unexpectedly. As an example, in 
https://github.com/apache/spark/pull/38404 , we added a new parser rule to 
extend the INSERT syntax, and it breaks interval literal. `select b + interval 
'1 month' from values (1, 1)` can't be parsed after https://g [...]

This PR makes the interval literal parser rule stricter. Now it lists all 
the allowed interval units instead of matching an identifier. This fixes the 
issue we hit in https://github.com/apache/spark/pull/38404 .

In the future, we can revisit other parser rules and try to rely on antlr 
more to do the parsing work.

### Why are the changes needed?

fix parser issues we hit in https://github.com/apache/spark/pull/38404

### Does this PR introduce _any_ user-facing change?

The error message is changed a little bit for `SELECT INTERVAL 1 
wrong_unit`.

### How was this patch tested?

existing tests

Closes #38583 from cloud-fan/parser.

Authored-by: Wenchen Fan 
Signed-off-by: Max Gekk 
---
 docs/sql-ref-ansi-compliance.md| 11 +++
 .../spark/sql/catalyst/parser/SqlBaseLexer.g4  | 11 +++
 .../spark/sql/catalyst/parser/SqlBaseParser.g4 | 36 --
 .../sql-tests/results/ansi/interval.sql.out| 15 +++--
 .../resources/sql-tests/results/interval.sql.out   | 15 +++--
 5 files changed, 66 insertions(+), 22 deletions(-)

diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md
index a59d145d551..1501e14c604 100644
--- a/docs/sql-ref-ansi-compliance.md
+++ b/docs/sql-ref-ansi-compliance.md
@@ -407,6 +407,7 @@ Below is a list of all the keywords in Spark SQL.
 |DATEADD|non-reserved|non-reserved|non-reserved|
 |DATEDIFF|non-reserved|non-reserved|non-reserved|
 |DAY|non-reserved|non-reserved|non-reserved|
+|DAYS|non-reserved|non-reserved|non-reserved|
 |DAYOFYEAR|non-reserved|non-reserved|non-reserved|
 |DBPROPERTIES|non-reserved|non-reserved|non-reserved|
 |DEFAULT|non-reserved|non-reserved|non-reserved|
@@ -456,6 +457,7 @@ Below is a list of all the keywords in Spark SQL.
 |GROUPING|non-reserved|non-reserved|reserved|
 |HAVING|reserved|non-reserved|reserved|
 |HOUR|non-reserved|non-reserved|non-reserved|
+|HOURS|non-reserved|non-reserved|non-reserved|
 |IF|non-reserved|non-reserved|not a keyword|
 |IGNORE|non-reserved|non-reserved|non-reserved|
 |IMPORT|non-reserved|non-reserved|non-reserved|
@@ -495,13 +497,19 @@ Below is a list of all the keywords in Spark SQL.
 |MATCHED|non-reserved|non-reserved|non-reserved|
 |MERGE|non-reserved|non-reserved|non-reserved|
 |MICROSECOND|non-reserved|non-reserved|non-reserved|
+|MICROSECONDS|non-reserved|non-reserved|non-reserved|
 |MILLISECOND|non-reserved|non-reserved|non-reserved|
+|MILLISECONDS|non-reserved|non-reserved|non-reserved|
 |MINUTE|non-reserved|non-reserved|non-reserved|
+|MINUTES|non-reserved|non-reserved|non-reserved|
 |MINUS|non-reserved|strict-non-reserved|non-reserved|
 |MONTH|non-reserved|non-reserved|non-reserved|
+|MONTHS|non-reserved|non-reserved|non-reserved|
 |MSCK|non-reserved|non-reserved|non-reserved|
 |NAMESPACE|non-reserved|non-reserved|non-reserved|
 |NAMESPACES|non-reserved|non-reserved|non-reserved|
+|NANOSECOND|non-reserved|non-reserved|non-reserved|
+|NANOSECONDS|non-reserved|non-reserved|non-reserved|
 |NATURAL|reserved|strict-non-reserved|reserved|
 |NO|non-reserved|non-reserved|reserved|
 |NOT|reserved|non-reserved|reserved|
@@ -565,6 +573,7 @@ Below is a list of all the keywords in Spark SQL.
 |SCHEMA|non-reserved|non-reserved|non-reserved|
 |SCHEMAS|non-reserved|non-reserved|non-reserved|
 |SECOND|non-reserved|non-reserved|non-reserved|
+|SECONDS|non-reserved|non-reserved|non-reserved|
 |SELECT|reserved|non-reserved|reserved|
 |SEMI|non-reserved|strict-non-reserved|non-reserved|
 |SEPARATED|non-reserved|non-reserved|non-reserved|
@@ -631,10 +640,12 @@ Below is a list of all the keywords in Spark SQL.
 |VIEW|non-reserved|non-reserved|non-reserved|
 |VIEWS|non-reserved|non-reserved|non-reserved|
 |WEEK|non-reserved|non-reserved|non-reserved|
+|WEEKS|non-reserv

[spark] branch master updated: [SPARK-41038][SQL] Rename `MULTI_VALUE_SUBQUERY_ERROR` to `SCALAR_SUBQUERY_TOO_MANY_ROWS`

2022-11-09 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 0205478b9d3 [SPARK-41038][SQL] Rename `MULTI_VALUE_SUBQUERY_ERROR` to 
`SCALAR_SUBQUERY_TOO_MANY_ROWS`
0205478b9d3 is described below

commit 0205478b9d35d62450fd7c9ade520087fd2979a7
Author: itholic 
AuthorDate: Wed Nov 9 19:14:32 2022 +0300

[SPARK-41038][SQL] Rename `MULTI_VALUE_SUBQUERY_ERROR` to 
`SCALAR_SUBQUERY_TOO_MANY_ROWS`

### What changes were proposed in this pull request?

This PR proposes to rename the `MULTI_VALUE_SUBQUERY_ERROR` to 
`SCALAR_SUBQUERY_TOO_MANY_ROWS`.

### Why are the changes needed?

The current error class name `MULTI_VALUE_SUBQUERY_ERROR` is not clear 
enough to brief the error situation.

`SCALAR_SUBQUERY_TOO_MANY_ROWS` would be more readable since the "scalar 
subquery" is the industrial term.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?
```
./build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*"
```

Closes #38551 from itholic/SPARK-41038.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 +-
 .../org/apache/spark/sql/errors/QueryExecutionErrors.scala |  2 +-
 .../subquery/scalar-subquery/scalar-subquery-select.sql.out|  2 +-
 .../org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala  |  4 ++--
 .../apache/spark/sql/errors/QueryExecutionErrorsSuite.scala|  4 ++--
 5 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 9c914b86bb1..7c33c1059ae 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -690,11 +690,6 @@
   "Not allowed to implement multiple UDF interfaces, UDF class "
 ]
   },
-  "MULTI_VALUE_SUBQUERY_ERROR" : {
-"message" : [
-  "More than one row returned by a subquery used as an expression."
-]
-  },
   "NON_LAST_MATCHED_CLAUSE_OMIT_CONDITION" : {
 "message" : [
   "When there are more than one MATCHED clauses in a MERGE statement, only 
the last MATCHED clause can omit the condition."
@@ -878,6 +873,11 @@
 ],
 "sqlState" : "42000"
   },
+  "SCALAR_SUBQUERY_TOO_MANY_ROWS" : {
+"message" : [
+  "More than one row returned by a subquery used as an expression."
+]
+  },
   "SCHEMA_ALREADY_EXISTS" : {
 "message" : [
   "Cannot create schema  because it already exists.",
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
index 73664e64c22..828f52fe71d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
@@ -2766,7 +2766,7 @@ private[sql] object QueryExecutionErrors extends 
QueryErrorsBase {
 
   def multipleRowSubqueryError(context: SQLQueryContext): Throwable = {
 new SparkException(
-  errorClass = "MULTI_VALUE_SUBQUERY_ERROR",
+  errorClass = "SCALAR_SUBQUERY_TOO_MANY_ROWS",
   messageParameters = Map.empty,
   cause = null,
   context = getQueryContext(context),
diff --git 
a/sql/core/src/test/resources/sql-tests/results/subquery/scalar-subquery/scalar-subquery-select.sql.out
 
b/sql/core/src/test/resources/sql-tests/results/subquery/scalar-subquery/scalar-subquery-select.sql.out
index 38ab365ef69..0012251d7eb 100644
--- 
a/sql/core/src/test/resources/sql-tests/results/subquery/scalar-subquery/scalar-subquery-select.sql.out
+++ 
b/sql/core/src/test/resources/sql-tests/results/subquery/scalar-subquery/scalar-subquery-select.sql.out
@@ -424,7 +424,7 @@ struct<>
 -- !query output
 org.apache.spark.SparkException
 {
-  "errorClass" : "MULTI_VALUE_SUBQUERY_ERROR",
+  "errorClass" : "SCALAR_SUBQUERY_TOO_MANY_ROWS",
   "queryContext" : [ {
 "objectType" : "",
 "objectName" : "",
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
index c9c66395a3b..25faa34b697 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/connector

[spark] branch master updated: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Improve test coverage

2022-11-09 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ef545d6ce57 [SPARK-40798][SQL][TESTS][FOLLOW-UP] Improve test coverage
ef545d6ce57 is described below

commit ef545d6ce579db1070d260426ab8cbf6e2853c28
Author: ulysses-you 
AuthorDate: Wed Nov 9 18:07:40 2022 +0300

[SPARK-40798][SQL][TESTS][FOLLOW-UP] Improve test coverage

### What changes were proposed in this pull request?

Add ansi test in 
`org.apache.spark.sql.execution.command.v2.AlterTableAddPartitionSuite`

### Why are the changes needed?

Improve test coverage with both ansi on/off

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

Pass CI

Closes #38580 from ulysses-you/test.

Authored-by: ulysses-you 
Signed-off-by: Max Gekk 
---
 .../command/v2/AlterTableAddPartitionSuite.scala   | 30 +-
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala
index c33d9b0101a..09ebd4af4ec 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala
@@ -17,6 +17,7 @@
 
 package org.apache.spark.sql.execution.command.v2
 
+import org.apache.spark.SparkNumberFormatException
 import org.apache.spark.sql.{AnalysisException, Row}
 import org.apache.spark.sql.catalyst.analysis.PartitionsAlreadyExistException
 import org.apache.spark.sql.execution.command
@@ -129,12 +130,29 @@ class AlterTableAddPartitionSuite
 withNamespaceAndTable("ns", "tbl") { t =>
   sql(s"CREATE TABLE $t (c int) $defaultUsing PARTITIONED BY (p int)")
 
-  withSQLConf(
-  SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> "true",
-  SQLConf.ANSI_ENABLED.key -> "false") {
-sql(s"ALTER TABLE $t ADD PARTITION (p='aaa')")
-checkPartitions(t, Map("p" -> defaultPartitionName))
-sql(s"ALTER TABLE $t DROP PARTITION (p=null)")
+  withSQLConf(SQLConf.SKIP_TYPE_VALIDATION_ON_ALTER_PARTITION.key -> 
"true") {
+withSQLConf(SQLConf.ANSI_ENABLED.key -> "true") {
+  checkError(
+exception = intercept[SparkNumberFormatException] {
+  sql(s"ALTER TABLE $t ADD PARTITION (p='aaa')")
+},
+errorClass = "CAST_INVALID_INPUT",
+parameters = Map(
+  "ansiConfig" -> "\"spark.sql.ansi.enabled\"",
+  "expression" -> "'aaa'",
+  "sourceType" -> "\"STRING\"",
+  "targetType" -> "\"INT\""),
+context = ExpectedContext(
+  fragment = s"ALTER TABLE $t ADD PARTITION (p='aaa')",
+  start = 0,
+  stop = 35 + t.length))
+}
+
+withSQLConf(SQLConf.ANSI_ENABLED.key -> "false") {
+  sql(s"ALTER TABLE $t ADD PARTITION (p='aaa')")
+  checkPartitions(t, Map("p" -> defaultPartitionName))
+  sql(s"ALTER TABLE $t DROP PARTITION (p=null)")
+}
   }
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41009][SQL] Rename the error class `_LEGACY_ERROR_TEMP_1070` to `LOCATION_ALREADY_EXISTS`

2022-11-09 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3e8191f7267 [SPARK-41009][SQL] Rename the error class 
`_LEGACY_ERROR_TEMP_1070` to `LOCATION_ALREADY_EXISTS`
3e8191f7267 is described below

commit 3e8191f726721bf74c8dbcb3ea73a216f6bf0517
Author: Max Gekk 
AuthorDate: Wed Nov 9 12:33:13 2022 +0300

[SPARK-41009][SQL] Rename the error class `_LEGACY_ERROR_TEMP_1070` to 
`LOCATION_ALREADY_EXISTS`

### What changes were proposed in this pull request?
In the PR, I propose to assign the proper name `LOCATION_ALREADY_EXISTS ` 
to the legacy error class `_LEGACY_ERROR_TEMP_1070 `, and modify test suite to 
use `checkError()` which checks the error class name, context and etc.

### Why are the changes needed?
Proper name improves user experience w/ Spark SQL.

### Does this PR introduce _any_ user-facing change?
Yes, the PR changes an user-facing error message.

### How was this patch tested?
By running the modified test suites:
```
$ build/sbt "core/testOnly *SparkThrowableSuite"
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly 
*AlterTableRenameSuite"
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly 
*HiveCatalogedDDLSuite"
```

Closes #38490 from MaxGekk/location-already-exists.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 +++---
 .../sql/catalyst/catalog/SessionCatalog.scala  |  8 ++---
 .../spark/sql/errors/QueryCompilationErrors.scala  | 10 --
 .../spark/sql/errors/QueryExecutionErrors.scala|  8 +
 .../spark/sql/execution/command/DDLSuite.scala | 42 --
 .../command/v1/AlterTableRenameSuite.scala | 17 +
 6 files changed, 51 insertions(+), 44 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 71703e7efd9..9c914b86bb1 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -669,6 +669,11 @@
   }
 }
   },
+  "LOCATION_ALREADY_EXISTS" : {
+"message" : [
+  "Cannot name the managed table as , as its associated 
location  already exists. Please pick a different table name, or 
remove the existing location first."
+]
+  },
   "MALFORMED_PROTOBUF_MESSAGE" : {
 "message" : [
   "Malformed Protobuf messages are detected in message deserialization. 
Parse Mode: . To process malformed protobuf message as null 
result, try setting the option 'mode' as 'PERMISSIVE'."
@@ -1949,11 +1954,6 @@
   "CREATE EXTERNAL TABLE must be accompanied by LOCATION."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1070" : {
-"message" : [
-  "Can not  the managed table(''). The 
associated location('') already exists."
-]
-  },
   "_LEGACY_ERROR_TEMP_1071" : {
 "message" : [
   "Some existing schema fields () are not present 
in the new schema. We don't support dropping columns yet."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
index bf712f9681e..06214613299 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
@@ -40,7 +40,7 @@ import 
org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project, Subque
 import org.apache.spark.sql.catalyst.trees.{CurrentOrigin, Origin}
 import org.apache.spark.sql.catalyst.util.{CharVarcharUtils, StringUtils}
 import org.apache.spark.sql.connector.catalog.CatalogManager
-import org.apache.spark.sql.errors.QueryCompilationErrors
+import org.apache.spark.sql.errors.{QueryCompilationErrors, 
QueryExecutionErrors}
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.internal.StaticSQLConf.GLOBAL_TEMP_DATABASE
 import org.apache.spark.sql.types.StructType
@@ -411,8 +411,7 @@ class SessionCatalog(
   val fs = tableLocation.getFileSystem(hadoopConf)
 
   if (fs.exists(tableLocation) && fs.listStatus(tableLocation).nonEmpty) {
-throw 
QueryCompilationErrors.cannotOperateManagedTableWithExistingLocationError(
-  "create", table.identifier, tableLocation)
+throw QueryExecutionErrors.locationAlreadyExists(table.identifier, 
tableLocation)
   }
 }
   }
@@ -1912,8 +1911,7 @@ class

[spark] branch master updated: [SPARK-41041][SQL] Integrate `_LEGACY_ERROR_TEMP_1279` into `TABLE_OR_VIEW_ALREADY_EXISTS`

2022-11-09 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 6b88d55b14d [SPARK-41041][SQL] Integrate `_LEGACY_ERROR_TEMP_1279` 
into `TABLE_OR_VIEW_ALREADY_EXISTS`
6b88d55b14d is described below

commit 6b88d55b14df1f9d15ba921569239cde86071e7d
Author: itholic 
AuthorDate: Wed Nov 9 11:54:34 2022 +0300

[SPARK-41041][SQL] Integrate `_LEGACY_ERROR_TEMP_1279` into 
`TABLE_OR_VIEW_ALREADY_EXISTS`

### What changes were proposed in this pull request?

This PR proposes to integrate the `_LEGACY_ERROR_TEMP_1279` into 
`TABLE_OR_VIEW_ALREADY_EXISTS`.

### Why are the changes needed?

They're duplicated, both explain about the view already exists.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

```
./build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*"
```

Closes #38552 from itholic/SPARK-41041.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json | 5 -
 .../scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala   | 4 ++--
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 57fe79ef184..71703e7efd9 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -2972,11 +2972,6 @@
   " is not a view."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1279" : {
-"message" : [
-  "View  already exists. If you want to update the view definition, 
please use ALTER VIEW AS or CREATE OR REPLACE VIEW AS."
-]
-  },
   "_LEGACY_ERROR_TEMP_1280" : {
 "message" : [
   "It is not allowed to create a persisted view from the Dataset API."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index 139ea236e49..67ceafbf03d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -2667,8 +2667,8 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
 
   def viewAlreadyExistsError(name: TableIdentifier): Throwable = {
 new AnalysisException(
-  errorClass = "_LEGACY_ERROR_TEMP_1279",
-  messageParameters = Map("name" -> name.toString))
+  errorClass = "TABLE_OR_VIEW_ALREADY_EXISTS",
+  messageParameters = Map("relationName" -> name.toString))
   }
 
   def createPersistedViewFromDatasetAPINotAllowedError(): Throwable = {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-41043][SQL] Rename the error class `_LEGACY_ERROR_TEMP_2429` to `NUM_COLUMNS_MISMATCH`

2022-11-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new be74ee79d5f [SPARK-41043][SQL] Rename the error class 
`_LEGACY_ERROR_TEMP_2429` to `NUM_COLUMNS_MISMATCH`
be74ee79d5f is described below

commit be74ee79d5f6a9bad02f5254fa3c32308ea7263f
Author: Max Gekk 
AuthorDate: Tue Nov 8 23:25:19 2022 +0300

[SPARK-41043][SQL] Rename the error class `_LEGACY_ERROR_TEMP_2429` to 
`NUM_COLUMNS_MISMATCH`

### What changes were proposed in this pull request?
In the PR, I propose to assign the proper name `NUM_COLUMNS_MISMATCH ` to 
the legacy error class `_LEGACY_ERROR_TEMP_2429 `, and modify test suite to use 
`checkError()` which checks the error class name, context and etc.

### Why are the changes needed?
Proper name improves user experience w/ Spark SQL.

### Does this PR introduce _any_ user-facing change?
Yes, the PR changes an user-facing error message.

### How was this patch tested?
By running the modified tests:
```
$ build/sbt "test:testOnly *DataFrameSetOperationsSuite"
$ PYSPARK_PYTHON=python3 build/sbt "sql/testOnly 
org.apache.spark.sql.SQLQueryTestSuite"
```

    Closes #38537 from MaxGekk/columns-num-mismatch.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 +-
 .../spark/sql/catalyst/analysis/CheckAnalysis.scala| 14 +++---
 .../resources/sql-tests/results/except-all.sql.out | 10 +-
 .../resources/sql-tests/results/intersect-all.sql.out  | 10 +-
 .../sql-tests/results/udf/udf-except-all.sql.out   | 10 +-
 .../sql-tests/results/udf/udf-intersect-all.sql.out| 10 +-
 .../apache/spark/sql/DataFrameSetOperationsSuite.scala | 18 +++---
 7 files changed, 43 insertions(+), 39 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 5107dd1778a..57fe79ef184 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -758,6 +758,11 @@
 ],
 "sqlState" : "22005"
   },
+  "NUM_COLUMNS_MISMATCH" : {
+"message" : [
+  " can only be performed on tables with the same number of 
columns, but the first table has  columns and the 
 table has  columns."
+]
+  },
   "ORDER_BY_POS_OUT_OF_RANGE" : {
 "message" : [
   "ORDER BY position  is not in select list (valid range is [1, 
])."
@@ -5033,11 +5038,6 @@
   "The sum of the LIMIT clause and the OFFSET clause must not be greater 
than the maximum 32-bit integer value (2,147,483,647) but found limit = 
, offset = ."
 ]
   },
-  "_LEGACY_ERROR_TEMP_2429" : {
-"message" : [
-  " can only be performed on tables with the same number of 
columns, but the first table has  columns and the  table has 
 columns."
-]
-  },
   "_LEGACY_ERROR_TEMP_2430" : {
 "message" : [
   " can only be performed on tables with compatible column 
types. The  column of the  table is  type which is not compatible 
with  at the same column of the first table."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index 544bb3cc301..9e41bcebe47 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
@@ -29,7 +29,7 @@ import org.apache.spark.sql.catalyst.trees.TreeNodeTag
 import 
org.apache.spark.sql.catalyst.trees.TreePattern.UNRESOLVED_WINDOW_EXPRESSION
 import org.apache.spark.sql.catalyst.util.{CharVarcharUtils, StringUtils, 
TypeUtils}
 import org.apache.spark.sql.connector.catalog.{LookupCatalog, 
SupportsPartitionManagement}
-import org.apache.spark.sql.errors.QueryCompilationErrors
+import org.apache.spark.sql.errors.{QueryCompilationErrors, QueryErrorsBase}
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
 import org.apache.spark.sql.util.SchemaUtils
@@ -38,7 +38,7 @@ import org.apache.spark.util.Utils
 /**
  * Throws user facing errors when passed invalid queries that fail to analyze.
  */
-trait CheckAnalysis extends PredicateHelper with LookupCatalog {
+trait CheckAnalysis extends PredicateHelper with LookupCatalog with 
QueryErrorsBase {
 
   protected def isView(nameParts: Seq[String]): Boolean
 
@@ -541,12 +541,12 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog {
   // Che

[spark] branch master updated (0d435411ec5 -> fabea7101ea)

2022-11-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 0d435411ec5 [SPARK-41029][SQL] Optimize constructor use of 
`GenericArrayData` for Scala 2.13
 add fabea7101ea [SPARK-41042][SQL] Rename `PARSE_CHAR_MISSING_LENGTH` to 
`DATATYPE_MISSING_SIZE`

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json | 12 ++--
 .../org/apache/spark/sql/errors/QueryParsingErrors.scala |  2 +-
 .../apache/spark/sql/catalyst/parser/ErrorParserSuite.scala  |  6 +++---
 3 files changed, 10 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-40973][SQL] Rename `_LEGACY_ERROR_TEMP_0055` to `UNCLOSED_BRACKETED_COMMENT`

2022-11-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 31b923d50fa [SPARK-40973][SQL] Rename `_LEGACY_ERROR_TEMP_0055` to 
`UNCLOSED_BRACKETED_COMMENT`
31b923d50fa is described below

commit 31b923d50fa6176312f7217069c0055cd778788f
Author: itholic 
AuthorDate: Tue Nov 8 13:34:27 2022 +0300

[SPARK-40973][SQL] Rename `_LEGACY_ERROR_TEMP_0055` to 
`UNCLOSED_BRACKETED_COMMENT`

### What changes were proposed in this pull request?

This PR proposes to introduce new error class `UNCLOSED_BRACKETED_COMMENT`, 
by updating the existing legacy temp error class `_LEGACY_ERROR_TEMP_0055 `.

### Why are the changes needed?

We should use appropriate error class name that matches the error message.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

The existing CI should pass.

Closes #38447 from itholic/LEGACY_0055.

Authored-by: itholic 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   | 10 +-
 .../org/apache/spark/sql/catalyst/parser/ParseDriver.scala |  5 -
 .../scala/org/apache/spark/sql/errors/QueryParsingErrors.scala | 10 +++---
 .../org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala |  4 ++--
 sql/core/src/test/resources/sql-tests/results/comments.sql.out |  4 ++--
 .../org/apache/spark/sql/execution/SparkSqlParserSuite.scala   |  4 ++--
 .../org/apache/spark/sql/hive/thriftserver/CliSuite.scala  |  3 ++-
 7 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 107bf5ebd5a..e28e5208784 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -938,6 +938,11 @@
   "Unable to convert SQL type  to Protobuf type ."
 ]
   },
+  "UNCLOSED_BRACKETED_COMMENT" : {
+"message" : [
+  "Found an unclosed bracketed comment. Please, append */ at the end of 
the comment."
+]
+  },
   "UNKNOWN_PROTOBUF_MESSAGE_TYPE" : {
 "message" : [
   "Attempting to treat  as a Message, but it was 
"
@@ -1567,11 +1572,6 @@
   "It is not allowed to add database prefix `` for the TEMPORARY 
view name."
 ]
   },
-  "_LEGACY_ERROR_TEMP_0055" : {
-"message" : [
-  "Unclosed bracketed comment"
-]
-  },
   "_LEGACY_ERROR_TEMP_0056" : {
 "message" : [
   "Invalid time travel spec: ."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala
index 10a213373ad..727d35d5c91 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala
@@ -429,7 +429,10 @@ case class UnclosedCommentProcessor(
   val failedToken = tokenStream.get(tokenStream.size() - 2)
   assert(failedToken.getType() == SqlBaseParser.BRACKETED_COMMENT)
   val position = Origin(Option(failedToken.getLine), 
Option(failedToken.getCharPositionInLine))
-  throw QueryParsingErrors.unclosedBracketedCommentError(command, position)
+  throw QueryParsingErrors.unclosedBracketedCommentError(
+command = command,
+start = Origin(Option(failedToken.getStartIndex)),
+stop = Origin(Option(failedToken.getStopIndex)))
 }
   }
 
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
index 1fce265bece..0fcf6edcbdf 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala
@@ -601,9 +601,13 @@ private[sql] object QueryParsingErrors extends 
QueryErrorsBase {
   ctx)
   }
 
-  def unclosedBracketedCommentError(command: String, position: Origin): 
Throwable = {
-new ParseException(Some(command), "Unclosed bracketed comment", position, 
position,
-  Some("_LEGACY_ERROR_TEMP_0055"))
+  def unclosedBracketedCommentError(command: String, start: Origin, stop: 
Origin): Throwable = {
+new ParseException(
+  command = Some(command),
+  start = start,
+  stop = stop,
+  errorClass = "UNCLOSED_BRACKETED_COMMENT",
+  messageParameters = Map.empty)
   }
 
   def invalidTimeTravelSpec(reason: String, ctx: ParserRuleContext): Throwable 
= {

[spark] branch master updated (75643f4e9b0 -> 3bbf0f35b7a)

2022-11-08 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 75643f4e9b0 [SPARK-41015][SQL][PROTOBUF] UnitTest null check for data 
generator
 add 3bbf0f35b7a [SPARK-41027][SQL] Use `UNEXPECTED_INPUT_TYPE` instead of 
`MAP_FROM_ENTRIES_WRONG_TYPE`

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   |  5 -
 .../spark/sql/catalyst/expressions/collectionOperations.scala  |  9 +
 .../sql/catalyst/expressions/CollectionExpressionsSuite.scala  | 10 ++
 .../scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala   |  9 +
 4 files changed, 16 insertions(+), 17 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

< 1 2 3 4 5 6 7 8 9 10 >

501 - 600 of 1112 matches

Mail list logo