[jira] [Resolved] (SPARK-47265) Enable test of v2 data sources in `FileBasedDataSourceSuite`

2024-03-03 Thread BingKun Pan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BingKun Pan resolved SPARK-47265.
-
Resolution: Won't Fix

> Enable test of v2 data sources in `FileBasedDataSourceSuite`
> 
>
> Key: SPARK-47265
> URL: https://issues.apache.org/jira/browse/SPARK-47265
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47265) Enable test of v2 data sources in `FileBasedDataSourceSuite`

2024-03-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-47265:
---
Labels: pull-request-available  (was: )

> Enable test of v2 data sources in `FileBasedDataSourceSuite`
> 
>
> Key: SPARK-47265
> URL: https://issues.apache.org/jira/browse/SPARK-47265
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47265) Enable test of v2 data sources in `FileBasedDataSourceSuite`

2024-03-03 Thread BingKun Pan (Jira)
BingKun Pan created SPARK-47265:
---

 Summary: Enable test of v2 data sources in 
`FileBasedDataSourceSuite`
 Key: SPARK-47265
 URL: https://issues.apache.org/jira/browse/SPARK-47265
 Project: Spark
  Issue Type: Improvement
  Components: SQL, Tests
Affects Versions: 4.0.0
Reporter: BingKun Pan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-46973) Skip V2 table lookup when a table is in V1 table cache

2024-03-03 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-46973.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45176
[https://github.com/apache/spark/pull/45176]

> Skip V2 table lookup when a table is in V1 table cache
> --
>
> Key: SPARK-46973
> URL: https://issues.apache.org/jira/browse/SPARK-46973
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Improve v2 table lookup performance when a table is already in the v1 table 
> cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-46973) Skip V2 table lookup when a table is in V1 table cache

2024-03-03 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-46973:
---

Assignee: Allison Wang

> Skip V2 table lookup when a table is in V1 table cache
> --
>
> Key: SPARK-46973
> URL: https://issues.apache.org/jira/browse/SPARK-46973
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>  Labels: pull-request-available
>
> Improve v2 table lookup performance when a table is already in the v1 table 
> cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-46834) Aggregate support for strings with collation

2024-03-03 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-46834.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45290
[https://github.com/apache/spark/pull/45290]

> Aggregate support for strings with collation
> 
>
> Key: SPARK-46834
> URL: https://issues.apache.org/jira/browse/SPARK-46834
> Project: Spark
>  Issue Type: Task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Aleksandar Tomic
>Assignee: Aleksandar Tomic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-46834) Aggregate support for strings with collation

2024-03-03 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-46834:
---

Assignee: Aleksandar Tomic

> Aggregate support for strings with collation
> 
>
> Key: SPARK-46834
> URL: https://issues.apache.org/jira/browse/SPARK-46834
> Project: Spark
>  Issue Type: Task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Aleksandar Tomic
>Assignee: Aleksandar Tomic
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47264) Thrift server support

2024-03-03 Thread Aleksandar Tomic (Jira)
Aleksandar Tomic created SPARK-47264:


 Summary: Thrift server support
 Key: SPARK-47264
 URL: https://issues.apache.org/jira/browse/SPARK-47264
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 4.0.0
Reporter: Aleksandar Tomic


Thrift server tests fail with following error:
repo: build/sbt -Phive-thriftserver "hive-thriftserver/testOnly 
org.apache.spark.sql.hive.thriftserver.ThriftServerQueryTestSuite -- -z 
collations.sql

error:
```
```
[info] - collations.sql *** FAILED *** (1 second, 144 milliseconds)

[13016](https://github.com/dbatomic/spark/actions/runs/8129693575/job/22217101481#step:10:13017)[info]
 Expected "[aaa aaa]", but got "[java.lang.IllegalArgumentException

[13017](https://github.com/dbatomic/spark/actions/runs/8129693575/job/22217101481#step:10:13018)[info]
 Unrecognized type name: string COLLATE 'UCS_BASIC_LCASE']" Result did not 
match for query #7

[13018](https://github.com/dbatomic/spark/actions/runs/8129693575/job/22217101481#step:10:13019)[info]
 select * from t1 where ucs_basic = 'aaa' (ThriftServerQueryTestSuite.scala:226)

[13019](https://github.com/dbatomic/spark/actions/runs/8129693575/job/22217101481#step:10:13020)[info]
 org.scalatest.exceptions.TestFailedException:

[13020](https://github.com/dbatomic/spark/actions/runs/8129693575/job/22217101481#step:10:13021)[info]
 at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)

[13021](https://github.com/dbatomic/spark/actions/runs/8129693575/job/22217101481#step:10:13022)[info]
 at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)

[13022](https://github.com/dbatomic/spark/actions/runs/8129693575/job/22217101481#step:10:13023)[info]
 at 
org.scalatest.funsuite.AnyFunSuite.newAssertionFailedException(AnyFunSuite.scala:1564)

[13023](https://github.com/dbatomic/spark/actions/runs/8129693575/job/22217101481#step:10:13024)[info]
 at org.scalatest.Assertions.assertResult(Assertions.scala:847)

[13024](https://github.com/dbatomic/spark/actions/runs/8129693575/job/22217101481#step:10:13025)[info]
 at org.scalatest.Assertions.assertResult$(Assertions.scala:842)

[13025](https://github.com/dbatomic/spark/actions/runs/8129693575/job/22217101481#step:10:13026)[info]
 at org.scalatest.funsuite.AnyFunSuite.assertResult(AnyFunSuite.scala:1564)

[13026](https://github.com/dbatomic/spark/actions/runs/8129693575/job/22217101481#step:10:13027)[info]
 at 
org.apache.spark.sql.hive.thriftserver.ThriftServerQueryTestSuite.$anonfun$runQueries$6(ThriftServerQueryTestSuite.scala:226)
```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47263) Assign classes to DEFAULT value errors

2024-03-03 Thread Max Gekk (Jira)
Max Gekk created SPARK-47263:


 Summary: Assign classes to DEFAULT value errors
 Key: SPARK-47263
 URL: https://issues.apache.org/jira/browse/SPARK-47263
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Max Gekk


Choose a proper name for the error class *_LEGACY_ERROR_TEMP_22[38-40]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47263) Assign classes to DEFAULT value errors

2024-03-03 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-47263:
-
Description: 
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_13[44-46]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]

  was:
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_22[38-40]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]


> Assign classes to DEFAULT value errors
> --
>
> Key: SPARK-47263
> URL: https://issues.apache.org/jira/browse/SPARK-47263
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Priority: Minor
>  Labels: starter
>
> Choose a proper name for the error class *_LEGACY_ERROR_TEMP_13[44-46]* 
> defined in {*}core/src/main/resources/error/error-classes.json{*}. The name 
> should be short but complete (look at the example in error-classes.json).
> Add a test which triggers the error from user code if such test still doesn't 
> exist. Check exception fields by using {*}checkError(){*}. The last function 
> checks valuable error fields only, and avoids dependencies from error text 
> message. In this way, tech editors can modify error format in 
> error-classes.json, and don't worry of Spark's internal tests. Migrate other 
> tests that might trigger the error onto checkError().
> If you cannot reproduce the error from user space (using SQL query), replace 
> the error by an internal error, see {*}SparkException.internalError(){*}.
> Improve the error message format in error-classes.json if the current is not 
> clear. Propose a solution to users how to avoid and fix such kind of errors.
> Please, look at the PR below as examples:
>  * [https://github.com/apache/spark/pull/38685]
>  * [https://github.com/apache/spark/pull/38656]
>  * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47262) Assign classes to Parquet converter errors

2024-03-03 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-47262:
-
Description: 
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_22[38-40]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]

  was:
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_11[72-74]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]


> Assign classes to Parquet converter errors
> --
>
> Key: SPARK-47262
> URL: https://issues.apache.org/jira/browse/SPARK-47262
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Priority: Minor
>  Labels: starter
>
> Choose a proper name for the error class *_LEGACY_ERROR_TEMP_22[38-40]* 
> defined in {*}core/src/main/resources/error/error-classes.json{*}. The name 
> should be short but complete (look at the example in error-classes.json).
> Add a test which triggers the error from user code if such test still doesn't 
> exist. Check exception fields by using {*}checkError(){*}. The last function 
> checks valuable error fields only, and avoids dependencies from error text 
> message. In this way, tech editors can modify error format in 
> error-classes.json, and don't worry of Spark's internal tests. Migrate other 
> tests that might trigger the error onto checkError().
> If you cannot reproduce the error from user space (using SQL query), replace 
> the error by an internal error, see {*}SparkException.internalError(){*}.
> Improve the error message format in error-classes.json if the current is not 
> clear. Propose a solution to users how to avoid and fix such kind of errors.
> Please, look at the PR below as examples:
>  * [https://github.com/apache/spark/pull/38685]
>  * [https://github.com/apache/spark/pull/38656]
>  * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47261) Assign classes to Parquet type errors

2024-03-03 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-47261:
-
Description: 
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_11[72-74]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]

  was:
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_32[49-51]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]


> Assign classes to Parquet type errors
> -
>
> Key: SPARK-47261
> URL: https://issues.apache.org/jira/browse/SPARK-47261
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Priority: Minor
>  Labels: starter
>
> Choose a proper name for the error class *_LEGACY_ERROR_TEMP_11[72-74]* 
> defined in {*}core/src/main/resources/error/error-classes.json{*}. The name 
> should be short but complete (look at the example in error-classes.json).
> Add a test which triggers the error from user code if such test still doesn't 
> exist. Check exception fields by using {*}checkError(){*}. The last function 
> checks valuable error fields only, and avoids dependencies from error text 
> message. In this way, tech editors can modify error format in 
> error-classes.json, and don't worry of Spark's internal tests. Migrate other 
> tests that might trigger the error onto checkError().
> If you cannot reproduce the error from user space (using SQL query), replace 
> the error by an internal error, see {*}SparkException.internalError(){*}.
> Improve the error message format in error-classes.json if the current is not 
> clear. Propose a solution to users how to avoid and fix such kind of errors.
> Please, look at the PR below as examples:
>  * [https://github.com/apache/spark/pull/38685]
>  * [https://github.com/apache/spark/pull/38656]
>  * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47262) Assign classes to Parquet converter errors

2024-03-03 Thread Max Gekk (Jira)
Max Gekk created SPARK-47262:


 Summary: Assign classes to Parquet converter errors
 Key: SPARK-47262
 URL: https://issues.apache.org/jira/browse/SPARK-47262
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Max Gekk


Choose a proper name for the error class *_LEGACY_ERROR_TEMP_11[72-74]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47261) Assign classes to Parquet type errors

2024-03-03 Thread Max Gekk (Jira)
Max Gekk created SPARK-47261:


 Summary: Assign classes to Parquet type errors
 Key: SPARK-47261
 URL: https://issues.apache.org/jira/browse/SPARK-47261
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Max Gekk


Choose a proper name for the error class *_LEGACY_ERROR_TEMP_32[49-51]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47260) Assign classes to Row to JSON errors

2024-03-03 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-47260:
-
Description: 
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_32[49-51]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]

  was:
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_32[08-14]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]


>  Assign classes to Row to JSON errors
> -
>
> Key: SPARK-47260
> URL: https://issues.apache.org/jira/browse/SPARK-47260
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Priority: Minor
>  Labels: starter
>
> Choose a proper name for the error class *_LEGACY_ERROR_TEMP_32[49-51]* 
> defined in {*}core/src/main/resources/error/error-classes.json{*}. The name 
> should be short but complete (look at the example in error-classes.json).
> Add a test which triggers the error from user code if such test still doesn't 
> exist. Check exception fields by using {*}checkError(){*}. The last function 
> checks valuable error fields only, and avoids dependencies from error text 
> message. In this way, tech editors can modify error format in 
> error-classes.json, and don't worry of Spark's internal tests. Migrate other 
> tests that might trigger the error onto checkError().
> If you cannot reproduce the error from user space (using SQL query), replace 
> the error by an internal error, see {*}SparkException.internalError(){*}.
> Improve the error message format in error-classes.json if the current is not 
> clear. Propose a solution to users how to avoid and fix such kind of errors.
> Please, look at the PR below as examples:
>  * [https://github.com/apache/spark/pull/38685]
>  * [https://github.com/apache/spark/pull/38656]
>  * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47260) Assign classes to Row to JSON errors

2024-03-03 Thread Max Gekk (Jira)
Max Gekk created SPARK-47260:


 Summary:  Assign classes to Row to JSON errors
 Key: SPARK-47260
 URL: https://issues.apache.org/jira/browse/SPARK-47260
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Max Gekk


Choose a proper name for the error class *_LEGACY_ERROR_TEMP_32[08-14]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47259) Assign classes to interval errors

2024-03-03 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-47259:
-
Description: 
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_32[08-14]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]

  was:
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_127[0-5]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]


> Assign classes to interval errors
> -
>
> Key: SPARK-47259
> URL: https://issues.apache.org/jira/browse/SPARK-47259
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Priority: Minor
>  Labels: starter
>
> Choose a proper name for the error class *_LEGACY_ERROR_TEMP_32[08-14]* 
> defined in {*}core/src/main/resources/error/error-classes.json{*}. The name 
> should be short but complete (look at the example in error-classes.json).
> Add a test which triggers the error from user code if such test still doesn't 
> exist. Check exception fields by using {*}checkError(){*}. The last function 
> checks valuable error fields only, and avoids dependencies from error text 
> message. In this way, tech editors can modify error format in 
> error-classes.json, and don't worry of Spark's internal tests. Migrate other 
> tests that might trigger the error onto checkError().
> If you cannot reproduce the error from user space (using SQL query), replace 
> the error by an internal error, see {*}SparkException.internalError(){*}.
> Improve the error message format in error-classes.json if the current is not 
> clear. Propose a solution to users how to avoid and fix such kind of errors.
> Please, look at the PR below as examples:
>  * [https://github.com/apache/spark/pull/38685]
>  * [https://github.com/apache/spark/pull/38656]
>  * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47259) Assign classes to interval errors

2024-03-03 Thread Max Gekk (Jira)
Max Gekk created SPARK-47259:


 Summary: Assign classes to interval errors
 Key: SPARK-47259
 URL: https://issues.apache.org/jira/browse/SPARK-47259
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Max Gekk


Choose a proper name for the error class *_LEGACY_ERROR_TEMP_127[0-5]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47258) Assign error classes to SHOW CREATE TABLE errors

2024-03-03 Thread Max Gekk (Jira)
Max Gekk created SPARK-47258:


 Summary: Assign error classes to SHOW CREATE TABLE errors
 Key: SPARK-47258
 URL: https://issues.apache.org/jira/browse/SPARK-47258
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Max Gekk


Choose a proper name for the error class *_LEGACY_ERROR_TEMP_105[3-4]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47258) Assign error classes to SHOW CREATE TABLE errors

2024-03-03 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-47258:
-
Description: 
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_127[0-5]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]

  was:
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_105[3-4]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]


> Assign error classes to SHOW CREATE TABLE errors
> 
>
> Key: SPARK-47258
> URL: https://issues.apache.org/jira/browse/SPARK-47258
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Priority: Minor
>  Labels: starter
>
> Choose a proper name for the error class *_LEGACY_ERROR_TEMP_127[0-5]* 
> defined in {*}core/src/main/resources/error/error-classes.json{*}. The name 
> should be short but complete (look at the example in error-classes.json).
> Add a test which triggers the error from user code if such test still doesn't 
> exist. Check exception fields by using {*}checkError(){*}. The last function 
> checks valuable error fields only, and avoids dependencies from error text 
> message. In this way, tech editors can modify error format in 
> error-classes.json, and don't worry of Spark's internal tests. Migrate other 
> tests that might trigger the error onto checkError().
> If you cannot reproduce the error from user space (using SQL query), replace 
> the error by an internal error, see {*}SparkException.internalError(){*}.
> Improve the error message format in error-classes.json if the current is not 
> clear. Propose a solution to users how to avoid and fix such kind of errors.
> Please, look at the PR below as examples:
>  * [https://github.com/apache/spark/pull/38685]
>  * [https://github.com/apache/spark/pull/38656]
>  * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47257) Assign error classes to ALTER COLUMN errors

2024-03-03 Thread Max Gekk (Jira)
Max Gekk created SPARK-47257:


 Summary: Assign error classes to ALTER COLUMN errors
 Key: SPARK-47257
 URL: https://issues.apache.org/jira/browse/SPARK-47257
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Max Gekk


Choose a proper name for the error class *_LEGACY_ERROR_TEMP_102[4-7]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47257) Assign error classes to ALTER COLUMN errors

2024-03-03 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-47257:
-
Description: 
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_105[3-4]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]

  was:
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_102[4-7]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]


> Assign error classes to ALTER COLUMN errors
> ---
>
> Key: SPARK-47257
> URL: https://issues.apache.org/jira/browse/SPARK-47257
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Priority: Minor
>  Labels: starter
>
> Choose a proper name for the error class *_LEGACY_ERROR_TEMP_105[3-4]* 
> defined in {*}core/src/main/resources/error/error-classes.json{*}. The name 
> should be short but complete (look at the example in error-classes.json).
> Add a test which triggers the error from user code if such test still doesn't 
> exist. Check exception fields by using {*}checkError(){*}. The last function 
> checks valuable error fields only, and avoids dependencies from error text 
> message. In this way, tech editors can modify error format in 
> error-classes.json, and don't worry of Spark's internal tests. Migrate other 
> tests that might trigger the error onto checkError().
> If you cannot reproduce the error from user space (using SQL query), replace 
> the error by an internal error, see {*}SparkException.internalError(){*}.
> Improve the error message format in error-classes.json if the current is not 
> clear. Propose a solution to users how to avoid and fix such kind of errors.
> Please, look at the PR below as examples:
>  * [https://github.com/apache/spark/pull/38685]
>  * [https://github.com/apache/spark/pull/38656]
>  * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47256) Assign error classes to FILTER expression errors

2024-03-03 Thread Max Gekk (Jira)
Max Gekk created SPARK-47256:


 Summary: Assign error classes to FILTER expression errors
 Key: SPARK-47256
 URL: https://issues.apache.org/jira/browse/SPARK-47256
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Max Gekk


Choose a proper name for the error class *_LEGACY_ERROR_TEMP_324[7-9]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47256) Assign error classes to FILTER expression errors

2024-03-03 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-47256:
-
Description: 
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_102[4-7]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]

  was:
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_324[7-9]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]


> Assign error classes to FILTER expression errors
> 
>
> Key: SPARK-47256
> URL: https://issues.apache.org/jira/browse/SPARK-47256
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Priority: Minor
>  Labels: starter
>
> Choose a proper name for the error class *_LEGACY_ERROR_TEMP_102[4-7]* 
> defined in {*}core/src/main/resources/error/error-classes.json{*}. The name 
> should be short but complete (look at the example in error-classes.json).
> Add a test which triggers the error from user code if such test still doesn't 
> exist. Check exception fields by using {*}checkError(){*}. The last function 
> checks valuable error fields only, and avoids dependencies from error text 
> message. In this way, tech editors can modify error format in 
> error-classes.json, and don't worry of Spark's internal tests. Migrate other 
> tests that might trigger the error onto checkError().
> If you cannot reproduce the error from user space (using SQL query), replace 
> the error by an internal error, see {*}SparkException.internalError(){*}.
> Improve the error message format in error-classes.json if the current is not 
> clear. Propose a solution to users how to avoid and fix such kind of errors.
> Please, look at the PR below as examples:
>  * [https://github.com/apache/spark/pull/38685]
>  * [https://github.com/apache/spark/pull/38656]
>  * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47255) Assign names to the error classes _LEGACY_ERROR_TEMP_324[7-9]

2024-03-03 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-47255:
-
Description: 
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_324[7-9]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]

  was:
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_325[1-9]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]


> Assign names to the error classes _LEGACY_ERROR_TEMP_324[7-9]
> -
>
> Key: SPARK-47255
> URL: https://issues.apache.org/jira/browse/SPARK-47255
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Priority: Minor
>  Labels: starter
>
> Choose a proper name for the error class *_LEGACY_ERROR_TEMP_324[7-9]* 
> defined in {*}core/src/main/resources/error/error-classes.json{*}. The name 
> should be short but complete (look at the example in error-classes.json).
> Add a test which triggers the error from user code if such test still doesn't 
> exist. Check exception fields by using {*}checkError(){*}. The last function 
> checks valuable error fields only, and avoids dependencies from error text 
> message. In this way, tech editors can modify error format in 
> error-classes.json, and don't worry of Spark's internal tests. Migrate other 
> tests that might trigger the error onto checkError().
> If you cannot reproduce the error from user space (using SQL query), replace 
> the error by an internal error, see {*}SparkException.internalError(){*}.
> Improve the error message format in error-classes.json if the current is not 
> clear. Propose a solution to users how to avoid and fix such kind of errors.
> Please, look at the PR below as examples:
>  * [https://github.com/apache/spark/pull/38685]
>  * [https://github.com/apache/spark/pull/38656]
>  * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47255) Assign names to the error classes _LEGACY_ERROR_TEMP_324[7-9]

2024-03-03 Thread Max Gekk (Jira)
Max Gekk created SPARK-47255:


 Summary: Assign names to the error classes 
_LEGACY_ERROR_TEMP_324[7-9]
 Key: SPARK-47255
 URL: https://issues.apache.org/jira/browse/SPARK-47255
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.0.0
Reporter: Max Gekk


Choose a proper name for the error class *_LEGACY_ERROR_TEMP_325[1-9]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47254) Assign names to the error classes _LEGACY_ERROR_TEMP_325[1-9]

2024-03-03 Thread Max Gekk (Jira)
Max Gekk created SPARK-47254:


 Summary: Assign names to the error classes 
_LEGACY_ERROR_TEMP_325[1-9]
 Key: SPARK-47254
 URL: https://issues.apache.org/jira/browse/SPARK-47254
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.5.0
Reporter: Max Gekk


Choose a proper name for the error class *_LEGACY_ERROR_TEMP_2000* defined in 
{*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47254) Assign names to the error classes _LEGACY_ERROR_TEMP_325[1-9]

2024-03-03 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-47254:
-
Affects Version/s: 4.0.0
   (was: 3.5.0)

> Assign names to the error classes _LEGACY_ERROR_TEMP_325[1-9]
> -
>
> Key: SPARK-47254
> URL: https://issues.apache.org/jira/browse/SPARK-47254
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Priority: Minor
>  Labels: starter
>
> Choose a proper name for the error class *_LEGACY_ERROR_TEMP_2000* defined in 
> {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
> short but complete (look at the example in error-classes.json).
> Add a test which triggers the error from user code if such test still doesn't 
> exist. Check exception fields by using {*}checkError(){*}. The last function 
> checks valuable error fields only, and avoids dependencies from error text 
> message. In this way, tech editors can modify error format in 
> error-classes.json, and don't worry of Spark's internal tests. Migrate other 
> tests that might trigger the error onto checkError().
> If you cannot reproduce the error from user space (using SQL query), replace 
> the error by an internal error, see {*}SparkException.internalError(){*}.
> Improve the error message format in error-classes.json if the current is not 
> clear. Propose a solution to users how to avoid and fix such kind of errors.
> Please, look at the PR below as examples:
>  * [https://github.com/apache/spark/pull/38685]
>  * [https://github.com/apache/spark/pull/38656]
>  * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47254) Assign names to the error classes _LEGACY_ERROR_TEMP_325[1-9]

2024-03-03 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk updated SPARK-47254:
-
Description: 
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_325[1-9]* defined 
in {*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]

  was:
Choose a proper name for the error class *_LEGACY_ERROR_TEMP_2000* defined in 
{*}core/src/main/resources/error/error-classes.json{*}. The name should be 
short but complete (look at the example in error-classes.json).

Add a test which triggers the error from user code if such test still doesn't 
exist. Check exception fields by using {*}checkError(){*}. The last function 
checks valuable error fields only, and avoids dependencies from error text 
message. In this way, tech editors can modify error format in 
error-classes.json, and don't worry of Spark's internal tests. Migrate other 
tests that might trigger the error onto checkError().

If you cannot reproduce the error from user space (using SQL query), replace 
the error by an internal error, see {*}SparkException.internalError(){*}.

Improve the error message format in error-classes.json if the current is not 
clear. Propose a solution to users how to avoid and fix such kind of errors.

Please, look at the PR below as examples:
 * [https://github.com/apache/spark/pull/38685]
 * [https://github.com/apache/spark/pull/38656]
 * [https://github.com/apache/spark/pull/38490]


> Assign names to the error classes _LEGACY_ERROR_TEMP_325[1-9]
> -
>
> Key: SPARK-47254
> URL: https://issues.apache.org/jira/browse/SPARK-47254
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Max Gekk
>Priority: Minor
>  Labels: starter
>
> Choose a proper name for the error class *_LEGACY_ERROR_TEMP_325[1-9]* 
> defined in {*}core/src/main/resources/error/error-classes.json{*}. The name 
> should be short but complete (look at the example in error-classes.json).
> Add a test which triggers the error from user code if such test still doesn't 
> exist. Check exception fields by using {*}checkError(){*}. The last function 
> checks valuable error fields only, and avoids dependencies from error text 
> message. In this way, tech editors can modify error format in 
> error-classes.json, and don't worry of Spark's internal tests. Migrate other 
> tests that might trigger the error onto checkError().
> If you cannot reproduce the error from user space (using SQL query), replace 
> the error by an internal error, see {*}SparkException.internalError(){*}.
> Improve the error message format in error-classes.json if the current is not 
> clear. Propose a solution to users how to avoid and fix such kind of errors.
> Please, look at the PR below as examples:
>  * [https://github.com/apache/spark/pull/38685]
>  * [https://github.com/apache/spark/pull/38656]
>  * [https://github.com/apache/spark/pull/38490]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47230) The column pruning is not working as expected for nested struct in an array

2024-03-03 Thread Ling Qin (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ling Qin updated SPARK-47230:
-
Description: 
See the code snippet below, when explode an array of struct and select one 
field in the struct,  some unexpected behaviour observed:
 * If the field in the struct is in the select clause, not in the where clause, 
the column pruning works as expected.
 * If the field in the struct is in the select clause and in the where clause, 
the column pruning not working.
 * If the field in the struct is not even selected, the column pruning not 
working

{code:java}
from pyspark.sql import SparkSession
from pyspark.sql.types import IntegerType, StructField, StructType, ArrayType
import random

spark = SparkSession.builder.appName("example").getOrCreate()# Create an RDD 
with an array of structs, each array having a random size between 5 to 10
rdd = spark.range(1000).rdd.map(lambda x: (x.id + 3, [(x.id + i, x.id - i) for 
i in range(1, random.randint(5, 11))]))

# Define a new schema

schema = StructType([
    StructField("a", IntegerType(), True),
    StructField("b", ArrayType(StructType([
        StructField("x", IntegerType(), True),
        StructField("y", IntegerType(), True)
    ])), True)
])

# Create a DataFrame with the new schema
df = spark.createDataFrame(rdd, schema=schema)

# Write the DataFrame to a parquet file
df.repartition(1).write.mode('overwrite').parquet('test.parquet')

# Read the parquet file back into a DataFrame
df = spark.read.parquet('test.parquet') 
df.createOrReplaceTempView("df_view")
spark.conf.set('spark.sql.optimizer.nestedSchemaPruning.enabled', 'true')

# case 1, as expected
sql_query = """
SELECT a, EXPLODE(b.x) AS bb
FROM df_view
"""
spark.sql(sql_query).explain()

# ReadSchema: struct>>

# case 2, as expected
sql_query = """
SELECT a, bb.x 
FROM df_view 
lateral view explode(b) as bb
"""
spark.sql(sql_query).explain()

# ReadSchema: struct>>

# case 3, bug: should only read b.x
sql_query = """
SELECT a, bb.x 
FROM df_view 
lateral view explode(b) as bb
where bb.x is not null
"""
spark.sql(sql_query).explain()

#ReadSchema: struct>>

#case 4, bug? seems no need to read both a and b
sql_query = """
SELECT a
FROM df_view 
lateral view explode(b) as bb
"""
spark.sql(sql_query).explain()

#ReadSchema: struct>>{code}
 
 
 

  was:
See the code snippet below, when explode an array of struct and select one 
field in the struct,  some unexpected behaviour observed:
 * If the field in the struct is in the select clause, not in the where clause, 
the column pruning works as expected.
 * If the field in the struct is in the select clause and in the where clause, 
the column pruning not working.
 * If the field in the struct is not even selected, the column pruning not 
working

{code:java}
from pyspark.sql import SparkSession
from pyspark.sql.types import IntegerType, StructField, StructType, ArrayType
import random

spark = SparkSession.builder.appName("example").getOrCreate()# Create an RDD 
with an array of structs, each array having a random size between 5 to 10
rdd = spark.range(1000).rdd.map(lambda x: (x.id + 3, [(x.id + i, x.id - i) for 
i in range(1, random.randint(5, 11))]))

# Define a new schema

schema = StructType([
    StructField("a", IntegerType(), True),
    StructField("b", ArrayType(StructType([
        StructField("x", IntegerType(), True),
        StructField("y", IntegerType(), True)
    ])), True)
])

# Create a DataFrame with the new schema
df = spark.createDataFrame(rdd, schema=schema)

# Write the DataFrame to a parquet file
df.repartition(1).write.mode('overwrite').parquet('test.parquet')

# Read the parquet file back into a DataFrame
df = spark.read.parquet('test.parquet')

spark.conf.set('spark.sql.optimizer.nestedSchemaPruning.enabled', 'true')

# case 1, as expected
sql_query = """
SELECT a, EXPLODE(b.x) AS bb
FROM df_view
"""
spark.sql(sql_query).explain()

# ReadSchema: struct>>

# case 2, as expected
sql_query = """
SELECT a, bb.x 
FROM df_view 
lateral view explode(b) as bb
"""
spark.sql(sql_query).explain()

# ReadSchema: struct>>

# case 3, bug: should only read b.x
sql_query = """
SELECT a, bb.x 
FROM df_view 
lateral view explode(b) as bb
where bb.x is not null
"""
spark.sql(sql_query).explain()

#ReadSchema: struct>>

#case 4, bug? seems no need to read both a and b
sql_query = """
SELECT a
FROM df_view 
lateral view explode(b) as bb
"""
spark.sql(sql_query).explain()

#ReadSchema: struct>>{code}
 
 
 


> The column pruning is not working as expected for nested struct in an array
> ---
>
> Key: SPARK-47230
> URL: https://issues.apache.org/jira/browse/SPARK-47230
> Project: Spark
>  Issue Type: Bug
>  Components: Optimizer, Spark Core, Spark Shell
>Affects Versions: 3.3.0, 3.4.0, 3.5.0, 3.5.1
>