[jira] [Created] (SPARK-48954) Rename unreleased try_remainder() function to try_mod()
Serge Rielau created SPARK-48954: Summary: Rename unreleased try_remainder() function to try_mod() Key: SPARK-48954 URL: https://issues.apache.org/jira/browse/SPARK-48954 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau Fix For: 4.0.0 the try_remainder() function is the try_* version of `%` and `mod`. As such, given that there is NO `remainder()` function and no other product seems to have try_remainder() we want to rename try_remainder to try_mod() -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-48929) View fails with internal error after upgrade causes expected syntax error.
Serge Rielau created SPARK-48929: Summary: View fails with internal error after upgrade causes expected syntax error. Key: SPARK-48929 URL: https://issues.apache.org/jira/browse/SPARK-48929 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau Fix For: 4.0.0 On older Spark: CREATE VIEW v AS SELECT 1 ! IN (2); SEELCT * FROM v; => true Upgrade to Spark 4 SELECT * FROM v; Internal error This makes it hard to debug the problem. Rather than assuming that failure to parse a view text is an internal error we should assume something like upgrade broke it and expose the actual error -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-48031) Add schema evolution options to views
Serge Rielau created SPARK-48031: Summary: Add schema evolution options to views Key: SPARK-48031 URL: https://issues.apache.org/jira/browse/SPARK-48031 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau We want to provide the ability for views to react to changes in the query resolution in manners differently than just failing the view. For example we want the view to be able to compensate for type changes by casting the query result to the view column types. Or to adopt any type of column arity changes into a view. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47907) Put removal of '!' as a synonym for 'NOT' on a keyword level under a config
Serge Rielau created SPARK-47907: Summary: Put removal of '!' as a synonym for 'NOT' on a keyword level under a config Key: SPARK-47907 URL: https://issues.apache.org/jira/browse/SPARK-47907 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau Recently we dissolved the lexer equivalence between '!' and 'NOT'. ! is a prefix operator and a synonym for NOT only in that case. But NOT is used in many more cases in the grammar. Given that there are a handful of known scenearios where users have exploited the undocumented loophole it's best to add a config. Usage found so far is: `c1 ! IN(1, 2)` `c1 ! BETWEEN 1 AND 2` `c1 ! LIKE 'a%'` But there are worse cases: c1 IS ! NULL CREATE TABLE T(c1 INT ! NULL) or even CREATE TABLE IF ! EXISTS T(c1 INT) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47802) Revert mapping ( star ) to named_struct ( star )
Serge Rielau created SPARK-47802: Summary: Revert mapping ( star ) to named_struct ( star ) Key: SPARK-47802 URL: https://issues.apache.org/jira/browse/SPARK-47802 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau Turning star within parens into named_struct ( star) as opposed to ignoring the parens turns out to be more risky than anticipated. Given that this was done solely for consistency with ( c1, c2...) it's best to not go there at all. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47789) Review and improve error message texts
Serge Rielau created SPARK-47789: Summary: Review and improve error message texts Key: SPARK-47789 URL: https://issues.apache.org/jira/browse/SPARK-47789 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau error-classes.json content could use some TLC to fix formatting, improve grammar, and other editing. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47783) Refresh error-states.json
[ https://issues.apache.org/jira/browse/SPARK-47783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serge Rielau updated SPARK-47783: - Summary: Refresh error-states.json (was: Refresh error-state.sql) > Refresh error-states.json > - > > Key: SPARK-47783 > URL: https://issues.apache.org/jira/browse/SPARK-47783 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Serge Rielau >Priority: Major > > We want to add more SQLSTATEs to the menu to prevent collisions and do some > general cleanup -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47783) Refresh error-state.sql
Serge Rielau created SPARK-47783: Summary: Refresh error-state.sql Key: SPARK-47783 URL: https://issues.apache.org/jira/browse/SPARK-47783 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau We want to add more SQLSTATEs to the menu to prevent collisions and do some general cleanup -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47719) Change default of spark.sql.legacy.timeParserPolicy from EXCEPTION to CORRECTED
Serge Rielau created SPARK-47719: Summary: Change default of spark.sql.legacy.timeParserPolicy from EXCEPTION to CORRECTED Key: SPARK-47719 URL: https://issues.apache.org/jira/browse/SPARK-47719 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau spark.sql.legacy.timeParserPolicy was introduced in Spark 3.0 and has been set to EXCEPTION. Changing it from EXCEPTION for SPark 4.0 to CORRECTED will reduce errors and reflects a prudent timeframe. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47637) Use errorCapturingIdentifier rule in more places to improve error messages
Serge Rielau created SPARK-47637: Summary: Use errorCapturingIdentifier rule in more places to improve error messages Key: SPARK-47637 URL: https://issues.apache.org/jira/browse/SPARK-47637 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau errorCapturingIdentifier parses identifier with included '-' to raise INVALID_IDENTIFIER instead of SYNTAX_ERROR for non-delimited identifiers containing a hyphen. It is meant to be used wherever the context is not that of an expression This Jira replaces a few missed identifiers with that rule. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47571) date_format() java.lang.ArithmeticException: long overflow for large dates
Serge Rielau created SPARK-47571: Summary: date_format() java.lang.ArithmeticException: long overflow for large dates Key: SPARK-47571 URL: https://issues.apache.org/jira/browse/SPARK-47571 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau The following works for CATS, but not for DATE_FORMAT(): select cast(cast('5881580' AS DATE) AS STRING); +5881580-01-01 spark-sql (default)> select date_format(cast('5881580' AS DATE), 'yyy-mm-dd'); 24/03/26 11:08:23 ERROR SparkSQLDriver: Failed in [select date_format(cast('5881580' AS DATE), 'yyy-mm-dd')] java.lang.ArithmeticException: long overflow at java.base/java.lang.Math.multiplyExact(Math.java:1004) at org.apache.spark.sql.catalyst.util.SparkDateTimeUtils.instantToMicros(SparkDateTimeUtils.scala:122) at org.apache.spark.sql.catalyst.util.SparkDateTimeUtils.instantToMicros$(SparkDateTimeUtils.scala:116) at org.apache.spark.sql.catalyst.util.DateTimeUtils$.instantToMicros(DateTimeUtils.scala:41) at org.apache.spark.sql.catalyst.util.SparkDateTimeUtils.daysToMicros(SparkDateTimeUtils.scala:174) at org.apache.spark.sql.catalyst.util.SparkDateTimeUtils.daysToMicros$(SparkDateTimeUtils.scala:172) at org.apache.spark.sql.catalyst.util.DateTimeUtils$.daysToMicros(DateTimeUtils.scala:41) at org.apache.spark.sql.catalyst.expressions.Cast.$anonfun$castToTimestamp$14(Cast.scala:642) at scala.runtime.java8.JFunction1$mcJI$sp.apply(JFunction1$mcJI$sp.scala:17) at org.apache.spark.sql.catalyst.expressions.Cast.buildCast(Cast.scala:557) at org.apache.spark.sql.catalyst.expressions.Cast.$anonfun$castToTimestamp$13(Cast.scala:642) at org.apache.spark.sql.catalyst.expressions.Cast.nullSafeEval(Cast.scala:1170) at org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:558) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47492) Relax definition of whitespace in lexer
[ https://issues.apache.org/jira/browse/SPARK-47492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serge Rielau updated SPARK-47492: - Summary: Relax definition of whitespace in lexer (was: Wide definition of whitespace in lexer) > Relax definition of whitespace in lexer > --- > > Key: SPARK-47492 > URL: https://issues.apache.org/jira/browse/SPARK-47492 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Serge Rielau >Priority: Major > > There have been multiple incidences where queries "copied" in from other > sources resulted in "weird" syntax errors which ultimately boiled down to > whitespaces which the lexer does not recognize as such. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47492) Wide definition of whitespace in lexer
Serge Rielau created SPARK-47492: Summary: Wide definition of whitespace in lexer Key: SPARK-47492 URL: https://issues.apache.org/jira/browse/SPARK-47492 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau There have been multiple incidences where queries "copied" in from other sources resulted in "weird" syntax errors which ultimately boiled down to whitespaces which the lexer does not recognize as such. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47467) Error message regressed when creating hive table with illegal column name
Serge Rielau created SPARK-47467: Summary: Error message regressed when creating hive table with illegal column name Key: SPARK-47467 URL: https://issues.apache.org/jira/browse/SPARK-47467 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau The following statement used to result in: CREATE TABLE test5(`北京, ` INT) USING HIVE; [[INVALID_HIVE_COLUMN_NAME|https://docs.databricks.com/error-messages/error-classes.html#invalid_hive_column_name]] Cannot create the table `hive_metastore`.`srielau`.`test5` having the column `北京, ` whose name contains invalid characters ',' in Hive metastore. SQLSTATE: 42K05 Now it results in: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe: columns has 2 elements while columns.types has 1 elements!) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$withClient$2(HiveExternalCatalog.scala:168) at org.apache.spark.sql.hive.HiveExternalCatalog.maybeSynchronized(HiveExternalCatalog.scala:115) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$withClient$1(HiveExternalCatalog.scala:153) at com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:405) at com.databricks.backend.daemon.driver.ProgressReporter$.withStatusCode(ProgressReporter.scala:391) at com.databricks.spark.util.SparkDatabricksProgressReporter$.withStatusCode(ProgressReporter.scala:34) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:152) at org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:312) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.$anonfun$createTable$1(ExternalCatalogWithListener.scala:122) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.catalyst.MetricKeyUtils$.measure(MetricKey.scala:661) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.$anonfun$profile$1(ExternalCatalogWithListener.scala:54) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.profile(ExternalCatalogWithListener.scala:53) This may be related to: https://github.com/apache/spark/pull/45180 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47427) Support trailing commas in select list
Serge Rielau created SPARK-47427: Summary: Support trailing commas in select list Key: SPARK-47427 URL: https://issues.apache.org/jira/browse/SPARK-47427 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau DuckDb has popularized allowing trailing commas in the SELECT list. The benefit of this ability is that it is easy to add, remove, comment out expressions from the select list: {noformat} SELECT c1, /* c2 */ FROM T; vs SELECT c1 /* , c2 */ FROM T; {noformat} Recently Snowflake jumped onto this usability feature as well. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47382) SPARK_JOB_CANCELLED is mislabeled as a system error
Serge Rielau created SPARK-47382: Summary: SPARK_JOB_CANCELLED is mislabeled as a system error Key: SPARK-47382 URL: https://issues.apache.org/jira/browse/SPARK-47382 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau This related to: [https://github.com/apache/spark/pull/43926] The proper SQLSTATE should be 57014 "processing was canceled as requested" -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47344) Enhance error message for invalid identifiers that need backticks
Serge Rielau created SPARK-47344: Summary: Enhance error message for invalid identifiers that need backticks Key: SPARK-47344 URL: https://issues.apache.org/jira/browse/SPARK-47344 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau We detect patterns like "my-tab" and raise a meaningful INVALID_IDENTIFIER error when it is not surrounded by back quotes. In this ticket we want to improve this effort to go beyond dashes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47308) LATERAL regresses correlation name resolution
Serge Rielau created SPARK-47308: Summary: LATERAL regresses correlation name resolution Key: SPARK-47308 URL: https://issues.apache.org/jira/browse/SPARK-47308 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.5.0, 3.3.0 Reporter: Serge Rielau {code:java} CREATE TABLE persons(name STRING); INSERT INTO persons VALUES('Table: persons'); CREATE OR REPLACE TABLE women(name STRING); INSERT INTO women VALUES('Table: women'); -- This works: SELECT (SELECT max(folk.id) FROM persons AS men(id), (SELECT name) AS folk(id)) FROM women; Table: women -- This does not: SELECT (SELECT max(folk.id) FROM persons AS men(id), LATERAL (SELECT name) AS folk(id)) FROM women; [UNRESOLVED_COLUMN.WITHOUT_SUGGESTION] A column, variable, or function parameter with name `name` cannot be resolved. SQLSTATE: 42703;{code} This is weird. LATERAL should be strictly additive to name resolution rules. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47192) Convert _LEGACY_ERROR_TEMP_0035 (unsupported hive feature)
Serge Rielau created SPARK-47192: Summary: Convert _LEGACY_ERROR_TEMP_0035 (unsupported hive feature) Key: SPARK-47192 URL: https://issues.apache.org/jira/browse/SPARK-47192 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau Old: > GRANT ROLE; _LEGACY_ERROR_TEMP_0035 Operation not allowed: grant role. (line 1, pos 0) New: error class: HIVE_OPERATION_NOT_SUPPORTED The Hive operation is not supported. (line 1, pos 0) sqlstate: 0A000 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47033) EXECUTE IMMEDIATE USING does not recognize session variable names
Serge Rielau created SPARK-47033: Summary: EXECUTE IMMEDIATE USING does not recognize session variable names Key: SPARK-47033 URL: https://issues.apache.org/jira/browse/SPARK-47033 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau {noformat} DECLARE parm = 'Hello'; EXECUTE IMMEDIATE 'SELECT :parm' USING parm; [ALL_PARAMETERS_MUST_BE_NAMED] Using name parameterized queries requires all parameters to be named. Parameters missing names: "parm". SQLSTATE: 07001 EXECUTE IMMEDIATE 'SELECT :parm' USING parm AS parm; Hello {noformat} variables are like column references, they act as their own aliases and thus should not be required to be named to associate with a named parameter with the same name. Note that unlike for pySpark this should be case insensitive (haven't verified). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46993) Allow session variables in more places such as from_json for schema
Serge Rielau created SPARK-46993: Summary: Allow session variables in more places such as from_json for schema Key: SPARK-46993 URL: https://issues.apache.org/jira/browse/SPARK-46993 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.4.2 Reporter: Serge Rielau It appears we do not allow session variables to provide a schema for from_json(). This is likely a generic restriction re constant folding. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46908) Extend SELECT * support outside of select list
Serge Rielau created SPARK-46908: Summary: Extend SELECT * support outside of select list Key: SPARK-46908 URL: https://issues.apache.org/jira/browse/SPARK-46908 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau Traditionally * is confined to thr select list and there to the top level of expressions. Spark does, in an undocumented fashion support * in the SELECT list for function argument list. Here we want to expand upon this capability by adding the WHERE clause (Filter) as well as a couple of more scenarios such as row value constructors and IN operator. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-46810) Clarify error class terminology
[ https://issues.apache.org/jira/browse/SPARK-46810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811631#comment-17811631 ] Serge Rielau commented on SPARK-46810: -- Yes I prefer option 1. Agreement from [~maxgekk] can't hurt. > Clarify error class terminology > --- > > Key: SPARK-46810 > URL: https://issues.apache.org/jira/browse/SPARK-46810 > Project: Spark > Issue Type: Improvement > Components: Documentation, SQL >Affects Versions: 4.0.0 >Reporter: Nicholas Chammas >Priority: Minor > Labels: pull-request-available > > We use inconsistent terminology when talking about error classes. I'd like to > get some clarity on that before contributing any potential improvements to > this part of the documentation. > Consider > [INCOMPLETE_TYPE_DEFINITION|https://spark.apache.org/docs/3.5.0/sql-error-conditions-incomplete-type-definition-error-class.html]. > It has several key pieces of hierarchical information that have inconsistent > names throughout our documentation and codebase: > * 42 > ** K01 > *** INCOMPLETE_TYPE_DEFINITION > ARRAY > MAP > STRUCT > What are the names of these different levels of information? > Some examples of inconsistent terminology: > * [Over > here|https://spark.apache.org/docs/latest/sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation] > we call 42 the "class". Yet on the main page for INCOMPLETE_TYPE_DEFINITION > we call that an "error class". So what exactly is a class, the 42 or the > INCOMPLETE_TYPE_DEFINITION? > * [Over > here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/README.md#L122] > we call K01 the "subclass". But [over > here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/error-classes.json#L1452-L1467] > we call the ARRAY, MAP, and STRUCT the subclasses. And on the main page for > INCOMPLETE_TYPE_DEFINITION we call those same things "derived error classes". > So what exactly is a subclass? > * [On this > page|https://spark.apache.org/docs/3.5.0/sql-error-conditions.html#incomplete_type_definition] > we call INCOMPLETE_TYPE_DEFINITION an "error condition", though in other > places we refer to it as an "error class". > I don't think we should leave this status quo as-is. I see a couple of ways > to fix this. > h1. Option 1: INCOMPLETE_TYPE_DEFINITION becomes an "Error Condition" > One solution is to use the following terms: > * Error class: 42 > * Error sub-class: K01 > * Error state: 42K01 > * Error condition: INCOMPLETE_TYPE_DEFINITION > * Error sub-condition: ARRAY, MAP, STRUCT > Pros: > * This terminology seems (to me at least) the most natural and intuitive. > * It may also match the SQL standard. > Cons: > * We use {{errorClass}} [all over our > codebase|https://github.com/apache/spark/blob/15c9ec7ca3b66ec413b7964a374cb9508a80/common/utils/src/main/scala/org/apache/spark/SparkException.scala#L30] > – literally in thousands of places – to refer to strings like > INCOMPLETE_TYPE_DEFINITION. > ** It's probably not practical to update all these usages to say > {{errorCondition}} instead, so if we go with this approach there will be a > divide between the terminology we use in user-facing documentation vs. what > the code base uses. > ** We can perhaps rename the existing {{error-classes.json}} to > {{error-conditions.json}} but clarify the reason for this divide between code > and user docs in the documentation for {{ErrorClassesJsonReader}} . > h1. Option 2: 42 becomes an "Error Category" > Another approach is to use the following terminology: > * Error category: 42 > * Error sub-category: K01 > * Error state: 42K01 > * Error class: INCOMPLETE_TYPE_DEFINITION > * Error sub-classes: ARRAY, MAP, STRUCT > Pros: > * We continue to use "error class" as we do today in our code base. > * The change from calling "42" a class to a category is low impact and may > not show up in user-facing documentation at all. (See my side note below.) > Cons: > * These terms may not align with the SQL standard. > * We will have to retire the term "error condition", which we have [already > used|https://github.com/apache/spark/blob/e7fb0ad68f73d0c1996b19c9e139d70dcc97a8c4/docs/sql-error-conditions.md] > in user-facing documentation. > — > Side note: In either case, I believe talking about "42" and "K01" – > regardless of what we end up calling them – in front of users is not helpful. > I don't think anybody cares what "42" by itself means, or what "K01" by > itself means. Accordingly, we should limit how much we talk about these > concepts in the user-facing documentation. -- This message was sent by Atlassian Jira (v8.20.10#
[jira] [Commented] (SPARK-46810) Clarify error class terminology
[ https://issues.apache.org/jira/browse/SPARK-46810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811606#comment-17811606 ] Serge Rielau commented on SPARK-46810: -- [~nchammas] ISO/IEC 9075-2:2016(E) 24.1 SQLSTATE The character string value returned in an SQLSTATE parameter comprises a 2-character class code followed by a 3-character subclass code, each with an implementation-defined character set that has a one-octet character encoding form and is restricted to s and s. Table 38, “SQLSTATE class and subclass codes”, specifies the class code for each condition and the subclass code or codes for each class code. Class codes that begin with one of the s '0', '1', '2', '3', or '4' or one of the s 'A', 'B', 'C', 'D', 'E', 'F', 'G', or 'H' are returned only for conditions defined in ISO/IEC 9075 or in any other International Standard. The range of such class codes is called standard-defined classes. Some such class codes are reserved for use by specific International Standards, as specified elsewhere in this Clause. Subclass codes associated with such classes that also begin with one of those 13 characters are returned only for conditions defined in ISO/IEC 9075 or some other International Standard. The range of such subclass codes is called standard-defined subclasses. Subclass codes associated with such classes that begin with one of the s '5', '6', '7', '8', or '9' or one of the s 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', or 'Z' are reserved for implementation-defined conditions and are called implementation- defined subclasses. Class codes that begin with one of the s '5', '6', '7', '8', or '9' or one of the s 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', or 'Z' are reserved for implementation-defined exception conditions and are called implementation-defined classes. All subclass codes except '000', which means no subclass, associated with such classes are reserved for implementation-defined conditions and are called implementation-defined subclasses. An implementation-defined completion condition shall be indicated by returning an implementation-defined subclass in conjunction with one of the classes successful completion, warning, or no data. I'm fine with the renaming of error class to error condition and subcondition. > Clarify error class terminology > --- > > Key: SPARK-46810 > URL: https://issues.apache.org/jira/browse/SPARK-46810 > Project: Spark > Issue Type: Improvement > Components: Documentation, SQL >Affects Versions: 4.0.0 >Reporter: Nicholas Chammas >Priority: Minor > Labels: pull-request-available > > We use inconsistent terminology when talking about error classes. I'd like to > get some clarity on that before contributing any potential improvements to > this part of the documentation. > Consider > [INCOMPLETE_TYPE_DEFINITION|https://spark.apache.org/docs/3.5.0/sql-error-conditions-incomplete-type-definition-error-class.html]. > It has several key pieces of hierarchical information that have inconsistent > names throughout our documentation and codebase: > * 42 > ** K01 > *** INCOMPLETE_TYPE_DEFINITION > ARRAY > MAP > STRUCT > What are the names of these different levels of information? > Some examples of inconsistent terminology: > * [Over > here|https://spark.apache.org/docs/latest/sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation] > we call 42 the "class". Yet on the main page for INCOMPLETE_TYPE_DEFINITION > we call that an "error class". So what exactly is a class, the 42 or the > INCOMPLETE_TYPE_DEFINITION? > * [Over > here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/README.md#L122] > we call K01 the "subclass". But [over > here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/error-classes.json#L1452-L1467] > we call the ARRAY, MAP, and STRUCT the subclasses. And on the main page for > INCOMPLETE_TYPE_DEFINITION we call those same things "derived error classes". > So what exactly is a subclass? > * [On this > page|https://spark.apache.org/docs/3.5.0/sql-error-conditions.html#incomplete_type_definition] > we call INCOMPLETE_TYPE_DEFINITION an "error condition", though in other > places we refer to it as an "error class". > I don't think we should leave this status quo as-is. I see a couple of ways > to fix this. > h1. Option 1: INCOMPLETE_TYPE_DEFINITION becomes an "Error Condition" > One solution is to use the following terms: > * Error class: 42 > * Error sub-class: K01 > * Error state: 42K01 > * Error condition: INCOMPLETE_TYPE_DEFINITI
[jira] [Created] (SPARK-46782) Bad SQLSTATE "ID001" for INVALID_INVERSE_DISTRIBUTION_FUNCTION
Serge Rielau created SPARK-46782: Summary: Bad SQLSTATE "ID001" for INVALID_INVERSE_DISTRIBUTION_FUNCTION Key: SPARK-46782 URL: https://issues.apache.org/jira/browse/SPARK-46782 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau INVALID_INVERSE_DISTRIBUTION_FUNCTION The "ID" SQLSTATE class is undefined and it is heavy handed to consume a class for a function. In Spark we use the K** subclass to expand the existing range with Spark private states Since this error appears to be compile time I propose "42" as a class. the next free slot in 42K** is: 42K0K -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46625) IDENTIFIER clause does not work with CTE reference
Serge Rielau created SPARK-46625: Summary: IDENTIFIER clause does not work with CTE reference Key: SPARK-46625 URL: https://issues.apache.org/jira/browse/SPARK-46625 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.3.4 Reporter: Serge Rielau IDENTIFIER clause does not pick up CTE DECLARE agg = 'max'; DECLARE col = 'c1'; DECLARE tab = 'T'; WITH S(c1, c2) AS (VALUES(1, 2), (2, 3)), T(c1, c2) AS (VALUES ('a', 'b'), ('c', 'd')) SELECT IDENTIFIER(agg)(IDENTIFIER(col)) FROM IDENTIFIER(tab); [TABLE_OR_VIEW_NOT_FOUND] The table or view `T` cannot be found. Verify the spelling and correctness of the schema and catalog. If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog. To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF EXISTS. SQLSTATE: 42P01; line 3 pos 45; 'Project [unresolvedalias(expressionwithunresolvedidentifier('agg, org.apache.spark.sql.catalyst.parser.AstBuilder$$Lambda$2785/0x009002071490@126688a7))] +- 'UnresolvedRelation [T], [], false -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46410) Assign error classes/subclasses to JdbcUtils.classifyException
Serge Rielau created SPARK-46410: Summary: Assign error classes/subclasses to JdbcUtils.classifyException Key: SPARK-46410 URL: https://issues.apache.org/jira/browse/SPARK-46410 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.5.0 Reporter: Serge Rielau This is a follow up to SPARK-46393. We should raise distinct error classes for the different kinds of invokers of JdbcUtils.classifyException -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46372) "Invalid call to toAttribute on unresolved object" instead UNRESOLVED_COLUMN.WITH_SUGGESTION on INSERT statement
Serge Rielau created SPARK-46372: Summary: "Invalid call to toAttribute on unresolved object" instead UNRESOLVED_COLUMN.WITH_SUGGESTION on INSERT statement Key: SPARK-46372 URL: https://issues.apache.org/jira/browse/SPARK-46372 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.5.0 Reporter: Serge Rielau {{CREATE TABLE rec(n INT, sm INT);}} SELECT n + 1, n + 1 + sm FROM rec WHERE rec = 8; [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column, variable, or function parameter with name `rec` cannot be resolved. Did you mean one of the following? [`n`, `sm`]. SQLSTATE: 42703; line 1 pos 40; {{but when placed in an INSERT:}} {{}} INSERT INTO rec SELECT n + 1, n + 1 + sm FROM rec WHERE rec = 8; Invalid call to toAttribute on unresolved object 1. This appears to be a system error and we should raise it as such 2. Clearly we missed or didn't get to the point to raise the proper error. Stacktrace: {quote}scala> spark.sql("INSERT INTO rec SELECT n + 1, n + 1 + sm FROM rec WHERE rec = 8").show(); 23/12/11 18:12:25 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0 23/12/11 18:12:25 WARN ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore serge.rielau@10.240.1.53 org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to toAttribute on unresolved object at org.apache.spark.sql.catalyst.analysis.UnresolvedAlias.toAttribute(unresolved.scala:707) at org.apache.spark.sql.catalyst.plans.logical.Project.$anonfun$output$1(basicLogicalOperators.scala:74) at scala.collection.immutable.List.map(List.scala:246) at scala.collection.immutable.List.map(List.scala:79) at org.apache.spark.sql.catalyst.plans.logical.Project.output(basicLogicalOperators.scala:74) at org.apache.spark.sql.hive.HiveAnalysis$$anonfun$apply$3.applyOrElse(HiveStrategies.scala:166) at org.apache.spark.sql.hive.HiveAnalysis$$anonfun$apply$3.applyOrElse(HiveStrategies.scala:161) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsDownWithPruning$2(AnalysisHelper.scala:170) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:83) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsDownWithPruning$1(AnalysisHelper.scala:170) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsDownWithPruning(AnalysisHelper.scala:168) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsDownWithPruning$(AnalysisHelper.scala:164) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsDownWithPruning(LogicalPlan.scala:33) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsWithPruning(AnalysisHelper.scala:99) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsWithPruning$(AnalysisHelper.scala:96) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsWithPruning(LogicalPlan.scala:33) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperators(AnalysisHelper.scala:76) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperators$(AnalysisHelper.scala:75) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:33) at org.apache.spark.sql.hive.HiveAnalysis$.apply(HiveStrategies.scala:161) at org.apache.spark.sql.hive.HiveAnalysis$.apply(HiveStrategies.scala:160) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:222) at scala.collection.LinearSeqOps.foldLeft(LinearSeq.scala:183) at scala.collection.LinearSeqOps.foldLeft$(LinearSeq.scala:179) at scala.collection.immutable.List.foldLeft(List.scala:79) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:219) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:211) at scala.collection.immutable.List.foreach(List.scala:333) at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:211) at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:224) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:220) at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:176) at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:220) at
[jira] [Created] (SPARK-46141) Change default of spark.sql.legacy.ctePrecedencePolicy from EXCEPTION to CORRECTED
Serge Rielau created SPARK-46141: Summary: Change default of spark.sql.legacy.ctePrecedencePolicy from EXCEPTION to CORRECTED Key: SPARK-46141 URL: https://issues.apache.org/jira/browse/SPARK-46141 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau spark.sql.legacy.ctePrecedencePolicy has been around for years and is defaulted to EXCEPTION. It is high time that we change it to corrected -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46068) Improve error message when using a string literal where only an identifier can go
Serge Rielau created SPARK-46068: Summary: Improve error message when using a string literal where only an identifier can go Key: SPARK-46068 URL: https://issues.apache.org/jira/browse/SPARK-46068 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.4.2 Reporter: Serge Rielau The following: {noformat} *nospark-sql (default)> select * from "t"; [PARSE_SYNTAX_ERROR] Syntax error at or near '"t"'. SQLSTATE: 42601 (line 1, pos 14) == SQL == select * from "t" --^^^ {noformat} .. is confusing if one is used to double quotes for identifiers. Similarly, it is easy to mix up ' and `. So we would like to return an error that clearly states that a string was given where an identifier was expected. We can also propose using spark.sql.ansi.double_quoted_identifiers in that case. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-45595) Expose SQLSTATE in error message
[ https://issues.apache.org/jira/browse/SPARK-45595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serge Rielau updated SPARK-45595: - Summary: Expose SQLSTATE in error message (was: Expose SQLSTATRE in errormessage) > Expose SQLSTATE in error message > > > Key: SPARK-45595 > URL: https://issues.apache.org/jira/browse/SPARK-45595 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Serge Rielau >Priority: Major > > When using spark.sql.error.messageFormat in MINIMAL or STANDARD mode the > SQLSTATE is exposed; > We want to extend this to PRETTY mode, now that all errors have SQLSTATEs > We propose to trail the SQLSTATE after the text message, so it does not take > away from the reading experience of the message, while still being easily > found by tooling or humans. > [] SQLSTATE: > > Example: > {{[DIVIDE_BY_ZERO] ** Division by zero. Use `try_divide` to tolerate divisor > being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to > "false" to bypass this error. SQLSTATE: 22013}} > {{{}== SQL(line 1, position 8){}}}{{{}== > {}}}{{{}SELECT 1/0 > {}}}{{ ^^^}} > Other options considered have been: > {{[DIVIDE_BY_ZERO](22013) ** Division by zero. Use `try_divide` to tolerate > divisor being 0 and return NULL instead. If necessary set > "spark.sql.ansi.enabled" to "false" to bypass this error. }} > {{{}== SQL(line 1, position 8){}}}{{{}== > {}}}{{{}SELECT 1/0 > {}}}{{ ^^^}} > {{and}} > [DIVIDE_BY_ZERO] ** Division by zero. Use `try_divide` to tolerate > divisor being 0 and return NULL instead. If necessary set > "spark.sql.ansi.enabled" to "false" to bypass this error.}} > {{{}== SQL(line 1, position 8){}}}{{{}=={}}} > {{SELECT 1/0}} > {{ ^^^}} > SQLSTATE: 22013 > }}{{{}{{}}{}}} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45595) Expose SQLSTATRE in errormessage
Serge Rielau created SPARK-45595: Summary: Expose SQLSTATRE in errormessage Key: SPARK-45595 URL: https://issues.apache.org/jira/browse/SPARK-45595 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau When using spark.sql.error.messageFormat in MINIMAL or STANDARD mode the SQLSTATE is exposed; We want to extend this to PRETTY mode, now that all errors have SQLSTATEs We propose to trail the SQLSTATE after the text message, so it does not take away from the reading experience of the message, while still being easily found by tooling or humans. [] SQLSTATE: Example: {{[DIVIDE_BY_ZERO] ** Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error. SQLSTATE: 22013}} {{{}== SQL(line 1, position 8){}}}{{{}== {}}}{{{}SELECT 1/0 {}}}{{ ^^^}} Other options considered have been: {{[DIVIDE_BY_ZERO](22013) ** Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error. }} {{{}== SQL(line 1, position 8){}}}{{{}== {}}}{{{}SELECT 1/0 {}}}{{ ^^^}} {{and}} [DIVIDE_BY_ZERO] ** Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.}} {{{}== SQL(line 1, position 8){}}}{{{}=={}}} {{SELECT 1/0}} {{ ^^^}} SQLSTATE: 22013 }}{{{}{{}}{}}} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45581) Make SQLSTATEs mandatory
Serge Rielau created SPARK-45581: Summary: Make SQLSTATEs mandatory Key: SPARK-45581 URL: https://issues.apache.org/jira/browse/SPARK-45581 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau All well-defined (non _LEGACY) error classes have been issued SQLSTATEs. To keep this clean, it is time to enforce that any new error classes must come with a SQLSTATE moving forward -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-45490) Replace: _LEGACY_ERROR_TEMP_2151 with a proper error class
[ https://issues.apache.org/jira/browse/SPARK-45490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serge Rielau resolved SPARK-45490. -- Resolution: Cannot Reproduce Seems to have been implemented as: EXPRESSION_DECODING_FAILED > Replace: _LEGACY_ERROR_TEMP_2151 with a proper error class > -- > > Key: SPARK-45490 > URL: https://issues.apache.org/jira/browse/SPARK-45490 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 4.0.0 >Reporter: Serge Rielau >Priority: Major > > {code:java} > def expressionDecodingError(e: Exception, expressions: Seq[Expression]): > SparkRuntimeException = { > new SparkRuntimeException( > errorClass = "_LEGACY_ERROR_TEMP_2151", > messageParameters = Map( > "e" -> e.toString(), > "expressions" -> expressions.map( > _.simpleString(SQLConf.get.maxToStringFields)).mkString("\n")), > cause = e) > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45493) Replace: _LEGACY_ERROR_TEMP_2187 with a better error message
Serge Rielau created SPARK-45493: Summary: Replace: _LEGACY_ERROR_TEMP_2187 with a better error message Key: SPARK-45493 URL: https://issues.apache.org/jira/browse/SPARK-45493 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau {code:java} def convertHiveTableToCatalogTableError( e: SparkException, dbName: String, tableName: String): Throwable = { new SparkException( errorClass = "_LEGACY_ERROR_TEMP_2187", messageParameters = Map( "message" -> e.getMessage, "dbName" -> dbName, "tableName" -> tableName), cause = e) } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45492) Replace: _LEGACY_ERROR_TEMP_2152 with a better error class
Serge Rielau created SPARK-45492: Summary: Replace: _LEGACY_ERROR_TEMP_2152 with a better error class Key: SPARK-45492 URL: https://issues.apache.org/jira/browse/SPARK-45492 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau {code:java} def expressionEncodingError(e: Exception, expressions: Seq[Expression]): SparkRuntimeException = { new SparkRuntimeException( errorClass = "_LEGACY_ERROR_TEMP_2152", messageParameters = Map( "e" -> e.toString(), "expressions" -> expressions.map( _.simpleString(SQLConf.get.maxToStringFields)).mkString("\n")), cause = e) } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45491) Replace: _LEGACY_ERROR_TEMP_2196 with a better error class
Serge Rielau created SPARK-45491: Summary: Replace: _LEGACY_ERROR_TEMP_2196 with a better error class Key: SPARK-45491 URL: https://issues.apache.org/jira/browse/SPARK-45491 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau {code:java} def cannotFetchTablesOfDatabaseError(dbName: String, e: Exception): Throwable = { new SparkException( errorClass = "_LEGACY_ERROR_TEMP_2196", messageParameters = Map( "dbName" -> dbName), cause = e) } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45490) Replace: _LEGACY_ERROR_TEMP_2151 with a proper error class
Serge Rielau created SPARK-45490: Summary: Replace: _LEGACY_ERROR_TEMP_2151 with a proper error class Key: SPARK-45490 URL: https://issues.apache.org/jira/browse/SPARK-45490 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau {code:java} def expressionDecodingError(e: Exception, expressions: Seq[Expression]): SparkRuntimeException = { new SparkRuntimeException( errorClass = "_LEGACY_ERROR_TEMP_2151", messageParameters = Map( "e" -> e.toString(), "expressions" -> expressions.map( _.simpleString(SQLConf.get.maxToStringFields)).mkString("\n")), cause = e) } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45489) Replace: _LEGACY_ERROR_TEMP_2134 with a regular error class
Serge Rielau created SPARK-45489: Summary: Replace: _LEGACY_ERROR_TEMP_2134 with a regular error class Key: SPARK-45489 URL: https://issues.apache.org/jira/browse/SPARK-45489 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau This is a frequently seen error we should convert: def cannotParseStringAsDataTypeError(pattern: String, value: String, dataType: DataType) : SparkRuntimeException = { new SparkRuntimeException( errorClass = "_LEGACY_ERROR_TEMP_2134", messageParameters = Map( "value" -> toSQLValue(value), "pattern" -> toSQLValue(pattern), "dataType" -> dataType.toString)) } -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45487) Replace: _LEGACY_ERROR_TEMP_3007
Serge Rielau created SPARK-45487: Summary: Replace: _LEGACY_ERROR_TEMP_3007 Key: SPARK-45487 URL: https://issues.apache.org/jira/browse/SPARK-45487 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 4.0.0 Reporter: Serge Rielau def checkpointRDDBlockIdNotFoundError(rddBlockId: RDDBlockId): Throwable = \{ new SparkException( errorClass = "_LEGACY_ERROR_TEMP_3007", messageParameters = Map("rddBlockId" -> s"$rddBlockId"), cause = null ) } This error condition appears to be quite common, so we should convert it to a proper error class. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45367) Add errorclass and sqlstate for: _LEGACY_ERROR_TEMP_1273
Serge Rielau created SPARK-45367: Summary: Add errorclass and sqlstate for: _LEGACY_ERROR_TEMP_1273 Key: SPARK-45367 URL: https://issues.apache.org/jira/browse/SPARK-45367 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.5.0 Reporter: Serge Rielau This seems to be a very common error -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-45132) Fix IDENTIFIER clause for functions
Serge Rielau created SPARK-45132: Summary: Fix IDENTIFIER clause for functions Key: SPARK-45132 URL: https://issues.apache.org/jira/browse/SPARK-45132 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.5.0 Reporter: Serge Rielau Due to a quirk in the grammar IDENTIFIER('foo')() does not resolve depending on . Example: SELECT IDENTIFIER('abs')(-1) works, but SELECT IDENTIFIER('abs')(c1) FROM VALUES(-1) AS T(c1) does not. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-44840) array_insert() give wrong results for ngative index
[ https://issues.apache.org/jira/browse/SPARK-44840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756574#comment-17756574 ] Serge Rielau edited comment on SPARK-44840 at 8/20/23 7:41 PM: --- [~srowen] There is no standard as such. However, there are multiple reasons not to be compatible with Snowflake: 1. Precedence: SUBSTR('Hello', 1, 1) => 'H', SUBSTR('Hello', -1, 1) => 'o' (not 'l'). 2. array access has been a mixed bag for us (some 0, some 1-based), but we have tried to move towards 1-based as well. e.g., element_at() is 1-based, and we use -1 (!) to get the last element. 3. Snowflake had no choice but to use -1 for the second last element because 1 is their second element. Because they are 0-based they are unable to use array_insert() to append an element (short of passing the (length - 1) as parameter. So the proposal is objectively more powerful. was (Author: JIRAUSER288374): [~srowen] There is no standard as such. However, there are multiple reasons not to be compatible with Snowflake: 1. Precedence: SUBSTR('Hello', 1, 1) => 'H', SUBSTR('Hello', -1, 1) => 'o' (not 'l'). 2. array access has been a mixed bag for us (some 0, some 1-based), but we have tried to move towards 1-based as well. e.g., element_at() is 1-based, and we use -1 (!) to get the last element. 3. Snowflake had no choice but to use 1 for the second last element because 1 is their second element. Because they are 0-based they are unable to use array_insert() to append an element (short of passing the (length - 1) as parameter. So the proposal is objectively more powerful. > array_insert() give wrong results for ngative index > --- > > Key: SPARK-44840 > URL: https://issues.apache.org/jira/browse/SPARK-44840 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.4.0 >Reporter: Serge Rielau >Assignee: Max Gekk >Priority: Major > > Unlike in Snowflake we decided that array_inert() is 1 based. > This means 1 is the first element in an array and -1 is the last. > This matches the behavior of functions such as substr() and element_at(). > > {code:java} > > SELECT array_insert(array('a', 'b', 'c'), 1, 'z'); > ["z","a","b","c"] > > SELECT array_insert(array('a', 'b', 'c'), 0, 'z'); > Error > > SELECT array_insert(array('a', 'b', 'c'), -1, 'z'); > ["a","b","c","z"] > > SELECT array_insert(array('a', 'b', 'c'), 5, 'z'); > ["a","b","c",NULL,"z"] > > SELECT array_insert(array('a', 'b', 'c'), -5, 'z'); > ["z",NULL,"a","b","c"] > > SELECT array_insert(array('a', 'b', 'c'), 2, cast(NULL AS STRING)); > ["a",NULL,"b","c"] > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-44840) array_insert() give wrong results for ngative index
[ https://issues.apache.org/jira/browse/SPARK-44840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17756574#comment-17756574 ] Serge Rielau commented on SPARK-44840: -- [~srowen] There is no standard as such. However, there are multiple reasons not to be compatible with Snowflake: 1. Precedence: SUBSTR('Hello', 1, 1) => 'H', SUBSTR('Hello', -1, 1) => 'o' (not 'l'). 2. array access has been a mixed bag for us (some 0, some 1-based), but we have tried to move towards 1-based as well. e.g., element_at() is 1-based, and we use -1 (!) to get the last element. 3. Snowflake had no choice but to use 1 for the second last element because 1 is their second element. Because they are 0-based they are unable to use array_insert() to append an element (short of passing the (length - 1) as parameter. So the proposal is objectively more powerful. > array_insert() give wrong results for ngative index > --- > > Key: SPARK-44840 > URL: https://issues.apache.org/jira/browse/SPARK-44840 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.4.0 >Reporter: Serge Rielau >Assignee: Max Gekk >Priority: Major > > Unlike in Snowflake we decided that array_inert() is 1 based. > This means 1 is the first element in an array and -1 is the last. > This matches the behavior of functions such as substr() and element_at(). > > {code:java} > > SELECT array_insert(array('a', 'b', 'c'), 1, 'z'); > ["z","a","b","c"] > > SELECT array_insert(array('a', 'b', 'c'), 0, 'z'); > Error > > SELECT array_insert(array('a', 'b', 'c'), -1, 'z'); > ["a","b","c","z"] > > SELECT array_insert(array('a', 'b', 'c'), 5, 'z'); > ["a","b","c",NULL,"z"] > > SELECT array_insert(array('a', 'b', 'c'), -5, 'z'); > ["z",NULL,"a","b","c"] > > SELECT array_insert(array('a', 'b', 'c'), 2, cast(NULL AS STRING)); > ["a",NULL,"b","c"] > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-44840) array_insert() give wrong results for ngative index
Serge Rielau created SPARK-44840: Summary: array_insert() give wrong results for ngative index Key: SPARK-44840 URL: https://issues.apache.org/jira/browse/SPARK-44840 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau Unlike in Snowflake we decided that array_inert() is 1 based. This means 1 is the first element in an array and -1 is the last. This matches the behavior of functions such as substr() and element_at(). {code:java} > SELECT array_insert(array('a', 'b', 'c'), 1, 'z'); ["z","a","b","c"] > SELECT array_insert(array('a', 'b', 'c'), 0, 'z'); Error > SELECT array_insert(array('a', 'b', 'c'), -1, 'z'); ["a","b","c","z"] > SELECT array_insert(array('a', 'b', 'c'), 5, 'z'); ["a","b","c",NULL,"z"] > SELECT array_insert(array('a', 'b', 'c'), -5, 'z'); ["z",NULL,"a","b","c"] > SELECT array_insert(array('a', 'b', 'c'), 2, cast(NULL AS STRING)); ["a",NULL,"b","c"] {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-44838) Enhance raise_error() to exploit the new error framework
Serge Rielau created SPARK-44838: Summary: Enhance raise_error() to exploit the new error framework Key: SPARK-44838 URL: https://issues.apache.org/jira/browse/SPARK-44838 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.5.0 Reporter: Serge Rielau raise_error() and assert_true() do not presently utilize the new error framework. We want to generalize raise_error() to take an error class, sqlstate and message parameters as arguments to compose a well-formed error condition. The existing assert_true(0 and raise_error() versions should return an error class -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44780) Document SQL Session variables
[ https://issues.apache.org/jira/browse/SPARK-44780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serge Rielau updated SPARK-44780: - Attachment: Screenshot 2023-08-11 at 10.22.55 PM.png Screenshot 2023-08-11 at 10.24.33 PM.png Screenshot 2023-08-11 at 10.26.54 PM.png > Document SQL Session variables > -- > > Key: SPARK-44780 > URL: https://issues.apache.org/jira/browse/SPARK-44780 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 3.4.2 >Reporter: Serge Rielau >Priority: Major > Attachments: Screenshot 2023-08-11 at 10.22.55 PM.png, Screenshot > 2023-08-11 at 10.24.33 PM.png, Screenshot 2023-08-11 at 10.26.54 PM.png > > > SQL Session variables have been added with: SPARK-42849. > Here we add the docs for it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44780) Document SQL Session variables
[ https://issues.apache.org/jira/browse/SPARK-44780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serge Rielau updated SPARK-44780: - Summary: Document SQL Session variables (was: Docuement SQL Session variables) > Document SQL Session variables > -- > > Key: SPARK-44780 > URL: https://issues.apache.org/jira/browse/SPARK-44780 > Project: Spark > Issue Type: Task > Components: Spark Core >Affects Versions: 3.4.2 >Reporter: Serge Rielau >Priority: Major > > SQL Session variables have been added with: SPARK-42849. > Here we add the docs for it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-44780) Docuement SQL Session variables
Serge Rielau created SPARK-44780: Summary: Docuement SQL Session variables Key: SPARK-44780 URL: https://issues.apache.org/jira/browse/SPARK-44780 Project: Spark Issue Type: Task Components: Spark Core Affects Versions: 3.4.2 Reporter: Serge Rielau SQL Session variables have been added with: SPARK-42849. Here we add the docs for it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-44680) parameter markers are not blocked from DEFAULT (and other places)
Serge Rielau created SPARK-44680: Summary: parameter markers are not blocked from DEFAULT (and other places) Key: SPARK-44680 URL: https://issues.apache.org/jira/browse/SPARK-44680 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau scala> spark.sql("CREATE TABLE t11(c1 int default :parm)", args = Map("parm" -> 5)).show() -> success scala> spark.sql("describe t11"); [INVALID_DEFAULT_VALUE.UNRESOLVED_EXPRESSION] Failed to execute EXISTS_DEFAULT command because the destination table column `c1` has a DEFAULT value :parm, which fails to resolve as a valid expression. This likely extends to other DDL-y places. I can only find protection against placement in the body of a CREATE VIEW. I see two ways out of this: * Raise an error (as we do for CREATE VIEW v1(c1) AS SELECT ? ) * Improve the way we persist queries/expressions to substitute the at-DDL-time bound parameter value (it' not a bug it's a feature) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-44577) INSERT BY NAME returns non-sensical error message
Serge Rielau created SPARK-44577: Summary: INSERT BY NAME returns non-sensical error message Key: SPARK-44577 URL: https://issues.apache.org/jira/browse/SPARK-44577 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau CREATE TABLE bug(c1 INT); INSERT INTO bug BY NAME SELECT 1 AS c2; ==> Multi-part identifier cannot be empty. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-43438) Fix mismatched column list error on INSERT
[ https://issues.apache.org/jira/browse/SPARK-43438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17740789#comment-17740789 ] Serge Rielau edited comment on SPARK-43438 at 7/6/23 8:17 PM: -- spark-sql (default)> INSERT INTO tabtest SELECT 1; This should NOT succeed. was (Author: JIRAUSER288374): spark-sql (default)> INSERT INTO tabtest SELECT 1; This should NOT succeed. > Fix mismatched column list error on INSERT > -- > > Key: SPARK-43438 > URL: https://issues.apache.org/jira/browse/SPARK-43438 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 3.4.0 >Reporter: Serge Rielau >Priority: Major > > This error message is pretty bad, and common > "_LEGACY_ERROR_TEMP_1038" : { > "message" : [ > "Cannot write to table due to mismatched user specified column > size() and data column size()." > ] > }, > It can perhaps be merged with this one - after giving it an ERROR_CLASS > "_LEGACY_ERROR_TEMP_1168" : { > "message" : [ > " requires that the data to be inserted have the same number of > columns as the target table: target table has column(s) but > the inserted data has column(s), including > partition column(s) having constant value(s)." > ] > }, > Repro: > CREATE TABLE tabtest(c1 INT, c2 INT); > INSERT INTO tabtest SELECT 1; > `spark_catalog`.`default`.`tabtest` requires that the data to be inserted > have the same number of columns as the target table: target table has 2 > column(s) but the inserted data has 1 column(s), including 0 partition > column(s) having constant value(s). > INSERT INTO tabtest(c1) SELECT 1, 2, 3; > Cannot write to table due to mismatched user specified column size(1) and > data column size(3).; line 1 pos 24 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-43438) Fix mismatched column list error on INSERT
[ https://issues.apache.org/jira/browse/SPARK-43438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17740789#comment-17740789 ] Serge Rielau commented on SPARK-43438: -- spark-sql (default)> INSERT INTO tabtest SELECT 1; This should NOT succeed. > Fix mismatched column list error on INSERT > -- > > Key: SPARK-43438 > URL: https://issues.apache.org/jira/browse/SPARK-43438 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 3.4.0 >Reporter: Serge Rielau >Priority: Major > > This error message is pretty bad, and common > "_LEGACY_ERROR_TEMP_1038" : { > "message" : [ > "Cannot write to table due to mismatched user specified column > size() and data column size()." > ] > }, > It can perhaps be merged with this one - after giving it an ERROR_CLASS > "_LEGACY_ERROR_TEMP_1168" : { > "message" : [ > " requires that the data to be inserted have the same number of > columns as the target table: target table has column(s) but > the inserted data has column(s), including > partition column(s) having constant value(s)." > ] > }, > Repro: > CREATE TABLE tabtest(c1 INT, c2 INT); > INSERT INTO tabtest SELECT 1; > `spark_catalog`.`default`.`tabtest` requires that the data to be inserted > have the same number of columns as the target table: target table has 2 > column(s) but the inserted data has 1 column(s), including 0 partition > column(s) having constant value(s). > INSERT INTO tabtest(c1) SELECT 1, 2, 3; > Cannot write to table due to mismatched user specified column size(1) and > data column size(3).; line 1 pos 24 > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-43918) Cannot CREATE VIEW despite columns explicitly aliased
Serge Rielau created SPARK-43918: Summary: Cannot CREATE VIEW despite columns explicitly aliased Key: SPARK-43918 URL: https://issues.apache.org/jira/browse/SPARK-43918 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau spark.sql("CREATE VIEW v(c) AS SELECT b AS c FROM (SELECT (SELECT 1)) AS T(b)").show() org.apache.spark.sql.AnalysisException: Not allowed to create a permanent view `spark_catalog`.`default`.`v` without explicitly assigning an alias for expression c. The problem seems to be the scalar subquery (SELECT 1) not being aliased. But that shouldn't matter. "AS X should be the backstop. In fact, AS T(b) should have been sufficient. Not to speak of the fact that the column is named c in the view header. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-43884) Allow parameter markers in DDL (again)
Serge Rielau created SPARK-43884: Summary: Allow parameter markers in DDL (again) Key: SPARK-43884 URL: https://issues.apache.org/jira/browse/SPARK-43884 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 3.5.0 Reporter: Serge Rielau When we introduced parameter markers initially, we allowed them n any SQL statement. Subsequently, we have limited their use to DML and queries because that aligns better with the industry and we saw no immediate use for broader support. However, we have introduced the IDENTIFIER() clause, which allows templating table-, and function-identifiers in DDL statements. To exploit this, we need parameter markers as argument: spark.sql("CREATE TABLE IDENTIFIER(:tableName) (c1 INT)", args = Map("tableName" -> "mytable") -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-43472) Mexico has changed observation of DST, this breaks timestamp_utc()
Serge Rielau created SPARK-43472: Summary: Mexico has changed observation of DST, this breaks timestamp_utc() Key: SPARK-43472 URL: https://issues.apache.org/jira/browse/SPARK-43472 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau [https://www.timeanddate.com/time/change/mexico/mexico-city?year=2023#:~:text=Daylight%20Saving%20Time%20(DST)%20Not,was%20on%20October%2030%2C%202022.] Mexico has stopped observing DST. This results in wrong results for: from_utc_timestamp([timestamp], 'America/Mexico_City') -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-43438) Fix mismatched column list error on INSERT
Serge Rielau created SPARK-43438: Summary: Fix mismatched column list error on INSERT Key: SPARK-43438 URL: https://issues.apache.org/jira/browse/SPARK-43438 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau This error message is pretty bad, and common "_LEGACY_ERROR_TEMP_1038" : { "message" : [ "Cannot write to table due to mismatched user specified column size() and data column size()." ] }, It can perhaps be merged with this one - after giving it an ERROR_CLASS "_LEGACY_ERROR_TEMP_1168" : { "message" : [ " requires that the data to be inserted have the same number of columns as the target table: target table has column(s) but the inserted data has column(s), including partition column(s) having constant value(s)." ] }, Repro: CREATE TABLE tabtest(c1 INT, c2 INT); INSERT INTO tabtest SELECT 1; `spark_catalog`.`default`.`tabtest` requires that the data to be inserted have the same number of columns as the target table: target table has 2 column(s) but the inserted data has 1 column(s), including 0 partition column(s) having constant value(s). INSERT INTO tabtest(c1) SELECT 1, 2, 3; Cannot write to table due to mismatched user specified column size(1) and data column size(3).; line 1 pos 24 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-43359) DELETE from Hive table result in INTERNAL error
Serge Rielau created SPARK-43359: Summary: DELETE from Hive table result in INTERNAL error Key: SPARK-43359 URL: https://issues.apache.org/jira/browse/SPARK-43359 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau spark-sql (default)> CREATE TABLE T1(c1 INT); spark-sql (default)> DELETE FROM T1 WHERE c1 = 1; [INTERNAL_ERROR] Unexpected table relation: HiveTableRelation [`spark_catalog`.`default`.`t1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [c1#3], Partition Cols: []] org.apache.spark.SparkException: [INTERNAL_ERROR] Unexpected table relation: HiveTableRelation [`spark_catalog`.`default`.`t1`, org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, Data Cols: [c1#3], Partition Cols: []] at org.apache.spark.SparkException$.internalError(SparkException.scala:77) at org.apache.spark.SparkException$.internalError(SparkException.scala:81) at org.apache.spark.sql.execution.datasources.v2.DataSourceV2Strategy.apply(DataSourceV2Strategy.scala:310) at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63) at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491) at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93) at org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:70) at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-43205) Add an IDENTIFIER(stringLiteral) clause that maps a string to an identifier
Serge Rielau created SPARK-43205: Summary: Add an IDENTIFIER(stringLiteral) clause that maps a string to an identifier Key: SPARK-43205 URL: https://issues.apache.org/jira/browse/SPARK-43205 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 3.5.0 Reporter: Serge Rielau There is a requirement for SQL templates, where the table and or column names are provided through substitution. This can be done today using variable substitution: SET hivevar:tabname = mytab; SELECT * FROM ${ hivevar:tabname }; A straight variable substitution is dangerous since it does allow for SQL injection: SET hivevar:tabname = mytab, someothertab; SELECT * FROM ${ hivevar:tabname }; A way to get around this problem is to wrap the variable substitution with a clause that limits the scope t produce an identifier. This approach is taken by Snowflake: [https://docs.snowflake.com/en/sql-reference/session-variables#using-variables-in-sql] SET hivevar:tabname = 'tabname'; SELECT * FROM IDENTIFIER(${ hivevar:tabname }) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42919) SELECT * LIKE 'pattern' FROM ....
Serge Rielau created SPARK-42919: Summary: SELECT * LIKE 'pattern' FROM Key: SPARK-42919 URL: https://issues.apache.org/jira/browse/SPARK-42919 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 3.5.0 Reporter: Serge Rielau SparkSQL supports *regex_column_names.* [https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select.html] However, support depends on a config: spark.sql.parser.quotedRegexColumnNames The reason is that it overloads proper identifier names. Here we propose a cleaner, compatible API: SELECT * LIKE 'pattern'; The semantic should follow common regular expression patterns used for the LIKE operator with the caveat that it should obey identifier case insensitivity setting. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-42919) SELECT * LIKE 'pattern' FROM ....
[ https://issues.apache.org/jira/browse/SPARK-42919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serge Rielau updated SPARK-42919: - Description: SparkSQL supports *regex_column_names.* [https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select.html] However, support depends on a config: spark.sql.parser.quotedRegexColumnNames The reason is that it overloads proper identifier names. Here we propose a cleaner, compatible API: SELECT * LIKE 'pattern' ... The semantic should follow common regular expression patterns used for the LIKE operator with the caveat that it should obey identifier case insensitivity setting. was: SparkSQL supports *regex_column_names.* [https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select.html] However, support depends on a config: spark.sql.parser.quotedRegexColumnNames The reason is that it overloads proper identifier names. Here we propose a cleaner, compatible API: SELECT * LIKE 'pattern'; The semantic should follow common regular expression patterns used for the LIKE operator with the caveat that it should obey identifier case insensitivity setting. > SELECT * LIKE 'pattern' FROM > - > > Key: SPARK-42919 > URL: https://issues.apache.org/jira/browse/SPARK-42919 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 3.5.0 >Reporter: Serge Rielau >Priority: Minor > > SparkSQL supports *regex_column_names.* > [https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select.html] > However, support depends on a config: spark.sql.parser.quotedRegexColumnNames > The reason is that it overloads proper identifier names. > Here we propose a cleaner, compatible API: > SELECT * LIKE 'pattern' ... > The semantic should follow common regular expression patterns used for the > LIKE operator with the caveat that it should obey identifier case > insensitivity setting. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42546) SPARK-42045 is incomplete in supporting ANSI_MODE fro round() and bround()
[ https://issues.apache.org/jira/browse/SPARK-42546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17702337#comment-17702337 ] Serge Rielau commented on SPARK-42546: -- Can't speak for Wenchen, but +1 [~ddavies1] > SPARK-42045 is incomplete in supporting ANSI_MODE fro round() and bround() > -- > > Key: SPARK-42546 > URL: https://issues.apache.org/jira/browse/SPARK-42546 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.3.2, 3.4.0 >Reporter: Serge Rielau >Priority: Major > > under ANSI mode SPARK-42045 added error conditions insetad of silent > overflows for edge cases in round() and bround(). > However it appears this fix works only for the INT data type. Trying it on a > e.g. SMALLINT the function still returns wrong results: > {code:java} > spark-sql> select round(2147483647, -1); > [ARITHMETIC_OVERFLOW] Overflow. If necessary set "spark.sql.ansi.enabled" to > "false" to bypass this error.{code} > {code:java} > spark-sql> select round(127y, -1); > -126 {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-42546) SPARK-42045 is incomplete in supporting ANSI_MODE fro round() and bround()
[ https://issues.apache.org/jira/browse/SPARK-42546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17702337#comment-17702337 ] Serge Rielau edited comment on SPARK-42546 at 3/19/23 6:30 PM: --- Can't speak for [~cloud_fan] , but +1 [~ddavies1] was (Author: JIRAUSER288374): Can't speak for Wenchen, but +1 [~ddavies1] > SPARK-42045 is incomplete in supporting ANSI_MODE fro round() and bround() > -- > > Key: SPARK-42546 > URL: https://issues.apache.org/jira/browse/SPARK-42546 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.3.2, 3.4.0 >Reporter: Serge Rielau >Priority: Major > > under ANSI mode SPARK-42045 added error conditions insetad of silent > overflows for edge cases in round() and bround(). > However it appears this fix works only for the INT data type. Trying it on a > e.g. SMALLINT the function still returns wrong results: > {code:java} > spark-sql> select round(2147483647, -1); > [ARITHMETIC_OVERFLOW] Overflow. If necessary set "spark.sql.ansi.enabled" to > "false" to bypass this error.{code} > {code:java} > spark-sql> select round(127y, -1); > -126 {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42849) Session variables
Serge Rielau created SPARK-42849: Summary: Session variables Key: SPARK-42849 URL: https://issues.apache.org/jira/browse/SPARK-42849 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 3.5.0 Reporter: Serge Rielau Provide a type-safe, engine controlled session variable: CREATE [ OR REPLACE } TEMPORARY VARIABLE [ IF NOT EXISTS ]var_name [ type ][ DEFAULT expresion ] SET { variable = expression | ( variable [, ...] ) = ( subquery | expression [, ...] ) DROP VARIABLE [ IF EXISTS ]variable_name -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42638) current_user() is blocked from VALUES, but current_timestamp() is not
Serge Rielau created SPARK-42638: Summary: current_user() is blocked from VALUES, but current_timestamp() is not Key: SPARK-42638 URL: https://issues.apache.org/jira/browse/SPARK-42638 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.3.0 Reporter: Serge Rielau VALUES(current_user()); returns: cannot evaluate expression current_user() in inline table definition.; line 1 pos 8 The same with current_timestamp() works. It appears current_user() is recognized as non-deterministic. But it is constant within the statement, just like current_timestanmp(). PS: It's not clear why we block non-deterministic functions to begin with -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42623) parameter markers not blocked in DDL
Serge Rielau created SPARK-42623: Summary: parameter markers not blocked in DDL Key: SPARK-42623 URL: https://issues.apache.org/jira/browse/SPARK-42623 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau Fix For: 3.4.0 The parameterized query code does not block DDL statements from referencing parameter markers. E.g. a {code:java} scala> spark.sql(sqlText = "CREATE VIEW v1 AS SELECT current_timestamp() + :later as stamp, :x * :x AS square", args = Map("later" -> "INTERVAL'3' HOUR", "x" -> "15.0")).show() ++ || ++ ++ {code} It appears we have some protection that fails us when the view is invoked: {code:java} scala> spark.sql(sqlText = "SELECT * FROM v1", args = Map("later" -> "INTERVAL'3' HOUR", "x" -> "15.0")).show() org.apache.spark.sql.AnalysisException: [UNBOUND_SQL_PARAMETER] Found the unbound parameter: `later`. Please, fix `args` and provide a mapping of the parameter to a SQL literal.; line 1 pos 29 {code} Right now I think affected are: * DEFAULT definition * VIEW definition but any other future standard expression popping up is at risk, such as SQL Functions, or GENERATED COLUMN. CREATE TABLE AS is debatable, since it it executes the query at definition only. For simplicity I propose to block the feature from ANY DDL statement (CREATE, ALTER). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42546) SPARK-42045 is incomplete in supporting ANSI_MODE fro round() and bround()
Serge Rielau created SPARK-42546: Summary: SPARK-42045 is incomplete in supporting ANSI_MODE fro round() and bround() Key: SPARK-42546 URL: https://issues.apache.org/jira/browse/SPARK-42546 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.3.2, 3.4.0 Reporter: Serge Rielau under ANSI mode SPARK-42045 added error conditions insetad of silent overflows for edge cases in round() and bround(). However it appears this fix works only for the INT data type. Trying it on a e.g. SMALLINT the function still returns wrong results: {code:java} spark-sql> select round(2147483647, -1); [ARITHMETIC_OVERFLOW] Overflow. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.{code} {code:java} spark-sql> select round(127y, -1); -126 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-42399) CONV() silently overflows returning wrong results
[ https://issues.apache.org/jira/browse/SPARK-42399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688755#comment-17688755 ] Serge Rielau commented on SPARK-42399: -- Adding support is of course best. If it can be done quickly, if not we should stop the wrong results first. > CONV() silently overflows returning wrong results > - > > Key: SPARK-42399 > URL: https://issues.apache.org/jira/browse/SPARK-42399 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.4.0 >Reporter: Serge Rielau >Priority: Critical > > spark-sql> SELECT > CONV(SUBSTRING('0x', > 3), 16, 10); > 18446744073709551615 > Time taken: 2.114 seconds, Fetched 1 row(s) > spark-sql> set spark.sql.ansi.enabled = true; > spark.sql.ansi.enabled true > Time taken: 0.068 seconds, Fetched 1 row(s) > spark-sql> SELECT > CONV(SUBSTRING('0x', > 3), 16, 10); > 18446744073709551615 > Time taken: 0.05 seconds, Fetched 1 row(s) > In ANSI mode we should raise an error for sure. > In non ANSI either an error or a NULL maybe be acceptable. > Alternatively, of course, we could consider if we can support arbitrary > domains since the result is a STRING again. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42399) CONV() silently overflows returning wrong results
Serge Rielau created SPARK-42399: Summary: CONV() silently overflows returning wrong results Key: SPARK-42399 URL: https://issues.apache.org/jira/browse/SPARK-42399 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau spark-sql> SELECT CONV(SUBSTRING('0x', 3), 16, 10); 18446744073709551615 Time taken: 2.114 seconds, Fetched 1 row(s) spark-sql> set spark.sql.ansi.enabled = true; spark.sql.ansi.enabled true Time taken: 0.068 seconds, Fetched 1 row(s) spark-sql> SELECT CONV(SUBSTRING('0x', 3), 16, 10); 18446744073709551615 Time taken: 0.05 seconds, Fetched 1 row(s) In ANSI mode we should raise an error for sure. In non ANSI either an error or a NULL maybe be acceptable. Alternatively, of course, we could consider if we can support arbitrary domains since the result is a STRING again. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42066) The DATATYPE_MISMATCH error class contains inappropriate and duplicating subclasses
Serge Rielau created SPARK-42066: Summary: The DATATYPE_MISMATCH error class contains inappropriate and duplicating subclasses Key: SPARK-42066 URL: https://issues.apache.org/jira/browse/SPARK-42066 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau subclass WRONG_NUM_ARGS (with suggestions) semantically does not belong into DATATYPE_MISMATCH and there is an error class with that same name. We should rea the subclasses for this errorclass, which seems to have become a bit of a dumping ground... -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-42058) Harden SQLSTATE usage for error classes (2)
Serge Rielau created SPARK-42058: Summary: Harden SQLSTATE usage for error classes (2) Key: SPARK-42058 URL: https://issues.apache.org/jira/browse/SPARK-42058 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau Error classes are great, but for JDBC, ODBC etc the SQLSTATEs of the standard reign. We have started adding SQLSTATEs but have not really paid attention to their correctness. Follow up to: https://issues.apache.org/jira/browse/SPARK-41994 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41994) Harden SQLSTATE usage for error classes
Serge Rielau created SPARK-41994: Summary: Harden SQLSTATE usage for error classes Key: SPARK-41994 URL: https://issues.apache.org/jira/browse/SPARK-41994 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau Error classes are great, but for JDBC, ODBC etc the SQLSTATEs of the standard reign. We have started adding SQLSTATEs but have not really paid attention to their correctness. Here is a unified view of SQLSTATE's used in the [Industry.|https://docs.google.com/spreadsheets/d/1hrQBSuHooiozUNAQTHiYq3WidS1uliHpl9cYfWpig1c/edit?usp=sharing] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41931) Improve UNSUPPORTED_DATA_TYPE message for complex types
Serge Rielau created SPARK-41931: Summary: Improve UNSUPPORTED_DATA_TYPE message for complex types Key: SPARK-41931 URL: https://issues.apache.org/jira/browse/SPARK-41931 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau spark-sql> SELECT CAST(array(1, 2, 3) AS ARRAY); [UNSUPPORTED_DATATYPE] Unsupported data type "ARRAY"(line 1, pos 30) == SQL == SELECT CAST(array(1, 2, 3) AS ARRAY) --^^^ This error message is confusing. We support ARRAY. We just require it to be typed. We should have an error like: [INCOMPLETE_TYPE_DEFINITION.ARRAY] The definition of type `ARRAY` is incomplete. You must provide an element type. For example: `ARRAY\`. Similarly for STRUCT and MAP. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41670) Introduce builtin and session namespaces for builtin functions and temp views/functions
Serge Rielau created SPARK-41670: Summary: Introduce builtin and session namespaces for builtin functions and temp views/functions Key: SPARK-41670 URL: https://issues.apache.org/jira/browse/SPARK-41670 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau Spark today allows overloading between persisted relations and function and temporary relations and functions. It also allows overloading between persisted functions and builtin functions. While Spark allows us to disambiguate a persisted objects by qualifying it, there is no qualifier for temp or builtin objects. Here we propose to use `builtin` for builtin objects and `session` for session temporary objects. If there is a conflict with persisted schemas of this name we can further declare that the catalog for both is `system` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41668) DECODE function returns wrong results when passed NULL
Serge Rielau created SPARK-41668: Summary: DECODE function returns wrong results when passed NULL Key: SPARK-41668 URL: https://issues.apache.org/jira/browse/SPARK-41668 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.1.2 Reporter: Serge Rielau The DECODE function was implemented for Oracle compatibility. It works similar to CASE expression, but it is supposed to have one major difference: NULL == NULL [https://docs.oracle.com/database/121/SQLRF/functions057.htm#SQLRF00631] The Spark implementation does not observe this however: select decode(null, 6, 'Spark', NULL, 'SQL', 4, 'rocks'); NULL The result is supposed to be 'SQL' -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41353) UNRESOLVED_ROUTINE error class
Serge Rielau created SPARK-41353: Summary: UNRESOLVED_ROUTINE error class Key: SPARK-41353 URL: https://issues.apache.org/jira/browse/SPARK-41353 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau We want to unify and name: "_LEGACY_ERROR_TEMP_1041" : \{ "message" : [ "Undefined function ." ] }, _LEGACY_ERROR_TEMP_1242" : \{ "message" : [ "Undefined function: . This function is neither a built-in/temporary function, nor a persistent function that is qualified as ." ] },"_LEGACY_ERROR_TEMP_1243" : { "message" : [ "Undefined function: " ] I proposal is: UNRESOLVED_ROUTINE. routineName => `a`.`b`.`func`, routineSignature => [INT, STRING] , searchPath => [`builtin`, `session`, `hiveMetaStore`.`default`] This assumes agreement to introduce `builtin` as optional qualifier for builtin functions. And `session` a optional qualifier for temporary functions (separate PR). Q: Why ROUTINE? A: Some day we may want to support PROCEDURES and they will follow the name rule and share the same namespace. Q:Why A PATH A: We do follow a hard coded path today with a fixed precedence rule. Q: Why provide the signature A: Longterm we may support overloading of functions by arity, type or even parameter name. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41323) Support CURRENT_SCHEMA() as alias for CURRENT_DATABASE()
Serge Rielau created SPARK-41323: Summary: Support CURRENT_SCHEMA() as alias for CURRENT_DATABASE() Key: SPARK-41323 URL: https://issues.apache.org/jira/browse/SPARK-41323 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau CURRENT_SCHEMA is the keyword used in thE SQL Standard to refer to the current namespace. It is also supported my multiple other vendors: PostgreSQL Redshift Snowflake Db2 and others -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-41104) Can insert NULL into hive table table with NOT NULL column
Serge Rielau created SPARK-41104: Summary: Can insert NULL into hive table table with NOT NULL column Key: SPARK-41104 URL: https://issues.apache.org/jira/browse/SPARK-41104 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau spark-sql> CREATE TABLE tttd(c1 int not null); 22/11/10 14:04:28 WARN ResolveSessionCatalog: A Hive serde table will be created as there is no table provider specified. You can set spark.sql.legacy.createHiveTableByDefault to false so that native data source table will be created instead. 22/11/10 14:04:28 WARN HiveMetaStore: Location: file:/Users/serge.rielau/spark/spark-warehouse/tttd specified for non-external table:tttd Time taken: 0.078 seconds spark-sql> INSERT INTO tttd VALUES(null); Time taken: 0.36 seconds spark-sql> SELECT * FROM tttd; NULL Time taken: 0.074 seconds, Fetched 1 row(s) spark-sql> Does hive not support NOT NULL? That's fine, but then we should fail on CREATE TABLE -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40822) Use stable derived-column-alias algorithm, suitable for CREATE VIEW
Serge Rielau created SPARK-40822: Summary: Use stable derived-column-alias algorithm, suitable for CREATE VIEW Key: SPARK-40822 URL: https://issues.apache.org/jira/browse/SPARK-40822 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau Spark has the ability derive column aliases for expressions if no alias was provided by the user. E.g. CREATE TABLE T(c1 INT, c2 INT); SELECT c1, `(c1 + 1)`, c3 FROM (SELECT c1, c1 + 1, c1 * c2 AS c3 FROM T); This is a valuable feature. However, the current implementation works by pretty printing the expression from the logical plan. This has multiple downsides: * The derived names can be unintuitive. For example the brackets in `(c1 + 1)` or outright ugly, such as: SELECT `substr(hello, 1, 2147483647)` FROM (SELECT substr('hello', 1)) AS T; * We cannot guarantee stability across versions since the logical lan of an expression may change. The later is a major reason why we cannot allow CREATE VIEW without a column list except in "trivial" cases. CREATE VIEW v AS SELECT c1, c1 + 1, c1 * c2 AS c3 FROM T; Not allowed to create a permanent view `spark_catalog`.`default`.`v` without explicitly assigning an alias for expression (c1 + 1). There are two way we can go about fixing this: # Stop deriving column aliases from the expression. Instead generate unique names such as `_col_1` based on their position in the select list. This is ugly and takes away the "nice" headers on result sets # Move the derivation of the name upstream. That is instead of pretty printing the logical plan we pretty print the lexer output, or a sanitized version of the expression as typed. The statement as typed is stable by definition. The lexer is stable because i has no reason to change. And if it ever did we have a better chance to manage the change. In this feature we propose the following semantic: # If the column alias can be trivially derived (some of these can stack), do so: ** a (qualified) column reference => the unqualified column identifier cat.sch.tab.col => col ** A field reference => the fieldname struct.field1.field2 => field2 ** A cast(column AS type) => column cast(col1 AS INT) => col1 ** A map lookup with literal key => keyname map.key => key map['key'] => key ** A parameter less function => unqualified function name current_schema() => current_schema # Take the lexer tokens of the expression, eliminate comments, and append them. foo(tab1.c1 + /* this is a plus*/ 1) => `foo(tab1.c1+1)` Of course we wan this change under a config. If the config is set we can allow CREATE VIEW to exploit this and use the derived expressions. PS: The exact mechanics of formatting the name is very much debatable. E.g.spaces between token, squeezing out comments - upper casing - preserving quotes or double quotes...) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40585) Support double-quoted identifiers
Serge Rielau created SPARK-40585: Summary: Support double-quoted identifiers Key: SPARK-40585 URL: https://issues.apache.org/jira/browse/SPARK-40585 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau In many SQL identifiers can be unquoted or quoted with double quotes. In Spark double quoted literals imply strings. In this proposal we allow for a config: double_quoted_identifiers which, when set, switches the interpretation from string to identifier. Note that back ticks are still allowed. Also the treatment of escapes is not changed as part of this work. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40521) PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions instead of the conflicting partition
[ https://issues.apache.org/jira/browse/SPARK-40521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607897#comment-17607897 ] Serge Rielau commented on SPARK-40521: -- Hive does return the offending partition. We just need to dig it out !Screen Shot 2022-09-21 at 10.08.44 AM.png!!Screen Shot 2022-09-21 at 10.08.52 AM.png! > PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions > instead of the conflicting partition > - > > Key: SPARK-40521 > URL: https://issues.apache.org/jira/browse/SPARK-40521 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.4.0 >Reporter: Serge Rielau >Priority: Minor > Attachments: Screen Shot 2022-09-21 at 10.08.44 AM.png, Screen Shot > 2022-09-21 at 10.08.52 AM.png > > > PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions > instead of the conflicting partition > When I run: > AlterTableAddPartitionSuiteBase for Hive > The test: partition already exists > Fails in my my local build ONLY in that mode because it reports two > partitions as conflicting where there should be only one. In all other modes > the test succeeds. > The test is passing on master because the test does not check the partitions > themselves. > Repro on master: Note that c1 = 1 does not already exist. It should NOT be > listed > create table t(c1 int, c2 int) partitioned by (c1); > alter table t add partition (c1 = 2); > alter table t add partition (c1 = 1) partition (c1 = 2); > 22/09/21 09:30:09 ERROR Hive: AlreadyExistsException(message:Partition > already exists: Partition(values:[2], dbName:default, tableName:t, > createTime:0, lastAccessTime:0, > sd:StorageDescriptor(cols:[FieldSchema(name:c2, type:int, comment:null)], > location:file:/Users/serge.rielau/spark/spark-warehouse/t/c1=2, > inputFormat:org.apache.hadoop.mapred.TextInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > parameters:\{serialization.format=1}), bucketCols:[], sortCols:[], > parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], > skewedColValueLocationMaps:{}), storedAsSubDirectories:false), > parameters:null)) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.startAddPartition(HiveMetaStore.java:2744) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partitions_core(HiveMetaStore.java:2442) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partitions_req(HiveMetaStore.java:2560) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at com.sun.proxy.$Proxy31.add_partitions_req(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.add_partitions(HiveMetaStoreClient.java:625) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173) > at com.sun.proxy.$Proxy32.add_partitions(Unknown Source) > at org.apache.hadoop.hive.ql.metadata.Hive.createPartitions(Hive.java:2103) > at > org.apache.spark.sql.hive.client.Shim_v0_13.createPartitions(HiveShim.scala:763) > at > org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$createPartitions$1(HiveClientImpl.scala:631) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:296) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:227) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:226) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:276) > at > org.apac
[jira] [Updated] (SPARK-40521) PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions instead of the conflicting partition
[ https://issues.apache.org/jira/browse/SPARK-40521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serge Rielau updated SPARK-40521: - Attachment: Screen Shot 2022-09-21 at 10.08.52 AM.png Screen Shot 2022-09-21 at 10.08.44 AM.png > PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions > instead of the conflicting partition > - > > Key: SPARK-40521 > URL: https://issues.apache.org/jira/browse/SPARK-40521 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.4.0 >Reporter: Serge Rielau >Priority: Minor > Attachments: Screen Shot 2022-09-21 at 10.08.44 AM.png, Screen Shot > 2022-09-21 at 10.08.52 AM.png > > > PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions > instead of the conflicting partition > When I run: > AlterTableAddPartitionSuiteBase for Hive > The test: partition already exists > Fails in my my local build ONLY in that mode because it reports two > partitions as conflicting where there should be only one. In all other modes > the test succeeds. > The test is passing on master because the test does not check the partitions > themselves. > Repro on master: Note that c1 = 1 does not already exist. It should NOT be > listed > create table t(c1 int, c2 int) partitioned by (c1); > alter table t add partition (c1 = 2); > alter table t add partition (c1 = 1) partition (c1 = 2); > 22/09/21 09:30:09 ERROR Hive: AlreadyExistsException(message:Partition > already exists: Partition(values:[2], dbName:default, tableName:t, > createTime:0, lastAccessTime:0, > sd:StorageDescriptor(cols:[FieldSchema(name:c2, type:int, comment:null)], > location:file:/Users/serge.rielau/spark/spark-warehouse/t/c1=2, > inputFormat:org.apache.hadoop.mapred.TextInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, > parameters:\{serialization.format=1}), bucketCols:[], sortCols:[], > parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], > skewedColValueLocationMaps:{}), storedAsSubDirectories:false), > parameters:null)) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.startAddPartition(HiveMetaStore.java:2744) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partitions_core(HiveMetaStore.java:2442) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partitions_req(HiveMetaStore.java:2560) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at com.sun.proxy.$Proxy31.add_partitions_req(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.add_partitions(HiveMetaStoreClient.java:625) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173) > at com.sun.proxy.$Proxy32.add_partitions(Unknown Source) > at org.apache.hadoop.hive.ql.metadata.Hive.createPartitions(Hive.java:2103) > at > org.apache.spark.sql.hive.client.Shim_v0_13.createPartitions(HiveShim.scala:763) > at > org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$createPartitions$1(HiveClientImpl.scala:631) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:296) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:227) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:226) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:276) > at > org.apache.spark.sql.hive.client.HiveClientImpl.createPartitions(HiveClientImpl.scala:624) > at > org.apac
[jira] [Updated] (SPARK-40521) PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions instead of the conflicting partition
[ https://issues.apache.org/jira/browse/SPARK-40521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serge Rielau updated SPARK-40521: - Description: PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions instead of the conflicting partition When I run: AlterTableAddPartitionSuiteBase for Hive The test: partition already exists Fails in my my local build ONLY in that mode because it reports two partitions as conflicting where there should be only one. In all other modes the test succeeds. The test is passing on master because the test does not check the partitions themselves. Repro on master: Note that c1 = 1 does not already exist. It should NOT be listed create table t(c1 int, c2 int) partitioned by (c1); alter table t add partition (c1 = 2); alter table t add partition (c1 = 1) partition (c1 = 2); 22/09/21 09:30:09 ERROR Hive: AlreadyExistsException(message:Partition already exists: Partition(values:[2], dbName:default, tableName:t, createTime:0, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:c2, type:int, comment:null)], location:file:/Users/serge.rielau/spark/spark-warehouse/t/c1=2, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:\{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), parameters:null)) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.startAddPartition(HiveMetaStore.java:2744) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partitions_core(HiveMetaStore.java:2442) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partitions_req(HiveMetaStore.java:2560) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy31.add_partitions_req(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.add_partitions(HiveMetaStoreClient.java:625) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173) at com.sun.proxy.$Proxy32.add_partitions(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createPartitions(Hive.java:2103) at org.apache.spark.sql.hive.client.Shim_v0_13.createPartitions(HiveShim.scala:763) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$createPartitions$1(HiveClientImpl.scala:631) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:296) at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:227) at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:226) at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:276) at org.apache.spark.sql.hive.client.HiveClientImpl.createPartitions(HiveClientImpl.scala:624) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$createPartitions$1(HiveExternalCatalog.scala:1039) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:102) at org.apache.spark.sql.hive.HiveExternalCatalog.createPartitions(HiveExternalCatalog.scala:1021) at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.createPartitions(ExternalCatalogWithListener.scala:201) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createPartitions(SessionCatalog.scala:1169) at org.apache.spark.sql.execution.command.AlterTableAddPartitionCommand.$anonfun$run$17(ddl.scala:514) at org.apache.spark.sql.execution.command.AlterTableAddPartitionCommand.$anonfun$run$17$adapted(ddl.scala:513) at scala.collection.Iterator.foreach(Itera
[jira] [Created] (SPARK-40521) PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions instead of the conflicting partition
Serge Rielau created SPARK-40521: Summary: PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions instead of the conflicting partition Key: SPARK-40521 URL: https://issues.apache.org/jira/browse/SPARK-40521 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau PartitionsAlreadyExistException in Hive V1 Command V1 reports all partitions instead of the conflicting partition When I run: AlterTableAddPartitionSuiteBase for Hive The test: partition already exists Fails in my my local build ONLY in that mode because it reports two partitions as conflicting where there should be only one. In all other modes the test succeeds. The test is passing on master because the test does not check the partitions themselves. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40360) Convert some DDL exception to new error framework
Serge Rielau created SPARK-40360: Summary: Convert some DDL exception to new error framework Key: SPARK-40360 URL: https://issues.apache.org/jira/browse/SPARK-40360 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau Tackling the following files: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AlreadyExistException.scala sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/NoSuchItemException.scala sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CannotReplaceMissingTableException.scala sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/NonEmptyException.scala sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala Here is the doc with proposed text: https://docs.google.com/document/d/1TpFx3AwcJZd3l7zB1ZDchvZ8j2dY6_uf5LHfW2gjE4A/edit?usp=sharing -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40208) New OFFSET clause does not use new error framework
[ https://issues.apache.org/jira/browse/SPARK-40208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584351#comment-17584351 ] Serge Rielau commented on SPARK-40208: -- [~maxgekk] (FYI) Also (I'm sure LIMIT is the same, maybe fix in one fell swoop?) spark-sql> SELECT name, age FROM person ORDER BY name OFFSET -1; Error in query: The offset expression must be equal to or greater than 0, but got -1; Offset -1 +- Sort [name#185 ASC NULLS FIRST], true +- Project [name#185, age#186] +- SubqueryAlias person +- View (`person`, [name#185,age#186]) +- Project [cast(col1#187 as string) AS name#185, cast(col2#188 as int) AS age#186] +- LocalRelation [col1#187, col2#188] > New OFFSET clause does not use new error framework > --- > > Key: SPARK-40208 > URL: https://issues.apache.org/jira/browse/SPARK-40208 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.4.0 >Reporter: Serge Rielau >Priority: Minor > > CREATE TEMP VIEW person (name, age) > AS VALUES ('Zen Hui', 25), > ('Anil B' , 18), > ('Shone S', 16), > ('Mike A' , 25), > ('John A' , 18), > ('Jack N' , 16); > SELECT name, age FROM person ORDER BY name OFFSET length(name); > Error in query: The offset expression must evaluate to a constant value, but > got length(person.name); > Offset length(name#181) > +- Sort [name#181 ASC NULLS FIRST], true > +- Project [name#181, age#182] > +- SubqueryAlias person > +- View (`person`, [name#181,age#182]) > +- Project [cast(col1#183 as string) AS name#181, cast(col2#184 > as int) AS age#182] > +- LocalRelation [col1#183, col2#184|#183, col2#184] > > Returning the plan here is quite pointless as well. The context would be more > interesting. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40208) New OFFSET clause does not use new error framework
Serge Rielau created SPARK-40208: Summary: New OFFSET clause does not use new error framework Key: SPARK-40208 URL: https://issues.apache.org/jira/browse/SPARK-40208 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau CREATE TEMP VIEW person (name, age) AS VALUES ('Zen Hui', 25), ('Anil B' , 18), ('Shone S', 16), ('Mike A' , 25), ('John A' , 18), ('Jack N' , 16); SELECT name, age FROM person ORDER BY name OFFSET length(name); Error in query: The offset expression must evaluate to a constant value, but got length(person.name); Offset length(name#181) +- Sort [name#181 ASC NULLS FIRST], true +- Project [name#181, age#182] +- SubqueryAlias person +- View (`person`, [name#181,age#182]) +- Project [cast(col1#183 as string) AS name#181, cast(col2#184 as int) AS age#182] +- LocalRelation [col1#183, col2#184|#183, col2#184] Returning the plan here is quite pointless as well. The context would be more interesting. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40156) url_decode() exposes a Java error
[ https://issues.apache.org/jira/browse/SPARK-40156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582643#comment-17582643 ] Serge Rielau commented on SPARK-40156: -- + [~maxgekk] For new function. we should be using the new error framework: [https://github.com/apache/spark/blob/master/core/src/main/resources/error/error-classes.json] > url_decode() exposes a Java error > - > > Key: SPARK-40156 > URL: https://issues.apache.org/jira/browse/SPARK-40156 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.4.0 >Reporter: Serge Rielau >Priority: Major > > Given a badly encode string Spark returns a Java error. > It should the return an ERROR_CLASS > spark-sql> SELECT url_decode('http%3A%2F%2spark.apache.org'); > 22/08/20 17:17:20 ERROR SparkSQLDriver: Failed in [SELECT > url_decode('http%3A%2F%2spark.apache.org')] > java.lang.IllegalArgumentException: URLDecoder: Illegal hex characters in > escape (%) pattern - Error at index 1 in: "2s" > at java.base/java.net.URLDecoder.decode(URLDecoder.java:232) > at java.base/java.net.URLDecoder.decode(URLDecoder.java:142) > at > org.apache.spark.sql.catalyst.expressions.UrlCodec$.decode(urlExpressions.scala:113) > at > org.apache.spark.sql.catalyst.expressions.UrlCodec.decode(urlExpressions.scala) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-40156) url_decode() exposes a Java error
[ https://issues.apache.org/jira/browse/SPARK-40156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582452#comment-17582452 ] Serge Rielau commented on SPARK-40156: -- It's new [~Zing] https://github.com/apache/spark/commit/e5c1b822016600e77fabcdf145ecb3ba93c692b3 > url_decode() exposes a Java error > - > > Key: SPARK-40156 > URL: https://issues.apache.org/jira/browse/SPARK-40156 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.4.0 >Reporter: Serge Rielau >Priority: Major > > Given a badly encode string Spark returns a Java error. > It should the return an ERROR_CLASS > spark-sql> SELECT url_decode('http%3A%2F%2spark.apache.org'); > 22/08/20 17:17:20 ERROR SparkSQLDriver: Failed in [SELECT > url_decode('http%3A%2F%2spark.apache.org')] > java.lang.IllegalArgumentException: URLDecoder: Illegal hex characters in > escape (%) pattern - Error at index 1 in: "2s" > at java.base/java.net.URLDecoder.decode(URLDecoder.java:232) > at java.base/java.net.URLDecoder.decode(URLDecoder.java:142) > at > org.apache.spark.sql.catalyst.expressions.UrlCodec$.decode(urlExpressions.scala:113) > at > org.apache.spark.sql.catalyst.expressions.UrlCodec.decode(urlExpressions.scala) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40156) url_decode() exposes a Java error
Serge Rielau created SPARK-40156: Summary: url_decode() exposes a Java error Key: SPARK-40156 URL: https://issues.apache.org/jira/browse/SPARK-40156 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau Given a badly encode string Spark returns a Java error. It should the return an ERROR_CLASS spark-sql> SELECT url_decode('http%3A%2F%2spark.apache.org'); 22/08/20 17:17:20 ERROR SparkSQLDriver: Failed in [SELECT url_decode('http%3A%2F%2spark.apache.org')] java.lang.IllegalArgumentException: URLDecoder: Illegal hex characters in escape (%) pattern - Error at index 1 in: "2s" at java.base/java.net.URLDecoder.decode(URLDecoder.java:232) at java.base/java.net.URLDecoder.decode(URLDecoder.java:142) at org.apache.spark.sql.catalyst.expressions.UrlCodec$.decode(urlExpressions.scala:113) at org.apache.spark.sql.catalyst.expressions.UrlCodec.decode(urlExpressions.scala) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40151) Fix return type for new median(interval) function
Serge Rielau created SPARK-40151: Summary: Fix return type for new median(interval) function Key: SPARK-40151 URL: https://issues.apache.org/jira/browse/SPARK-40151 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.4.0 Reporter: Serge Rielau median() right now returns an interval of the same type as the input. We should instead match mean and avg(): The result type is computed as for the arguments: - year-month interval: The result is an `INTERVAL YEAR TO MONTH`. - day-time interval: The result is an `INTERVAL DAY TO SECOND`. - In all other cases the result is a DOUBLE. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39558) Store error message parameters as Map instead of Array
Serge Rielau created SPARK-39558: Summary: Store error message parameters as Map instead of Array Key: SPARK-39558 URL: https://issues.apache.org/jira/browse/SPARK-39558 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.3.1 Reporter: Serge Rielau Right now when we raise a SparkException we pass an array of argument which are assigned to the message parameters by position. This has several downsides: 1. It makes hit hard to later localize (or rework) the messages since that may shuffle position. 2. There could be an accidental mismatch when writing code which is not detected in QA. 3. Sometimes we want to use the same parameter multiple times in a message. Repeating it as an argument seems silly. All of these problems go away when we use a map aligning parameters and arguments. We already do this for CheckError. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39492) Rework MISSING_COLUMN error class
Serge Rielau created SPARK-39492: Summary: Rework MISSING_COLUMN error class Key: SPARK-39492 URL: https://issues.apache.org/jira/browse/SPARK-39492 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.3.1 Reporter: Serge Rielau "MISSING_COLUMN" : { "message" : [ "Column '' does not exist. Did you mean one of the following? []" ], "sqlState" : "42000" Is unfortunately named. It is more accurate to talk about an UNRESOLVED_COLUMN or an UNRESOLVED_COLUMN_IDENTIFIER since we could refer to an alias, a SQL UDF parameter, a field, or, in the future, a variable. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39432) element_at(*, 0) does not return INVALID_ARRAY_INDEX_IN_ELEMENT_AT
Serge Rielau created SPARK-39432: Summary: element_at(*, 0) does not return INVALID_ARRAY_INDEX_IN_ELEMENT_AT Key: SPARK-39432 URL: https://issues.apache.org/jira/browse/SPARK-39432 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.3.0 Reporter: Serge Rielau spark-sql> SELECT element_at(array('a', 'b', 'c'), index) FROM VALUES(0), (2) AS T(index); 22/06/09 16:23:07 ERROR SparkSQLDriver: Failed in [SELECT element_at(array('a', 'b', 'c'), index) FROM VALUES(0), (2) AS T(index)] java.lang.ArrayIndexOutOfBoundsException: SQL array indices start at 1 at org.apache.spark.sql.errors.QueryExecutionErrors$.sqlArrayIndexNotStartAtOneError(QueryExecutionErrors.scala:1206) This should roll into INVALID_ARRAY_IN_ELEMENT_AT. Makes no sense to make a new error class -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39418) DECODE docs refer to Oracle instead of Spark
Serge Rielau created SPARK-39418: Summary: DECODE docs refer to Oracle instead of Spark Key: SPARK-39418 URL: https://issues.apache.org/jira/browse/SPARK-39418 Project: Spark Issue Type: Bug Components: Documentation Affects Versions: 3.2.0 Reporter: Serge Rielau [https://spark.apache.org/docs/latest/api/sql/index.html#decode] If no match is found, then {color:#de350b}Oracle{color} returns default. If default is omitted, returns null. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39416) When raising an exception, pass parameters as a map instead of an array
[ https://issues.apache.org/jira/browse/SPARK-39416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serge Rielau updated SPARK-39416: - Description: We have moved away from c-style parameters in error message texts towards symbolic parameters. E.g. {code:java} "CANNOT_CAST_DATATYPE" : { "message" : [ "Cannot cast to ." ], "sqlState" : "22005" },{code} {{However when we raise an exception we merely pass a simple array and assume positional assignment. }} {code:java} def cannotCastFromNullTypeError(to: DataType): Throwable = { new SparkException(errorClass = "CANNOT_CAST_DATATYPE", messageParameters = Array(NullType.typeName, to.typeName), null) }{code} This has multiple downsides: # It's not possible to mention the same parameter twice in an error message. # When reworking an error message we cannon shuffle parameters without changing the code # There is a risk that the error message and the exception go out of synch unnoticed given we do not want to check for the message text in the code. So in this PR we propose the following new usage: {code:java} def cannotCastFromNullTypeError(to: DataType): Throwable = { new SparkException(errorClass = "CANNOT_CAST_DATATYPE", messageParameters = Map("sourceType" -> NullType.typeName, "targetType" ->to.typeName), context = null) }{code} getMessage will then substitute the parameters in the message appropriately. Moving forward this should be the preferred way to raise exceptions. was: We have moved away from c-style parameters in error message texts towards symbolic parameters. E.g. {code:java} "CANNOT_CAST_DATATYPE" : { "message" : [ "Cannot cast to ." ], "sqlState" : "22005" },{code} {{However when we raise an exception we merely pass a simple array and assume positional assignment. }} {{}} {code:java} def cannotCastFromNullTypeError(to: DataType): Throwable = { new SparkException(errorClass = "CANNOT_CAST_DATATYPE", messageParameters = Array(NullType.typeName, to.typeName), null) }{code} This has multiple downsides: # It's not possible to mention the same parameter twice in an error message. # When reworking an error message we cannon shuffle parameters without changing the code # There is a risk that the error message and the exception go out of synch unnoticed given we do not want to check for the message text in the code. So in this PR we propose the following new usage: {code:java} def cannotCastFromNullTypeError(to: DataType): Throwable = { new SparkException(errorClass = "CANNOT_CAST_DATATYPE", messageParameters = Map("sourceType" -> NullType.typeName, "targetType" ->to.typeName), context = null) }{code} getMessage will then substitute the parameters in the message appropriately. Moving forward this should be the preferred way to raise exceptions. > When raising an exception, pass parameters as a map instead of an array > --- > > Key: SPARK-39416 > URL: https://issues.apache.org/jira/browse/SPARK-39416 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.3.1 >Reporter: Serge Rielau >Priority: Major > > We have moved away from c-style parameters in error message texts towards > symbolic parameters. E.g. > > {code:java} > "CANNOT_CAST_DATATYPE" : { > "message" : [ > "Cannot cast to ." > ], > "sqlState" : "22005" > },{code} > {{However when we raise an exception we merely pass a simple array and assume > positional assignment. }} > {code:java} > def cannotCastFromNullTypeError(to: DataType): Throwable = { > new SparkException(errorClass = "CANNOT_CAST_DATATYPE", > messageParameters = Array(NullType.typeName, to.typeName), null) > }{code} > > This has multiple downsides: > # It's not possible to mention the same parameter twice in an error message. > # When reworking an error message we cannon shuffle parameters without > changing the code > # There is a risk that the error message and the exception go out of synch > unnoticed given we do not want to check for the message text in the code. > So in this PR we propose the following new usage: > {code:java} > def cannotCastFromNullTypeError(to: DataType): Throwable = { > new SparkException(errorClass = "CANNOT_CAST_DATATYPE", > messageParameters = Map("sourceType" -> NullType.typeName, "targetType" > ->to.typeName), > context = null) > }{code} > getMessage will then substitute the parameters in the message appropriately. > Moving forward this should be the preferred way to raise exceptions. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h.
[jira] [Created] (SPARK-39416) When raising an exception, pass parameters as a map instead of an array
Serge Rielau created SPARK-39416: Summary: When raising an exception, pass parameters as a map instead of an array Key: SPARK-39416 URL: https://issues.apache.org/jira/browse/SPARK-39416 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.3.1 Reporter: Serge Rielau We have moved away from c-style parameters in error message texts towards symbolic parameters. E.g. {code:java} "CANNOT_CAST_DATATYPE" : { "message" : [ "Cannot cast to ." ], "sqlState" : "22005" },{code} {{However when we raise an exception we merely pass a simple array and assume positional assignment. }} {{}} {code:java} def cannotCastFromNullTypeError(to: DataType): Throwable = { new SparkException(errorClass = "CANNOT_CAST_DATATYPE", messageParameters = Array(NullType.typeName, to.typeName), null) }{code} This has multiple downsides: # It's not possible to mention the same parameter twice in an error message. # When reworking an error message we cannon shuffle parameters without changing the code # There is a risk that the error message and the exception go out of synch unnoticed given we do not want to check for the message text in the code. So in this PR we propose the following new usage: {code:java} def cannotCastFromNullTypeError(to: DataType): Throwable = { new SparkException(errorClass = "CANNOT_CAST_DATATYPE", messageParameters = Map("sourceType" -> NullType.typeName, "targetType" ->to.typeName), context = null) }{code} getMessage will then substitute the parameters in the message appropriately. Moving forward this should be the preferred way to raise exceptions. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39349) Add a CheckError() method to SparkFunSuite
Serge Rielau created SPARK-39349: Summary: Add a CheckError() method to SparkFunSuite Key: SPARK-39349 URL: https://issues.apache.org/jira/browse/SPARK-39349 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.3.1 Reporter: Serge Rielau We want to standardize on a generic way to QA error messages without impeding the ability to enhance/rework error messages. CheckError() allows for efficient asserting on the "payload": * Errorclass, subclass * SQLState * Parameters (both names and values) It does not test the actual English text. Which is the feature -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-39185) Convert *AlreadyExistsException to use error classes
[ https://issues.apache.org/jira/browse/SPARK-39185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Serge Rielau updated SPARK-39185: - Summary: Convert *AlreadyExistsException to use error classes (was: t *AlreadyExistsException to use error classes) > Convert *AlreadyExistsException to use error classes > > > Key: SPARK-39185 > URL: https://issues.apache.org/jira/browse/SPARK-39185 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Serge Rielau >Priority: Major > > XXX already exists is a pretty common error condition. > We want to handle it as an error class -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-39185) t *AlreadyExistsException to use error classes
Serge Rielau created SPARK-39185: Summary: t *AlreadyExistsException to use error classes Key: SPARK-39185 URL: https://issues.apache.org/jira/browse/SPARK-39185 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.3.0 Reporter: Serge Rielau XXX already exists is a pretty common error condition. We want to handle it as an error class -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org