[GitHub] [spark] MaxGekk commented on a diff in pull request #37887: [SPARK-40360] ALREADY_EXISTS and NOT_FOUND exceptions

GitBox Mon, 10 Oct 2022 10:32:43 -0700


MaxGekk commented on code in PR #37887:
URL: https://github.com/apache/spark/pull/37887#discussion_r991513440



##########
sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala:
##########
@@ -45,14 +47,15 @@ trait AlterTableTests extends SharedSparkSession {
 
   test("AlterTable: table does not exist") {
     val t2 = s"${catalogAndNamespace}fake_table"
+    val quoted = 
UnresolvedAttribute.parseAttributeName(s"${catalogAndNamespace}table_name")
+      .map(part => quoteIdentifier(part)).mkString(".")

Review Comment:
   Does `toSQLId` do the same, no?



##########
core/src/main/resources/error/error-classes.json:
##########
@@ -536,12 +563,71 @@
       "Failed to set original permission <permission> back to the created 
path: <path>. Exception: <message>"
     ]
   },
+  "ROUTINE_ALREADY_EXISTS" : {
+    "message" : [
+      "Cannot create the function <routineName> because it already exists.",
+      "Choose a different name, drop or replace the existing function, or add 
the IF NOT EXISTS clause to tolerate a pre-existing function."
+    ],
+    "sqlState" : "42000"
+  },
+  "ROUTINE_NOT_FOUND" : {
+    "message" : [
+      "The function <routineName> cannot be found. Verify the spelling and 
correctness of the schema and catalog.",
+      "If you did not qualify the name with a schema and catalog, verify the 
current_schema() output, or qualify the name with the correct schema and 
catalog.",
+      "To tolerate the error on drop use DROP FUNCTION IF EXISTS."
+    ],
+    "sqlState" : "42000"
+  },
+  "SCHEMA_ALREADY_EXISTS" : {
+    "message" : [
+      "Cannot create schema <schemaName> because it already exists.",
+      "Choose a different name, drop the existing schema, or add the IF NOT 
EXISTS clause to tolerate pre-existing schema."
+    ],
+    "sqlState" : "42000"
+  },
+  "SCHEMA_NOT_EMPTY" : {
+    "message" : [
+      "Cannot drop a schema <schemaName> because it contains objects.",
+      "Use DROP SCHEMA ... CASCADE to drop the schema and all its objects."
+    ],
+    "sqlState" : "42000"
+  },
+  "SCHEMA_NOT_FOUND" : {
+    "message" : [
+      "The schema <schemaName> cannot be found. Verify the spelling and 
correctness of the schema and catalog.",
+      "If you did not qualify the name with a catalog, verify the 
current_schema() output, or qualify the name with the correct catalog.",
+      "To tolerate the error on drop use DROP SCHEMA IF EXISTS."
+    ],
+    "sqlState" : "42000"
+  },
   "SECOND_FUNCTION_ARGUMENT_NOT_INTEGER" : {
     "message" : [
       "The second argument of <functionName> function needs to be an integer."
     ],
     "sqlState" : "22023"
   },
+  "TABLE_OR_VIEW_ALREADY_EXISTS" : {
+    "message" : [
+      "Cannot create table or view <relationName> because it already exists.",
+      "Choose a different name, drop or replace the existing object, or add 
the IF NOT EXISTS clause to tolerate pre-existing objects."
+    ],
+    "sqlState" : "42000"
+  },
+  "TABLE_OR_VIEW_NOT_FOUND" : {
+    "message" : [
+      "The table or view <relationName> cannot be found. Verify the spelling 
and correctness of the schema and catalog.",
+      "If you did not qualify the name with a schema, verify the 
current_schema() output, or qualify the name with the correct schema and 
catalog.",
+      "To tolerate the error on drop use DROP VIEW IF EXISTS or DROP TABLE IF 
EXISTS."
+    ],
+    "sqlState" : "42000"
+  },
+  "TEMP_TABLE_OR_VIEW_ALREADY_EXISTS" : {
+    "message" : [
+      "Cannot create the temporary view <relationName> because it already 
exists.",

Review Comment:
   Why not to create separate error classes? Or main error class like 
`ALREADY_EXISTS` + sub-classes per every relation type.



##########
core/src/main/resources/error/error-classes.json:
##########
@@ -536,12 +563,71 @@
       "Failed to set original permission <permission> back to the created 
path: <path>. Exception: <message>"
     ]
   },
+  "ROUTINE_ALREADY_EXISTS" : {
+    "message" : [
+      "Cannot create the function <routineName> because it already exists.",
+      "Choose a different name, drop or replace the existing function, or add 
the IF NOT EXISTS clause to tolerate a pre-existing function."
+    ],
+    "sqlState" : "42000"
+  },
+  "ROUTINE_NOT_FOUND" : {
+    "message" : [
+      "The function <routineName> cannot be found. Verify the spelling and 
correctness of the schema and catalog.",
+      "If you did not qualify the name with a schema and catalog, verify the 
current_schema() output, or qualify the name with the correct schema and 
catalog.",
+      "To tolerate the error on drop use DROP FUNCTION IF EXISTS."
+    ],
+    "sqlState" : "42000"
+  },
+  "SCHEMA_ALREADY_EXISTS" : {
+    "message" : [
+      "Cannot create schema <schemaName> because it already exists.",
+      "Choose a different name, drop the existing schema, or add the IF NOT 
EXISTS clause to tolerate pre-existing schema."
+    ],
+    "sqlState" : "42000"
+  },
+  "SCHEMA_NOT_EMPTY" : {
+    "message" : [
+      "Cannot drop a schema <schemaName> because it contains objects.",
+      "Use DROP SCHEMA ... CASCADE to drop the schema and all its objects."
+    ],
+    "sqlState" : "42000"
+  },
+  "SCHEMA_NOT_FOUND" : {
+    "message" : [
+      "The schema <schemaName> cannot be found. Verify the spelling and 
correctness of the schema and catalog.",
+      "If you did not qualify the name with a catalog, verify the 
current_schema() output, or qualify the name with the correct catalog.",
+      "To tolerate the error on drop use DROP SCHEMA IF EXISTS."
+    ],
+    "sqlState" : "42000"
+  },
   "SECOND_FUNCTION_ARGUMENT_NOT_INTEGER" : {
     "message" : [
       "The second argument of <functionName> function needs to be an integer."
     ],
     "sqlState" : "22023"
   },
+  "TABLE_OR_VIEW_ALREADY_EXISTS" : {
+    "message" : [
+      "Cannot create table or view <relationName> because it already exists.",

Review Comment:
   The error class comes from
   
https://github.com/apache/spark/blob/29e45528315c43336ba60ee1717ef5ec0c001c00/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala#L809-L813
   where `oldName` is not used anymore. So, you are going to give less info to 
users for the error troubleshooting.



##########
core/src/main/resources/error/error-classes.json:
##########
@@ -346,6 +346,18 @@
       }
     }
   },
+  "INDEX_ALREADY_EXISTS" : {

Review Comment:
   It should appear somewhere in the source code:
   ```
   $ find . -type f -print0|xargs -0 grep 'INDEX_ALREADY_EXISTS'
   ./core/target/scala-2.12/classes/error/error-classes.json:  
"INDEX_ALREADY_EXISTS" : {
   ./core/src/main/resources/error/error-classes.json:  "INDEX_ALREADY_EXISTS" 
: {
   ./sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala:         
 // INDEX_ALREADY_EXISTS_1
   ```
   but it is not in fact.



##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala:
##########
@@ -1467,9 +1467,11 @@ abstract class SessionCatalogSuite extends AnalysisTest 
with Eventually {
       val e = intercept[AnalysisException] {
         catalog.registerFunction(
           newFunc("temp1", None), overrideIfExists = false, functionBuilder = 
Some(tempFunc3))
-      }.getMessage
-      assert(e.contains("Function temp1 already exists"))
-      // Temporary function is overridden
+      }
+      checkError(e,
+        errorClass = "ROUTINE_ALREADY_EXISTS",
+        parameters = Map("routineName" -> "`temp1`"))
+            // Temporary function is overridden

Review Comment:
   Please, fix indentation.



##########
core/src/main/resources/error/error-classes.json:
##########
@@ -346,6 +346,18 @@
       }
     }
   },
+  "INDEX_ALREADY_EXISTS" : {
+    "message" : [
+      "Cannot create the index because it already exists. <message>."
+    ],
+    "sqlState" : "42000"
+  },
+  "INDEX_NOT_FOUND" : {
+    "message" : [
+      "Cannot find the index. <message>."

Review Comment:
   We convert `SQLException` in such way as I can see at:
   
https://github.com/apache/spark/blob/34d5272663ce4852ca5b2daa665983a321b42060/sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala#L223-L232
   
   Cann't we just pass the sqlState and `sqlException.getErrorCode` + vendor 
name. The info should be enough to identify the error, not?



##########
core/src/test/scala/org/apache/spark/SparkFunSuite.scala:
##########
@@ -374,6 +382,19 @@ abstract class SparkFunSuite
     checkError(exception, errorClass, sqlState, parameters,
       matchPVals = true, Array(context))
 
+  protected def checkErrorTableNotFound(

Review Comment:
   How about `QueryTest`, and pass a SQL statement or a plan into 
`checkErrorTableNotFound()`.



##########
core/src/main/resources/error/error-classes.json:
##########
@@ -346,6 +346,18 @@
       }
     }
   },
+  "INDEX_ALREADY_EXISTS" : {

Review Comment:
   Maybe pass the error class here?
   
https://github.com/apache/spark/blob/288bdd28635d71623dfc6fe82dc1e3aab5fa6c63/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AlreadyExistException.scala#L84-L85



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] MaxGekk commented on a diff in pull request #37887: [SPARK-40360] ALREADY_EXISTS and NOT_FOUND exceptions

Reply via email to