[GitHub] [spark] beliefer commented on a change in pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

GitBox Tue, 22 Jun 2021 20:23:19 -0700


beliefer commented on a change in pull request #32958:
URL: https://github.com/apache/spark/pull/32958#discussion_r656730245




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
##########
@@ -1647,4 +1643,300 @@ private[spark] object QueryCompilationErrors {
   def invalidYearMonthIntervalType(startFieldName: String, endFieldName: 
String): Throwable = {
     new AnalysisException(s"'interval $startFieldName to $endFieldName' is 
invalid.")
   }
+
+  def queryFromRawFilesIncludeCorruptRecordColumnError(): Throwable = {
+    new AnalysisException(
+      """
+        |Since Spark 2.3, the queries from raw JSON/CSV files are disallowed 
when the
+        |referenced columns only include the internal corrupt record column
+        |(named _corrupt_record by default). For example:
+        
|spark.read.schema(schema).csv(file).filter($\"_corrupt_record\".isNotNull).count()
+        |and 
spark.read.schema(schema).csv(file).select(\"_corrupt_record\").show().
+        |Instead, you can cache or save the parsed results and then send the 
same query.
+        |For example, val df = spark.read.schema(schema).csv(file).cache() and 
then
+        |df.filter($\"_corrupt_record\".isNotNull).count().
+      """.stripMargin('#'))
+  }
+
+  def userDefinedPartitionNotFoundInJDBCRelationError(
+      columnName: String, schema: String): Throwable = {
+    new AnalysisException(s"User-defined partition column $columnName not " +
+      s"found in the JDBC relation: $schema")
+  }
+
+  def invalidPartitionColumnTypeError(column: StructField): Throwable = {
+    new AnalysisException(
+      s"""
+         |Partition column type should be ${NumericType.simpleString},
+         |${DateType.catalogString}, or ${TimestampType.catalogString}, but
+         |${column.dataType.catalogString} found.
+       """.stripMargin.replaceAll("\n", " "))
+  }
+
+  def tableOrViewAlreadyExistsError(name: String): Throwable = {
+    new AnalysisException(
+      s"Table or view '$name' already exists. SaveMode: ErrorIfExists.")
+  }
+
+  def columnNameContainsInvalidCharactersError(name: String): Throwable = {
+    new AnalysisException(
+      s"""
+         |Column name "$name" contains invalid character(s).
+         |Please use alias to rename it.
+       """.stripMargin.replaceAll("\n", " "))
+  }
+
+  def textDataSourceWithMultiColumnsError(schema: StructType): Throwable = {
+    new AnalysisException(
+      s"Text data source supports only a single column, and you have 
${schema.size} columns.")
+  }
+
+  def cannotResolveFieldReferenceError(ref: FieldReference, query: 
LogicalPlan): Throwable = {
+    new AnalysisException(s"Cannot resolve '$ref' using ${query.output}")
+  }
+
+  def v2ExpressionUnsupportedError(expr: V2Expression): Throwable = {

Review comment:
       I just follow 
https://github.com/apache/spark/blob/f87e24d944332844d935ad6b3152ef628a10bdc8/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DistributionAndOrderingUtils.scala#L24
   I don't know the reason.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] beliefer commented on a change in pull request #32958: [SPARK-35065][SQL] Group exception messages in spark/sql (core)

Reply via email to