Github user yuchenhuo commented on a diff in the pull request:
https://github.com/apache/spark/pull/20953#discussion_r181241901
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala
---
@@ -179,7 +182,23 @@ class FileScanRDD(
currentIterator = readCurrentFile()
}
- hasNext
+ try {
+ hasNext
+ } catch {
+ case e: SchemaColumnConvertNotSupportedException =>
+ val message = "Parquet column cannot be converted in " +
+ s"file ${currentFile.filePath}. Column: ${e.getColumn}, " +
+ s"Expected: ${e.getLogicalType}, Found:
${e.getPhysicalType}"
+ throw new QueryExecutionException(message, e)
--- End diff --
According to my discussion with @gatorsmile, he thought that we should
throw some exception that could be captured and displayed in a better form. And
since we already have some exception wrapping at
https://github.com/apache/spark/blob/master/python/pyspark/sql/utils.py#L60 it
seem appropriate to use QueryExecutionException instead of the original
SparkException.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]