AngersZhuuuu edited a comment on pull request #34308:
URL: https://github.com/apache/spark/pull/34308#issuecomment-965262139


   After deep check this case, this exception should be caused we have parquet 
files with different schema, then  if we don't set mergeSchema, then it will 
directly use the first file's schema to read data, so first file is long type 
and next file is int type for one column, when read the second file, it will 
read long data but the column's descriptor is int type, then will throw such 
Unsupported decoding problem.
   
   But such caused should have been denied by @sunchao 's this pr 
https://github.com/apache/spark/pull/32777
   And such case will be denied when call 
`ParquetVectorUpdaterFactory.getUpdator()`  and will throw exception will file 
path.
   
   ```
   [info]   Cause: org.apache.spark.sql.execution.QueryExecutionException: 
Parquet column cannot be converted in file 
file:///Users/yi.zhu/Documents/project/Angerszhuuuu/spark/target/tmp/spark-3eccc50d-9d9c-4970-9674-87de46ea1192/test-002.parquet/part-00000-4332031b-e514-4b95-b52a-e8d798c999e6-c000.parquet.
 Column: [a], Expected: bigint, Found: INT32
   [info]   at 
org.apache.spark.sql.errors.QueryExecutionErrors$.unsupportedSchemaColumnConvertError(QueryExecutionErrors.scala:586)
   [info]   at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:172)
   [info]   at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93)
   ```
   
   
   Thanks all for your help. @sunchao @cloud-fan @sadikovi .
   Hope your confirm and then I will close this one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to