Joe McDonnell created IMPALA-13720:
--------------------------------------

             Summary: Include information about the file size for parquet 
version number error
                 Key: IMPALA-13720
                 URL: https://issues.apache.org/jira/browse/IMPALA-13720
             Project: IMPALA
          Issue Type: Task
          Components: Backend
    Affects Versions: Impala 4.5.0
            Reporter: Joe McDonnell


When reading a Parquet file, we validate the parquet version number and throw 
an error if it is an unexpected value. For example:
{noformat}
'hdfs://path/to/table/xxxx.parquet' has an invalid Parquet version number: 0 0 
65 3 .
Please check that it is a valid Parquet file. This error can also occur due to 
stale metadata. If you believe this is a valid Parquet file, try running 
"refresh db_name.table_name".{noformat}
One way this can happen is if a file is overwritten or changes size (which is 
why it recommends doing a refresh). It would be useful to do an extra check to 
see if the file size is different than expected. This can be used to put the 
actual and expected size in the error message. This can also provide a more 
detailed error message for Iceberg if the actual size does not match the 
expected size from the Iceberg manifest file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to