Joe McDonnell created IMPALA-13720:
--------------------------------------
Summary: Include information about the file size for parquet
version number error
Key: IMPALA-13720
URL: https://issues.apache.org/jira/browse/IMPALA-13720
Project: IMPALA
Issue Type: Task
Components: Backend
Affects Versions: Impala 4.5.0
Reporter: Joe McDonnell
When reading a Parquet file, we validate the parquet version number and throw
an error if it is an unexpected value. For example:
{noformat}
'hdfs://path/to/table/xxxx.parquet' has an invalid Parquet version number: 0 0
65 3 .
Please check that it is a valid Parquet file. This error can also occur due to
stale metadata. If you believe this is a valid Parquet file, try running
"refresh db_name.table_name".{noformat}
One way this can happen is if a file is overwritten or changes size (which is
why it recommends doing a refresh). It would be useful to do an extra check to
see if the file size is different than expected. This can be used to put the
actual and expected size in the error message. This can also provide a more
detailed error message for Iceberg if the actual size does not match the
expected size from the Iceberg manifest file.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)