wuwenchi commented on issue #4074: URL: https://github.com/apache/iceberg/issues/4074#issuecomment-1034541847
@RussellSpitzer Thank you for your answer. I incorporated the pr you mentioned, but the phenomenon persists. I followed the process and the problem seems to be in the **isPartialFileScan** function. This function is fine if using avro format data files. However, when using data files in parquet format, since the initial state of the parquet file itself has a 4-byte offset, the judgment here is wrong, even if it is a complete parquet file, it will return true here. Should a judgment be made on the file format here? If it is in parquet format, we need to add an initial offset of 4 bytes to **fileScanTask.length()**. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
