pumbaaaaa opened a new issue, #51092: URL: https://github.com/apache/doris/issues/51092
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Version 2.1.7 ### What's Wrong? ``` mysql> set enable_file_cache=true; Query OK, 0 rows affected (0.00 sec) mysql> select date_format(cast(left(logtime, 19) as datetime(3)), '%Y-%m-%d') as dateValue from hive_36.streaming.mpaas_data_20230717 where pt_bd = '2025-04-28' group by dateValue limit 100; ERROR 1105 (HY000): errCode = 2, detailMessage = (172.29.47.34)[CANCELLED]cur path: hdfs://bipcluster/user/hive/warehouse/streaming.db/mpaas_data_20230717/pt_td=2025-04-28/pt_bd=2025-04-28/compacted-part-c1607f8b-0b89-4aac-8555-9a0cb25dfd12-0-11348. Orc row reader nextBatch failed. reason = Buffer error in ZlibDecompressionStream::NextDecompress mysql> set enable_file_cache=false; Query OK, 0 rows affected (0.01 sec) mysql> select date_format(cast(left(logtime, 19) as datetime(3)), '%Y-%m-%d') as dateValue from hive_36.streaming.mpaas_data_20230717 where pt_bd = '2025-04-28' group by dateValue limit 100; +------------+ | dateValue | +------------+ | 2025-04-26 | | 2025-04-18 | | 2023-06-20 | | 2024-09-01 | +------------+ ``` Query Hive external table through catalog, if file_cache is enabled, the query fails; but if file_cache is disabled, the query succeeds. Aside from setting clear_file_cache = true to clear the file cache, how can this issue be resolved? The following are all the error messages: ``` detailMessage = (xx)[CANCELLED]cur path: hdfs://xx/user/hive/warehouse/streaming.db/mpaas_data_20230717/pt_td=2025-04-28/pt_bd=2025-04-28/compacted-part-c1607f8b-0b89-4aac-8555-9a0cb25dfd12-0-11348. Orc row reader nextBatch failed. reason = Read past EOF in DecompressionStream::readBuffer detailMessage = (xx)[CANCELLED]cur path: hdfs://xx/user/hive/warehouse/streaming.db/mpaas_data_20230717/pt_td=2025-04-16/pt_bd=2025-04-16/compacted-part-c1607f8b-0b89-4aac-8555-9a0cb25dfd12-0-7712. Orc row reader nextBatch failed. reason = Data error in ZlibDecompressionStream::NextDecompress detailMessage = (xx)[CANCELLED]cur path: hdfs://xx/user/hive/warehouse/streaming.db/mpaas_data_20230717/pt_td=2025-04-28/pt_bd=2025-04-28/compacted-part-c1607f8b-0b89-4aac-8555-9a0cb25dfd12-0-11348. Orc row reader nextBatch failed. reason = Illegal run length for delta encoding: 1 detailMessage = (xx)[CANCELLED]cur path: hdfs://xx/user/hive/warehouse/streaming.db/mpaas_data_20230717/pt_td=2025-04-16/pt_bd=2025-04-16/compacted-part-c1607f8b-0b89-4aac-8555-9a0cb25dfd12-0-7773. failed to init reader, err: [INTERNAL_ERROR]Init OrcReader failed. reason = Invalid ORC postscript length detailMessage = (xx)[CANCELLED]cur path: hdfs://xx/user/hive/warehouse/streaming.db/mpaas_data_20230717/pt_td=2025-04-28/pt_bd=2025-04-28/compacted-part-c1607f8b-0b89-4aac-8555-9a0cb25dfd12-0-11348. Orc row reader nextBatch failed. reason = Corrupt PATCHED_BASE encoded data (patchBitSize + pgw > 64)! detailMessage = (xx)[CANCELLED]cur path: hdfs://xx/user/hive/warehouse/streaming.db/mpaas_data_20230717/pt_td=2025-04-28/pt_bd=2025-04-28/compacted-part-c1607f8b-0b89-4aac-8555-9a0cb25dfd12-0-11348. Orc row reader nextBatch failed. reason = Corrupt PATCHED_BASE encoded data (pl==0)! detailMessage = (xx)[CANCELLED]cur path: hdfs://xx/user/hive/warehouse/streaming.db/mpaas_data_20230717/pt_td=2025-03-14/pt_bd=2025-03-14/part-41910-415c2648-d2db-4443-baa6-f346415998eb.c000.snappy.orc. Orc row reader nextBatch failed. reason = SnappyDecompressionStream choked on corrupt input ``` ### What You Expected? Resolve this issue ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
