Aggarwal-Raghav commented on PR #5391: URL: https://github.com/apache/hive/pull/5391#issuecomment-2284198010
As per my understanding: 1. One of the benefit of enabling _hive.orc.splits.include.file.foote_r is to reduce fs calls as explained in HIVE-15038. In ORC code, _extractFileTail_ https://github.com/apache/orc/blob/7878691befc66ecc372ff41715cbdff97ec7aafd/java/core/src/java/org/apache/orc/impl/ReaderImpl.java#L569 make a fs call for creating OrcTail but with the config enabled, it was optimized and we were creating OrcTail object in OrcSplit.java https://github.com/apache/hive/blob/d0d5d6d7d11b3eece0d0bc17b429cb30dec5dc79/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java#L230 3. In HIVE-15665 with hive.orc.splits.include.file.footer enabled, it requires the OrcTail to have serializedTail present (passing null or empty BufferChunk won't help as it will throw NPE) https://github.com/apache/hive/blob/d0d5d6d7d11b3eece0d0bc17b429cb30dec5dc79/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L669 4. Possible fix is while creating OrcTail in OrcSplit.java, we "**_somehow_**" get the serializedTail without making additional fs call or we need to revert HIVE-15038, doing so will force the orcReader in OrcEncodedDataReader.java to get perform _extractFileTail_ which will have the serializedTail. 5. I have gone with reverting the HIVE-15038. Looking forward for suggestions on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org