Aggarwal-Raghav commented on PR #5391:
URL: https://github.com/apache/hive/pull/5391#issuecomment-2284198010

   As per my understanding:
   
   1. One of the benefit of enabling _hive.orc.splits.include.file.foote_r is 
to reduce fs calls as explained in HIVE-15038. In ORC code, _extractFileTail_  
https://github.com/apache/orc/blob/7878691befc66ecc372ff41715cbdff97ec7aafd/java/core/src/java/org/apache/orc/impl/ReaderImpl.java#L569
 make a fs call for creating OrcTail but with the config enabled, it was 
optimized and we were creating OrcTail object in OrcSplit.java 
https://github.com/apache/hive/blob/d0d5d6d7d11b3eece0d0bc17b429cb30dec5dc79/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java#L230
   
   3. In HIVE-15665 with hive.orc.splits.include.file.footer enabled, it 
requires the OrcTail to have serializedTail present (passing null or empty 
BufferChunk won't help as it will throw NPE) 
https://github.com/apache/hive/blob/d0d5d6d7d11b3eece0d0bc17b429cb30dec5dc79/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L669
        
   4. Possible fix is while creating OrcTail in OrcSplit.java, we 
"**_somehow_**" get the serializedTail without making additional fs call or we 
need to revert HIVE-15038, doing so will force the orcReader in 
OrcEncodedDataReader.java to get perform _extractFileTail_ which will have the 
serializedTail.
        
   5. I have gone with reverting the HIVE-15038. Looking forward for 
suggestions on this.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to