[ 
https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14705461#comment-14705461
 ] 

Prasanth Jayachandran commented on HIVE-11595:
----------------------------------------------

1) Can you rename the reader api from getSerializedFileMetadata() to 
getSerializedFooter()?
2) Also the variable fullFooterBuffer to serializedFooter?
3) Can you revert the signature of getAndCheckPostScript() to use Path instead 
of Object? I am assuming metastore has enough information about where the 
ByteBuffer came from (i.e, the path that the ByteBuffer belongs). It will be 
good to throw exception with path information instead of just "Byte buffer" for 
which we won't have any clue. You can add another helper for 
extractMetaInfoFromFooter that accepts Path as parameter.
4) Rename getAndCheckPostScript() to extractPostScript() to be inline with 
other extract methods?
5) What is the purpose of FooterInfo class? Apart from serialized footer and 
metadata (serialized or non serialized?) what other information are stored in 
the metastore?

> refactor ORC footer reading to make it usable from outside
> ----------------------------------------------------------
>
>                 Key: HIVE-11595
>                 URL: https://issues.apache.org/jira/browse/HIVE-11595
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-10595.patch, HIVE-11595.01.patch
>
>
> If ORC footer is read from cache, we want to parse it without having the 
> reader, opening a file, etc. I thought it would be as simple as protobuf 
> parseFrom bytes, but apparently there's bunch of stuff going on there. It 
> needs to be accessible via something like parseFrom(ByteBuffer), or similar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to