[
https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14705461#comment-14705461
]
Prasanth Jayachandran commented on HIVE-11595:
----------------------------------------------
1) Can you rename the reader api from getSerializedFileMetadata() to
getSerializedFooter()?
2) Also the variable fullFooterBuffer to serializedFooter?
3) Can you revert the signature of getAndCheckPostScript() to use Path instead
of Object? I am assuming metastore has enough information about where the
ByteBuffer came from (i.e, the path that the ByteBuffer belongs). It will be
good to throw exception with path information instead of just "Byte buffer" for
which we won't have any clue. You can add another helper for
extractMetaInfoFromFooter that accepts Path as parameter.
4) Rename getAndCheckPostScript() to extractPostScript() to be inline with
other extract methods?
5) What is the purpose of FooterInfo class? Apart from serialized footer and
metadata (serialized or non serialized?) what other information are stored in
the metastore?
> refactor ORC footer reading to make it usable from outside
> ----------------------------------------------------------
>
> Key: HIVE-11595
> URL: https://issues.apache.org/jira/browse/HIVE-11595
> Project: Hive
> Issue Type: Bug
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Attachments: HIVE-10595.patch, HIVE-11595.01.patch
>
>
> If ORC footer is read from cache, we want to parse it without having the
> reader, opening a file, etc. I thought it would be as simple as protobuf
> parseFrom bytes, but apparently there's bunch of stuff going on there. It
> needs to be accessible via something like parseFrom(ByteBuffer), or similar.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)