[
https://issues.apache.org/jira/browse/ORC-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053051#comment-16053051
]
ASF GitHub Bot commented on ORC-202:
------------------------------------
Github user omalley commented on a diff in the pull request:
https://github.com/apache/orc/pull/132#discussion_r122580547
--- Diff: java/core/src/java/org/apache/orc/impl/OrcTail.java ---
@@ -70,8 +70,11 @@ public long getFileModificationTime() {
public OrcFile.WriterVersion getWriterVersion() {
OrcProto.PostScript ps = fileTail.getPostscript();
+ OrcProto.Footer footer = fileTail.getFooter();
+ OrcFile.WriterImplementation writer =
+ OrcFile.WriterImplementation.from(footer.getWriter());
return (ps.hasWriterVersion()
--- End diff --
writerVersion is required for anything except the pre-HIVE-8732 Java ORC
files. Otherwise, as you say, the readers will interpret it as one of the
pre-HIVE-8732 files. I can simplify this condition, because if it isn't present
it will get the default value of 0, which is correct.
The current code in this pull request, would make broken files that have
writer = presto, writerVersion = 0 as future. I guess I could check for the
case of presto or orc_cpp and version < 6 and throw an exception.
> Add enum that encodes which writer wrote a file
> -----------------------------------------------
>
> Key: ORC-202
> URL: https://issues.apache.org/jira/browse/ORC-202
> Project: ORC
> Issue Type: Bug
> Reporter: Owen O'Malley
> Assignee: Owen O'Malley
>
> Add a protobuf enum value in the footer that can encode which writer wrote
> the file:
> * ORC Java Writer
> * ORC C++ Writer
> * Presto Writer
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)