[
https://issues.apache.org/jira/browse/HIVE-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979641#comment-14979641
]
Eugene Koifman commented on HIVE-12257:
---------------------------------------
I think FileDump.printData()
should include e.getMessage() in System.err.println("Unable to dump data for
file: " + file);
I think getReaderInfo(final Configuration conf, final Path sideFile, final Path
path) implementation may be unreliable.
See OrcRawRecordMerger.getLastFlushLength.
Instead of relying on NN metadata for length, it has a while loop. This (I
believe) is to make sure to read until EOF even if NN doesn't yet have the
latest info.
(I guess ReaderImpl.extractMetaInfoFromFooter can't use the same trick since
that would be a perf problem)
> Enhance ORC FileDump utility to handle flush_length files
> ---------------------------------------------------------
>
> Key: HIVE-12257
> URL: https://issues.apache.org/jira/browse/HIVE-12257
> Project: Hive
> Issue Type: Bug
> Affects Versions: 1.3.0, 2.0.0
> Reporter: Prasanth Jayachandran
> Assignee: Prasanth Jayachandran
> Attachments: HIVE-12257.1.patch
>
>
> ORC file dump utility currently does not handle delta directories that
> contain *_flush_length files. These files contains offsets to footer in the
> corresponding delta file.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)