[jira] [Commented] (HIVE-12257) Enhance ORC FileDump utility to handle flush_length files

Eugene Koifman (JIRA) Wed, 28 Oct 2015 18:47:31 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-12257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14979641#comment-14979641
 ]


Eugene Koifman commented on HIVE-12257:
---------------------------------------

I think FileDump.printData()
should include e.getMessage() in System.err.println("Unable to dump data for 
file: " + file);


I think getReaderInfo(final Configuration conf, final Path sideFile, final Path 
path) implementation may be unreliable.
See OrcRawRecordMerger.getLastFlushLength.
Instead of relying on NN metadata for length, it has a while loop.  This (I 
believe) is to make sure to read until EOF even if NN doesn't yet have the 
latest info.

(I guess ReaderImpl.extractMetaInfoFromFooter can't use the same trick since 
that would be a perf problem)

> Enhance ORC FileDump utility to handle flush_length files
> ---------------------------------------------------------
>
>                 Key: HIVE-12257
>                 URL: https://issues.apache.org/jira/browse/HIVE-12257
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.3.0, 2.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>         Attachments: HIVE-12257.1.patch
>
>
> ORC file dump utility currently does not handle delta directories that 
> contain *_flush_length files. These files contains offsets to footer in the 
> corresponding delta file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12257) Enhance ORC FileDump utility to handle flush_length files

Reply via email to