[ https://issues.apache.org/jira/browse/HIVE-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gopal V updated HIVE-8137: -------------------------- Fix Version/s: (was: 0.14.0) 0.15.0 > Empty ORC file handling > ----------------------- > > Key: HIVE-8137 > URL: https://issues.apache.org/jira/browse/HIVE-8137 > Project: Hive > Issue Type: Improvement > Components: File Formats > Affects Versions: 0.13.1 > Reporter: Pankit Thapar > Fix For: 0.15.0 > > Attachments: HIVE-8137.2.patch, HIVE-8137.patch > > > Hive 13 does not handle reading of a zero size Orc File properly. An Orc file > is suposed to have a post-script > which the ReaderIml class tries to read and initialize the footer with it. > But in case, the file is empty > or is of zero size, then it runs into an IndexOutOfBound Exception because of > ReaderImpl trying to read in its constructor. > Code Snippet : > //get length of PostScript > int psLen = buffer.get(readSize - 1) & 0xff; > In the above code, readSize for an empty file is zero. > I see that ensureOrcFooter() method performs some sanity checks for footer , > so, either we can move the above code snippet to ensureOrcFooter() and throw > a "Malformed ORC file exception" or we can create a dummy Reader that does > not initialize footer and basically has hasNext() set to false so that it > returns false on the first call. > Basically, I would like to know what might be the correct way to handle an > empty ORC file in a mapred job? > Should we neglect it and not throw an exception or we can throw an exeption > that the ORC file is malformed. > Please let me know your thoughts on this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)