[ https://issues.apache.org/jira/browse/HIVE-20327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572131#comment-16572131 ]
Eugene Koifman edited comment on HIVE-20327 at 8/7/18 7:01 PM: --------------------------------------------------------------- patch 2 is a prototype and an unsuccessful attempt to repro this Reader deltaReader = OrcFile.createReader(deltaFile, OrcFile.readerOptions(conf).maxLength(length)); recordReader = reader.rowsOptions(options, conf); recordReader.hasNext() returns false when deltaFile is an empty file.... was (Author: ekoifman): patch 2 is a prototype and an unsuccessful attempt to repro this > Compactor should gracefully handle 0 length files and invalid orc files > ----------------------------------------------------------------------- > > Key: HIVE-20327 > URL: https://issues.apache.org/jira/browse/HIVE-20327 > Project: Hive > Issue Type: Improvement > Components: Transactions > Affects Versions: 2.0.0 > Reporter: Eugene Koifman > Assignee: Eugene Koifman > Priority: Major > Attachments: HIVE-20327.02.patch > > > Older versions of Streaming API did not handle interrupts well and could > leave 0-length ORC files behind which cannot be read. > These should be just skipped. > Other cases of file where ORC Reader cannot be created > 1. regular write (1 txn delta) where the client died and didn't properly > close the file - this delta should be aborted and never read > 2. streaming ingest write (delta_x_y, x < y). There should always be a side > file if the file was not closed properly. (though it may still indicate that > length is 0) > If we check these cases and still can't create a reader, it should not > silently skip the file since the system thinks it contains at least some > committed data but the file is corrupted (and the side file doesn't point at > a valid footer) - we should never be in this situation and we should throw so > that the end user can try manual intervention (where the only option may be > deleting the file) -- This message was sent by Atlassian JIRA (v7.6.3#76005)