[
https://issues.apache.org/jira/browse/HIVE-20327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eugene Koifman reassigned HIVE-20327:
-------------------------------------
> Compactor should gracefully handle 0 length files and invalid orc files
> -----------------------------------------------------------------------
>
> Key: HIVE-20327
> URL: https://issues.apache.org/jira/browse/HIVE-20327
> Project: Hive
> Issue Type: Improvement
> Components: Transactions
> Affects Versions: 2.0.0
> Reporter: Eugene Koifman
> Assignee: Eugene Koifman
> Priority: Major
>
> Older versions of Streaming API did not handle interrupts well and could
> leave 0-length ORC files behind which cannot be read.
> These should be just skipped.
> Other cases of file where ORC Reader cannot be created
> 1. regular write (1 txn delta) where the client died and didn't properly
> close the file - this delta should be aborted and never read
> 2. streaming ingest write (delta_x_y, x < y). There should always be a side
> file if the file was not closed properly. (though it may still indicate that
> length is 0)
> If we check these cases and still can't create a reader, it should not
> silently skip the file since the system thinks it contains at least some
> committed data but the file is corrupted (and the side file doesn't point at
> a valid footer) - we should never be in this situation and we should throw so
> that the end user can try manual intervention (where the only option may be
> deleting the file)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)