[
https://issues.apache.org/jira/browse/HBASE-16964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gary Helmling updated HBASE-16964:
----------------------------------
Attachment: HBASE-16964.patch
Attaching a patch against master.
This introduces a new FailedArchiveException, which can be thrown by
HFileArchive.archiveStoreFiles() in order to retain information on which files
failed. Only the successfully archived store files will then get cleared from
the compactedfiles list.
I'd also like to call out a change to HFileArchiver.resolveAndArchiveFile().
Here I added an explicit catch for FileNotFoundException and do _not_ set
success to false in that case. My reasoning is that if we're going to archive
the file anyway, do we really care (or is it really an error) is the file not
exist. I'd appreciate any thoughts or insights on the validity of this from
other perspectives.
[~ram_krish] please take a look.
> Successfully archived files are not cleared from compacted store file list if
> archiving of any file fails
> ---------------------------------------------------------------------------------------------------------
>
> Key: HBASE-16964
> URL: https://issues.apache.org/jira/browse/HBASE-16964
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Reporter: Gary Helmling
> Assignee: Gary Helmling
> Priority: Blocker
> Fix For: 2.0.0, 1.3.0, 1.4.0
>
> Attachments: HBASE-16964.patch
>
>
> In HStore.removeCompactedFiles(), we only clear archived files from
> StoreFileManager's list of compactedfiles if _all_ files were archived
> successfully. If we encounter an error archiving any of the files, then any
> files which were already archived will remain in the list of compactedfiles.
> Even worse, this means that all subsequent attempts to archive the list of
> compacted files will fail (as the previously successfully archived files
> still in the list will now throw FileNotFoundException), and the list of
> compactedfiles will never be cleared from that point on.
> Finally, when the region closes, we will again throw an exception out of
> HStore.removeCompactedFiles(), in this case causing a regionserver abort.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)