[ 
https://issues.apache.org/jira/browse/HBASE-23633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17026669#comment-17026669
 ] 

Pankaj Kumar commented on HBASE-23633:
--------------------------------------

In my test scenario all corrupted hfiles were of zero length. We should check & 
delete zero length file during recovered hfile bulkload, the same way how it 
handled while replaying edits,

{code}
  private long loadRecoveredHFilesIfAny(Collection<HStore> stores) throws 
IOException {
    Path regionDir = getWALRegionDir();
    long maxSeqId = -1;
    for (HStore store : stores) {
      String familyName = store.getColumnFamilyName();
      FileStatus[] files =
          WALSplitUtil.getRecoveredHFiles(fs.getFileSystem(), regionDir, 
familyName);
      if (files != null && files.length != 0) {
        for (FileStatus file : files) {
          // Check and delete the zero length file
          if (isZeroLengthThenDelete(fs.getFileSystem(), file.getPath())) {
            continue;
          }
          store.assertBulkLoadHFileOk(file.getPath());
{code}

> Find a way to handle the corrupt recovered hfiles
> -------------------------------------------------
>
>                 Key: HBASE-23633
>                 URL: https://issues.apache.org/jira/browse/HBASE-23633
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Guanghao Zhang
>            Priority: Major
>
> Copy the comment from PR review.
>  
> If the file is a corrupt HFile, an exception will be thrown here, which will 
> cause the region to fail to open.
> Maybe we can add a new parameter to control whether to skip the exception, 
> similar to recover edits which has a parameter 
> "hbase.hregion.edits.replay.skip.errors";
>  
> Regions that can't be opened because of detached References or corrupt hfiles 
> are a fact-of-life. We need work on this issue. This will be a new variant on 
> the problem -- i.e. bad recovered hfiles.
> On adding a config to ignore bad files and just open, thats a bit dangerous 
> as per @infraio .... as it could mean silent data loss.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to