[ 
https://issues.apache.org/jira/browse/HBASE-27579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17679486#comment-17679486
 ] 

Hudson commented on HBASE-27579:
--------------------------------

Results for branch branch-2.5
        [build #283 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/283/]:
 (/) *{color:green}+1 overall{color}*
----
details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/283/General_20Nightly_20Build_20Report/]


(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/283/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/283/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/283/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> CatalogJanitor can cause data loss due to errors during cleanMergeRegion
> ------------------------------------------------------------------------
>
>                 Key: HBASE-27579
>                 URL: https://issues.apache.org/jira/browse/HBASE-27579
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Blocker
>             Fix For: 2.6.0, 3.0.0-alpha-4, 2.4.16, 2.5.3
>
>
> In CatalogJanitor.cleanMergeRegion, there is the following check:
> {code:java}
> HRegionFileSystem regionFs = null;
> try {
>   regionFs = 
> HRegionFileSystem.openRegionFromFileSystem(this.services.getConfiguration(), 
> fs,
>     tabledir, mergedRegion, true);
> } catch (IOException e) {
>   LOG.warn("Merged region does not exist: " + mergedRegion.getEncodedName());
> }
> if (regionFs == null || !regionFs.hasReferences(htd)) {
>  .. do the cleanup ..
> } {code}
>  
> I think the assumption here is that an IOException would only be thrown if a 
> region doesn't exist? We had a very poorly timed NameNode failover, during 
> CatalogJanitor run, after a merge. The NameNode failover caused the 
> openRegionFromFileSystem call to fail, which logged:
> {code:java}
> WARN org.apache.hadoop.hbase.master.janitor.CatalogJanitor: Merged region 
> does not exist: 32c71224852c5a4b94a3ba271b4fcb15 {code}
> This region did in fact exist and had not fully compacted, so there were 
> still some lingering reference files.
> The cleanup process moves the parent regions to the archive directory, but 
> the default TTL for those files in the archive directory is only 5 minutes. 
> After that they are cleaned up and the data is now unrecoverable.
> This resulted in FileNotFoundExceptions trying to read or open this region. 
> Our only course of action was to move the lingering reference files aside, so 
> the data is unrecoverable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to