[ 
https://issues.apache.org/jira/browse/HBASE-27974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoyue Huang updated HBASE-27974:
----------------------------------
    Description: 
We encountered an issue with the loss of HFile references in the snapshot after 
enabling the CompactionServer feature in our cluster.

The relevant log fragment is approximately as follows:
{code:java}
File does not exist: 
/.../.tmp/data/default/......../b72cab4efb074defb1bd9acd9087891f
File does not exist: 
/....../archive/data/default/........../b72cab4efb074defb1bd9acd9087891f
File does not exist: 
/....../data/default/......./b72cab4efb074defb1bd9acd9087891f {code}
>From the displayed HDFS logs, we observed that this HFile 
>'b72cab4efb074defb1bd9acd9087891f' was renamed by the CompactionServer and 
>RegionServer, and eventually deleted by the HMaster.
{code:java}
2022-07-13,00:50:01,727 INFO FSNamesystem.audit:  
cmd=rename   operator=CompactionServer
src=/...../data/default/...../b72cab4efb074defb1bd9acd9087891f 
dst=/....../archive/data/default/....../b72cab4efb074defb1bd9acd9087891f 

2022-07-13,00:51:23,802 INFO FSNamesystem.audit:  
cmd=rename 
operator=RegionServer
src=/....../archive/data/default/....../b72cab4efb074defb1bd9acd9087891f
dst=/....../archive/data/default/....../b72cab4efb074defb1bd9acd9087891f.1657644683801
 

2022-07-13,01:51:57,823 INFO FSNamesystem.audit:        
cmd=delete     
operator=HMaster
src=/....../archive/data/default/....../b72cab4efb074defb1bd9acd9087891f.1657644683801
 {code}
 

Based on HBASE-26722 and HBASE-22163, we understand that if a region A on RS1 
is not closed, and another RS2 (in this case, CompactionServer) opens the same 
region A, it may trigger an "archived" state. Consequently, when RS1 closes 
this region, it will be archived again, resulting in the deletion of the HFile 
from the archived directory. As a result, the snapshot will lose its reference 
to the HFile.

 

  was:
We encountered an issue with the loss of HFile references in the snapshot after 
enabling the CompactionServer feature in our cluster.

The relevant log fragment is approximately as follows:
{code:java}
File does not exist: 
/.../.tmp/data/default/......../b72cab4efb074defb1bd9acd9087891f
File does not exist: 
/....../archive/data/default/........../b72cab4efb074defb1bd9acd9087891f
File does not exist: 
/....../data/default/......./b72cab4efb074defb1bd9acd9087891f {code}
>From the displayed HDFS logs, we observed that this HFile 
>'b72cab4efb074defb1bd9acd9087891f' was renamed by the CompactionServer and 
>RegionServer, and eventually deleted by the HMaster.
{code:java}
2022-07-13,00:50:01,727 INFO FSNamesystem.audit:  
cmd=rename   operator=CompactionServer
src=/...../data/default/...../b72cab4efb074defb1bd9acd9087891f 
dst=/....../archive/data/default/....../b72cab4efb074defb1bd9acd9087891f 

2022-07-13,00:51:23,802 INFO FSNamesystem.audit:  
cmd=rename 
operator=RegionServer
src=/....../archive/data/default/....../b72cab4efb074defb1bd9acd9087891f
dst=/....../archive/data/default/....../b72cab4efb074defb1bd9acd9087891f.1657644683801
 

2022-07-13,01:51:57,823 INFO FSNamesystem.audit:        
cmd=delete     
operator=HMaster
src=/....../archive/data/default/....../b72cab4efb074defb1bd9acd9087891f.1657644683801
 {code}
 


> CompactionServer cause the loss of HFile references in snapshot
> ---------------------------------------------------------------
>
>                 Key: HBASE-27974
>                 URL: https://issues.apache.org/jira/browse/HBASE-27974
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Zhuoyue Huang
>            Assignee: Zhuoyue Huang
>            Priority: Major
>
> We encountered an issue with the loss of HFile references in the snapshot 
> after enabling the CompactionServer feature in our cluster.
> The relevant log fragment is approximately as follows:
> {code:java}
> File does not exist: 
> /.../.tmp/data/default/......../b72cab4efb074defb1bd9acd9087891f
> File does not exist: 
> /....../archive/data/default/........../b72cab4efb074defb1bd9acd9087891f
> File does not exist: 
> /....../data/default/......./b72cab4efb074defb1bd9acd9087891f {code}
> From the displayed HDFS logs, we observed that this HFile 
> 'b72cab4efb074defb1bd9acd9087891f' was renamed by the CompactionServer and 
> RegionServer, and eventually deleted by the HMaster.
> {code:java}
> 2022-07-13,00:50:01,727 INFO FSNamesystem.audit:  
> cmd=rename   operator=CompactionServer
> src=/...../data/default/...../b72cab4efb074defb1bd9acd9087891f 
> dst=/....../archive/data/default/....../b72cab4efb074defb1bd9acd9087891f 
> 2022-07-13,00:51:23,802 INFO FSNamesystem.audit:  
> cmd=rename 
> operator=RegionServer
> src=/....../archive/data/default/....../b72cab4efb074defb1bd9acd9087891f
> dst=/....../archive/data/default/....../b72cab4efb074defb1bd9acd9087891f.1657644683801
>  
> 2022-07-13,01:51:57,823 INFO FSNamesystem.audit:        
> cmd=delete     
> operator=HMaster
> src=/....../archive/data/default/....../b72cab4efb074defb1bd9acd9087891f.1657644683801
>  {code}
>  
> Based on HBASE-26722 and HBASE-22163, we understand that if a region A on RS1 
> is not closed, and another RS2 (in this case, CompactionServer) opens the 
> same region A, it may trigger an "archived" state. Consequently, when RS1 
> closes this region, it will be archived again, resulting in the deletion of 
> the HFile from the archived directory. As a result, the snapshot will lose 
> its reference to the HFile.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to