[ 
https://issues.apache.org/jira/browse/HBASE-22193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-22193.
------------------------------------
       Resolution: Fixed
    Fix Version/s: 2.3.0
                   2.2.0
                   3.0.0

> Add backoff when region failed open too many times
> --------------------------------------------------
>
>                 Key: HBASE-22193
>                 URL: https://issues.apache.org/jira/browse/HBASE-22193
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Guanghao Zhang
>            Assignee: Guanghao Zhang
>            Priority: Major
>             Fix For: 3.0.0, 2.2.0, 2.3.0
>
>
>  
> {code:java}
> public static final String ASSIGN_MAX_ATTEMPTS =
>     "hbase.assignment.maximum.attempts";
> private static final int DEFAULT_ASSIGN_MAX_ATTEMPTS = Integer.MAX_VALUE;
> {code}
> Now the default config is Integer.MAX_VALUE. 
>  
> {code:java}
> 2019-04-09,10:50:44,921 INFO 
> org.apache.hadoop.hbase.master.assignment.TransitRegionStateProcedure: 
> Retry=170813 of max=2147483647; pid=2849, ppid=2846, 
> state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_OPENED, locked=true; 
> TransitRegionStateProcedure table=IntegrationTestBigLinkedList, 
> region=634feb79a583480597e1843647d11228, REOPEN/MOVE; rit=OPENING, 
> location=c4-hadoop-tst-st26.bj,29100,1554262369262
> {code}
> The ITBLL failed to open the region as HBASE-22163 and retry 170813 to 
> reopen. After I fixed the problem and restart master, I found it need take a 
> long time to init the old procedure logs because there are too many old 
> logs...
> Code in WALProcedureStore,java.
>  
> {code:java}
> private long initOldLogs(FileStatus[] logFiles) throws IOException {
>   if (logFiles == null || logFiles.length == 0) {
>     return 0L;
>   }
>   long maxLogId = 0;
>   for (int i = 0; i < logFiles.length; ++i) {
>     final Path logPath = logFiles[i].getPath();
>     leaseRecovery.recoverFileLease(fs, logPath);
>     if (!isRunning()) {
>       throw new IOException("wal aborting");
>     }
>     maxLogId = Math.max(maxLogId, getLogIdFromName(logPath.getName()));
>     ProcedureWALFile log = initOldLog(logFiles[i], this.walArchiveDir);
>     if (log != null) {
>       this.logs.add(log);
>     }
>   }
>   initTrackerFromOldLogs();
>   return maxLogId;
> }
> {code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to