[GitHub] [hbase] anoopsjohn commented on a change in pull request #1764: HBASE-24420 Avoid Meaningless Retry Attempts in Unrecoverable Failure

GitBox Fri, 22 May 2020 23:59:30 -0700


anoopsjohn commented on a change in pull request #1764:
URL: https://github.com/apache/hbase/pull/1764#discussion_r429521262




##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/tool/BulkLoadHFilesTool.java
##########
@@ -879,13 +881,21 @@ public void bulkHFile(ColumnFamilyDescriptorBuilder 
builder, FileStatus hfileSta
       }
 
       int maxRetries = 
getConf().getInt(HConstants.BULKLOAD_MAX_RETRIES_NUMBER, 10);
-      maxRetries = Math.max(maxRetries, startEndKeys.size() + 1);
+
+      /**
+       * For the first attempt, we make maxRetries with the configured maximum 
retry number
+       * As long as we find that region number changed, we setup maxRetries to 
region number
+       * But if we find that the region is not changed, then the maxRetries 
should be still
+       * be configured BULKLOAD_MAX_RETRIES_NUMBER to avoid meaningless retry 
attempts
+       */
+      if(count != 0 && previousRegionNum != startEndKeys.size() )

Review comment:
       This is a nice find.
   Well this will help your particular case. But don't think this is a generic 
solution.
   What if a cluster having an issue like your's (config issue) and during this 
configured retry times there was a split also!  It will change the max retries 
to be so large (if so many regions like ur case).
   What if we just reset the 'count' only within the while loop if we notice a 
split happened in between the current run and previous? If there are splits 
happening at regular interval this will cause a never ending loop and would 
still need an upper bound. But blindly making the retry count to be #regions+1 
just after seeing a split is something concerning to me.
   cc @saintstack   - Seems you  reviewed the original jira so might be 
remembering that context. Any pointers sir?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hbase] anoopsjohn commented on a change in pull request #1764: HBASE-24420 Avoid Meaningless Retry Attempts in Unrecoverable Failure

Reply via email to