anoopsjohn commented on a change in pull request #3648:
URL: https://github.com/apache/hbase/pull/3648#discussion_r705525612



##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCacheStats.java
##########
@@ -35,14 +35,22 @@
   private static final long NANO_TIME = TimeUnit.MILLISECONDS.toNanos(1);
   private long lastLogTime = EnvironmentEdgeManager.currentTime();
 
+  /* Tracing failed Bucket Cache allocations. */
+  private LongAdder allocationFailCount = new LongAdder();
+  private LongAdder allocationFailSize  = new LongAdder();

Review comment:
       do we need this really? I believe alloc fail count is enough in  stat

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java
##########
@@ -969,10 +977,25 @@ void doDrain(final List<RAMQueueEntry> entries) throws 
InterruptedException {
           }
           index++;
         } catch (BucketAllocatorException fle) {
-          LOG.warn("Failed allocation for " + (re == null ? "" : re.getKey()) 
+ "; " + fle);
+          long currTE = System.currentTimeMillis()/1000; // Current time since 
Epoch in seconds.
+          // Record the warning.
+          cacheStats.allocationFailed(re.getData().getSerializedLength());
+          if (prevRecTE == 0) {
+            // The very first exception.
+            LOG.warn("Failed allocation for " + (re == null ? "" : 
re.getKey()) + "; " + fle +
+              " HFile: " + (re == null ? "" : re.getKey().getHfileName()));
+          } else {
+            if (currTE - prevRecTE > ALLOCATION_FAIL_TRACE_TIME_GAP) {
+              LOG.warn("Most recent failed allocation in " + 
ALLOCATION_FAIL_TRACE_TIME_GAP +
+                "seconds: key: " + (re == null ? "" : re.getKey()) + "; " + 
fle +
+                " HFile: " + (re == null ? "" : re.getKey().getHfileName()));
+            }
+            // else do nothing, the exception has already been recorded above.
+          }
           // Presume can't add. Too big? Move index on. Entry will be cleared 
from ramCache below.
           bucketEntries[index] = null;
           index++;
+          prevRecTE = currTE;

Review comment:
       This will cause prevRecTE to set to curTS whenever 
BucketAllocatorException  happens.  We need to do that only when we log it

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java
##########
@@ -969,10 +977,25 @@ void doDrain(final List<RAMQueueEntry> entries) throws 
InterruptedException {
           }
           index++;
         } catch (BucketAllocatorException fle) {
-          LOG.warn("Failed allocation for " + (re == null ? "" : re.getKey()) 
+ "; " + fle);
+          long currTE = System.currentTimeMillis()/1000; // Current time since 
Epoch in seconds.
+          // Record the warning.
+          cacheStats.allocationFailed(re.getData().getSerializedLength());
+          if (prevRecTE == 0) {

Review comment:
       May be just a common log is enoug?
   if( prevRecTE == 0 || currTE - prevRecTE > ALLOCATION_FAIL_TRACE_TIME_GAP)

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java
##########
@@ -247,6 +247,10 @@
    * */
   private String algorithm;
 
+  /* Tracing failed Bucket Cache allocations. */
+  private long prevRecTE; // previous recorded time since Epoch in seconds
+  private static final int ALLOCATION_FAIL_TRACE_TIME_GAP = 60; // Default 60 
seconds.

Review comment:
       Call it ALLOCATION_FAIL_LOG_TIME_PERIOD? 

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java
##########
@@ -247,6 +247,10 @@
    * */
   private String algorithm;
 
+  /* Tracing failed Bucket Cache allocations. */
+  private long prevRecTE; // previous recorded time since Epoch in seconds
+  private static final int ALLOCATION_FAIL_TRACE_TIME_GAP = 60; // Default 60 
seconds.

Review comment:
       Can keep this as millisec so that for below time check, you can just use 
System.currentTimeMillis() alone




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to