ayushtkn commented on code in PR #5497:
URL: https://github.com/apache/hadoop/pull/5497#discussion_r1148399591


##########
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/store/driver/TestStateStoreDriverBase.java:
##########
@@ -574,6 +580,38 @@ private static Map<String, Class<?>> getFields(BaseRecord 
record) {
     return getters;
   }
 
+  public long getMountTableCacheLoadSamples(StateStoreDriver driver) throws 
IOException {
+    final MutableRate mountTableCache = getMountTableCache(driver);
+    return mountTableCache.lastStat().numSamples();
+  }
+
+  private static MutableRate getMountTableCache(StateStoreDriver driver) 
throws IOException {
+    StateStoreMetrics metrics = stateStore.getMetrics();
+    final Query<MountTable> query = new Query<>(MountTable.newInstance());
+    driver.getMultiple(MountTable.class, query);
+    final Map<String, MutableRate> cacheLoadMetrics = 
metrics.getCacheLoadMetrics();
+    final MutableRate mountTableCache = 
cacheLoadMetrics.get("CacheMountTableLoad");
+    assertNotNull("CacheMountTableLoad should be present in the state store 
metrics",
+        mountTableCache);
+    return mountTableCache;
+  }
+
+  public void testCacheLoadMetrics(StateStoreDriver driver, long numRefresh)
+      throws IOException, IllegalArgumentException {
+    final MutableRate mountTableCache = getMountTableCache(driver);
+    // CacheMountTableLoadNumOps
+    final long mountTableCacheLoadNumOps = 
getMountTableCacheLoadSamples(driver);
+    assertEquals("Num of samples collected should match", numRefresh, 
mountTableCacheLoadNumOps);
+    // CacheMountTableLoadAvgTime ms
+    final double mountTableCacheLoadAvgTimeMs = 
mountTableCache.lastStat().mean();
+    // 2 seconds is a high enough value for the test, hence we expect mount 
table cache
+    // with very few entries to be loaded by this time duration, hence not 
have this test result
+    // show flaky behavior.
+    assertTrue(
+        "Mean time duration for cache load is expected to be less than 2000 
ms. Actual value: "
+            + mountTableCacheLoadAvgTimeMs, mountTableCacheLoadAvgTimeMs < 
2000d);
+  }

Review Comment:
   Even if it goes to say 10K, it has nothing to do with your code, your code 
still captured the value and we would be happy with that.
   
   Now if it is 0, it means it didn't capture the value, that is where we need 
to worried.
   
   if there are chances it can stay at 0 :
   
   - If only in test: May be have the default value as -1 and we can assert it 
isn't -1 and we should be sorted, if it is -1 then things didin't work the way 
we wanted.
   - If it can be in prod as well: then we might have to think of lowering the 
unit or have dynamic units or so, because having these values as 0 in a prod 
cluster would be just confusing? mostly would create doubts like as if the 
metrics isn't getting captured.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to