ayushtkn commented on code in PR #5497:
URL: https://github.com/apache/hadoop/pull/5497#discussion_r1148399591
##########
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/store/driver/TestStateStoreDriverBase.java:
##########
@@ -574,6 +580,38 @@ private static Map<String, Class<?>> getFields(BaseRecord
record) {
return getters;
}
+ public long getMountTableCacheLoadSamples(StateStoreDriver driver) throws
IOException {
+ final MutableRate mountTableCache = getMountTableCache(driver);
+ return mountTableCache.lastStat().numSamples();
+ }
+
+ private static MutableRate getMountTableCache(StateStoreDriver driver)
throws IOException {
+ StateStoreMetrics metrics = stateStore.getMetrics();
+ final Query<MountTable> query = new Query<>(MountTable.newInstance());
+ driver.getMultiple(MountTable.class, query);
+ final Map<String, MutableRate> cacheLoadMetrics =
metrics.getCacheLoadMetrics();
+ final MutableRate mountTableCache =
cacheLoadMetrics.get("CacheMountTableLoad");
+ assertNotNull("CacheMountTableLoad should be present in the state store
metrics",
+ mountTableCache);
+ return mountTableCache;
+ }
+
+ public void testCacheLoadMetrics(StateStoreDriver driver, long numRefresh)
+ throws IOException, IllegalArgumentException {
+ final MutableRate mountTableCache = getMountTableCache(driver);
+ // CacheMountTableLoadNumOps
+ final long mountTableCacheLoadNumOps =
getMountTableCacheLoadSamples(driver);
+ assertEquals("Num of samples collected should match", numRefresh,
mountTableCacheLoadNumOps);
+ // CacheMountTableLoadAvgTime ms
+ final double mountTableCacheLoadAvgTimeMs =
mountTableCache.lastStat().mean();
+ // 2 seconds is a high enough value for the test, hence we expect mount
table cache
+ // with very few entries to be loaded by this time duration, hence not
have this test result
+ // show flaky behavior.
+ assertTrue(
+ "Mean time duration for cache load is expected to be less than 2000
ms. Actual value: "
+ + mountTableCacheLoadAvgTimeMs, mountTableCacheLoadAvgTimeMs <
2000d);
+ }
Review Comment:
Even if it goes to say 10K, it has nothing to do with your code, your code
still captured the value and we would be happy with that.
Now if it is 0, it means it didn't capture the value, that is where we need
to worried.
if there are chances it can stay at 0 :
- If only in test: May be have the default value as -1 and we can assert it
isn't -1 and we should be sorted, if it is -1 then things didin't work the way
we wanted.
- If it can be in prod as well: then we might have to think of lowering the
unit or have dynamic units or so, because having these values as 0 in a prod
cluster would be just confusing? mostly would create doubts like as if the
metrics isn't getting captured.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]