lhotari commented on a change in pull request #9393:
URL: https://github.com/apache/pulsar/pull/9393#discussion_r572656526
##########
File path:
pulsar-broker/src/main/java/org/apache/pulsar/broker/loadbalance/impl/LinuxBrokerHostUsageImpl.java
##########
@@ -75,18 +75,17 @@ public LinuxBrokerHostUsageImpl(int
hostUsageCheckIntervalMin,
this.lastCollection = 0L;
this.usage = new SystemResourceUsage();
this.overrideBrokerNicSpeedGbps = overrideBrokerNicSpeedGbps;
- executorService.scheduleAtFixedRate(this::calculateBrokerHostUsage, 0,
- hostUsageCheckIntervalMin, TimeUnit.MINUTES);
boolean isCGroupsEnabled = false;
try {
isCGroupsEnabled =
Files.exists(Paths.get(CGROUPS_CPU_USAGE_PATH));
} catch (Exception e) {
log.warn("Failed to check cgroup CPU usage file: {}",
e.getMessage());
}
-
this.isCGroupsEnabled = isCGroupsEnabled;
- calculateBrokerHostUsage();
Review comment:
I don't think that the call to `calculateBrokerHostUsage` should be
removed. It should remain so that the values are available after the
constructor returns. Instead of removing this line, the scheduling should
happen after this call and the initial delay of the scheduling should be set to
`hostUsageCheckIntervalMin` to prevent the race.
The problem in the race should also be fixed, I have created a few commits
in
https://github.com/apache/pulsar/compare/master...lhotari:issue-6368-flaky-test
to fix the issue that the concurrent calls to `calculateBrokerHostUsage` caused.
Some suggested improvements to de-risk actual production flakiness:
* [Make ResourceUsage class
immutable](https://github.com/lhotari/pulsar/commit/7c5ea55fffd49a2b09b53ca0b1132eef0b1e67cf)
* [Replace usage of Files.readAllBytes with
Files.copy](https://github.com/lhotari/pulsar/commit/97a3fb609b6aea81327146d2296412c5599988b2)
* [Pre-calculate the usage percentage in ResourceUsage
class](https://github.com/lhotari/pulsar/commit/32aec9de38674f959d090305cb40b14f8c994df7)
* [Fix initial cpu usage calculation and use System.nanoTime() for
calculating elapsed
time](https://github.com/apache/pulsar/commit/468550f471e3e1b4e5071fa9e5fac2f69713a398)
One possibility is to cherry-pick these commits to this same PR.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]