[
https://issues.apache.org/jira/browse/HDFS-17026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17729514#comment-17729514
]
ASF GitHub Bot commented on HDFS-17026:
---------------------------------------
goiri commented on code in PR #5714:
URL: https://github.com/apache/hadoop/pull/5714#discussion_r1218675945
##########
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/NamenodeHeartbeatService.java:
##########
@@ -348,44 +362,73 @@ private void updateJMXParameters(
String address, NamenodeStatusReport report) {
try {
// TODO part of this should be moved to its own utility
- String query = "Hadoop:service=NameNode,name=FSNamesystem*";
- JSONArray aux = FederationUtil.getJmx(
- query, address, connectionFactory, scheme);
- if (aux != null) {
- for (int i = 0; i < aux.length(); i++) {
- JSONObject jsonObject = aux.getJSONObject(i);
- String name = jsonObject.getString("name");
- if (name.equals("Hadoop:service=NameNode,name=FSNamesystemState")) {
- report.setDatanodeInfo(
- jsonObject.getInt("NumLiveDataNodes"),
- jsonObject.getInt("NumDeadDataNodes"),
- jsonObject.getInt("NumStaleDataNodes"),
- jsonObject.getInt("NumDecommissioningDataNodes"),
- jsonObject.getInt("NumDecomLiveDataNodes"),
- jsonObject.getInt("NumDecomDeadDataNodes"),
- jsonObject.optInt("NumInMaintenanceLiveDataNodes"),
- jsonObject.optInt("NumInMaintenanceDeadDataNodes"),
- jsonObject.optInt("NumEnteringMaintenanceDataNodes"));
- } else if (name.equals(
- "Hadoop:service=NameNode,name=FSNamesystem")) {
- report.setNamesystemInfo(
- jsonObject.getLong("CapacityRemaining"),
- jsonObject.getLong("CapacityTotal"),
- jsonObject.getLong("FilesTotal"),
- jsonObject.getLong("BlocksTotal"),
- jsonObject.getLong("MissingBlocks"),
- jsonObject.getLong("PendingReplicationBlocks"),
- jsonObject.getLong("UnderReplicatedBlocks"),
- jsonObject.getLong("PendingDeletionBlocks"),
- jsonObject.optLong("ProvidedCapacityTotal"));
- }
- }
+ if (shouldUpdateJmx()) {
+ this.lastJmxUpdateAttempt = Time.monotonicNow();
+ String query = "Hadoop:service=NameNode,name=FSNamesystem*";
+ this.fsNamesystemMetrics = FederationUtil.getJmx(
+ query, address, connectionFactory, scheme);
}
+ populateFsNamesystemMetrics(this.fsNamesystemMetrics, report);
} catch (Exception e) {
LOG.error("Cannot get stat from {} using JMX", getNamenodeDesc(), e);
}
}
+ /**
+ * Evaluates whether the JMX report should be refreshed by
+ * calling the Namenode, based on the following conditions:
+ * 1. JMX Updates must be enabled.
+ * 2. The last attempt to update JMX occurred before the
+ * configured interval (if any).
+ */
+ private boolean shouldUpdateJmx() {
+ if (this.updateJmxIntervalMs < 0) {
+ return false;
+ }
+
+ return Time.monotonicNow() - this.lastJmxUpdateAttempt >
this.updateJmxIntervalMs;
+ }
+
+ /**
+ * Populates FSNamesystem* metrics into report.
+ * @param aux FSNamesystem* metrics from namenode.
+ * @param report Namenode status report to update with JMX data.
+ * @throws JSONException When invalid JSONObject is found.
+ */
+ private void populateFsNamesystemMetrics(JSONArray aux, NamenodeStatusReport
report)
+ throws JSONException {
+ if (aux != null) {
Review Comment:
Do early exit in == null and reduce nesting.
> RBF: NamenodeHeartbeatService should update JMX report with configurable
> frequency
> ----------------------------------------------------------------------------------
>
> Key: HDFS-17026
> URL: https://issues.apache.org/jira/browse/HDFS-17026
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: rbf
> Reporter: Hector Sandoval Chaverri
> Assignee: Hector Sandoval Chaverri
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: HDFS-17026-branch-3.3.patch
>
>
> TheĀ NamenodeHeartbeatService currently calls each of the Namenode's JMX
> endpoint every time it wakes up (default value is every 5 seconds).
> In a cluster with 40 routers, we have observed service degradation on some of
> theĀ Namenodes, since the JMX request obtains Datanode status and blocks
> other RPC requests. However, JMX report data doesn't seem to be used for
> critical paths on the routers.
> We should configure the NamenodeHeartbeatService so it updates the JMX
> reports on a slower frequency than the Namenode states or to disable the
> reports completely.
> The class calls out the JMX request being optional even though there is no
> implementation to turn it off:
> {noformat}
> // Read the stats from JMX (optional)
> updateJMXParameters(webAddress, report);{noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]