mbalassi commented on code in PR #558:
URL:
https://github.com/apache/flink-kubernetes-operator/pull/558#discussion_r1155274830
##########
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java:
##########
@@ -627,14 +637,42 @@ public Map<String, String> getClusterInfo(Configuration
conf) throws Exception {
.toSeconds(),
TimeUnit.SECONDS);
- runtimeVersion.put(
+ clusterInfo.put(
DashboardConfiguration.FIELD_NAME_FLINK_VERSION,
dashboardConfiguration.getFlinkVersion());
- runtimeVersion.put(
+ clusterInfo.put(
DashboardConfiguration.FIELD_NAME_FLINK_REVISION,
dashboardConfiguration.getFlinkRevision());
}
- return runtimeVersion;
+
+ // JobManager resource usage can be deduced from the CR
+ var jmParameters =
+ new KubernetesJobManagerParameters(
+ conf, new
KubernetesClusterClientFactory().getClusterSpecification(conf));
+ var jmTotalCpu =
+ jmParameters.getJobManagerCPU()
+ * jmParameters.getJobManagerCPULimitFactor()
+ * jmParameters.getReplicas();
+ var jmTotalMemory =
+ Math.round(
+ jmParameters.getJobManagerMemoryMB()
+ * Math.pow(1024, 2)
+ * jmParameters.getJobManagerMemoryLimitFactor()
+ * jmParameters.getReplicas());
+
+ // TaskManager resource usage is best gathered from the REST API to
get current replicas
Review Comment:
Thanks @mateczagany, this approach looks good. If you have the bandwidth
would you mind pushing your suggestions to this PR branch so that the commit
can be attributed to you? 😏 I have invited you as a collaborator to my fork,
you might need to accept that.
I would ask the following if you have the time:
1. Get resource configuration from the config as you suggested uniformly for
JMs and TMs
2. Get JM replicas from config, TM replicas from the REST API (we are trying
to be careful with the TM replicas because we foresee that we might be changing
things dynamically there via the autoscaler soon)
3. Add a test to `FlinkDeploymentMetricsTest` that verifies that given that
the `status.clusterInfo` is properly filled out we fill out the metrics
properly.
Currently we do not have meaningful test for creating the clusterInfo and
since we are relying on the application's REST API I do not see an easy way of
testing it properly, so I would accept this change without that (but it might
merit a separate JIRA).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]