swatiksi273-ksolves commented on PR #1139: URL: https://github.com/apache/flink-kubernetes-operator/pull/1139#issuecomment-4743828850
@Dennis-Mircea, could you please review this PR? This addresses the actual root cause identified in FLINK-39925. The reporter confirmed that killing the JM and letting a new one spin up resolves the issue, after restart, the API returns read-records-complete: true with correct values. This confirms the JM loses its connection to TMs over time, causing the REST API to return complete: false with zeros. This PR fixes it by checking the complete flag in IOMetrics.from() and skipping scaling decisions when metrics are incomplete, preventing incorrect scale down. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
