xBis7 commented on PR #4362: URL: https://github.com/apache/ozone/pull/4362#issuecomment-1491920080
@adoroszlai The latest changes fix the timeout issue. I've launched multiple workflows and it's not occurring anymore. But this revealed another underlying issue that might not even have to do with the test. During leader change the metrics don't get updated. `OMHAMetrics` rely upon calling `OzoneManager.updatePeerList()`, at the end of this method we unregister the metrics and then register them again. It was my understanding that after every time an OM gets started, stopped or restarted, there is a conf change and `OMStateMachine` calls that method. That doesn't seem to be the case. Latest four workflows, where you can see that there is no timeout failure. All failures are due to the metrics not getting updated. https://github.com/xBis7/ozone/actions/runs/4566947556 https://github.com/xBis7/ozone/actions/runs/4567066352 https://github.com/xBis7/ozone/actions/runs/4574892711 https://github.com/xBis7/ozone/actions/runs/4574961334 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
