adoroszlai opened a new pull request #697: [HDDS-1332] Attempt to fix flaky test testStartStopDatanodeStateMachine URL: https://github.com/apache/hadoop/pull/697 ## What changes were proposed in this pull request? `testStartStopDatanodeStateMachine` is flaky, causing [occasional pre-commit build failures](https://builds.apache.org/job/hadoop-multibranch/job/PR-691/1/artifact/out/patch-unit-hadoop-hdds_container-service.txt). [HDDS-1332](https://issues.apache.org/jira/browse/HDDS-1332) added some logging to find out more about the cause. I think the problem is not test-specific, and is caused by the following: `SCMConnectionManager#scmMachines` is a plain `HashMap`, guarded by a `ReadWriteLock` in most places where it's used, except `getValues()`. The method also returns the values collection without any write protection (though currently none of the callers modify it). This is an attempt to fix the cause by acquiring the read lock and creating a read-only copy. https://issues.apache.org/jira/browse/HDDS-1332 ## How was this patch tested? Ran affected unit tests several times, plus tried `ozone` docker-compose cluster.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org