adoroszlai opened a new pull request #697: [HDDS-1332] Attempt to fix flaky 
test testStartStopDatanodeStateMachine
URL: https://github.com/apache/hadoop/pull/697
 
 
   ## What changes were proposed in this pull request?
   
   `testStartStopDatanodeStateMachine` is flaky, causing [occasional pre-commit 
build 
failures](https://builds.apache.org/job/hadoop-multibranch/job/PR-691/1/artifact/out/patch-unit-hadoop-hdds_container-service.txt).
  [HDDS-1332](https://issues.apache.org/jira/browse/HDDS-1332) added some 
logging to find out more about the cause.
   
   I think the problem is not test-specific, and is caused by the following: 
`SCMConnectionManager#scmMachines` is a plain `HashMap`, guarded by a 
`ReadWriteLock` in most places where it's used, except `getValues()`.  The 
method also returns the values collection without any write protection (though 
currently none of the callers modify it).
   
   This is an attempt to fix the cause by acquiring the read lock and creating 
a read-only copy.
   
   https://issues.apache.org/jira/browse/HDDS-1332
   
   ## How was this patch tested?
   
   Ran affected unit tests several times, plus tried `ozone` docker-compose 
cluster.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to