adoroszlai opened a new pull request #650: HDDS-3089. TestSCMNodeManager 
intermittent crash
URL: https://github.com/apache/hadoop-ozone/pull/650
 
 
   ## What changes were proposed in this pull request?
   
   Based on the logs the test crashes during teardown when trying to delete 
pipeline in `EventQueue-DeadNodeForDeadNodeHandler` thread, as the RocksDB for 
pipelines was already closed during SCM stop.
   
   1. Clear `db` reference in `RocksDBStore` after closing it, to avoid calls 
on closed RocksDB.  This might result in NPE, which would be handled in Ozone 
normally, while call on closed DB might result in seg.fault in native code.
   2. Protect `pipelineStore.close()` call with write lock in 
`SCMPipelineManager` to avoid concurrent `pipelineStore.delete()` (or other 
similar operations).
   
   https://issues.apache.org/jira/browse/HDDS-3089
   
   ## How was this patch tested?
   
   Executed TestSCMNodeManager 2x20 times successfully:
   https://github.com/adoroszlai/hadoop-ozone/runs/491868681
   https://github.com/adoroszlai/hadoop-ozone/runs/491888786
   
   Regular CI with all checks (only it-freon failed):
   https://github.com/adoroszlai/hadoop-ozone/runs/491950482

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to