mcfatealan commented on issue #1077: ZOOKEEPER-3531: Synchronization on ACLCache cause cluster to hang whe… URL: https://github.com/apache/zookeeper/pull/1077#issuecomment-534328771 @maoling I came across two other related ones today: Lineage-driven Fault Injection [SIGMOD '15] On Fault Resilience of OpenStack [SoCC '13] Actually right now we are working on a systematic fault injector and runtime checker framework using program analysis techniques for catching these type of failures. We have a short paper [HotOS '19](https://www.cs.jhu.edu/~chlou/paper/watchdog-hotos19-preprint.pdf) describing our approach and some preliminary results and plan to release the tool to the open-source community when it is mature. We could let you know when it's available if you are interested. (sorry for a little self-promotion here 😸 ) Any feedback will be highly appreciated :)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
