Is there a JIRA open for the partial crash bug described in "Redundancy Does Not Imply Fault Tolerance: Analysis of Distributed Storage Reactions to Single Errors and Corruptions" Aishwarya Ganesan, Ramnatthan Alagappan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau, University of Wisconsin—Madison. 15th USENIX Conference on File and Storage Technologies (FAST ’17)?
From https://www.usenix.org/system/files/conference/fast17/fast17-ganesan.pdf "Unfortunately, ZooKeeper does not recover from write errors to the transaction head and log tail. On write errors during log initialization, the error handling code tries to gracefully shutdown the node but kills only the transaction processing threads; the quorum thread remains alive (partial crash). Consequently, other nodes believe that the leader is healthy and do not elect a new leader. However, since the leader has partially crashed, it cannot propose any transactions, leading to an indefinite write unavailability." -- Best regards, Andrew Purtell [email protected] [email protected]
