This seems interesting and low bar to entry:

https://github.com/arosien/failure

On Tue, Nov 15, 2016 at 4:01 PM, Edward Capriolo <[email protected]>
wrote:

> I was doing some load testing and I found the the current gating factor
> for max instances running in the same JVM is limited by the JMX based
> notification system the failure detector uses.
>
> Currently a cluster of N requires N * (N-1) JMX notification threads. I
> started attempting to remove this limit without going into building the
> accrual failure detector (22) but there were some nuanced bugs and I backed
> off because it did not seem worth the change.
>
> If anyone has an literature to contribute about building a consensus based
> failure detector please discuss. Once we cut this release that is likely
> were I will spent my attention.
>
> Thanks,
> Edward
>

Reply via email to