[
https://issues.apache.org/jira/browse/GOSSIP-74?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931871#comment-15931871
]
ASF GitHub Bot commented on GOSSIP-74:
--------------------------------------
GitHub user makrusak opened a pull request:
https://github.com/apache/incubator-gossip/pull/43
GOSSIP-74 Critical bugs in FailureDetector
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/makrusak/incubator-gossip GOSSIP-74
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-gossip/pull/43.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #43
----
commit 9bae8bb9461fd2b88e368e92033c0f8787a6f636
Author: Maxim Rusak <[email protected]>
Date: 2017-03-19T18:20:32Z
GOSSIP-74 Critical bugs in FailureDetector
----
> Critical bugs in FailureDetector
> --------------------------------
>
> Key: GOSSIP-74
> URL: https://issues.apache.org/jira/browse/GOSSIP-74
> Project: Gossip
> Issue Type: Bug
> Reporter: Maxim Rusak
> Assignee: Maxim Rusak
>
> Now FailureDetector have (at least) 2 bugs (in comparation to original paper):
> 1. latestHeartbeatMs don't update on each HeartBeat. So we have
> descriptiveStatistics consisted not from deltas between heartbeats but from
> time periods from first heartbeats.
> 2. when we create normalDistribution we pass variation, not standard
> deviation.
> They make FailureDetector totally indifferent due to extremely high deviation.
> Example: http://pastebin.com/xaeF52PP
> Here we send 100 heartbeats, one per second(for example), then we check the
> state after 2000 seconds, and comparing to threshold it's still alive.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)