Edward Capriolo created GOSSIP-49:
-------------------------------------
Summary: Refactor Failure detector Lambda into named class
Key: GOSSIP-49
URL: https://issues.apache.org/jira/browse/GOSSIP-49
Project: Gossip
Issue Type: Improvement
Reporter: Edward Capriolo
When receiving a message the PassiveGossipThread updates heartbeats. Currently
a lambda in the GossipManager, which periodically moves through the list and
marks hosts as down and fires the event notification listner:
{noformat}
scheduledServiced.scheduleAtFixedRate(() -> {
try {
for (Entry<LocalGossipMember, GossipState> entry : members.entrySet()) {
Double result = null;
try {
result = entry.getKey().detect(clock.nanoTime());
//System.out.println(entry.getKey() +" "+ result);
if (result != null) {
if (result > settings.getConvictThreshold() && entry.getValue()
== GossipState.UP) {
members.put(entry.getKey(), GossipState.DOWN);
listener.gossipEvent(entry.getKey(), GossipState.DOWN);
}
if (result <= settings.getConvictThreshold() && entry.getValue()
== GossipState.DOWN) {
members.put(entry.getKey(), GossipState.UP);
listener.gossipEvent(entry.getKey(), GossipState.UP);
}
}
} catch (IllegalArgumentException ex) {
//0.0 returns throws exception computing the mean.
long now = clock.nanoTime();
long nowInMillis =
TimeUnit.MILLISECONDS.convert(now,TimeUnit.NANOSECONDS);
if (nowInMillis - settings.getCleanupInterval() >
entry.getKey().getHeartbeat() && entry.getValue() == GossipState.UP){
LOGGER.warn("Marking down");
members.put(entry.getKey(), GossipState.DOWN);
listener.gossipEvent(entry.getKey(), GossipState.DOWN);
}
} //end catch
} // end for
} catch (RuntimeException ex) {
LOGGER.warn("scheduled state had exception", ex);
}
{noformat}
This should be moved to a named class that is injected with the data members it
needs. This would make the logic easier to unit/mock test. We need to run it
periodically in the rare case that no messages are coming to us, but we could
also run this after receiving a message rather than waiting for the scheduled
executor to trigger it. In many cases that would alert faster.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)