+1

On Fri, Dec 18, 2015 at 3:46 PM, Jianxia Chen <[email protected]> wrote:

> Hi all,
>
> I am trying to implement the stats for Geode membership service health
> monitor, which monitors the health of the members of the distributed system
> by heartbeats. I will describe the stats that will be implemented. Please
> take a look and let me know what you think.
>
> Assume you have basic knowledge of Geode, here is a very brief description
> of how the health monitor works. Every member exchanges heartbeat messages
> with its neighbors to make sure that its neighbor is alive. If for some
> reason, a member doesn't receive heartbeat from its neighbor, the member
> will send suspect member messages to the coordinator reporting the issue.
> Upon receiving the suspect member message, the coordinator will perform a
> final check with the suspect member by exchanging final check messages
> (similar to heartbeat) with the suspect member. Depending on the result of
> final check, the coordinator can decide whether to keep or remove the
> suspect member from membership. For details of the health monitor, please
> refer to GEODE-77 and/or GMSHealthMonitor.java.
>
> The proposed stats for health monitor are:
>
> 1) The number of heartbeat requests a member has sent
> 2) The number of heartbeat requests a member has received
> 3) The number of heartbeat (responses) a member has sent
> 4) The number of heartbeat (responses) a member has received
> 5) The number of suspect member messages a member has sent
> 6) The number of suspect member messages a member has received
> 7) The number of final check request a member has sent
> 8) The number of final check request a member has received
> 9) The number of final check responses a member has sent
> 10) The number of final check responses a member has received
>
> Note that there are two different types of final checks (TCP based and UDP
> based), therefore more stats of these two types of final checks:
>
> 11) The number of TCP final check request a member has sent
> 12) The number of TCP final check request a member has received
> 13) The number of TCP final check responses a member has sent
> 14) The number of TCP final check responses a member has received
> 15) The number of UDP final check request a member has sent
> 16) The number of UDP final check request a member has received
> 17) The number of UDP final check responses a member has sent
> 18) The number of UDP final check responses a member has received
>
> Thanks,
> Jianxia
>

Reply via email to