-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43496/
-----------------------------------------------------------

Review request for geode, Hitesh Khamesra and Udo Kohlmeyer.


Repository: geode


Description
-------

When a member crashes or there is a network partition the new GMS was 
performing many final-checks in GMSHealthMonitor for the same member(s).  This 
overloaded the system quite a bit, causing a large jump in the number of health 
monitor threads.

JGroupMessenger was also using checkIfAvailable() instead of suspect() when 
there were IOExceptions thrown in JGroups messaging, which exacerbated the 
problem if the network were actually down.


Diffs
-----

  
gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/fd/GMSHealthMonitor.java
 b20fe036ffa7931fbaad0b7af86ee934368a7f8f 
  
gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/interfaces/JoinLeave.java
 3f2d8479f0dd4bbf9d0213267b269b5fb118cd3e 
  
gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/messenger/JGroupsMessenger.java
 be2c405495268ea4c83380f4fe8b5c261539a9aa 

Diff: https://reviews.apache.org/r/43496/diff/


Testing
-------

network-down testing, precheckin


Thanks,

Bruce Schuchardt

Reply via email to