[ https://issues.apache.org/jira/browse/TRAFODION-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gonzalo E Correa resolved TRAFODION-3334. ----------------------------------------- Resolution: Fixed > Communication IO between monitor processes must use timeouts and retries > ------------------------------------------------------------------------ > > Key: TRAFODION-3334 > URL: https://issues.apache.org/jira/browse/TRAFODION-3334 > Project: Apache Trafodion > Issue Type: Bug > Components: foundation > Affects Versions: 2.4 > Reporter: Gonzalo E Correa > Priority: Major > Fix For: 2.4 > > Time Spent: 4h 20m > Remaining Estimate: 0h > > Most communication channels used by monitor processes to exchange cluster > state information and to handle failure detection must be changed to > asynchronous IO with timeouts and retries to allow for the removal of a > monitor process from the cluster communication. This is to prevent a 'Sync > Thread Timeout' failure of the entire cluster instance where a monitor > process or it host server becomes unresponsive due to a server or network > failure. -- This message was sent by Atlassian Jira (v8.3.4#803005)