GitHub user zcorrea opened a pull request:
https://github.com/apache/trafodion/pull/1392
[TRAFODION-2881] HA fixes
Fixed multiple problems in monitor Allgather() socket reconnect logic.
- Separated node down detection logic from communication errors and timeouts
to better handle multiple failure scenarios
- Better handling network resets
- Additional trace information
- Fixed 'node up' hang in monitor shell due to TmSync race condition
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/zcorrea/trafodion TRAFODION-2881
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/trafodion/pull/1392.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1392
----
commit e832d827507521998567d4cc5d92e4239007d19a
Author: Zalo Correa <zalo.correa@...>
Date: 2018-01-11T17:32:11Z
[TRAFODION-2881] HA fixes
Fixed multiple problems in monitor Allgather() socket reconnect logic.
- Separated node down detection logic from communication errors and timeouts
to better handle multiple failure scenarios
- Better handling network resets
- Additional trace information
- Fixed 'node up' hang in monitor shell due to TmSync race condition
----
---