Since upgrading to v4.1.3.1 a couple weeks ago, the standby servers in the High Availability pairs will randomly become unreachable via ping, web or ssh for minutes to hours. The primaries stay up. This did not occur prior to this version. A reboot of the secondary will instantly resolve the problem.
Has anyone encountered this issue? We use Dell 1850's with optic fiber interfaces in centrally deployed, out-of-band gateway mode. If I ping out from the secondary server, I see (tcpdump) the outbound traffic on the fake0 interface, but it does not appear to go out the physical interface, as it does not reach the physical interface on the router it is directly attached too. As a result, there is no return traffic on eth0 either. The eth2 interface between the primary and secondary remains up, and I am able to ssh into the secondary from the primary via the untrusted interface (eth1/fake1) while the trusted interface (eth0/fake0) is unresponsive. Any ideas are appreciated. -Bill Davis Colorado State University
