[Linux-ha-dev] LRM bug

2012-07-30 Thread Alan Robertson
The LRM treats operation timeouts as ERROR:s - not just failed operations that give warnings. This violates the meaning of ERROR: messages in the code. We reserved ERROR: messages for things that the software did not expect - and therefore possibly could not be properly recovered from. In

[Linux-ha-dev] Probable sub-optimal behavior in ucast

2012-07-30 Thread Alan Robertson
Hi, I have a 10-node system with each system having 2 interfaces, and therefore each ha.cf file has 18 ucast lines in it. If I read the code correctly, I think each heartbeat packet is then being received 18 times and sent to the master control process - where each is then uncompressed and 17