Hi David, I tried to reproduce the issue in our lab with Ipv4 and Ipv6 multiple times on Ubuntu 16.04, but failed to reproduce.
Can you please share dtmd traces(/var/log/opensaf/osafdtmd) when working fine and when not working fine from both the nodes.(SC-1 and SC-2). You can enable dtmd traces by uncommenting the following line in /etc/opensaf/dtmd.conf on SC-1 and SC-2. Then run your tests #args="--tracemask=0xffffffff" Thanks & Regards Mohan Kanakam | +91-8333082448 Software Engineer High Availability Solutions Description: HAS Logo.jpg <http://www.gethighavailability.com/> www.GetHighAvailability.com Get High Availability Today ! NJ, USA: +1 508-422-7725 | Hyderabad, India: +91 798-992-5293 From: Hoyt, David [mailto:dh...@rbbn.com] Sent: Tuesday, May 05, 2020 10:58 PM To: Mohan Kanakam Cc: nagen...@gethighavailability.com; Opensaf-users@lists.sourceforge.net Subject: RE: alias IP causing issues Just an fyi, we found the issue occurs more frequently with IPv6. Regards, David From: Mohan Kanakam <mo...@hasolutions.in> Sent: Tuesday, May 5, 2020 1:18 PM To: Hoyt, David <dh...@rbbn.com> Cc: nagen...@gethighavailability.com; Opensaf-users@lists.sourceforge.net Subject: RE: alias IP causing issues _____ NOTICE: This email was received from an EXTERNAL sender _____ Hi David, Thanks for the example. Let me give a try to reproduce it in our lab. Just as word of caution, in such scenarios, when the traffic will increase on the shared NIC because of application, then OpenSAF processes can miss their messages or timeout can occur. This could be Day 2 problems. Thanks & Regards Mohan Kanakam | +91-8333082448 Software Engineer High Availability Solutions Description: HAS Logo.jpg <http://www.gethighavailability.com> www.GetHighAvailability.com Get High Availability Today ! NJ, USA: +1 508-422-7725 | Hyderabad, India: +91 798-992-5293 From: Hoyt, David [mailto:dh...@rbbn.com] Sent: Monday, May 04, 2020 11:43 PM To: mo...@hasolutions.in Cc: nagen...@gethighavailability.com; mo...@hasolutions.in; Opensaf-users@lists.sourceforge.net Subject: alias IP causing issues Hi Mohan, Thanks for your response. Maybe is I use an example. The following IPs are the fixed eth0 IP for each node: SC-1 IP = 1.2.3.444 SC-2 IP = 1.2.3.777 The node with SC-1 is up and running with the active opensaf controller. The aliasIP is then added to SC-1’s eth0. aliasIP = 1.2.3.99 Opensaf is started on the SC-2 node. After opensaf on SC-2 is up and running, sometimes, I see the following TCP connection. This shows that SC-1’s node IP (1.2.3.444:22606) is connected to SC-2’s port 6700 (1.2.3.777:6700). Opensaf communication between the nodes is good. [root@sc-1: ~]# ss -at | grep 6700 State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 20 1.2.3.444:6700 *:* ESTAB 0 0 1.2.3.444:22606 1.2.3.777:6700 Other times, I see the connection as below, which shows the alias IP on SC-1(1.2.3.99.15128) connected to SC-2’s port 6700. In this scenario, an si-swap of the application running on the two nodes will cause opensaf to go split-brain. That’s because when the application running on SC-1 goes quiesced, it removes the aliasIP. Once that occurs, the TCP connection is lost.. [root@sc-1: ~]# ss -at | grep 6700 State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 20 1.2.3.444:6700 *:* ESTAB 0 0 1.2.3.99:15128 1.2.3.777:6700 As I understand it, during the initial UDP message exchange between the two opensaf processes, the active opensaf controller will provide its IP address. This IP is then used to establish a TCP connection between the two. Problem is, sometimes it’s the alias IP address that is provided. So, is there a way to enforce opensaf to respond with the fixed IP address during that initial UDP message exchange? Regards, David _____ Notice: This e-mail together with any attachments may contain information of Ribbon Communications Inc. that is confidential and/or proprietary for the sole use of the intended recipient. Any review, disclosure, reliance or distribution by others or forwarding without express permission is strictly prohibited. If you are not the intended recipient, please notify the sender immediately and then delete all copies, including any attachments. _____ _______________________________________________ Opensaf-users mailing list Opensaf-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-users