[tickets] [opensaf:tickets] #2522 dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect
- **status**: review --> fixed - **Comment**: commit 3ac6c452d30d2814f1704af578617f2a90f439b7 Author: Alex Jones Date: Tue Aug 15 11:36:41 2017 -0400 --- ** [tickets:#2522] dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect** **Status:** fixed **Milestone:** 5.17.10 **Created:** Thu Jul 06, 2017 01:28 PM UTC by Alex Jones **Last Updated:** Fri Aug 11, 2017 03:21 PM UTC **Owner:** Alex Jones If TCP is used for transport, and TCP_USER_TIMEOUT is used also, if a node leaves the cluster due to some quick network outage, the nodes do not come back into the cluster automatically. If TCP_USER_TIMEOUT is set to 1500 ms, and the network outage on the link is for 2000 ms, the node never comes back into the cluster. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2522 dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect
- **status**: accepted --> review --- ** [tickets:#2522] dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect** **Status:** review **Milestone:** 5.17.10 **Created:** Thu Jul 06, 2017 01:28 PM UTC by Alex Jones **Last Updated:** Fri Aug 11, 2017 02:59 PM UTC **Owner:** Alex Jones If TCP is used for transport, and TCP_USER_TIMEOUT is used also, if a node leaves the cluster due to some quick network outage, the nodes do not come back into the cluster automatically. If TCP_USER_TIMEOUT is set to 1500 ms, and the network outage on the link is for 2000 ms, the node never comes back into the cluster. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2522 dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect
- **status**: unassigned --> accepted - **assigned_to**: Alex Jones - **Part**: - --> d --- ** [tickets:#2522] dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect** **Status:** accepted **Milestone:** 5.17.10 **Created:** Thu Jul 06, 2017 01:28 PM UTC by Alex Jones **Last Updated:** Wed Aug 09, 2017 05:41 PM UTC **Owner:** Alex Jones If TCP is used for transport, and TCP_USER_TIMEOUT is used also, if a node leaves the cluster due to some quick network outage, the nodes do not come back into the cluster automatically. If TCP_USER_TIMEOUT is set to 1500 ms, and the network outage on the link is for 2000 ms, the node never comes back into the cluster. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2522 dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect
If I set DTM_INI_DIS_TIMEOUT_SECS to 5000s the nodes do relearn each other and come back into the cluster. --- ** [tickets:#2522] dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect** **Status:** unassigned **Milestone:** 5.17.10 **Created:** Thu Jul 06, 2017 01:28 PM UTC by Alex Jones **Last Updated:** Fri Jul 21, 2017 03:53 AM UTC **Owner:** nobody If TCP is used for transport, and TCP_USER_TIMEOUT is used also, if a node leaves the cluster due to some quick network outage, the nodes do not come back into the cluster automatically. If TCP_USER_TIMEOUT is set to 1500 ms, and the network outage on the link is for 2000 ms, the node never comes back into the cluster. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2522 dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect
I don't think Alex is taking about initial discovery issue/ processes ( topology node discovery) , but any how we can configure very big value of `DTM_INI_DIS_TIMEOUT_SECS` in dtm.conf to verify --- ** [tickets:#2522] dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect** **Status:** unassigned **Milestone:** 5.17.10 **Created:** Thu Jul 06, 2017 01:28 PM UTC by Alex Jones **Last Updated:** Thu Jul 20, 2017 01:54 PM UTC **Owner:** nobody If TCP is used for transport, and TCP_USER_TIMEOUT is used also, if a node leaves the cluster due to some quick network outage, the nodes do not come back into the cluster automatically. If TCP_USER_TIMEOUT is set to 1500 ms, and the network outage on the link is for 2000 ms, the node never comes back into the cluster. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2522 dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect
I believe DTM sends broadcast (or multicast) messages on the network for a while after it has started, to discover other nodes on the network. But it stops doing this after a while and that is the reason why it fails to reconnect after a network disturbance. A solution could be: * The node with the lowest node_id will never stop broadcasting the discovery messages * A node which is connected with another node with a lower node_id will never broadcast discovery messages * The node with the lowest node_id will inform all the other connected nodes about the topology of the cluster - in particular, if a new node has appeared. --- ** [tickets:#2522] dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect** **Status:** unassigned **Milestone:** 5.17.10 **Created:** Thu Jul 06, 2017 01:28 PM UTC by Alex Jones **Last Updated:** Thu Jul 06, 2017 01:28 PM UTC **Owner:** nobody If TCP is used for transport, and TCP_USER_TIMEOUT is used also, if a node leaves the cluster due to some quick network outage, the nodes do not come back into the cluster automatically. If TCP_USER_TIMEOUT is set to 1500 ms, and the network outage on the link is for 2000 ms, the node never comes back into the cluster. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
[tickets] [opensaf:tickets] #2522 dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect
--- ** [tickets:#2522] dtm: if TCP_USER_TIMEOUT closes socket, no attempt is make to reconnect** **Status:** unassigned **Milestone:** 5.17.10 **Created:** Thu Jul 06, 2017 01:28 PM UTC by Alex Jones **Last Updated:** Thu Jul 06, 2017 01:28 PM UTC **Owner:** nobody If TCP is used for transport, and TCP_USER_TIMEOUT is used also, if a node leaves the cluster due to some quick network outage, the nodes do not come back into the cluster automatically. If TCP_USER_TIMEOUT is set to 1500 ms, and the network outage on the link is for 2000 ms, the node never comes back into the cluster. --- Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is subscribed to https://sourceforge.net/p/opensaf/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.-- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Opensaf-tickets mailing list Opensaf-tickets@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-tickets