On Fri, Oct 13, 2017 at 8:30 AM, Numan Siddique <[email protected]> wrote: > On Fri, Oct 13, 2017 at 6:05 AM, Andy Zhou <[email protected]> wrote: > >> Hi, Numan, >> >> I am curious why default 5 seconds inactivity time does not work? Do >> you have more details? >> >> Does the glitch usually happen around the HA switch over? If this >> happens during normal operation, >> Then this is not HA specific issue, but an indication of some >> connectivity issues. >> > > Hi Andy. This happens in the openstack deployment and when the > neutron-server is busy handling lots of API requests. > Normally the deployment would be having 3 controller nodes and > neutron-server would be running in each node. On each controller node, > neutron-server starts around 10 - 12 neutron workers (which are separate > processes). Number of API workers is a configuration option and normally > number of cores = no of neutron works if not configured. > > I have tested in both physical nodes deployment and virtual deployment (3 > controllers running as vms in a node). Around 40 connections are opened to > the OVN north ovsdb-server by all the neutron workers in the physical > deployment and around 15 connections are opened in the virtual deployment. > When neutron-server is loaded with many API requests, I have noticed that, > ovsdb-server drops the connections when it doesn't get the echo reply every > 5 seconds. This leads to lot of reconnections to the ovsdb-server and the > response from the neutron-server is very slow and bad. With this patch it > seems to work fine. > > The issue is not because of any network issues but because of lots of > connections from the neutron-server workers to the ovsdb-server and failure > by the idl clients to reply to the echo request every 5 seconds when the > neutron-server is loaded.
We have to disable the inactivity probe everywhere each time we have done performance testing so far. > I can make the patch to provide the configuration option to override the > inactivity probe value so that it doesn't affect others who use the OVN OCF > pacemaker script. > > Let me know your comments. I think the default through this script should match the normal default. It looks like it defaults to 60s in this patch instead of 5s? I would make it match. I do like exposing the ability to change it, though. We could consider setting a different default through our OpenStack work. > > Thanks > Numan > > >> >> On Thu, Oct 12, 2017 at 11:08 AM, Andy Zhou <[email protected]> wrote: >> > Sure, I will take a look. >> > >> > On Thu, Oct 12, 2017 at 10:49 AM, Ben Pfaff <[email protected]> wrote: >> >> Hi Andy. In the IRC meeting today, Numan suggested that you might be an >> >> appropriate reviewer for this patch, so if you agree and you have a >> >> chance to look at this then it would be appreciated. >> >> >> >> Thanks, >> >> >> >> Ben. >> >> >> >> On Wed, Oct 11, 2017 at 02:22:33PM +0530, [email protected] wrote: >> >>> From: Numan Siddique <[email protected]> >> >>> >> >>> In the case of OVN HA deployments with openstack, it has been noticed >> >>> that the 5 seconds inactivity probe interval is not enough and >> ovsdb-servers >> >>> time out. >> >>> This patch >> >>> - providdes an option to configure this value. >> >>> - creates a connection row in NB/SB dbs and sets the target and >> >>> inactivity_probe values when the node is promoted to master. >> >>> >> >>> CC: Andy Zhou <[email protected]> >> >>> Signed-off-by: Numan Siddique <[email protected]> >> >>> --- >> >>> ovn/utilities/ovndb-servers.ocf | 27 +++++++++++++++++++++++++++ >> >>> 1 file changed, 27 insertions(+) >> >>> >> >>> diff --git a/ovn/utilities/ovndb-servers.ocf >> b/ovn/utilities/ovndb-servers.ocf >> >>> index fe1207c22..92620af6a 100755 >> >>> --- a/ovn/utilities/ovndb-servers.ocf >> >>> +++ b/ovn/utilities/ovndb-servers.ocf >> >>> @@ -8,6 +8,8 @@ >> >>> : ${SB_MASTER_PORT_DEFAULT="6642"} >> >>> : ${SB_MASTER_PROTO_DEFAULT="tcp"} >> >>> : ${MANAGE_NORTHD_DEFAULT="no"} >> >>> +: ${INACTIVE_PROBE_DEFAULT="60000"} >> >>> + >> >>> CRM_MASTER="${HA_SBIN_DIR}/crm_master -l reboot" >> >>> CRM_ATTR_REPL_INFO="${HA_SBIN_DIR}/crm_attribute --type crm_config >> --name OVN_REPL_INFO -s ovn_ovsdb_master_server" >> >>> OVN_CTL=${OCF_RESKEY_ovn_ctl:-${OVN_CTL_DEFAULT}} >> >>> @@ -17,6 +19,7 @@ NB_MASTER_PROTO=${OCF_RESKEY_ >> nb_master_protocol:-${NB_MASTER_PROTO_DEFAULT}} >> >>> SB_MASTER_PORT=${OCF_RESKEY_sb_master_port:-${SB_MASTER_ >> PORT_DEFAULT}} >> >>> SB_MASTER_PROTO=${OCF_RESKEY_sb_master_protocol:-${SB_ >> MASTER_PROTO_DEFAULT}} >> >>> MANAGE_NORTHD=${OCF_RESKEY_manage_northd:-${MANAGE_NORTHD_DEFAULT}} >> >>> +INACTIVE_PROBE=${OCF_RESKEY_inactive_probe_interval:-${ >> INACTIVE_PROBE_DEFAULT}} >> >>> >> >>> # Invalid IP address is an address that can never exist in the >> network, as >> >>> # mentioned in rfc-5737. The ovsdb servers connects to this IP >> address till >> >>> @@ -101,6 +104,14 @@ ovsdb_server_metadata() { >> >>> <content type="string" /> >> >>> </parameter> >> >>> >> >>> + <parameter name="inactive_probe_interval" unique="1"> >> >>> + <longdesc lang="en"> >> >>> + Inactive probe interval to set for ovsdb-server. >> >>> + </longdesc> >> >>> + <shortdesc lang="en">Set inactive probe interval</shortdesc> >> >>> + <content type="string" /> >> >>> + </parameter> >> >>> + >> >>> </parameters> >> >>> >> >>> <actions> >> >>> @@ -138,6 +149,22 @@ ovsdb_server_notify() { >> >>> ${OVN_CTL} --ovn-manage-ovsdb=no start_northd >> >>> fi >> >>> >> >>> + conn=`ovn-nbctl get NB_global . connections` >> >>> + if [ "$conn" == "[]" ] >> >>> + then >> >>> + ovn-nbctl -- --id=@conn_uuid create Connection \ >> >>> +target="p${NB_MASTER_PROTO}\:${NB_MASTER_PORT}\:${MASTER_IP}" \ >> >>> +inactivity_probe=$INACTIVE_PROBE -- set NB_Global . >> connections=@conn_uuid >> >>> + fi >> >>> + >> >>> + conn=`ovn-sbctl get SB_global . connections` >> >>> + if [ "$conn" == "[]" ] >> >>> + then >> >>> + ovn-sbctl -- --id=@conn_uuid create Connection \ >> >>> +target="p${SB_MASTER_PROTO}\:${SB_MASTER_PORT}\:${MASTER_IP}" \ >> >>> +inactivity_probe=$INACTIVE_PROBE -- set SB_Global . >> connections=@conn_uuid >> >>> + fi >> >>> + >> >>> else >> >>> if [ "$MANAGE_NORTHD" = "yes" ]; then >> >>> # Stop ovn-northd service. Set --ovn-manage-ovsdb=no so >> that >> >>> -- >> >>> 2.13.5 >> >>> >> >>> _______________________________________________ >> >>> dev mailing list >> >>> [email protected] >> >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev -- Russell Bryant _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
