Re: [ovs-dev] [ovn] ha-chassis-group false positive failover

Numan Siddique Tue, 16 May 2023 13:55:10 -0700

Hi Vladislav,

Sorry for the late reply.


PSB for few comments.



On Tue, May 16, 2023 at 3:42 PM Vladislav Odintsov <[email protected]> wrote:
>
> Hi Numan, Dumitru, Ilya,
>
> I guess I’ve managed to deal with some of these problems and have some 
> conclusions.
> Please see inline.
>
> > On 14 Apr 2023, at 16:26, Vladislav Odintsov <[email protected]> wrote:
> >
> > And as a follow-up, I see sometimes next error message in ovn-controller 
> > log, though there is a configured much higher limit in seconds for 
> > inactivity probe value for the chassis:
> >
>
> ovn-controller establishes 3 openflow connections to local ovs-vswitchd for 
> next purposes:
> 1. openflow rules management (openflow probe interval is configured through 
> ovn-openflow-probe-interval external_ids setting [1]; default — disabled [2]);
> 2. openflow features collection (hardcoded probe interval 5s [3]);
> 3. pinctrl thread (hardcoded probe interval 5s [4]).
>
> > ovs|04426|rconn|ERR|unix:/run/openvswitch/br-int.mgmt: no response to 
> > inactivity probe after 8 seconds, disconnecting
> >
> > ovs|26560|rconn|ERR|unix:/run/openvswitch/br-int.mgmt: no response to 
> > inactivity probe after 10 seconds, disconnecting
> >
>
> These messages most likely related to the connection for features collection 
> or pinctrl thread. So ovn-openflow-probe-interval has no matter here.
>
> >
> > server with first log message:
> > # ovs-vsctl get open . external_ids:ovn-openflow-probe-interval
> > "30"
> >
> > server with second log message:
> > # ovs-vsctl get open . external_ids:ovn-openflow-probe-interval
> > "60"
> >
> >
> >
> >> On 30 Mar 2023, at 23:19, Vladislav Odintsov <[email protected]> wrote:
> >>
> >> Hi all,
> >>
> >> I’m facing a false positive failover in the event of openflow connection 
> >> got dropped after inactivity probe timeout due to high load of 
> >> ovn-controller handling huge amount of ovsdb updates.
> >> I wonder wether its a bug or I have to tune any specifics. See 
> >> ovn-controller logs:
> >>
> >> 2023-03-30T07:57:24.469Z|58440|inc_proc_eng|INFO|node: 
> >> logical_flow_output, recompute (failed handler for input port_groups) took 
> >> 35774ms
> >> 2023-03-30T07:57:28.521Z|58441|lflow_cache|INFO|Detected cache inactivity 
> >> (last active 40006 ms ago): trimming cache
> >> 2023-03-30T07:57:28.528Z|58442|timeval|WARN|Unreasonably long 40084ms poll 
> >> interval (32947ms user, 6968ms system)
> >> 2023-03-30T07:57:28.528Z|58443|timeval|WARN|faults: 43468 minor, 0 major
> >> …..
> >> 2023-03-30T07:57:48.784Z|58496|reconnect|ERR|ssl:xxxx:6642: no response to 
> >> inactivity probe after 60.2 seconds, disconnecting
> >> 2023-03-30T07:57:48.784Z|58497|reconnect|INFO|ssl:xxxx:6642: connection 
> >> dropped
> >> 2023-03-30T07:57:48.905Z|58498|main|INFO|OVNSB commit failed, force 
> >> recompute next time.
> >> 2023-03-30T07:57:49.786Z|58499|reconnect|INFO|ssl:yyyy:6642: connecting...
> >> 2023-03-30T07:57:49.924Z|58500|reconnect|INFO|ssl:yyyy:6642: connected
> >> 2023-03-30T07:57:29.781Z|58501|poll_loop|INFO|wakeup due to 0-ms timeout 
> >> at lib/stream-ssl.c:838 (74% CPU usage)
> >> 2023-03-30T07:57:29.836Z|58502|poll_loop|INFO|wakeup due to 0-ms timeout 
> >> at lib/stream-ssl.c:838 (74% CPU usage)
> >> ….
> >> 2023-03-30T07:57:05.976Z|58545|poll_loop|INFO|wakeup due to 0-ms timeout 
> >> at lib/stream-ssl.c:838 (101% CPU usage)
> >> 2023-03-30T07:57:07.002Z|58546|memory|INFO|peak resident set size grew 54% 
> >> in last 682257.2 seconds, from 4435696 kB to 6824820 kB
> >> 2023-03-30T07:57:07.002Z|58547|memory|INFO|idl-cells-OVN_Southbound:11491359
> >>  idl-cells-Open_vSwitch:14416 lflow-cache-entries-cache-expr:530298 
> >> lflow-cache-entries-cache-matches:31770 lflow-cache-size-KB:769553 
> >> local_datapath_usage-KB:630
> >>  ofctrl_desired_flow_usage-KB:577052 ofctrl_installed_flow_usage-KB:409538 
> >> ofctrl_sb_flow_ref_usage-KB:273707
> >> …
> >> 2023-03-30T07:57:31.667Z|58552|rconn|WARN|unix:/run/openvswitch/br-int.mgmt:
> >>  connection dropped (Broken pipe)
> >> 2023-03-30T07:57:31.739Z|58553|binding|INFO|Releasing lport cr-xxx from 
> >> this chassis (sb_readonly=0)
> >> 2023-03-30T07:57:31.739Z|58554|if_status|WARN|Dropped 1 log messages in 
> >> last 1254725 seconds (most recently, 1254725 seconds ago) due to excessive 
> >> rate
> >> 2023-03-30T07:57:31.739Z|58555|if_status|WARN|Trying to release unknown 
> >> interface cr-xxx
> >> 2023-03-30T07:57:31.743Z|58556|binding|INFO|Releasing lport cr-yyy from 
> >> this chassis (sb_readonly=0)
> >> 2023-03-30T07:57:31.743Z|58557|binding|INFO|Releasing lport cr-zzz from 
> >> this chassis (sb_readonly=0)
> >> …
> >>
> >> After some time these ports are claimed back:
> >>
> >> 2023-03-30T07:57:49.320Z|59137|rconn|INFO|unix:/run/openvswitch/br-int.mgmt:
> >>  connecting...
> >> 2023-03-30T07:57:49.320Z|59138|rconn|INFO|unix:/run/openvswitch/br-int.mgmt:
> >>  connected
> >> 2023-03-30T07:57:49.523Z|59139|poll_loop|INFO|wakeup due to 0-ms timeout 
> >> at controller/ovn-controller.c:4296 (99% CPU usage)
> >> 2023-03-30T07:57:49.574Z|59140|poll_loop|INFO|wakeup due to 0-ms timeout 
> >> at controller/ovn-controller.c:4286 (99% CPU usage)
> >> 2023-03-30T07:58:01.451Z|59141|poll_loop|INFO|wakeup due to [POLLIN] on fd 
> >> 22 (xxxx:34736<->yyyy:6642) at lib/stream-ssl.c:836 (99% CPU usage)
> >> 2023-03-30T07:58:02.449Z|59142|binding|INFO|Claiming lport cr-xxx for this 
> >> chassis.
> >> 2023-03-30T07:58:02.449Z|59143|binding|INFO|cr-xxx: Claiming 
> >> xx:xx:xx:xx:xx x.x.x.x/xx
> >> 2023-03-30T07:58:02.453Z|59144|binding|INFO|Claiming lport cr-yyy for this 
> >> chassis.
> >> 2023-03-30T07:58:02.453Z|59145|binding|INFO|cr-yyy: Claiming 
> >> yy:yy:yy:yy:yy:yy y.y.y.y/yy
> >> …
> >>
> >> Previously I’ve met with similar problem but the openflow probe interval 
> >> was 30 seconds and I set it to 60 and then to 120 seconds.
> >> But it seems that OVS was configured on default inactivity probe timeout 
> >> for openflow - 60 seconds. So, the error above occured. In OVS logs it 
> >> looks like next:
> >>
> >> 2023-03-30T07:57:28.632Z|61003|rconn|ERR|br-int<->unix#8: no response to 
> >> inactivity probe after 60 seconds, disconnecting
>
> When ovn-controller looses openflow connection for flows management it 
> considers OVS is dead and releases its ha-chassis-group bindings. This logic 
> was introduced with the patch [5].
>
> ovn-controller uses unix socket to connect to ovs-vswitchd OF socket, it 
> seems useless to configure inactivity probing, right?
> In this commit message [6] Han also disables OF inactivity probing by default.
>
> Moreover, it seems harmful to configure inactivity probing from OVS to 
> ovn-controller and vise versa.
> In case ovn-controller for some reason is busy with some work and can’t 
> answer to OVS in two minutes (its default probe interval is 60s [7]), so OVS 
> destroys OF connection with the ERR log:
> ovs|18206|rconn|ERR|br-int<->unix#17574: no response to inactivity probe 
> after 60 seconds, disconnecting
>
> So, when ovn-controller resumes after a big work it sees that OF connection 
> is closed and triggers chassisredirect portbindings release.
> This leads to datapath downtime and unnesessary load on OVN_Southbound, 
> ovn-controllers, etc.
>
> To avoid such behaviour user may consider not to configure 
> ovn-openflow-probe-interval AND disable probing from OVS side with the next 
> command:
> ovs-vsctl --inactivity-probe=0 set-controller br-int 
> punix://var/run/openvswitch/br-int.mgmt
>
> To make this setting applied, it’s needed to reestablish OF connection. (for 
> instance, restart ovn-controller)
>
> My proposal:
>
> If ovn-controller uses only unix socket for OF connection, it should be safe 
> to disable OF probing from its code:
> 1. configure controller for br-int to disable probing (ensure br-int or other 
> configured bridge controller created with inactivity probe = 0);
> 2. delete ovn-openflow-probe-interval setting and mark in documentation that 
> for OVN versions >= 20.09 it’s recommended not to configure 
> ovn-openflow-inactivity-probe. For versions < 20.09 default interval is 5 
> seconds, so it’s reasonable to set it to higher value in the config for 
> loaded hypervisors;

You mean 23.09 right ?


> 3. set openflow probe interval for ovs features and pinctrl OF connection to 
> 0 (disable probing);
>
> What do you guys think about this?
> Please, correct me if I understand something wrong and described changes 
> can’t be accepted.

The proposal looks reasonable to me.

There may be once concern.  Let's say the physical link is fine (where
vswitchd runs bfd) but vswitchd is down on chassis 'A' for some
reason.  With probing disabled, ovn-controller will never
come to know if vswitchd is down until it tries to communicate with
it.  If this chassis 'A' is acting as a gateway,  external traffic
from other chassis VMs will be tunnelled to this and since vswitchd is
down the traffic may not go out if no datapath flows exist to output
the traffic to the physical interface of the provider bridge.  Is this
acceptable ?
At least with the probing enabled, ovn-controller will release the
chassis resident port and the next in the priority will claim it.

IMO,  this limitation is more acceptable than the issue you reported.
Let's see what others have to say about this.





>
>
> >>
> >> This node is a dedicated l3gateway which has is-interconn=true setting 
> >> (interconnection case) and has quite big openflow table around 2.7M flows.
> >>
> >> So, I’ve got some questions for OVN and OVS:
> >> 1. I couldn’t find in ovn-controller code where the logic of handling 
> >> openflow connection drop. It is a bug when after openflow connection drop 
> >> the claimed cr-ports are being released? I guess, yes. Can you please 
> >> point me to the corrent place of code to start investigating?
> >> 2. I see almost the same openflow amount on each l3gateway with 
> >> interconnection role. Even l3gateways which have no allocated 
> >> cr-port-bindings have 2M+ openflow flows. Is it normal operation or it’s a 
> >> place for possible optimisation? I mean it seems that excess flows are 
> >> installed.
>
> There are a lot of mac_binding records, but the question is why these flows 
> still persist even if no datapaths are allocated to the node.

Maybe the datapath to which the mac binding records belong is in the
ovn-controllers' "local_datapath"  [1] -
https://github.com/ovn-org/ovn/blob/main/controller/local_data.h#L35

>
> >> 3. How to handle long OVS OF updates, which block inactivity probe for OF 
> >> connection and then got dropped? From ovn-controller it’s possible to 
> >> configure inactivity probe, is it possible and should I configure from 
> >> ovs-vswitchd side?
> >> 4. There is a warning on try to release unknown interface, which was 
> >> relased right before. Is it normal situation or also should be 
> >> investigated?

I think it can be ignored.  Maybe @Dumitru Ceara can correct me or comment here

Thanks
Numan

> >>
>
> For points 2-4 I still haven’t found the answer yet. If someone knows the 
> answer, please share it.
>
> >>
> >> Thanks.
> >>
> >> Regards,
> >> Vladislav Odintsov
> >>
> >
> >
> > Regards,
> > Vladislav Odintsov
> >
>
>
> 1: https://github.com/ovn-org/ovn/blob/v23.03.0/controller/ofctrl.c#L421
> 2: 
> https://github.com/ovn-org/ovn/blob/v23.03.0/controller/ovn-controller.c#L108
> 3: https://github.com/ovn-org/ovn/blob/v23.03.0/lib/features.c#L85
> 4: https://github.com/ovn-org/ovn/blob/v23.03.0/controller/pinctrl.c#L3479
> 5: 
> https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/
> 6: 
> https://github.com/ovn-org/ovn/commit/b8af8549396e62d6523be18e104352e334825783
> 7: https://github.com/openvswitch/ovs/blob/v3.1.0/vswitchd/bridge.c#L3860
>
> Regards,
> Vladislav Odintsov
>
>
>
>
>
>
> Regards,
> Vladislav Odintsov
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [ovn] ha-chassis-group false positive failover

Reply via email to