On Tue, Aug 9, 2022 at 12:53 PM Mark Michelson <[email protected]> wrote: > > On 8/9/22 15:12, Ilya Maximets wrote: > > On 8/9/22 20:59, Mark Michelson wrote: > >> On 8/8/22 09:31, Ilya Maximets wrote: > >>> On 8/5/22 08:28, Han Zhou wrote: > >>>> On Thu, Jun 30, 2022 at 12:55 PM Mark Michelson <[email protected]> wrote: > >>>>> > >>>>> Hi OVS and OVN devs, > >>>>> > >>>>> The OVN team has considered the idea of moving inactivity probes (i.e. > >>>>> OVSDB echo requests/replies) into a background thread. > >>>>> > >>>>> OVN logical networks can be very large, meaning that ovn components such > >>>>> as ovn-northd and ovn-controller may take a while to process everything > >>>>> in an OVSDB database. On large clusters, we end up seeing the following > >>>>> loop occur: > >>>>> > >>>>> 1. The OVN component connects to the database. > >>>>> 2. The OVN component must compute the entire contents of database. > >>>>> 3. While the OVN component is executing its main loop, the inactivity > >>>>> probe interval expires. The OVN component is disconnected from database. > >>>>> 4. The OVN component finishes its computation. > >>>>> 5. Since the OVN component is disconnected from the database, it must > >>>>> reconnect. Go to step 1. > >>>>> > >>>>> This makes for an unstable and slow experience. Typically if OVN can get > >>>>> past the initial loop after connecting to the database, then incremental > >>>>> processing will allow for subsequent loops to execute much more quickly. > >>>>> However, the constant disconnect-reconnect makes OVN operate at its > >>>>> slowest at all times. > >>>>> > >>>>> The way we've dealt with this before is to try to optimize the > >>>>> performance of OVN components, while also advising that the inactivity > >>>>> probe gets set to a high value. The problem is that as the demand for > >>>>> larger and larger logical networks grows, the execution time of OVN is > >>>>> hard to bring down much more, but the inactivity probes have to keep > >>>>> getting higher and higher to avoid the described scenario. Once OVN > >>>>> reaches its "stable" state where incremental processing makes loops > >>>>> execute quickly, this high inactivity probe becomes detrimental. It > >>>>> means that if there is a legitimate disconnection, then we don't detect > >>>>> it very quickly. > >>>>> > >>>> Hi Mark, > >>>> > >>>> Sorry for the late reply. > >>>> For the slowness of ovn-northd/ovn-controller, our practice is to disable > >>>> the inactivity probe from DB server side. It seems harmless because if the > >>>> client network is recovered it would just reconnect, right? > >>>> The problem is more with the other direction: probe from client to server. > >>>> This probe is required because if a client (e.g. ovn-controller) doesn't > >>>> need to send any transactions to the server for a long time then it would > >>>> not detect a server crash sooner, thus would not trigger reconnecting to > >>>> another server in the cluster in time. > >>>> With the client -> server probe enabled, there is a scale problem with the > >>>> SB server, if it is connected with a huge number of clients and if the > >>>> probe interval is not high enough, because when it is busy serving the > >>>> clients it may fail responding the probes in time, causing some clients > >>>> reconnecting and re-transmitting data, a cascaded failure. > >>>> So for the background thread approach may appear to be helpful for the > >>>> server side. > >>>> However, there is already an alternative to the server side scale problem > >>>> (I haven't tried yet), the OVSDB relay, which can reduce the number of > >>>> clients per server to a very low number. In this case, a short probe should > >>>> not matter. > >>>> So based on the above thoughts, I am not sure if it is really necessary to > >>>> have the background probe handling. > >>>> > >>>>> As mentioned at the top of the email, a possible solution is to put the > >>>>> inactivity probes into a background thread. Is this in the spirit of the > >>>>> inactivity probe? From my point of view, the inactivity probe should > >>>>> fail only in a serious error condition, such as a network outage, or a > >>>>> program crash. If a program is "busy" it is still "active" and should > >>>>> therefore not be subject to inactivity probe failures. However, I want > >>>>> to get the opinions of the list on this. > >>>> > >>>> This is indeed a controversial point. If a program is "busy" but "active", > >>>> I agree it shouldn't be subject to probe failures. However, if the program > >>>> is not responsive at all, due to bugs, e.g. the main thread is in a dead > >>>> loop (although it is not very likely to happen in OVN components), should > >>>> the probe fail? I think the answer is yes in this case, considering the > >>>> case when a SB server node is not responsive at all because of bugs while > >>>> the background thread is still responding to probes, the client wouldn't > >>>> notice the problem and would not reconnect to a healthy node, which defies > >>>> the purpose of the probe. > >>> > >>> I think, there is one more thing we should keep in mind - plain old TCP > >>> keepalive functionality. Instead of implementing some special background > >>> threads, users can just disable all inactivity probes and LD_PRELOAD > >>> keepalive library with desired configuration that will be automatically > >>> applied to all connections and it will be kernel's responsibility to > >>> handle probes. IIUC, that should cover all the same cases as a background > >>> thread, and will have the same drawbacks of not detecting the process being > >>> stuck in an infinite loop. > >>> > >>> Best regards, Ilya Maximets. > >>> > >> > >> A long time ago (July 2020), Anton Ivanov provided a patch that enabled keepalives at the kernel level. This is the latest version of the patch I could find: https://mail.openvswitch.org/pipermail/ovs-dev/2020-June/371262.html > >> > >> I had some findings on that particular patch, and AFAIK, there was never a follow-up patch sent (although I may have missed it when searching). > >> > >> Would something like that be a viable alternative? > > > > The beauty of this is that you don't need to change the code! > > Just install libkeepalive in you favorite distribution and > > LD_PREVOAD the library while starting the application. > > See some examples here: > > https://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/#libkeepalive > > > > Best regards, Ilya Maximets. > > > Wow, so if I understand correctly, we could handle this entirely in > ovn-ctl and the systemd service files, and we'd get inactivity probes > that work independent of what the application is doing. That's pretty nifty. > > It doesn't address Han's concern about timing out if the code is in a > dead loop or is deadlocked. But that's the sort of thing a local > watchdog could take care of instead.
Hi Mark, this concern was about moving the probe handling to a background thread. With the current probe handling mechanism in ovsdb-server and ovn-controller/ovn-northd this is not a concern because it is handled by the main loop and would definitely reflect the dead-loop/lock situation. Thanks, Han _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
