On 8/9/22 20:59, Mark Michelson wrote: > On 8/8/22 09:31, Ilya Maximets wrote: >> On 8/5/22 08:28, Han Zhou wrote: >>> On Thu, Jun 30, 2022 at 12:55 PM Mark Michelson <[email protected]> wrote: >>>> >>>> Hi OVS and OVN devs, >>>> >>>> The OVN team has considered the idea of moving inactivity probes (i.e. >>>> OVSDB echo requests/replies) into a background thread. >>>> >>>> OVN logical networks can be very large, meaning that ovn components such >>>> as ovn-northd and ovn-controller may take a while to process everything >>>> in an OVSDB database. On large clusters, we end up seeing the following >>>> loop occur: >>>> >>>> 1. The OVN component connects to the database. >>>> 2. The OVN component must compute the entire contents of database. >>>> 3. While the OVN component is executing its main loop, the inactivity >>>> probe interval expires. The OVN component is disconnected from database. >>>> 4. The OVN component finishes its computation. >>>> 5. Since the OVN component is disconnected from the database, it must >>>> reconnect. Go to step 1. >>>> >>>> This makes for an unstable and slow experience. Typically if OVN can get >>>> past the initial loop after connecting to the database, then incremental >>>> processing will allow for subsequent loops to execute much more quickly. >>>> However, the constant disconnect-reconnect makes OVN operate at its >>>> slowest at all times. >>>> >>>> The way we've dealt with this before is to try to optimize the >>>> performance of OVN components, while also advising that the inactivity >>>> probe gets set to a high value. The problem is that as the demand for >>>> larger and larger logical networks grows, the execution time of OVN is >>>> hard to bring down much more, but the inactivity probes have to keep >>>> getting higher and higher to avoid the described scenario. Once OVN >>>> reaches its "stable" state where incremental processing makes loops >>>> execute quickly, this high inactivity probe becomes detrimental. It >>>> means that if there is a legitimate disconnection, then we don't detect >>>> it very quickly. >>>> >>> Hi Mark, >>> >>> Sorry for the late reply. >>> For the slowness of ovn-northd/ovn-controller, our practice is to disable >>> the inactivity probe from DB server side. It seems harmless because if the >>> client network is recovered it would just reconnect, right? >>> The problem is more with the other direction: probe from client to server. >>> This probe is required because if a client (e.g. ovn-controller) doesn't >>> need to send any transactions to the server for a long time then it would >>> not detect a server crash sooner, thus would not trigger reconnecting to >>> another server in the cluster in time. >>> With the client -> server probe enabled, there is a scale problem with the >>> SB server, if it is connected with a huge number of clients and if the >>> probe interval is not high enough, because when it is busy serving the >>> clients it may fail responding the probes in time, causing some clients >>> reconnecting and re-transmitting data, a cascaded failure. >>> So for the background thread approach may appear to be helpful for the >>> server side. >>> However, there is already an alternative to the server side scale problem >>> (I haven't tried yet), the OVSDB relay, which can reduce the number of >>> clients per server to a very low number. In this case, a short probe should >>> not matter. >>> So based on the above thoughts, I am not sure if it is really necessary to >>> have the background probe handling. >>> >>>> As mentioned at the top of the email, a possible solution is to put the >>>> inactivity probes into a background thread. Is this in the spirit of the >>>> inactivity probe? From my point of view, the inactivity probe should >>>> fail only in a serious error condition, such as a network outage, or a >>>> program crash. If a program is "busy" it is still "active" and should >>>> therefore not be subject to inactivity probe failures. However, I want >>>> to get the opinions of the list on this. >>> >>> This is indeed a controversial point. If a program is "busy" but "active", >>> I agree it shouldn't be subject to probe failures. However, if the program >>> is not responsive at all, due to bugs, e.g. the main thread is in a dead >>> loop (although it is not very likely to happen in OVN components), should >>> the probe fail? I think the answer is yes in this case, considering the >>> case when a SB server node is not responsive at all because of bugs while >>> the background thread is still responding to probes, the client wouldn't >>> notice the problem and would not reconnect to a healthy node, which defies >>> the purpose of the probe. >> >> I think, there is one more thing we should keep in mind - plain old TCP >> keepalive functionality. Instead of implementing some special background >> threads, users can just disable all inactivity probes and LD_PRELOAD >> keepalive library with desired configuration that will be automatically >> applied to all connections and it will be kernel's responsibility to >> handle probes. IIUC, that should cover all the same cases as a background >> thread, and will have the same drawbacks of not detecting the process being >> stuck in an infinite loop. >> >> Best regards, Ilya Maximets. >> > > A long time ago (July 2020), Anton Ivanov provided a patch that enabled > keepalives at the kernel level. This is the latest version of the patch I > could find: > https://mail.openvswitch.org/pipermail/ovs-dev/2020-June/371262.html > > I had some findings on that particular patch, and AFAIK, there was never a > follow-up patch sent (although I may have missed it when searching). > > Would something like that be a viable alternative?
The beauty of this is that you don't need to change the code! Just install libkeepalive in you favorite distribution and LD_PREVOAD the library while starting the application. See some examples here: https://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/#libkeepalive Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
