Dear openvswitch developers, After recent update of our openstack wallaby cloud, when OVS was updated from 2.15.0 to 2.15.2, we observe new OVS DB behavior with often (every 10-20 min) transferring leadership in raft cluster. Transferring leadership was implemented by https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/#2682913 It caused a lot of errors in our cloud, with neutron <-> OVN interaction.
First of all, we have to schedule regular (every 10min) manual compaction to avoid transferring leadership, which is unnecessary for us. Then, we tried to find out, why OVN database triggers compacting so often and we can see next things: 1) After restart of all OVN SB DB instances in raft cluster, there is no compaction for about 20-24 hours; 2) First time, compaction starts after 24 hours, or earlier after doubling of DB size (after restart, DB size was 105MB, compaction was triggered after DB size ~ 210MB); 3) After this first compaction, we have next compactions every 10-20 min. But it's unclear why? In http://www.openvswitch.org/support/dist-docs/ovsdb-server.1.txt we have next description - "A database is also compacted automatically when a transaction is logged if it is over 2 times as large as its previous compacted size (and at least 10 MB)". But according our usual activity (below), SB DB should not trigger compaction every 10-20 min, because during 10 min we have DB growth ~ 1Mb only: Wed 03 Aug 2022 07:50:14 AM EDT # ls -l /var/lib/docker/volumes/ovn_sb_db/_data/ovnsb.db -rw-r----- 1 root root 105226017 Aug 3 07:50 /var/lib/docker/volumes/ovn_sb_db/_data/ovnsb.db # docker exec ovn_sb_db ovs-appctl -t /var/run/ovn/ovnsb_db.ctl memory/show cells:2422190 monitors:8 raft-connections:4 raft-log:2 sessions:88 # docker exec ovn_sb_db ovsdb-tool show-log /var/lib/openvswitch/ovn-sb/ovnsb.db | grep -c record 6 Wed 03 Aug 2022 07:59:39 AM EDT # ls -l /var/lib/docker/volumes/ovn_sb_db/_data/ovnsb.db -rw-r----- 1 root root 106027676 Aug 3 07:59 /var/lib/docker/volumes/ovn_sb_db/_data/ovnsb.db docker exec ovn_sb_db ovs-appctl -t /var/run/ovn/ovnsb_db.ctl memory/show cells:2422190 monitors:8 raft-connections:4 raft-log:1925 sessions:88 # docker exec ovn_sb_db ovsdb-tool show-log /var/lib/openvswitch/ovn-sb/ovnsb.db | grep -c record 3852 4) After investigation of old logs, we realized that we had often compaction in OVS 2.15.0 also, for a long time, according regular messages like "Unreasonably long 2939ms poll interval" (every 10-20 min). We just didn`t see impact/errors from this compaction, like with patch "transferring leadership". Could you please help to find out - is it some bug with compaction trigger or expected behaviour? P.S. It would be good to have a choice in OVSDB - do we want to use transferring leadership or no. Thank you. Oleksandr Mykhalskyi, System Engineer Netcracker Technology ________________________________ The information transmitted herein is intended only for the person or entity to which it is addressed and may contain confidential, proprietary and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.
_______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
