On 6/10/21 1:21 PM, Dumitru Ceara wrote: > On 6/8/21 3:17 PM, Ilya Maximets wrote: >> The symptom of this issue is that OVS bridge looses its IP address on >> restart. >> >> Simple reproducer: >> 0. start ovsdb-server and ovs-vswitchd >> 1. ovs-vsctl add-br br0 >> 2. ifconfig br0 10.0.0.1 up >> 3. ovs-appctl -t ovs-vswitchd exit >> 4. start ovs-vswitchd back. >> >> After step #3 ovs-vswitchd is down, but br0 interface exists and >> has configured IP address. After step #4 there is no IP address >> on the port br0. >> >> What happened: >> 1. ovsdb-cs connects to the database via ovsdb-idl and requests >> database lock. >> --> get_schema for _Server database >> --> lock request >> >> 2. ovsdb-cs receives schema for the _Server database. And sends >> monitor request. >> <-- schema for _Server >> --> monitor_cond for _Server >> >> 3. ovsdb-cs receives lock reply. >> <-- locked >> At this point ovsdb-cs generates OVSDB_CS_EVENT_TYPE_LOCKED >> event and passes it to ovsdb-idl. ovsdb-idl increases change_seqno. >> >> 4. ovsdb_idl_has_ever_connected() is 'true' now, because change_seqno >> is not zero. >> >> 5. ovs-vswitchd decides that it has connection with database and >> all the initial data, therefore initiates configuration of bridges. >> bridge_run():ovsdb_idl_has_ever_connected() --> true >> >> 6. Since monitor request for the Open_vSwitch database is not even >> sent yet, the database is empty. This leads to removal of all the >> ports and all other resources. >> >> 7. When data finally received, ovs-vswitchd re-creates bridges and >> ports, but IP addresses can not be restored. >> >> While splitting out ovsdb-cs from ovsdb-idl one part of the logic >> was lost. Particularly, before the split, ovsdb-idl updated >> change_seqno only in MONITORING state. >> >> Restoring the logic by updating the change_seqno only if may send >> transaction, i.e. lock is ours and ovsdb-cs is in the MONITORING >> state. This matches with the main purpose of increasing change_seqno >> at this point, i.e. to force the client to re-try the transaction. >> With this change ovsdb_idl_has_ever_connected() remains 'false' >> until the first monitor reply with the actual data received. >> >> This issue was reported several times during the last couple of weeks. >> >> Reported-at: https://bugzilla.redhat.com/1968445 >> Reported-at: >> https://mail.openvswitch.org/pipermail/ovs-dev/2021-June/383512.html >> Reported-at: >> https://mail.openvswitch.org/pipermail/ovs-discuss/2021-June/051222.html >> Fixes: 1c337c43ac1c ("ovsdb-idl: Break into two layers.") >> Signed-off-by: Ilya Maximets <[email protected]> >> --- >> lib/ovsdb-idl.c | 12 +++++++++--- >> 1 file changed, 9 insertions(+), 3 deletions(-) >> >> diff --git a/lib/ovsdb-idl.c b/lib/ovsdb-idl.c >> index 1d385ca2f..2198c69c6 100644 >> --- a/lib/ovsdb-idl.c >> +++ b/lib/ovsdb-idl.c >> @@ -429,9 +429,15 @@ ovsdb_idl_run(struct ovsdb_idl *idl) >> break; >> >> case OVSDB_CS_EVENT_TYPE_LOCKED: >> - /* If the client couldn't run a transaction because it didn't >> have >> - * the lock, this will encourage it to try again. */ >> - idl->change_seqno++; >> + if (ovsdb_cs_may_send_transaction(idl->cs)) { >> + /* If the client couldn't run a transaction because it >> didn't >> + * have the lock, this will encourage it to try again. */ >> + idl->change_seqno++; >> + } else { >> + /* We're setting up a session, so don't signal that the >> + * database changed. Finalizing the session will increment >> + * change_seqno anyhow. */ >> + } >> break; >> >> case OVSDB_CS_EVENT_TYPE_UPDATE: >> > > Looks good to me; I also ran the OVN tests with this change applied to > OVS and all passed. > > Acked-by: Dumitru Ceara <[email protected]>
Thanks! Applied to master and branch-2.15. Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
