A simplified ofproto destruction flow looks like this:

  ofproto_destroy()

    ofproto_flush__()
      ofproto_class->flush(ofproto);
        udpif_flush()
          udpif_stop_threads(udpif, true)
            revalidator_purge()
              ...
              ovsrcu_postpone(ukey_delete__, ukey)
          udpif_start_threads()

    ofproto_class->destruct(p, del)
      xlate_txn_start();
      xlate_remove_ofproto(ofproto);
      xlate_txn_commit();
      ...
      recirc_free_ofproto()
      ...
      ovsrcu_barrier()
      close_dpif_backer(ofproto->backer, del);

The recirc_free_ofproto() function is used to check if we leaked any
recirculation IDs, but it doesn't really wait for everything to be
completed and reports false positives.  There are two races here in
particular:

1. Flush frees recirculation IDs in an RCU-postponed ukey_delete__().
There is one ovsrcu_synchronize() call between udpif_flush() and the
recirc_free_ofproto() inside xlate_txn_commit(), but synchronization
only waits for all threads to enter quiescent state, it doesn't wait
for the postponed callbacks to be completed.  So, when the
recirc_free_ofproto() is called, some IDs can still be not freed.

2. udpif_flush() starts the threads back at the end, so they can still
add new ukeys in the short window between the flush and the
destruction.  In this case some new IDs can be allocated after the
flush and the leak check will find them.

We can solve both races by moving the leak check down to the end of
the ofproto_class->destruct().  We already have the ovsrcu_barrier()
there that actually waits for all the RCU-postponed cleanups.  And the
close_dpif_backer() will take care of all the ukeys created after the
flush in a non-postponed fashion.

It's hard to pinpoint the exact commit to blame, but citing one that
definitely played a role here.

Fixes: e672ff9b4d22 ("ofproto-dpif: Restore metadata and registers on 
recirculation.")
Signed-off-by: Ilya Maximets <[email protected]>
---
 ofproto/ofproto-dpif.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index a02afe8ef..46c6c27bc 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -1959,8 +1959,6 @@ destruct(struct ofproto *ofproto_, bool del)
     }
     guarded_list_destroy(&ofproto->ams);
 
-    recirc_free_ofproto(ofproto, ofproto->up.name);
-
     mbridge_unref(ofproto->mbridge);
 
     netflow_unref(ofproto->netflow);
@@ -1984,6 +1982,7 @@ destruct(struct ofproto *ofproto_, bool del)
     /* Wait for all the meter destroy work to finish. */
     ovsrcu_barrier();
     close_dpif_backer(ofproto->backer, del);
+    recirc_free_ofproto(ofproto, ofproto->up.name);
 }
 
 static int
-- 
2.53.0

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to