The 05/27/2021 12:43, Ilya Maximets wrote: > On 5/27/21 12:29 PM, Jianbo Liu wrote: > > There is a race condidtion between purger and handler in dpif-netlink. > > Handler may create new ukey and install it while executing 'ovs-appctl > > revalidator/purge' command. However, before handler calls > > transition_ukey() in handle_upcalls(), purger may get this ukey from > > umap, then evict and delete it. This will trigger ovs_abort in > > transition_ukey() for handler because it is trying to set state to > > EVICTED or OPERATIONAL, but ukey is already in DELETED state. > > To fix this issue, purger must not delete ukey in VISIBLE state. > > Hi. This is not a good thing to trigger abort(), but "purge" means > "purge". And, AFAIU, most of ukeys are visible. This is a purely > debug interface that was introduced to test functionality of OVS and > should not be used in production environment. The fact that "purge" > doesn't remove visible ukeys, in my opinion, just makes the appctl > "revalidator/purge" useless. Can we replace abort() with rate-limited
But currently ukey is also not removed if we can't hold its lock. Besides, new ukey will be installed while purging. We can't make sure that no visilbe/operational ukeys exist at the monent this command is finished. So in order to resolve the race, visible ukey should be kept, it is just like a new incoming ukey. > error message for this scenario instead to avoid process termination? No sure if there are issues, because ukey is deleted (maybe freed) while handler still access it by the pointer. > > BTW, how did you catch this? Is it reproducible with system tests? We found this issue when testing CT offload, do purge without stop traffic. Thanks! Jianbo > > Best regards, Ilya Maximets. -- _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
