Hi, to improve the MAC binding aging mechanism we need a way to ensure that rows which are still in use are preserved. This doesn't happen with current implementation.
I propose the following solution which should solve the issue, any questions or comments are welcome. If there isn't anything major that would block this approach I would start to implement it so it can be available on 23.09. For the approach itself: Add "mac_cache_use" action into "lr_in_learn_neighbor" table (only the flow that continues on known MAC binding): match=(REGBIT_LOOKUP_NEIGHBOR_RESULT == 1 || REGBIT_LOOKUP_NEIGHBOR_IP_RESULT == 0), action=(next;) -> match=(REGBIT_LOOKUP_NEIGHBOR_RESULT == 1 || REGBIT_LOOKUP_NEIGHBOR_IP_RESULT == 0), action=(mac_cache_use; next;) The "mac_cache_use" would translate to resubmit into separate table with flows per MAC binding as follows: match=(ip.src=<MB_IP>, eth.src=<MB_MAC>, datapath=<MB_Datapath>), action=(drop;) This should bump the statistics every time for the correct MAC binding. In ovn-controller we could periodically dump the flows from this table. the period would be set to MIN(mac_binding_age_threshold/2) from all local datapaths. The dump would happen from a different thread with its own rconn to prevent backlogging issues. The thread would receive mapped data from I-P node that would keep track of mapping datapath -> cookies -> mac bindings. This allows us to avoid constant lookups, but at the cost of keeping track of all local MAC bindings. To save some computation time this I-P could be relevant only for datapaths that actually have the threshold set. If the "idle_age" of the particular flow is smaller than the datapath "mac_binding_age_threshold" it means that it is still in use. To prevent a lot of updates, if the traffic is still relevant on multiple controllers, we would check if the timestamp is older than the "dump period"; if not we don't have to update it, because someone else did. Also to "desync" the controllers there would be a random delay added to the "dump period". All of this would be applicable to FDB aging as well. Does that sound reasonable? Please let me know if you have any comments/suggestions. Thanks, Ales -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> [email protected] IM: amusil <https://red.ht/sig>
_______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
