On 15/09/2021 18:58, Mark Michelson wrote: > On 9/14/21 1:27 PM, Mark Gray wrote: >> On 13/09/2021 21:41, Mark Michelson wrote: >>> Hi Mark, >>> >>> Of all the patches in this series, this is the one I am least sure about. >> Thanks for looking over these. >> >>> >>> It seems simple enough, but it's not clear to me >>> >>> 1) Why would the number of nat entries on an ovn_datapath differ from >>> what's in the database? Since this series doesn't introduce any "real" >>> incremental processing, we should be recreating the ovn_datapath on >>> every iteration. And on each iteration, we should have a fresh view of >>> the database, so why would the values differ? >>> 2) Why is this only an issue in destroy_nat_entries()? There are several >>> other places in the code where we iterate over the number of nat entries >>> using od->nbr->n_nat. Why are those not affected? >>> 3) How does this patch address that issue? The patch just sets the >>> ovn_datapath's n_nat_entries to be the same as the database's count, >>> which according to the commit message, is not always accurate. >>> >>> Can you provide some clarity here? Maybe tell of a situation where >>> od->nbr->n_nat is incorrect? >> >> You are absolutely correct as the details in the commit message are not >> sufficient to understand. In fact, I had a hard time remembering the >> reason myself! >> >> IIRC, on each iteration, we need to call en_runtime_run(). This needs to >> initialize the runtime data .. but .. it also needs to destroy the >> runtime data from the previous run. This may not be the best way to do >> it, but the I-P framework doesn't have a convenient way to clear data >> between runs. >> >> Consider an example in which we have completed iteration x, and started >> iteration x+1. If we add a NAT entry between iteration x and iteration >> x+1, od->nbr->n_nat will contain the updated value of n_nat as the >> northbound record will have been updated. However, as we have not yet >> (re)initialized the nat entries in od, od->n_nat_entries will be equal >> to the previous value. These two values could then be different which >> makes this loop invalid. Does this make sense? If so, shall I update the >> commit message? How would you like me to proceed? > > OK, this makes a lot more sense now. It makes sense that the issue > specifically would be seen when destroying datapaths and not at any > other point. > > Here is my suggestion for how to proceed: > 1) Update the commit message to give an explanation similar to what > you've given here.
I posted a new series and added the comment. > 2) Correct the finding that I just found right now and that I mention > down below. I forgot to do this, I will fix in v2. Also, I am surprised I didn't see this as I though I compiled with --enable-Werror. > >> >>> >>> On 9/3/21 8:21 AM, Mark Gray wrote: >>>> destroy_nat_entries() iterates over nat_entries using the >>>> number of NAT entries from the NB database. When using I-P, >>>> this behaviour is incorrect as the number of NAT entries in >>>> the NBDB may differ from that stored in 'struct ovn_datapath' >>>> causing unexpected behaviour. >>>> >>>> Signed-off-by: Mark Gray <mark.d.g...@redhat.com> >>>> --- >>>> northd/northd.c | 5 ++++- >>>> 1 file changed, 4 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/northd/northd.c b/northd/northd.c >>>> index 724baa3994d5..829c4479f14b 100644 >>>> --- a/northd/northd.c >>>> +++ b/northd/northd.c >>>> @@ -605,6 +605,8 @@ struct ovn_datapath { >>>> >>>> /* NAT entries configured on the router. */ >>>> struct ovn_nat *nat_entries; >>>> + size_t n_nat_entries; >>>> + > > n_nat_entries is declared and never used in this function. > >>>> bool has_distributed_nat; >>>> >>>> /* Set of nat external ips on the router. */ >>>> @@ -789,6 +791,7 @@ init_nat_entries(struct ovn_datapath *od) >>>> od->has_distributed_nat = true; >>>> } >>>> } >>>> + od->n_nat_entries = od->nbr->n_nat; >>>> } >>>> >>>> static void >>>> @@ -802,7 +805,7 @@ destroy_nat_entries(struct ovn_datapath *od) >>>> destroy_lport_addresses(&od->dnat_force_snat_addrs); >>>> destroy_lport_addresses(&od->lb_force_snat_addrs); >>>> >>>> - for (size_t i = 0; i < od->nbr->n_nat; i++) { >>>> + for (size_t i = 0; i < od->n_nat_entries; i++) { >>>> destroy_lport_addresses(&od->nat_entries[i].ext_addrs); >>>> } >>>> } >>>> >>> >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev