On 15/09/2021 18:58, Mark Michelson wrote:
> On 9/14/21 1:27 PM, Mark Gray wrote:
>> On 13/09/2021 21:41, Mark Michelson wrote:
>>> Hi Mark,
>>>
>>> Of all the patches in this series, this is the one I am least sure about.
>> Thanks for looking over these.
>>
>>>
>>> It seems simple enough, but it's not clear to me
>>>
>>> 1) Why would the number of nat entries on an ovn_datapath differ from
>>> what's in the database? Since this series doesn't introduce any "real"
>>> incremental processing, we should be recreating the ovn_datapath on
>>> every iteration. And on each iteration, we should have a fresh view of
>>> the database, so why would the values differ?
>>> 2) Why is this only an issue in destroy_nat_entries()? There are several
>>> other places in the code where we iterate over the number of nat entries
>>> using od->nbr->n_nat. Why are those not affected?
>>> 3) How does this patch address that issue? The patch just sets the
>>> ovn_datapath's n_nat_entries to be the same as the database's count,
>>> which according to the commit message, is not always accurate.
>>>
>>> Can you provide some clarity here? Maybe tell of a situation where
>>> od->nbr->n_nat is incorrect?
>>
>> You are absolutely correct as the details in the commit message are not
>> sufficient to understand. In fact, I had a hard time remembering the
>> reason myself!
>>
>> IIRC, on each iteration, we need to call en_runtime_run(). This needs to
>> initialize the runtime data .. but .. it also needs to destroy the
>> runtime data from the previous run. This may not be the best way to do
>> it, but the I-P framework doesn't have a convenient way to clear data
>> between runs.
>>
>> Consider an example in which we have completed iteration x, and started
>> iteration x+1. If we add a NAT entry between iteration x and iteration
>> x+1, od->nbr->n_nat will contain the updated value of n_nat as the
>> northbound record will have been updated. However, as we have not yet
>> (re)initialized the nat entries in od, od->n_nat_entries will be equal
>> to the previous value. These two values could then be different which
>> makes this loop invalid. Does this make sense? If so, shall I update the
>> commit message? How would you like me to proceed?
> 
> OK, this makes a lot more sense now. It makes sense that the issue 
> specifically would be seen when destroying datapaths and not at any 
> other point.
> 
> Here is my suggestion for how to proceed:
> 1) Update the commit message to give an explanation similar to what 
> you've given here.

I posted a new series and added the comment.
> 2) Correct the finding that I just found right now and that I mention 
> down below.

I forgot to do this, I will fix in v2. Also, I am surprised I didn't see
this as I though I compiled with --enable-Werror.
> 
>>
>>>
>>> On 9/3/21 8:21 AM, Mark Gray wrote:
>>>> destroy_nat_entries() iterates over nat_entries using the
>>>> number of NAT entries from the NB database. When using I-P,
>>>> this behaviour is incorrect as the number of NAT entries in
>>>> the NBDB may differ from that stored in 'struct ovn_datapath'
>>>> causing unexpected behaviour.
>>>>
>>>> Signed-off-by: Mark Gray <mark.d.g...@redhat.com>
>>>> ---
>>>>    northd/northd.c | 5 ++++-
>>>>    1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/northd/northd.c b/northd/northd.c
>>>> index 724baa3994d5..829c4479f14b 100644
>>>> --- a/northd/northd.c
>>>> +++ b/northd/northd.c
>>>> @@ -605,6 +605,8 @@ struct ovn_datapath {
>>>>    
>>>>        /* NAT entries configured on the router. */
>>>>        struct ovn_nat *nat_entries;
>>>> +    size_t n_nat_entries;
>>>> +
> 
> n_nat_entries is declared and never used in this function.
> 
>>>>        bool has_distributed_nat;
>>>>    
>>>>        /* Set of nat external ips on the router. */
>>>> @@ -789,6 +791,7 @@ init_nat_entries(struct ovn_datapath *od)
>>>>                od->has_distributed_nat = true;
>>>>            }
>>>>        }
>>>> +    od->n_nat_entries = od->nbr->n_nat;
>>>>    }
>>>>    
>>>>    static void
>>>> @@ -802,7 +805,7 @@ destroy_nat_entries(struct ovn_datapath *od)
>>>>        destroy_lport_addresses(&od->dnat_force_snat_addrs);
>>>>        destroy_lport_addresses(&od->lb_force_snat_addrs);
>>>>    
>>>> -    for (size_t i = 0; i < od->nbr->n_nat; i++) {
>>>> +    for (size_t i = 0; i < od->n_nat_entries; i++) {
>>>>            destroy_lport_addresses(&od->nat_entries[i].ext_addrs);
>>>>        }
>>>>    }
>>>>
>>>
>>
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to