Paolo Valerio <[email protected]> writes:

> Ilya Maximets <[email protected]> writes:
>
>> On 5/4/23 19:21, Paolo Valerio wrote:
>>> Ilya Maximets <[email protected]> writes:
>>> 
>>>> On 4/19/23 20:40, Paolo Valerio wrote:
>>>>> During the creation of a new connection, there's a chance both key and
>>>>> rev_key end up having the same hash. This is more common in the case
>>>>> of all-zero snat with no collisions. In that case, once the
>>>>> connection is expired, but not cleaned up, if a new packet with the
>>>>> same 5-tuple is received, an assertion failure gets triggered in
>>>>> conn_update_state() because of a previous failure of retrieving a
>>>>> CT_CONN_TYPE_DEFAULT connection.
>>>>>
>>>>> Fix it by releasing the nat_conn during the connection creation in the
>>>>> case of same hash for both key and rev_key.
>>>>
>>>> This sounds a bit odd.  Shouldn't we treat hash collision as a normal case?
>>>>
>>>> Looking at the code, I'm assuming that the issue comes from the following
>>>> part in process_one():
>>>>
>>>>     if (OVS_LIKELY(conn)) {
>>>>         if (conn->conn_type == CT_CONN_TYPE_UN_NAT) {
>>>>             ...
>>>>             conn_key_lookup(ct, &ctx->key, hash, now, &conn, &ctx->reply);
>>>>
>>>> And here we get the same connection again, because the default one is 
>>>> already
>>>> expired.  Is that correct?
>>>>
>>>> If so, maybe we should add an extra condition to conn_key_lookup() to
>>>> only look for DEFAULT connections instead, just for this case?  Since
>>>> we really don't want to get the UN_NAT one here.
>>>>
>>> 
>>> Hello Ilya,
>>> 
>>> It's a fair point.
>>> I initially thought about the approach you're suggesting, but I had some
>>> concerns about it that I'll try to summarize below.
>>> 
>>> For sure it would fix the issue (it could require the first patch to be
>>> applied as well for the branches with rcu exp lists).
>>> 
>>> Based on the current logic, new packets matching that expired connection
>>> but not evicted will be marked as +inv and further packets will be
>>> marked so for the whole sweep interval unless an exception like this get
>>> added:
>>> 
>>> uint32_t hash = conn_key_hash(&conn->rev_key, ct->hash_basis);
>>> /* the last flag indicates CT_CONN_TYPE_DEFAULT only */
>>> conn_key_lookup_(ct, &ctx->key, hash, now, &conn, &ctx->reply, true);
>>> /* special case where there's hash collision */
>>> if (!conn && ctx->hash != hash) {
>>>     pkt->md.ct_state |= CS_INVALID;
>>>     write_ct_md(pkt, zone, NULL, NULL, NULL);
>>>     ...
>>>     return;
>>> }
>>> 
>>> This would further require that subsequent lookup in the create_new_conn
>>> path are restricted to CT_CONN_TYPE_DEFAULT, e.g.:
>>> 
>>> uint32_t hash = conn_key_hash(&ctx->key, ct->hash_basis);
>>> /* Only check for CT_CONN_TYPE_DEFAULT */
>>> if (!conn_key_lookup_(ct, &ctx->key, hash, now, NULL, NULL, true)) {
>>>     conn = conn_not_found(ct, pkt, ctx, commit, now, nat_action_info,
>>>                           helper, alg_exp, ct_alg_ctl, tp_id);
>>> }
>>> 
>>> otherwise we could incur in a false positive which prevent to create a
>>> new connection.
>>
>> I'm not really sure if what described above is more correct way of doing
>> things or not...  Aaron, do you have opinion on this?
>>
>> Another thought: Can we expire the CT_CONN_TYPE_UN_NAT connection the
>> moment DEFAULT counterpart of it expires?  Or that will that be against
>> some logic / not possible to do?
>>
>
> As far as I can tell, this could not be straightforward as simply
> marking it as expired should not be reliable (e.g. doing it from the
> sweeper), and I guess that managing the expiration time field for the
> nat_conn as well would require updating the nat_conn every time the
> default one gets updated, probably making it a bit unpractical.
>
> Another approach would be removing the nat_conn [1] altogether.
> The problem in this case is backporting. Some adjustments that would add
> to the patch might be needed for older branches.
>
> [1]
> https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/

I think that work was interesting, and maybe the best way to go
forward.  Backports would become difficult, though - agreed.

>>
>>> 
>>>> Best regards, Ilya Maximets.
>>>>
>>>>>
>>>>> Reported-by: Michael Plato <[email protected]>
>>>>> Fixes: 61e48c2d1db2 ("conntrack: Handle SNAT with all-zero IP address.")
>>>>> Signed-off-by: Paolo Valerio <[email protected]>
>>>>> ---
>>>>> In this thread [0] there are some more details. A similar
>>>>> approach here could be to avoid to add the nat_conn to the cmap and
>>>>> letting the sweeper release the memory for nat_conn once the whole
>>>>> connection gets freed.
>>>>> That approach could still be ok, but the drawback is that it could
>>>>> require a different patch for older branches that don't include
>>>>> 3d9c1b855a5f ("conntrack: Replace timeout based expiration lists with
>>>>> rculists."). It still worth to be considered.
>>>>>
>>>>> [0] 
>>>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2023-April/052339.html
>>>>> ---
>>>>>  lib/conntrack.c |   21 +++++++++++++--------
>>>>>  1 file changed, 13 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/lib/conntrack.c b/lib/conntrack.c
>>>>> index 7e1fc4b1f..d2ee127d9 100644
>>>>> --- a/lib/conntrack.c
>>>>> +++ b/lib/conntrack.c
>>>>> @@ -1007,14 +1007,19 @@ conn_not_found(struct conntrack *ct, struct 
>>>>> dp_packet *pkt,
>>>>>              }
>>>>>  
>>>>>              nat_packet(pkt, nc, false, ctx->icmp_related);
>>>>> -            memcpy(&nat_conn->key, &nc->rev_key, sizeof nat_conn->key);
>>>>> -            memcpy(&nat_conn->rev_key, &nc->key, sizeof 
>>>>> nat_conn->rev_key);
>>>>> -            nat_conn->conn_type = CT_CONN_TYPE_UN_NAT;
>>>>> -            nat_conn->nat_action = 0;
>>>>> -            nat_conn->alg = NULL;
>>>>> -            nat_conn->nat_conn = NULL;
>>>>> -            uint32_t nat_hash = conn_key_hash(&nat_conn->key, 
>>>>> ct->hash_basis);
>>>>> -            cmap_insert(&ct->conns, &nat_conn->cm_node, nat_hash);
>>>>> +            uint32_t nat_hash = conn_key_hash(&nc->rev_key, 
>>>>> ct->hash_basis);
>>>>> +            if (nat_hash != ctx->hash) {
>>>>> +                memcpy(&nat_conn->key, &nc->rev_key, sizeof 
>>>>> nat_conn->key);
>>>>> +                memcpy(&nat_conn->rev_key, &nc->key, sizeof 
>>>>> nat_conn->rev_key);
>>>>> +                nat_conn->conn_type = CT_CONN_TYPE_UN_NAT;
>>>>> +                nat_conn->nat_action = 0;
>>>>> +                nat_conn->alg = NULL;
>>>>> +                nat_conn->nat_conn = NULL;
>>>>> +                cmap_insert(&ct->conns, &nat_conn->cm_node, nat_hash);
>>>>> +            } else {
>>>>> +                free(nat_conn);
>>>>> +                nat_conn = NULL;
>>>>> +            }
>>>>>          }
>>>>>  
>>>>>          nc->nat_conn = nat_conn;
>>>>>
>>> 
>>> _______________________________________________
>>> dev mailing list
>>> [email protected]
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to