Ilya Maximets <[email protected]> writes:

> On 4/19/23 20:40, Paolo Valerio wrote:
>> During the creation of a new connection, there's a chance both key and
>> rev_key end up having the same hash. This is more common in the case
>> of all-zero snat with no collisions. In that case, once the
>> connection is expired, but not cleaned up, if a new packet with the
>> same 5-tuple is received, an assertion failure gets triggered in
>> conn_update_state() because of a previous failure of retrieving a
>> CT_CONN_TYPE_DEFAULT connection.
>> 
>> Fix it by releasing the nat_conn during the connection creation in the
>> case of same hash for both key and rev_key.
>
> This sounds a bit odd.  Shouldn't we treat hash collision as a normal case?
>
> Looking at the code, I'm assuming that the issue comes from the following
> part in process_one():
>
>     if (OVS_LIKELY(conn)) {
>         if (conn->conn_type == CT_CONN_TYPE_UN_NAT) {
>             ...
>             conn_key_lookup(ct, &ctx->key, hash, now, &conn, &ctx->reply);
>
> And here we get the same connection again, because the default one is already
> expired.  Is that correct?
>
> If so, maybe we should add an extra condition to conn_key_lookup() to
> only look for DEFAULT connections instead, just for this case?  Since
> we really don't want to get the UN_NAT one here.
>

Hello Ilya,

It's a fair point.
I initially thought about the approach you're suggesting, but I had some
concerns about it that I'll try to summarize below.

For sure it would fix the issue (it could require the first patch to be
applied as well for the branches with rcu exp lists).

Based on the current logic, new packets matching that expired connection
but not evicted will be marked as +inv and further packets will be
marked so for the whole sweep interval unless an exception like this get
added:

uint32_t hash = conn_key_hash(&conn->rev_key, ct->hash_basis);
/* the last flag indicates CT_CONN_TYPE_DEFAULT only */
conn_key_lookup_(ct, &ctx->key, hash, now, &conn, &ctx->reply, true);
/* special case where there's hash collision */
if (!conn && ctx->hash != hash) {
    pkt->md.ct_state |= CS_INVALID;
    write_ct_md(pkt, zone, NULL, NULL, NULL);
    ...
    return;
}

This would further require that subsequent lookup in the create_new_conn
path are restricted to CT_CONN_TYPE_DEFAULT, e.g.:

uint32_t hash = conn_key_hash(&ctx->key, ct->hash_basis);
/* Only check for CT_CONN_TYPE_DEFAULT */
if (!conn_key_lookup_(ct, &ctx->key, hash, now, NULL, NULL, true)) {
    conn = conn_not_found(ct, pkt, ctx, commit, now, nat_action_info,
                          helper, alg_exp, ct_alg_ctl, tp_id);
}

otherwise we could incur in a false positive which prevent to create a
new connection.

> Best regards, Ilya Maximets.
>
>> 
>> Reported-by: Michael Plato <[email protected]>
>> Fixes: 61e48c2d1db2 ("conntrack: Handle SNAT with all-zero IP address.")
>> Signed-off-by: Paolo Valerio <[email protected]>
>> ---
>> In this thread [0] there are some more details. A similar
>> approach here could be to avoid to add the nat_conn to the cmap and
>> letting the sweeper release the memory for nat_conn once the whole
>> connection gets freed.
>> That approach could still be ok, but the drawback is that it could
>> require a different patch for older branches that don't include
>> 3d9c1b855a5f ("conntrack: Replace timeout based expiration lists with
>> rculists."). It still worth to be considered.
>> 
>> [0] https://mail.openvswitch.org/pipermail/ovs-discuss/2023-April/052339.html
>> ---
>>  lib/conntrack.c |   21 +++++++++++++--------
>>  1 file changed, 13 insertions(+), 8 deletions(-)
>> 
>> diff --git a/lib/conntrack.c b/lib/conntrack.c
>> index 7e1fc4b1f..d2ee127d9 100644
>> --- a/lib/conntrack.c
>> +++ b/lib/conntrack.c
>> @@ -1007,14 +1007,19 @@ conn_not_found(struct conntrack *ct, struct 
>> dp_packet *pkt,
>>              }
>>  
>>              nat_packet(pkt, nc, false, ctx->icmp_related);
>> -            memcpy(&nat_conn->key, &nc->rev_key, sizeof nat_conn->key);
>> -            memcpy(&nat_conn->rev_key, &nc->key, sizeof nat_conn->rev_key);
>> -            nat_conn->conn_type = CT_CONN_TYPE_UN_NAT;
>> -            nat_conn->nat_action = 0;
>> -            nat_conn->alg = NULL;
>> -            nat_conn->nat_conn = NULL;
>> -            uint32_t nat_hash = conn_key_hash(&nat_conn->key, 
>> ct->hash_basis);
>> -            cmap_insert(&ct->conns, &nat_conn->cm_node, nat_hash);
>> +            uint32_t nat_hash = conn_key_hash(&nc->rev_key, ct->hash_basis);
>> +            if (nat_hash != ctx->hash) {
>> +                memcpy(&nat_conn->key, &nc->rev_key, sizeof nat_conn->key);
>> +                memcpy(&nat_conn->rev_key, &nc->key, sizeof 
>> nat_conn->rev_key);
>> +                nat_conn->conn_type = CT_CONN_TYPE_UN_NAT;
>> +                nat_conn->nat_action = 0;
>> +                nat_conn->alg = NULL;
>> +                nat_conn->nat_conn = NULL;
>> +                cmap_insert(&ct->conns, &nat_conn->cm_node, nat_hash);
>> +            } else {
>> +                free(nat_conn);
>> +                nat_conn = NULL;
>> +            }
>>          }
>>  
>>          nc->nat_conn = nat_conn;
>> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to