Paolo Valerio <[email protected]> writes: > Ilya Maximets <[email protected]> writes: > >> On 5/4/23 19:21, Paolo Valerio wrote: >>> Ilya Maximets <[email protected]> writes: >>> >>>> On 4/19/23 20:40, Paolo Valerio wrote: >>>>> During the creation of a new connection, there's a chance both key and >>>>> rev_key end up having the same hash. This is more common in the case >>>>> of all-zero snat with no collisions. In that case, once the >>>>> connection is expired, but not cleaned up, if a new packet with the >>>>> same 5-tuple is received, an assertion failure gets triggered in >>>>> conn_update_state() because of a previous failure of retrieving a >>>>> CT_CONN_TYPE_DEFAULT connection. >>>>> >>>>> Fix it by releasing the nat_conn during the connection creation in the >>>>> case of same hash for both key and rev_key. >>>> >>>> This sounds a bit odd. Shouldn't we treat hash collision as a normal case? >>>> >>>> Looking at the code, I'm assuming that the issue comes from the following >>>> part in process_one(): >>>> >>>> if (OVS_LIKELY(conn)) { >>>> if (conn->conn_type == CT_CONN_TYPE_UN_NAT) { >>>> ... >>>> conn_key_lookup(ct, &ctx->key, hash, now, &conn, &ctx->reply); >>>> >>>> And here we get the same connection again, because the default one is >>>> already >>>> expired. Is that correct? >>>> >>>> If so, maybe we should add an extra condition to conn_key_lookup() to >>>> only look for DEFAULT connections instead, just for this case? Since >>>> we really don't want to get the UN_NAT one here. >>>> >>> >>> Hello Ilya, >>> >>> It's a fair point. >>> I initially thought about the approach you're suggesting, but I had some >>> concerns about it that I'll try to summarize below. >>> >>> For sure it would fix the issue (it could require the first patch to be >>> applied as well for the branches with rcu exp lists). >>> >>> Based on the current logic, new packets matching that expired connection >>> but not evicted will be marked as +inv and further packets will be >>> marked so for the whole sweep interval unless an exception like this get >>> added: >>> >>> uint32_t hash = conn_key_hash(&conn->rev_key, ct->hash_basis); >>> /* the last flag indicates CT_CONN_TYPE_DEFAULT only */ >>> conn_key_lookup_(ct, &ctx->key, hash, now, &conn, &ctx->reply, true); >>> /* special case where there's hash collision */ >>> if (!conn && ctx->hash != hash) { >>> pkt->md.ct_state |= CS_INVALID; >>> write_ct_md(pkt, zone, NULL, NULL, NULL); >>> ... >>> return; >>> } >>> >>> This would further require that subsequent lookup in the create_new_conn >>> path are restricted to CT_CONN_TYPE_DEFAULT, e.g.: >>> >>> uint32_t hash = conn_key_hash(&ctx->key, ct->hash_basis); >>> /* Only check for CT_CONN_TYPE_DEFAULT */ >>> if (!conn_key_lookup_(ct, &ctx->key, hash, now, NULL, NULL, true)) { >>> conn = conn_not_found(ct, pkt, ctx, commit, now, nat_action_info, >>> helper, alg_exp, ct_alg_ctl, tp_id); >>> } >>> >>> otherwise we could incur in a false positive which prevent to create a >>> new connection. >> >> I'm not really sure if what described above is more correct way of doing >> things or not... Aaron, do you have opinion on this? >> >> Another thought: Can we expire the CT_CONN_TYPE_UN_NAT connection the >> moment DEFAULT counterpart of it expires? Or that will that be against >> some logic / not possible to do? >> > > As far as I can tell, this could not be straightforward as simply > marking it as expired should not be reliable (e.g. doing it from the > sweeper), and I guess that managing the expiration time field for the > nat_conn as well would require updating the nat_conn every time the > default one gets updated, probably making it a bit unpractical. > > Another approach would be removing the nat_conn [1] altogether. > The problem in this case is backporting. Some adjustments that would add > to the patch might be needed for older branches. > > [1] > https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/
I think that work was interesting, and maybe the best way to go forward. Backports would become difficult, though - agreed. >> >>> >>>> Best regards, Ilya Maximets. >>>> >>>>> >>>>> Reported-by: Michael Plato <[email protected]> >>>>> Fixes: 61e48c2d1db2 ("conntrack: Handle SNAT with all-zero IP address.") >>>>> Signed-off-by: Paolo Valerio <[email protected]> >>>>> --- >>>>> In this thread [0] there are some more details. A similar >>>>> approach here could be to avoid to add the nat_conn to the cmap and >>>>> letting the sweeper release the memory for nat_conn once the whole >>>>> connection gets freed. >>>>> That approach could still be ok, but the drawback is that it could >>>>> require a different patch for older branches that don't include >>>>> 3d9c1b855a5f ("conntrack: Replace timeout based expiration lists with >>>>> rculists."). It still worth to be considered. >>>>> >>>>> [0] >>>>> https://mail.openvswitch.org/pipermail/ovs-discuss/2023-April/052339.html >>>>> --- >>>>> lib/conntrack.c | 21 +++++++++++++-------- >>>>> 1 file changed, 13 insertions(+), 8 deletions(-) >>>>> >>>>> diff --git a/lib/conntrack.c b/lib/conntrack.c >>>>> index 7e1fc4b1f..d2ee127d9 100644 >>>>> --- a/lib/conntrack.c >>>>> +++ b/lib/conntrack.c >>>>> @@ -1007,14 +1007,19 @@ conn_not_found(struct conntrack *ct, struct >>>>> dp_packet *pkt, >>>>> } >>>>> >>>>> nat_packet(pkt, nc, false, ctx->icmp_related); >>>>> - memcpy(&nat_conn->key, &nc->rev_key, sizeof nat_conn->key); >>>>> - memcpy(&nat_conn->rev_key, &nc->key, sizeof >>>>> nat_conn->rev_key); >>>>> - nat_conn->conn_type = CT_CONN_TYPE_UN_NAT; >>>>> - nat_conn->nat_action = 0; >>>>> - nat_conn->alg = NULL; >>>>> - nat_conn->nat_conn = NULL; >>>>> - uint32_t nat_hash = conn_key_hash(&nat_conn->key, >>>>> ct->hash_basis); >>>>> - cmap_insert(&ct->conns, &nat_conn->cm_node, nat_hash); >>>>> + uint32_t nat_hash = conn_key_hash(&nc->rev_key, >>>>> ct->hash_basis); >>>>> + if (nat_hash != ctx->hash) { >>>>> + memcpy(&nat_conn->key, &nc->rev_key, sizeof >>>>> nat_conn->key); >>>>> + memcpy(&nat_conn->rev_key, &nc->key, sizeof >>>>> nat_conn->rev_key); >>>>> + nat_conn->conn_type = CT_CONN_TYPE_UN_NAT; >>>>> + nat_conn->nat_action = 0; >>>>> + nat_conn->alg = NULL; >>>>> + nat_conn->nat_conn = NULL; >>>>> + cmap_insert(&ct->conns, &nat_conn->cm_node, nat_hash); >>>>> + } else { >>>>> + free(nat_conn); >>>>> + nat_conn = NULL; >>>>> + } >>>>> } >>>>> >>>>> nc->nat_conn = nat_conn; >>>>> >>> >>> _______________________________________________ >>> dev mailing list >>> [email protected] >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
