On 5/18/23 09:37, Peng He wrote:
> Hi, Paolo,
> 
> IIRC, you have a revision version on these patches?
> I guess it should be closer to the upstream than mine?

I suppose, this one was the last revision of the patch:
  
https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/

It was part of the 'buckets' approach for multithread
scalability and we didn't go with it.

Best regards, Ilya Maximets.

> 
> 
> Aaron Conole <[email protected] <mailto:[email protected]>> 于2023年5月17日周三 
> 21:54写道:
> 
>     Paolo Valerio <[email protected] <mailto:[email protected]>> writes:
> 
>     > Ilya Maximets <[email protected] <mailto:[email protected]>> writes:
>     >
>     >> On 5/4/23 19:21, Paolo Valerio wrote:
>     >>> Ilya Maximets <[email protected] <mailto:[email protected]>> writes:
>     >>>
>     >>>> On 4/19/23 20:40, Paolo Valerio wrote:
>     >>>>> During the creation of a new connection, there's a chance both key 
> and
>     >>>>> rev_key end up having the same hash. This is more common in the case
>     >>>>> of all-zero snat with no collisions. In that case, once the
>     >>>>> connection is expired, but not cleaned up, if a new packet with the
>     >>>>> same 5-tuple is received, an assertion failure gets triggered in
>     >>>>> conn_update_state() because of a previous failure of retrieving a
>     >>>>> CT_CONN_TYPE_DEFAULT connection.
>     >>>>>
>     >>>>> Fix it by releasing the nat_conn during the connection creation in 
> the
>     >>>>> case of same hash for both key and rev_key.
>     >>>>
>     >>>> This sounds a bit odd.  Shouldn't we treat hash collision as a 
> normal case?
>     >>>>
>     >>>> Looking at the code, I'm assuming that the issue comes from the 
> following
>     >>>> part in process_one():
>     >>>>
>     >>>>     if (OVS_LIKELY(conn)) {
>     >>>>         if (conn->conn_type == CT_CONN_TYPE_UN_NAT) {
>     >>>>             ...
>     >>>>             conn_key_lookup(ct, &ctx->key, hash, now, &conn, 
> &ctx->reply);
>     >>>>
>     >>>> And here we get the same connection again, because the default one 
> is already
>     >>>> expired.  Is that correct?
>     >>>>
>     >>>> If so, maybe we should add an extra condition to conn_key_lookup() to
>     >>>> only look for DEFAULT connections instead, just for this case?  Since
>     >>>> we really don't want to get the UN_NAT one here.
>     >>>>
>     >>>
>     >>> Hello Ilya,
>     >>>
>     >>> It's a fair point.
>     >>> I initially thought about the approach you're suggesting, but I had 
> some
>     >>> concerns about it that I'll try to summarize below.
>     >>>
>     >>> For sure it would fix the issue (it could require the first patch to 
> be
>     >>> applied as well for the branches with rcu exp lists).
>     >>>
>     >>> Based on the current logic, new packets matching that expired 
> connection
>     >>> but not evicted will be marked as +inv and further packets will be
>     >>> marked so for the whole sweep interval unless an exception like this 
> get
>     >>> added:
>     >>>
>     >>> uint32_t hash = conn_key_hash(&conn->rev_key, ct->hash_basis);
>     >>> /* the last flag indicates CT_CONN_TYPE_DEFAULT only */
>     >>> conn_key_lookup_(ct, &ctx->key, hash, now, &conn, &ctx->reply, true);
>     >>> /* special case where there's hash collision */
>     >>> if (!conn && ctx->hash != hash) {
>     >>>     pkt->md.ct_state |= CS_INVALID;
>     >>>     write_ct_md(pkt, zone, NULL, NULL, NULL);
>     >>>     ...
>     >>>     return;
>     >>> }
>     >>>
>     >>> This would further require that subsequent lookup in the 
> create_new_conn
>     >>> path are restricted to CT_CONN_TYPE_DEFAULT, e.g.:
>     >>>
>     >>> uint32_t hash = conn_key_hash(&ctx->key, ct->hash_basis);
>     >>> /* Only check for CT_CONN_TYPE_DEFAULT */
>     >>> if (!conn_key_lookup_(ct, &ctx->key, hash, now, NULL, NULL, true)) {
>     >>>     conn = conn_not_found(ct, pkt, ctx, commit, now, nat_action_info,
>     >>>                           helper, alg_exp, ct_alg_ctl, tp_id);
>     >>> }
>     >>>
>     >>> otherwise we could incur in a false positive which prevent to create a
>     >>> new connection.
>     >>
>     >> I'm not really sure if what described above is more correct way of 
> doing
>     >> things or not...  Aaron, do you have opinion on this?
>     >>
>     >> Another thought: Can we expire the CT_CONN_TYPE_UN_NAT connection the
>     >> moment DEFAULT counterpart of it expires?  Or that will that be against
>     >> some logic / not possible to do?
>     >>
>     >
>     > As far as I can tell, this could not be straightforward as simply
>     > marking it as expired should not be reliable (e.g. doing it from the
>     > sweeper), and I guess that managing the expiration time field for the
>     > nat_conn as well would require updating the nat_conn every time the
>     > default one gets updated, probably making it a bit unpractical.
>     >
>     > Another approach would be removing the nat_conn [1] altogether.
>     > The problem in this case is backporting. Some adjustments that would add
>     > to the patch might be needed for older branches.
>     >
>     > [1]
>     > 
> https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/
>  
> <https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/>
> 
>     I think that work was interesting, and maybe the best way to go
>     forward.  Backports would become difficult, though - agreed.
> 
>     >>
>     >>>
>     >>>> Best regards, Ilya Maximets.
>     >>>>
>     >>>>>
>     >>>>> Reported-by: Michael Plato <[email protected] 
> <mailto:[email protected]>>
>     >>>>> Fixes: 61e48c2d1db2 ("conntrack: Handle SNAT with all-zero IP 
> address.")
>     >>>>> Signed-off-by: Paolo Valerio <[email protected] 
> <mailto:[email protected]>>
>     >>>>> ---
>     >>>>> In this thread [0] there are some more details. A similar
>     >>>>> approach here could be to avoid to add the nat_conn to the cmap and
>     >>>>> letting the sweeper release the memory for nat_conn once the whole
>     >>>>> connection gets freed.
>     >>>>> That approach could still be ok, but the drawback is that it could
>     >>>>> require a different patch for older branches that don't include
>     >>>>> 3d9c1b855a5f ("conntrack: Replace timeout based expiration lists 
> with
>     >>>>> rculists."). It still worth to be considered.
>     >>>>>
>     >>>>> [0] 
> https://mail.openvswitch.org/pipermail/ovs-discuss/2023-April/052339.html 
> <https://mail.openvswitch.org/pipermail/ovs-discuss/2023-April/052339.html>
>     >>>>> ---
>     >>>>>  lib/conntrack.c |   21 +++++++++++++--------
>     >>>>>  1 file changed, 13 insertions(+), 8 deletions(-)
>     >>>>>
>     >>>>> diff --git a/lib/conntrack.c b/lib/conntrack.c
>     >>>>> index 7e1fc4b1f..d2ee127d9 100644
>     >>>>> --- a/lib/conntrack.c
>     >>>>> +++ b/lib/conntrack.c
>     >>>>> @@ -1007,14 +1007,19 @@ conn_not_found(struct conntrack *ct, struct 
> dp_packet *pkt,
>     >>>>>              }
>     >>>>> 
>     >>>>>              nat_packet(pkt, nc, false, ctx->icmp_related);
>     >>>>> -            memcpy(&nat_conn->key, &nc->rev_key, sizeof 
> nat_conn->key);
>     >>>>> -            memcpy(&nat_conn->rev_key, &nc->key, sizeof 
> nat_conn->rev_key);
>     >>>>> -            nat_conn->conn_type = CT_CONN_TYPE_UN_NAT;
>     >>>>> -            nat_conn->nat_action = 0;
>     >>>>> -            nat_conn->alg = NULL;
>     >>>>> -            nat_conn->nat_conn = NULL;
>     >>>>> -            uint32_t nat_hash = conn_key_hash(&nat_conn->key, 
> ct->hash_basis);
>     >>>>> -            cmap_insert(&ct->conns, &nat_conn->cm_node, nat_hash);
>     >>>>> +            uint32_t nat_hash = conn_key_hash(&nc->rev_key, 
> ct->hash_basis);
>     >>>>> +            if (nat_hash != ctx->hash) {
>     >>>>> +                memcpy(&nat_conn->key, &nc->rev_key, sizeof 
> nat_conn->key);
>     >>>>> +                memcpy(&nat_conn->rev_key, &nc->key, sizeof 
> nat_conn->rev_key);
>     >>>>> +                nat_conn->conn_type = CT_CONN_TYPE_UN_NAT;
>     >>>>> +                nat_conn->nat_action = 0;
>     >>>>> +                nat_conn->alg = NULL;
>     >>>>> +                nat_conn->nat_conn = NULL;
>     >>>>> +                cmap_insert(&ct->conns, &nat_conn->cm_node, 
> nat_hash);
>     >>>>> +            } else {
>     >>>>> +                free(nat_conn);
>     >>>>> +                nat_conn = NULL;
>     >>>>> +            }
>     >>>>>          }
>     >>>>> 
>     >>>>>          nc->nat_conn = nat_conn;
>     >>>>>
>     >>>
>     >>> _______________________________________________
>     >>> dev mailing list
>     >>> [email protected] <mailto:[email protected]>
>     >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev 
> <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
>     >>>
> 
>     _______________________________________________
>     dev mailing list
>     [email protected] <mailto:[email protected]>
>     https://mail.openvswitch.org/mailman/listinfo/ovs-dev 
> <https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
> 
> 
> 
> -- 
> hepeng

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to