On Fri, May 15, 2020 at 12:18 AM Han Zhou <[email protected]> wrote:
>
> When dp_hash is executed with slowpath actions, it results in endless
> recirc loop in kernel datapath, and finally drops the packet, with
> kernel logs:
>
> openvswitch: ovs-system: deferred action limit reached, drop recirc action
>
> The root cause is that the dp_hash value calculated by slowpath is not
> passed to datapath when executing the recirc action, thus when the
recirced
> packet miss upcall comes to userspace again, it generates the dp_hash
> and recirc action again, with same recirc_id, which in turn generates
> a megaflow with recirc action with the recird_id same as the recirc_id in
> its match condition, which causes a loop in datapath.
>
> For example, this can be reproduced with below setup of OVN environment:
>
>                          LS1            LS2
>                           |              |
>                           |------R1------|
>         VIF--LS0---R0-----|              |------R3
>                           |------R2------|
>
> Assume there is a route from the VIF to R3: R0 -> R1 -> R3, and there are
two
> routes (ECMP) from R3 to the VIF:
> R3 -> R1 -> R0
> R3 -> R2 -> R0
>
> Now if we ping from the VIF to R3, the OVS flow execution on the HV of
the VIF
> will hit the R3's datapath which has flows that responds to the ICMP
packet
> by setting ICMP fields, which requires slowpath actions, and in later flow
> tables it will hit the "group" action that selects between the ECMP
routes.
>
> By default OVN uses "dp_hash" method for the "group" action.
>
> For the first miss upcall packet, dp_hash value is empty, so the group
action
> will be translated to "dp_hash" and "recirc".
>
> During action execution, because of the previous actions that sets ICMP
fields,
> the whole execution requires slowpath, so it tries to execute all actions
in
> userspace in odp_execute_actions(), including dp_hash action, except the
> recirc action, which can only be executed in datapath. So the dp_hash
value
> is calculated in userspace, and then the packet is injected to datapath
for
> recirc action execution.
>
> However, the dp_hash calculated by the userspace is not passed to
datapath.
>
> Because of this, the packet recirc in datapath doesn't have dp_hash value,
> and the miss upcall for the recirced packet hits the same flow tables and
> triggers same "dp_hash" and "recirc" action again, with exactly same
recirc_id!
>
> This time, the new upcall doesn't require any slowpath execution, so both
> the dp_hash and recirc actions are executed in datapath, after creating a
> datapath megaflow like:
>
> recirc_id(XYZ),..., actions:hash(l4(0)),recirc(XYZ)
>
> with match recirc_id equals the recirc id in the action, thus creating a
loop.
>
> This patch fixes the problem by passing the calculated dp_hash value to
> datapath in odp_key_from_dp_packet().
>
> Signed-off-by: Han Zhou <[email protected]>
> ---
>  lib/odp-util.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/lib/odp-util.c b/lib/odp-util.c
> index b66d266..ac532fe 100644
> --- a/lib/odp-util.c
> +++ b/lib/odp-util.c
> @@ -6392,6 +6392,10 @@ odp_key_from_dp_packet(struct ofpbuf *buf, const
struct dp_packet *packet)
>
>      nl_msg_put_u32(buf, OVS_KEY_ATTR_PRIORITY, md->skb_priority);
>
> +    if (md->dp_hash) {
> +        nl_msg_put_u32(buf, OVS_KEY_ATTR_DP_HASH, md->dp_hash);
> +    }
> +
>      if (flow_tnl_dst_is_set(&md->tunnel)) {
>          tun_key_to_attr(buf, &md->tunnel, &md->tunnel, NULL, NULL);
>      }
> --
> 2.1.0
>

Ben and Ilya, this is the fix to the dp_hash problem we discussed in
yesterday's meeting. The actual fix is simpler that I thought it would be.
I didn't take the approach of executing dp_hash in datapath because in this
case since the flow is required to be slowpathed, all the following packets
for this flow will anyway get upcalled. If all the dp_hash for the flow is
executed in slowpath then there is no consistency problem. So I think it is
ok to keep the calculation in userspace and the fix is simple. Let me know
if you think differently.

Thanks,
Han
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to