>-----Original Message-----
>From: Ilya Maximets <i.maxim...@ovn.org>
>Sent: Wednesday, 2 April 2025 13:51
>To: Roi Dayan <r...@nvidia.com>; d...@openvswitch.org
>Cc: Maor Dickman <ma...@nvidia.com>; Eli Britstein <el...@nvidia.com>;
>i.maxim...@ovn.org
>Subject: Re: [ovs-dev] [PATCH 2/2] ofproto-dpif-xlate: Add a drop action for
>native tunnel failure.
>
>External email: Use caution opening links or attachments
>
>
>On 3/24/25 3:58 PM, Roi Dayan via dev wrote:
>> From: Eli Britstein <el...@nvidia.com>
>>
>> Upon tunnel output failure, due to routing failure for example, add an
>> explicit drop action, so it will appear in the dp-flows for better
>> visibility for that case.
>>
>> Signed-off-by: Eli Britstein <el...@nvidia.com>
>> Reviewed-by: Roi Dayan <r...@nvidia.com>
>> ---
>> ofproto/ofproto-dpif-xlate.c | 27 +++++++++++++++++++++++----
>> 1 file changed, 23 insertions(+), 4 deletions(-)
>
>Thanks, Eli and Roi for the updated series!
>
>Please, add a small test for this functionality to the tunnel-push-pop.at.
>It should be easy to simulate the missing neighbor cache entry, for example.
>Style-wise, please, use newer tests (at the end of the file) as a model for
>this
>new test. Datapath flow dumps do not print drop reasons, but it should be
>enough to just see an explicit cloned drop in the test.
Ack
>
>See some more comments below.
>
>Best regards, Ilya Maximets.
>
>>
>> diff --git a/ofproto/ofproto-dpif-xlate.c
>> b/ofproto/ofproto-dpif-xlate.c index eeb1853efd29..40ddd426e6f6 100644
>> --- a/ofproto/ofproto-dpif-xlate.c
>> +++ b/ofproto/ofproto-dpif-xlate.c
>> @@ -461,6 +461,10 @@ const char *xlate_strerror(enum xlate_error error)
>> return "Unknown error";
>> }
>>
>> +static void put_drop_action(struct ofproto_dpif *ofproto,
>> + struct ofpbuf *odp_actions,
>> + enum xlate_error error,
>> + bool is_last_action);
>> static void xlate_action_set(struct xlate_ctx *ctx); static void
>> xlate_commit_actions(struct xlate_ctx *ctx);
>>
>> @@ -3895,6 +3899,8 @@ native_tunnel_output(struct xlate_ctx *ctx,
>> const struct xport *xport,
>>
>> err = tnl_route_lookup_flow(ctx, flow, &d_ip6, &s_ip6, &out_dev);
>> if (err) {
>> + put_drop_action(ctx->xbridge->ofproto, ctx->odp_actions,
>> + XLATE_INVALID_TUNNEL_METADATA,
>> + is_last_action);
>
>Please, define new separate translation error values for each case.
>The invalid tunnel metadata error means that the metadata for the tunnel is
>not correct and cannot be transformed into required format.
>It's not the case for the routing failure or a missing MAC or a neighbor cache
>miss below. These should all be separate values.
>Please add them to the end of the xlate_error enum. And we'll need
>corresponding coverage counters in the odp-execute.
The reason I didn't want to do it was because they are defined in
include/linux/openvswitch.h and I was not sure it's ok to change from OVS
context or only from kernel side.
I'll add them. Is this OK? Other names?
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 406d083b3..556239594 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -449,6 +449,12 @@ const char *xlate_strerror(enum xlate_error error)
return "Too many MPLS labels";
case XLATE_INVALID_TUNNEL_METADATA:
return "Invalid tunnel metadata";
+ case XLATE_TUNNEL_ROUTING_FAIL:
+ return "Tunnel routing fail";
+ case XLATE_TUNNEL_OUTPUT_NO_ETHERNET:
+ return "Tunnel output no ethernet";
+ case XLATE_TUNNEL_NEIGH_LOOKUP_FAIL:
+ return "Tunnel neigh lookup fail";
case XLATE_UNSUPPORTED_PACKET_TYPE:
return "Unsupported packet type";
>
>Being able to distinguish drops caused by the neighbor lookup failures would
>actually be very useful.
>
>> xlate_report(ctx, OFT_WARN, "native tunnel routing failed");
>> return err;
>> }
>> @@ -3906,6 +3912,8 @@ native_tunnel_output(struct xlate_ctx *ctx, const
>struct xport *xport,
>> /* Use mac addr of bridge port of the peer. */
>> err = netdev_get_etheraddr(out_dev->netdev, &smac);
>> if (err) {
>> + put_drop_action(ctx->xbridge->ofproto, ctx->odp_actions,
>> + XLATE_INVALID_TUNNEL_METADATA,
>> + is_last_action);
>> xlate_report(ctx, OFT_WARN,
>> "tunnel output device lacks Ethernet address");
>> return err;
>> @@ -3920,6 +3928,8 @@ native_tunnel_output(struct xlate_ctx *ctx, const
>struct xport *xport,
>> if (err) {
>> struct in6_addr nh_s_ip6 = in6addr_any;
>>
>> + put_drop_action(ctx->xbridge->ofproto, ctx->odp_actions,
>> + XLATE_INVALID_TUNNEL_METADATA,
>> + is_last_action);
>> xlate_report(ctx, OFT_DETAIL,
>> "neighbor cache miss for %s on bridge %s, "
>> "sending %s request", @@ -6501,13 +6511,21 @@
>> put_ct_label(const struct flow *flow, struct ofpbuf *odp_actions,
>>
>> static void
>> put_drop_action(struct ofproto_dpif *ofproto, struct ofpbuf *odp_actions,
>> - enum xlate_error error)
>> + enum xlate_error error, bool is_last_action)
>> {
>> + size_t offset;
>> +
>> if (!ovs_explicit_drop_action_supported(ofproto)) {
>> return;
>> }
>>
>> + if (!is_last_action) {
>> + offset = nl_msg_start_nested(odp_actions,
>OVS_ACTION_ATTR_CLONE);
>> + }
>> nl_msg_put_u32(odp_actions, OVS_ACTION_ATTR_DROP, error);
>> + if (!is_last_action) {
>> + nl_msg_end_nested(odp_actions, offset);
>> + }
>> }
>>
>> static void
>> @@ -8134,7 +8152,7 @@ xlate_tweak_odp_actions(struct xlate_ctx *ctx)
>> }
>>
>> if (!last_action) {
>> - put_drop_action(ctx->xbridge->ofproto, actions, XLATE_OK);
>> + put_drop_action(ctx->xbridge->ofproto, actions, XLATE_OK,
>> + true);
>
>This is confusing as we're in the explicit !last_action case and we're calling
>this
>function with is_last_action == true.
We are not in the explicit !last_action case. We are here if the action list is
empty and the previous loop was not iterated even once.
I can add a comment to clarify.
>
>I'd suggest to create a new function for the cloned drop, e.g.
>put_cloned_drop_action() and use it for the tunnel case only. It may call the
If we have a dedicated function, it is true that current use of put_drop_action
remains, but in all the new tunnel drop cases, we will have to do this:
if (is_last_action) {
put_drop_action()
} else {
put_cloned_drop_action()
}
This is longer and repetitive code. Are you sure you prefer it this way?
>original put_drop_action(). There will be a case when we may add an empty
>clone() if the datapath doesn't support the explicit drop, but it should
>actually
>never happen and it's not a big problem even if that happens for any reason.
>
>> return;
>> }
>>
>> @@ -8167,7 +8185,7 @@ xlate_tweak_odp_actions(struct xlate_ctx *ctx)
>> last_observe_offset != UINT32_MAX &&
>> (unsigned char *) last_action == (unsigned char *) actions->data +
>> last_observe_offset) {
>> - put_drop_action(ctx->xbridge->ofproto, actions, XLATE_OK);
>> + put_drop_action(ctx->xbridge->ofproto, actions, XLATE_OK,
>> + true);
>> }
>> }
>>
>> @@ -8615,7 +8633,8 @@ exit:
>> if (xin->odp_actions) {
>> ofpbuf_clear(xin->odp_actions);
>> /* Make the drop explicit if the datapath supports it. */
>> - put_drop_action(ctx.xbridge->ofproto, xin->odp_actions,
>> ctx.error);
>> + put_drop_action(ctx.xbridge->ofproto, xin->odp_actions,
>> ctx.error,
>> + true);
>> }
>> } else {
>> /* In the non-error case, see if we can further optimize or
>> tweak
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev