On 20 Oct 2022, at 15:53, Ales Musil wrote:
> On Thu, Oct 20, 2022 at 3:41 PM Eelco Chaudron <[email protected]> wrote: > >> >> >> On 18 Oct 2022, at 19:34, Ales Musil wrote: >> >>> When the packet was traveling through patch port boundary >>> OvS would check if any of the actions is reversible, >>> if not it would clone the packet. However, the check >>> was only at the first level of the second bridge. >>> That caused some issues when the packet had gone >>> through more actions, some of them might have been >>> irreversible. >>> >>> In order to keep the semantics the same we might >>> need to run the actions twice in the worst case >>> scenario. During the clone there are 4 scenarios >>> that can happen. >>> >>> 1) The action is last one for that flow, >>> in that case we just continue without clone. >>> >>> 2) There is irreversible action in the action >>> set (first level). In this case we know that >>> there is at leas one irreversible action which >>> is enough to do clone. >>> >>> 3) All actions in first level are reversible, >>> we can try to run all actions as if we don't >>> need any clone and inspect the ofpbuf at the >>> end. In positive case there are no irreversible >>> actions so we will just submit the buffer and continue. >>> >>> 4) This is same as 3) with the difference that >>> there is irreversible action in the ofpbuf. >>> To keep the semantics we need to re-run the actions >>> and treat it as clone. This requires resotration >>> of the xlate_ctx. >>> >>> Add test cases for all irreversible actions >>> to see if they are properly cloned. >>> >>> Signed-off-by: Ales Musil <[email protected]> >>> --- >> >> Thanks for following through with all these revisions!! >> >> See some small comments below. >> >> //Eelco >> >> >> <SNIP> >> >> Why did you remove odp_sample_contains_irreversible_actions(), is it >> because OVN always makes this irreversible, or is it try that in any case >> OVS will add some reversible actions? If there is a way for OVS to add only >> reversible actions in sample, we should re-introduce this check. >> > > I have realized that it serves the same purpose as clone when clone is not > supported, so basically anything encapsulated in the sample should be ok. It’s also used by xlate_controller_action(), but looking at the code it seems that it’s always used with at least the meter action, so the code should be good as is. >>> + >>> +static bool >>> +odp_cpl_contains_irreversible_actions(const struct nlattr *attr) >>> +{ >>> + static const struct nl_policy ovs_cpl_policy[] = { >>> + [OVS_CHECK_PKT_LEN_ATTR_PKT_LEN] = {.type = NL_A_U16}, >>> + [OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_GREATER] = {.type = >> NL_A_NESTED}, >>> + [OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL] = {.type = >> NL_A_NESTED}, >>> + }; >>> + struct nlattr *a[ARRAY_SIZE(ovs_cpl_policy)]; >>> + >>> + if (!nl_parse_nested(attr, ovs_cpl_policy, a, ARRAY_SIZE(a))) { >>> + return false; >>> + } >>> + >>> + const struct nlattr *greater = >>> + a[OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_GREATER]; >>> + const struct nlattr *less = >>> + a[OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL]; >>> + const void *greater_data = nl_attr_get(greater); >>> + const void *less_data = nl_attr_get(less); >>> + size_t greater_len = nl_attr_get_size(greater); >>> + size_t less_len = nl_attr_get_size(less); >>> + >>> + return odp_contains_irreversible_action(greater_data, greater_len) >> || >>> + odp_contains_irreversible_action(less_data, less_len); >>> +} >>> + >>> +/* Check if any of the actions in the ofpbuf is irreversible. */ >>> +bool >>> +odp_contains_irreversible_action(const void *attrs, size_t attrs_len) >>> +{ >>> + const struct nlattr *attr; >>> + int left; >>> + >>> + NL_ATTR_FOR_EACH (attr, left, attrs, attrs_len) { >>> + uint16_t type = attr->nla_type; >>> + >>> + switch ((enum ovs_action_attr) type) { >>> + /* Skip "clone" or "sample" because it already encapsulates >> >> "because they already encapsulate" >> >> > Will update in v8. > > >> >>> + * irreversible actions. */ >>> + case OVS_ACTION_ATTR_CLONE: >>> + case OVS_ACTION_ATTR_SAMPLE: >> >> See comment above. >> >>> + continue; >>> + /* Check nested actions of "check_packet_len". */ >>> + case OVS_ACTION_ATTR_CHECK_PKT_LEN: >>> + if (odp_cpl_contains_irreversible_actions(attr)) { >>> + return true; >>> + } >>> + continue; >>> + /* Check any other action. */ >>> + case OVS_ACTION_ATTR_CT: >>> + case OVS_ACTION_ATTR_CT_CLEAR: >>> + case OVS_ACTION_ATTR_TRUNC: >>> + case OVS_ACTION_ATTR_PUSH_ETH: >>> + case OVS_ACTION_ATTR_POP_ETH: >>> + case OVS_ACTION_ATTR_PUSH_NSH: >>> + case OVS_ACTION_ATTR_POP_NSH: >>> + case OVS_ACTION_ATTR_METER: >>> + case OVS_ACTION_ATTR_TUNNEL_PUSH: >>> + case OVS_ACTION_ATTR_TUNNEL_POP: >>> + case OVS_ACTION_ATTR_UNSPEC: >>> + case OVS_ACTION_ATTR_OUTPUT: >>> + case OVS_ACTION_ATTR_USERSPACE: >>> + case OVS_ACTION_ATTR_SET: >>> + case OVS_ACTION_ATTR_PUSH_VLAN: >>> + case OVS_ACTION_ATTR_POP_VLAN: >>> + case OVS_ACTION_ATTR_RECIRC: >>> + case OVS_ACTION_ATTR_HASH: >>> + case OVS_ACTION_ATTR_SET_MASKED: >>> + case OVS_ACTION_ATTR_LB_OUTPUT: >>> + case OVS_ACTION_ATTR_ADD_MPLS: >>> + case OVS_ACTION_ATTR_PUSH_MPLS: >>> + case OVS_ACTION_ATTR_POP_MPLS: >>> + case OVS_ACTION_ATTR_DROP: >>> + if (nlattr_action_is_irreversible(type)) { >>> + return true; >>> + } >>> + continue; >>> + case __OVS_ACTION_ATTR_MAX: >>> + break; >>> + } >> >> Maybe add a small comment here, as it might not make sense for people >> reading this why you use the continue; logic here. Something like: >> >> /* All unknown action types fallthrough the switch statement and >> * are by default assumed irreversible. */ >> > > Will update in v8. > > >> >>> + return true; >>> + } >>> + return false; >>> +} >> >> <SNIP> >> >>> @@ -5858,37 +5963,22 @@ reversible_actions(const struct ofpact *ofpacts, >> size_t ofpacts_len) >>> } >>> >>> static void >>> -clone_xlate_actions(const struct ofpact *actions, size_t actions_len, >>> - struct xlate_ctx *ctx, bool is_last_action, >>> - bool group_bucket_action OVS_UNUSED) >>> +do_clone_xlate_actions(const struct ofpact *actions, size_t actions_len, >>> + struct xlate_ctx *ctx) >>> { >>> - struct ofpbuf old_stack = ctx->stack; >>> - union mf_subvalue new_stack[1024 / sizeof(union mf_subvalue)]; >>> - ofpbuf_use_stub(&ctx->stack, new_stack, sizeof new_stack); >>> - ofpbuf_put(&ctx->stack, old_stack.data, old_stack.size); >>> - >>> - struct ofpbuf old_action_set = ctx->action_set; >>> - uint64_t actset_stub[1024 / 8]; >>> - ofpbuf_use_stub(&ctx->action_set, actset_stub, sizeof actset_stub); >>> - ofpbuf_put(&ctx->action_set, old_action_set.data, >> old_action_set.size); >>> - >>> size_t offset, ac_offset; >>> - struct flow old_flow = ctx->xin->flow; >>> - >>> - if (reversible_actions(actions, actions_len) || is_last_action) { >>> - old_flow = ctx->xin->flow; >>> - do_xlate_actions(actions, actions_len, ctx, is_last_action, >> false); >>> - xlate_ctx_process_freezing(ctx); >>> - goto xlate_done; >>> - } >>> + union mf_subvalue stack_buffer[1024 / sizeof(union mf_subvalue)]; >>> + uint64_t act_set_buffer[1024 / 8]; I noticed you use act_set_buffer here but later you used sct_set_buffer, guess the latter is a cut/paste/change error? >> >> Why not make these two part of the xlate_ctx_clone_state structure itself? >> This might also help allocating this data structure on the heap if needed >> due to a large number of recursions (See discussion on Mike's patch with >> Ilya). >> > > If the buffer would be part of the state struct it would need to be on heap > because once assigned to ofpub we cannot move the struct around. > The v6 actually had this problem. I don't mind the buffer being on heap if > that's preferable (saw the discussion and it would make sense as clone is > one the main contributors to this problem). I think for now we can leave it on the stack. I think you might need a minimum change, something like (make check tested only): diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index b576f4c7e..a2d908d32 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -5795,37 +5795,37 @@ struct xlate_ctx_clone_state { int resubmit; uint32_t sflow_n_outputs; + + union mf_subvalue stack_buffer[1024 / sizeof(union mf_subvalue)]; + uint64_t act_set_buffer[1024 / 8]; }; -static struct xlate_ctx_clone_state -xlate_ctx_clone_state_store(struct xlate_ctx *ctx, void *stack_buffer, - size_t stack_buffer_len, void *act_set_buffer, - size_t act_set_buffer_len) -{ - struct xlate_ctx_clone_state state = { - .stack = ctx->stack, - .action_set = ctx->action_set, - .odp_actions_size = ctx->odp_actions->size, - .xin_flow = ctx->xin->flow, - .base_flow = ctx->base_flow, - .was_mpls = ctx->was_mpls, - .conntracked = ctx->conntracked, - .recirc_size = ctx->xin->recirc_queue - ? ovs_list_size(ctx->xin->recirc_queue) - : 0, - .resubmit = ctx->resubmits, - .sflow_n_outputs = ctx->sflow_n_outputs, - }; +static void +xlate_ctx_clone_state_store(struct xlate_ctx *ctx, + struct xlate_ctx_clone_state *state) +{ + state->stack = ctx->stack; + state->action_set = ctx->action_set; + state->odp_actions_size = ctx->odp_actions->size; + state->xin_flow = ctx->xin->flow; + state->base_flow = ctx->base_flow; + state->was_mpls = ctx->was_mpls; + state->conntracked = ctx->conntracked; + state->recirc_size = ctx->xin->recirc_queue + ? ovs_list_size(ctx->xin->recirc_queue) + : 0; + state->resubmit = ctx->resubmits; + state->sflow_n_outputs = ctx->sflow_n_outputs; /* Set up a new stack from the provided buffer. */ - ofpbuf_use_stub(&ctx->stack, stack_buffer, stack_buffer_len); - ofpbuf_put(&ctx->stack, state.stack.data, state.stack.size); + ofpbuf_use_stub(&ctx->stack, state->stack_buffer, + sizeof state->stack_buffer); + ofpbuf_put(&ctx->stack, state->stack.data, state->stack.size); /* Set up a new action_set from the provided buffer. */ - ofpbuf_use_stub(&ctx->action_set, act_set_buffer, act_set_buffer_len); - ofpbuf_put(&ctx->action_set, state.action_set.data, state.action_set.size); - - return state; + ofpbuf_use_stub(&ctx->action_set, state->act_set_buffer, + sizeof state->act_set_buffer); + ofpbuf_put(&ctx->action_set, state->action_set.data, state->action_set.size); } /* Restore xlate_ctx member to the old_state, with additional steps for @@ -5967,11 +5967,9 @@ do_clone_xlate_actions(const struct ofpact *actions, size_t actions_len, struct xlate_ctx *ctx) { size_t offset, ac_offset; - union mf_subvalue stack_buffer[1024 / sizeof(union mf_subvalue)]; - uint64_t act_set_buffer[1024 / 8]; - struct xlate_ctx_clone_state old_state = - xlate_ctx_clone_state_store(ctx, stack_buffer, sizeof stack_buffer, - act_set_buffer, sizeof act_set_buffer); + struct xlate_ctx_clone_state old_state; + + xlate_ctx_clone_state_store(ctx, &old_state); /* Commit datapath actions before emitting the clone action to * avoid emitting those actions twice. Once inside @@ -6026,12 +6024,9 @@ clone_xlate_actions(const struct ofpact *actions, size_t actions_len, * there was any irreversible action we can restore the xlate_ctx and run * the do_clone instead. If the action is last we don't have to check and * can just proceed with the actions. */ - union mf_subvalue stack_buffer[1024 / sizeof(union mf_subvalue)]; - uint64_t sct_set_buffer[1024 / 8]; - struct xlate_ctx_clone_state old_state = - xlate_ctx_clone_state_store(ctx, stack_buffer, sizeof stack_buffer, - sct_set_buffer, sizeof sct_set_buffer); + struct xlate_ctx_clone_state old_state; + xlate_ctx_clone_state_store(ctx, &old_state); do_xlate_actions(actions, actions_len, ctx, is_last_action, false); void *added_actions = When Mike wants to do his testing, he could change your patch to something like: diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index a2d908d32..c72f02905 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -5800,10 +5800,12 @@ struct xlate_ctx_clone_state { uint64_t act_set_buffer[1024 / 8]; }; -static void -xlate_ctx_clone_state_store(struct xlate_ctx *ctx, - struct xlate_ctx_clone_state *state) +static struct xlate_ctx_clone_state * +xlate_ctx_clone_state_store(struct xlate_ctx *ctx) { + struct xlate_ctx_clone_state *state = xmalloc( + sizeof(struct xlate_ctx_clone_state)); + state->stack = ctx->stack; state->action_set = ctx->action_set; state->odp_actions_size = ctx->odp_actions->size; @@ -5825,7 +5827,10 @@ xlate_ctx_clone_state_store(struct xlate_ctx *ctx, /* Set up a new action_set from the provided buffer. */ ofpbuf_use_stub(&ctx->action_set, state->act_set_buffer, sizeof state->act_set_buffer); - ofpbuf_put(&ctx->action_set, state->action_set.data, state->action_set.size); + ofpbuf_put(&ctx->action_set, state->action_set.data, + state->action_set.size); + + return state; } /* Restore xlate_ctx member to the old_state, with additional steps for @@ -5844,6 +5849,7 @@ xlate_ctx_clone_state_restore(struct xlate_ctx *ctx, /* Restore just common parts when we don't clone. */ if (no_clone_restore) { + free(old_state); return; } @@ -5880,6 +5886,7 @@ xlate_ctx_clone_state_restore(struct xlate_ctx *ctx, oftrace_recirc_node_destroy(node); } } + free(old_state); } /* Determine if an datapath action translated from the openflow action @@ -5967,16 +5974,14 @@ do_clone_xlate_actions(const struct ofpact *actions, size_t actions_len, struct xlate_ctx *ctx) { size_t offset, ac_offset; - struct xlate_ctx_clone_state old_state; - - xlate_ctx_clone_state_store(ctx, &old_state); + struct xlate_ctx_clone_state *old_state = xlate_ctx_clone_state_store(ctx); /* Commit datapath actions before emitting the clone action to * avoid emitting those actions twice. Once inside * the clone, another time for the action after clone. */ xlate_commit_actions(ctx); /* The base needs to be stored after xlate_commit_actions. */ - old_state.base_flow = ctx->base_flow; + old_state->base_flow = ctx->base_flow; /* The actions are not reversible, a datapath clone action is * required to encode the translation. Select the clone action @@ -6006,7 +6011,7 @@ do_clone_xlate_actions(const struct ofpact *actions, size_t actions_len, * report an error */ xlate_report_error(ctx, "Failed to compose clone action"); } - xlate_ctx_clone_state_restore(ctx, &old_state, false, false); + xlate_ctx_clone_state_restore(ctx, old_state, false, false); } static void @@ -6024,21 +6029,20 @@ clone_xlate_actions(const struct ofpact *actions, size_t actions_len, * there was any irreversible action we can restore the xlate_ctx and run * the do_clone instead. If the action is last we don't have to check and * can just proceed with the actions. */ - struct xlate_ctx_clone_state old_state; + struct xlate_ctx_clone_state *old_state = xlate_ctx_clone_state_store(ctx); - xlate_ctx_clone_state_store(ctx, &old_state); do_xlate_actions(actions, actions_len, ctx, is_last_action, false); void *added_actions = - ofpbuf_at_assert(ctx->odp_actions, old_state.odp_actions_size, 0); - uint32_t added_size = ctx->odp_actions->size - old_state.odp_actions_size; + ofpbuf_at_assert(ctx->odp_actions, old_state->odp_actions_size, 0); + uint32_t added_size = ctx->odp_actions->size - old_state->odp_actions_size; if (is_last_action || !odp_contains_irreversible_action(added_actions, added_size)) { xlate_ctx_process_freezing(ctx); - xlate_ctx_clone_state_restore(ctx, &old_state, true, false); + xlate_ctx_clone_state_restore(ctx, old_state, true, false); } else { - xlate_ctx_clone_state_restore(ctx, &old_state, false, true); + xlate_ctx_clone_state_restore(ctx, old_state, false, true); do_clone_xlate_actions(actions, actions_len, ctx); } } Or maybe Ilya/Mike thinks we should have this as a patch to this series? Ilya/Mike? >> >>> + struct xlate_ctx_clone_state old_state = >>> + xlate_ctx_clone_state_store(ctx, stack_buffer, sizeof >> stack_buffer, >>> + act_set_buffer, sizeof >> act_set_buffer); >>> >>> /* Commit datapath actions before emitting the clone action to >>> * avoid emitting those actions twice. Once inside >>> * the clone, another time for the action after clone. */ >>> xlate_commit_actions(ctx); >>> - struct flow old_base = ctx->base_flow; >>> - bool old_was_mpls = ctx->was_mpls; >>> - bool old_conntracked = ctx->conntracked; >> >> nit: add an extra cr/lf here, right before the below comment. >> >>> + /* The base needs to be stored after xlate_commit_actions. */ >>> + old_state.base_flow = ctx->base_flow; >>> >> >> <SNIP> >> >> > I'll send v8 once we settle the discussion. > > Thanks, > Ales > > -- > > Ales Musil > > Senior Software Engineer - OVN Core > > Red Hat EMEA <https://www.redhat.com> > > [email protected] IM: amusil > <https://red.ht/sig> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
