On 13 May 2024, at 10:44, Adrian Moreno wrote:
> On 5/10/24 12:06 PM, Eelco Chaudron wrote: >> On 24 Apr 2024, at 21:53, Adrian Moreno wrote: >> >>> Offload the sample action if it contains psample information by creating >>> a tc "sample" action with the user cookie inside the action's cookie. >>> >>> Avoid using the "sample" action's cookie to store the ufid. >> >> I have some concerns about the sample action-only solution. What happened to >> the idea of adding an additional cookie to the TC match? Can we add a test >> case for the explicit drop? >> > > > I guess we can ask the kernel community. However, I'm not 100% convinced it > makes a lot of sense form a kernel perspective. There is already a cookie in > every action so there is plenty of space to store user data. Kernel does not > enforce any semantics on the cookies, it's up to userspace to interpret them > at will. Also, you cannot have a flow without an action IIUC. > So having a new cookie added just because it makes OVS code sligthly cleaner > doesn't seem to be a super strong argument. ACK, but I thought this was part of the earlier discussion, so it should be explored. I guess we can make the opposite argument, why a per-action cookie if we could have a per-flow one? Having a match cookie will free the action cookie for action related metadata. > > Regarding tc <-> ufid correlation. I'm a bit confused by the fact that flow > replacement seems to work on a tcf_id basis, meaning that each tc flow has > it's own unique tcf_id (or we would replace some other flow). Also, there is > a hmap mapping tcf_id to ufids already. Am I missing something? You are right, I guess this exists to support kernels where we did not have an actions cookie, so we use find_ufid() in the flow dump. So in theory if there is no action with a flow cookie it will all still work. >> In general, I think we should apply the sflow patch before this series. >> > > Agree. Are we all OK with the compatibility issues that will introduce? > >> //Eelco >> >>> Signed-off-by: Adrian Moreno <[email protected]> >>> --- >>> include/linux/automake.mk | 5 +- >>> include/linux/pkt_cls.h | 2 + >>> include/linux/tc_act/tc_sample.h | 26 ++++++++++ >>> lib/netdev-offload-tc.c | 67 +++++++++++++++++++++++++ >>> lib/tc.c | 84 ++++++++++++++++++++++++++++++-- >>> lib/tc.h | 7 +++ >>> utilities/ovs-psample.c | 8 +-- >>> 7 files changed, 190 insertions(+), 9 deletions(-) >>> create mode 100644 include/linux/tc_act/tc_sample.h >>> >>> diff --git a/include/linux/automake.mk b/include/linux/automake.mk >>> index ac306b53c..c2a270152 100644 >>> --- a/include/linux/automake.mk >>> +++ b/include/linux/automake.mk >>> @@ -5,9 +5,10 @@ noinst_HEADERS += \ >>> include/linux/pkt_cls.h \ >>> include/linux/psample.h \ >>> include/linux/gen_stats.h \ >>> + include/linux/tc_act/tc_ct.h \ >>> include/linux/tc_act/tc_mpls.h \ >>> include/linux/tc_act/tc_pedit.h \ >>> + include/linux/tc_act/tc_sample.h \ >>> include/linux/tc_act/tc_skbedit.h \ >>> include/linux/tc_act/tc_tunnel_key.h \ >>> - include/linux/tc_act/tc_vlan.h \ >>> - include/linux/tc_act/tc_ct.h >>> + include/linux/tc_act/tc_vlan.h >>> diff --git a/include/linux/pkt_cls.h b/include/linux/pkt_cls.h >>> index fb4a7ecea..c566ac1b5 100644 >>> --- a/include/linux/pkt_cls.h >>> +++ b/include/linux/pkt_cls.h >>> @@ -8,6 +8,8 @@ >>> #include <linux/types.h> >>> #include <linux/pkt_sched.h> >>> >>> +#define TC_COOKIE_MAX_SIZE 16 >>> + >>> /* Action attributes */ >>> enum { >>> TCA_ACT_UNSPEC, >>> diff --git a/include/linux/tc_act/tc_sample.h >>> b/include/linux/tc_act/tc_sample.h >>> new file mode 100644 >>> index 000000000..398f32761 >>> --- /dev/null >>> +++ b/include/linux/tc_act/tc_sample.h >>> @@ -0,0 +1,26 @@ >>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ >>> +#ifndef __LINUX_TC_SAMPLE_H >>> +#define __LINUX_TC_SAMPLE_H >>> + >>> +#include <linux/types.h> >>> +#include <linux/pkt_cls.h> >>> +#include <linux/if_ether.h> >>> + >>> +struct tc_sample { >>> + tc_gen; >>> +}; >>> + >>> +enum { >>> + TCA_SAMPLE_UNSPEC, >>> + TCA_SAMPLE_TM, >>> + TCA_SAMPLE_PARMS, >>> + TCA_SAMPLE_RATE, >>> + TCA_SAMPLE_TRUNC_SIZE, >>> + TCA_SAMPLE_PSAMPLE_GROUP, >>> + TCA_SAMPLE_PAD, >>> + __TCA_SAMPLE_MAX >>> +}; >>> +#define TCA_SAMPLE_MAX (__TCA_SAMPLE_MAX - 1) >>> + >>> +#endif >>> + >>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c >>> index 36e814bee..0e7273431 100644 >>> --- a/lib/netdev-offload-tc.c >>> +++ b/lib/netdev-offload-tc.c >>> @@ -1038,6 +1038,21 @@ parse_tc_flower_to_actions__(struct tc_flower >>> *flower, struct ofpbuf *buf, >>> nl_msg_end_nested(buf, offset); >>> } >>> break; >>> + case TC_ACT_SAMPLE: { >>> + size_t offset; >>> + >>> + offset = nl_msg_start_nested(buf, OVS_ACTION_ATTR_SAMPLE); >>> + nl_msg_put_u32(buf, OVS_SAMPLE_ATTR_PROBABILITY, >>> + UINT32_MAX / action->sample.rate); >>> + nl_msg_put_u32(buf, OVS_SAMPLE_ATTR_PSAMPLE_GROUP, >>> + action->sample.group_id); >>> + nl_msg_put_unspec(buf, OVS_SAMPLE_ATTR_PSAMPLE_COOKIE, >>> + &action->sample.cookie[0], >>> + action->sample.cookie_len); >>> + >>> + nl_msg_end_nested(buf, offset); >>> + } >>> + break; >>> } >>> >>> if (action->jump_action && action->type != TC_ACT_POLICE_MTU) { >>> @@ -2054,6 +2069,53 @@ parse_check_pkt_len_action(struct netdev *netdev, >>> struct tc_flower *flower, >>> return 0; >>> } >>> >>> +static int >>> +parse_sample_action(struct tc_flower *flower, const struct nlattr *nl_act, >>> + struct tc_action *action) >>> +{ >>> + /* Only offloadable if it's psample only. Use the policy to enforce it >>> by >> >> Double space when starting a new sentence. >> >>> + * making psample arguments mandatory and omitting actions. */ >> >> I think a lot of this code is complex, and it part of the sflow/psample >> patch series, which I prefer to go in before this patch. Ilya any update on >> your review of that series? >> >>> + static const struct nl_policy ovs_sample_policy[] = { >>> + [OVS_SAMPLE_ATTR_PROBABILITY] = { .type = NL_A_U32 }, >>> + [OVS_SAMPLE_ATTR_PSAMPLE_GROUP] = { .type = NL_A_U32, }, >>> + [OVS_SAMPLE_ATTR_PSAMPLE_COOKIE] = { .type = NL_A_UNSPEC, >>> + .optional = true, >>> + .max_len = TC_COOKIE_MAX_SIZE >>> } >>> + }; >>> + struct nlattr *a[ARRAY_SIZE(ovs_sample_policy)]; >>> + uint32_t probability; >>> + >>> + if (!nl_parse_nested(nl_act, ovs_sample_policy, a, ARRAY_SIZE(a))) { >>> + return EOPNOTSUPP; >>> + } >>> + >>> + action->type = TC_ACT_SAMPLE; >>> + /* OVS probability and TC sampling rate have different semantics. >>> + * The former represents the number of sampled packets out of >>> UINT32_MAX >>> + * while the other represents the ratio between observed and sampled >>> + * packets. */ >>> + probability = nl_attr_get_u32(a[OVS_SAMPLE_ATTR_PROBABILITY]); >>> + if (!probability) { >>> + return EINVAL; >>> + } >>> + action->sample.rate = UINT32_MAX / probability; >>> + >>> + action->sample.group_id = >>> + nl_attr_get_u32(a[OVS_SAMPLE_ATTR_PSAMPLE_GROUP]); >>> + >>> + if (a[OVS_SAMPLE_ATTR_PSAMPLE_COOKIE]) { >>> + action->sample.cookie_len = >>> + nl_attr_get_size(a[OVS_SAMPLE_ATTR_PSAMPLE_COOKIE]); >>> + >>> + memcpy(&action->sample.cookie[0], >>> + nl_attr_get(a[OVS_SAMPLE_ATTR_PSAMPLE_COOKIE]), >>> + action->sample.cookie_len); >>> + } >>> + >>> + flower->action_count++; >>> + return 0; >>> +} >>> + >>> static int >>> netdev_tc_parse_nl_actions(struct netdev *netdev, struct tc_flower >>> *flower, >>> struct offload_info *info, >>> @@ -2195,6 +2257,11 @@ netdev_tc_parse_nl_actions(struct netdev *netdev, >>> struct tc_flower *flower, >>> if (err) { >>> return err; >>> } >>> + } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_SAMPLE) { >>> + err = parse_sample_action(flower, nla, action); >>> + if (err) { >>> + return err; >>> + } >>> } else { >>> VLOG_DBG_RL(&rl, "unsupported put action type: %d", >>> nl_attr_type(nla)); >>> diff --git a/lib/tc.c b/lib/tc.c >>> index 7176991c3..08d23064d 100644 >>> --- a/lib/tc.c >>> +++ b/lib/tc.c >>> @@ -22,15 +22,16 @@ >>> #include <linux/if_ether.h> >>> #include <linux/if_packet.h> >>> #include <linux/rtnetlink.h> >>> +#include <linux/tc_act/tc_ct.h> >>> #include <linux/tc_act/tc_csum.h> >>> #include <linux/tc_act/tc_gact.h> >>> #include <linux/tc_act/tc_mirred.h> >>> #include <linux/tc_act/tc_mpls.h> >>> #include <linux/tc_act/tc_pedit.h> >>> +#include <linux/tc_act/tc_sample.h> >>> #include <linux/tc_act/tc_skbedit.h> >>> #include <linux/tc_act/tc_tunnel_key.h> >>> #include <linux/tc_act/tc_vlan.h> >>> -#include <linux/tc_act/tc_ct.h> >>> #include <linux/gen_stats.h> >>> #include <net/if.h> >>> #include <unistd.h> >>> @@ -1563,6 +1564,42 @@ nl_parse_act_police(const struct nlattr *options, >>> struct tc_flower *flower) >>> return 0; >>> } >>> >>> +static const struct nl_policy sample_policy[] = { >>> + [TCA_SAMPLE_PARMS] = { .type = NL_A_UNSPEC, >>> + .min_len = sizeof(struct tc_sample), >>> + .optional = false, }, >>> + [TCA_SAMPLE_PSAMPLE_GROUP] = { .type = NL_A_U32, >>> + .optional = false, }, >>> + [TCA_SAMPLE_RATE] = { .type = NL_A_U32, >>> + .optional = false, }, >>> +}; >>> + >>> +static int >>> +nl_parse_act_sample(struct nlattr *options, const struct nlattr *cookie, >>> + struct tc_flower *flower) >>> +{ >>> + struct nlattr *sample_attrs[ARRAY_SIZE(sample_policy)]; >>> + struct tc_action *action; >>> + >>> + if (!nl_parse_nested(options, sample_policy, sample_attrs, >>> + ARRAY_SIZE(sample_policy))) { >>> + VLOG_ERR_RL(&error_rl, "Failed to parse sample action options"); >>> + return EPROTO; >>> + } >>> + >>> + action = &flower->actions[flower->action_count++]; >>> + action->type = TC_ACT_SAMPLE; >>> + action->sample.group_id = >>> + nl_attr_get_u32(sample_attrs[TCA_SAMPLE_PSAMPLE_GROUP]); >>> + action->sample.rate = >>> + nl_attr_get_u32(sample_attrs[TCA_SAMPLE_RATE]); >>> + action->sample.cookie_len = nl_attr_get_size(cookie); >>> + memcpy(&action->sample.cookie[0], nl_attr_get(cookie), >>> + MIN(action->sample.cookie_len, TC_COOKIE_MAX_SIZE)); >>> + >>> + return 0; >>> +} >>> + >>> static const struct nl_policy mirred_policy[] = { >>> [TCA_MIRRED_PARMS] = { .type = NL_A_UNSPEC, >>> .min_len = sizeof(struct tc_mirred), >>> @@ -2078,6 +2115,8 @@ nl_parse_single_action(struct nlattr *action, struct >>> tc_flower *flower, >>> nl_parse_act_ct(act_options, flower); >>> } else if (!strcmp(act_kind, "police")) { >>> nl_parse_act_police(act_options, flower); >>> + } else if (!strcmp(act_kind, "sample")) { >>> + nl_parse_act_sample(act_options, act_cookie, flower); >>> } else { >>> VLOG_ERR_RL(&error_rl, "unknown tc action kind: %s", act_kind); >>> err = EINVAL; >>> @@ -2087,7 +2126,8 @@ nl_parse_single_action(struct nlattr *action, struct >>> tc_flower *flower, >>> return err; >>> } >>> >>> - if (act_cookie) { >>> + /* TC_ACT_SAMPLE uses the cookie to store action-specific metadata. */ >> >> I thought the plan was to add a psample specific cookie, was this rejected >> upstream? >> > > Well, from the kernel perspective, the sample action already has a cookie. > Isn't it better just to reuse it instead of adding a redundant argument? I was just wondering if it was proposed, and rejected upstream. But in this light it makes sense. >>> + if (act_cookie && strcmp(act_kind, "sample")) { >>> flower->flow_cookie.data = nl_attr_get(act_cookie); >>> flower->flow_cookie.len = nl_attr_get_size(act_cookie); >>> } >>> @@ -2901,6 +2941,29 @@ nl_msg_put_act_mirred(struct ofpbuf *request, int >>> ifindex, int action, >>> nl_msg_end_nested(request, offset); >>> } >>> >>> +static void >>> +nl_msg_put_act_sample(struct ofpbuf *request, struct tc_action *action, >>> + uint32_t action_pc) >>> +{ >>> + size_t offset; >>> + >>> + nl_msg_put_string(request, TCA_ACT_KIND, "sample"); >>> + offset = nl_msg_start_nested(request, TCA_ACT_OPTIONS | NLA_F_NESTED); >>> + { >>> + struct tc_sample parm = { .action = action_pc }; >>> + >>> + nl_msg_put_unspec(request, TCA_SAMPLE_PARMS, &parm, sizeof parm); >>> + nl_msg_put_u32(request, TCA_SAMPLE_RATE, action->sample.rate); >>> + nl_msg_put_u32(request, TCA_SAMPLE_PSAMPLE_GROUP, >>> + action->sample.group_id); >>> + } >>> + nl_msg_end_nested(request, offset); >>> + if (action->sample.cookie_len) { >>> + nl_msg_put_unspec(request, TCA_ACT_COOKIE, >>> &action->sample.cookie[0], >>> + action->sample.cookie_len); >>> + } >>> +} >>> + >>> static inline void >>> nl_msg_put_act_cookie(struct ofpbuf *request, struct tc_cookie *ck) { >>> if (ck->len) { >>> @@ -3220,6 +3283,7 @@ get_action_index_for_tc_actions(struct tc_flower >>> *flower, uint16_t act_index, >>> case TC_ACT_MPLS_SET: >>> case TC_ACT_GOTO: >>> case TC_ACT_CT: >>> + case TC_ACT_SAMPLE: >> >> Note that this is no longer true if the sample patch get applied later. >> > > Yep. I know. When the sflow patch gets applied I'll need to rebase. > >>> /* Increase act_index by one if we are sure this type of >>> action >>> * will only add one tc action in the kernel. */ >>> act_index++; >>> @@ -3506,13 +3570,27 @@ nl_msg_put_flower_acts(struct ofpbuf *request, >>> struct tc_flower *flower) >>> nl_msg_end_nested(request, act_offset); >>> } >>> break; >>> + case TC_ACT_SAMPLE: { >>> + act_offset = nl_msg_start_nested(request, act_index++); >>> + /* TC_ACT_SAMPLE uses the action cookie so intentionally >>> + * skipping flow action configuration. */ >>> + nl_msg_put_act_sample(request, action, action_pc); >>> + nl_msg_end_nested(request, act_offset); >>> + break; >>> + } >>> } >>> >>> prev_action_pc = action_pc; >>> } >>> } >>> >>> - if (!flower->action_count) { >>> + /* A flow with only a TC_ACT_SAMPLE action is possible. It is sampling >>> the >>> + * packet as it gets dropped. But given TC_ACT_SAMPLE uses the cookie >>> to >>> + * store action-specific data, add the explicit drop action to store >>> the >>> + * flow cookie. */ >> >> I guess this could be a problem, as the datapath rule now looks different >> than the rule we were asked to install. >> >> i.e., if you do a ovs-appctl dpctl/dump-flows for both kernel and tc, does >> the output show the same? It's been a while, but I think in the past the >> revalidator also complained if the rules did not match (but might before we >> had the ufid in the cookie). >> > > I didn't add a check but I remember testing it and it works. The comparison > you mention (we read the same we wrote), apart from doable visually from > ovs-appctl dpctl/dump-flows, is done programatically in tc_replace_flower > comparing the read tc_flower is the same as what we configured. I’m talking about the ‘ovs-appctl dpctl/dump-flows’ output. But if you tested it and it’s the same we should be good. Maybe add a test case for this, as suggested in patch 11/11. > > When actions are parsed (read) the explicit action is read, the ufid is > interpreted as the flow cookie but the actual action is discarted (i.e: made > implicit). See nl_parse_single_action and nl_parse_act_gact. > > >> There are other combinations of rules that could cause no ufid cookie to be >> installed, guess they should be handled here also. >> > > What combinations are those? If they are, I think they exist now as well. Check nl_msg_put_flower_acts() for all actions not having nl_msg_put_act_cookie(). But I guess you are right, so maybe we should not add the explicit drop here at all? >> Having said this, we always kept telling people not to use the act_cookie >> for anything else than ufid. So my preference would be to add a match >> cookie, to the kernel, and if that exists we can free up the action cookie >> for other use. > > > goto top comment > >> >>> + if (!flower->action_count || >>> + (flower->action_count == 1 && >>> + flower->actions[0].type == TC_ACT_SAMPLE)) { >>> act_offset = nl_msg_start_nested(request, act_index++); >>> nl_msg_put_act_gact(request, 0); >>> nl_msg_put_act_cookie(request, &flower->flow_cookie); >>> diff --git a/lib/tc.h b/lib/tc.h >>> index 40ea3d816..e6ad6950e 100644 >>> --- a/lib/tc.h >>> +++ b/lib/tc.h >>> @@ -201,6 +201,7 @@ enum tc_action_type { >>> TC_ACT_CT, >>> TC_ACT_POLICE, >>> TC_ACT_POLICE_MTU, >>> + TC_ACT_SAMPLE, >>> }; >>> >>> enum nat_type { >>> @@ -296,6 +297,12 @@ struct tc_action { >>> uint32_t result_jump; >>> uint16_t mtu; >>> } police; >>> + struct { >>> + uint32_t rate; >>> + uint32_t group_id; >>> + uint16_t cookie_len; >>> + uint8_t cookie[TC_COOKIE_MAX_SIZE]; >>> + } sample; >>> }; >>> >>> enum tc_action_type type; >>> diff --git a/utilities/ovs-psample.c b/utilities/ovs-psample.c >>> index 9127d065c..51c72cc30 100644 >>> --- a/utilities/ovs-psample.c >>> +++ b/utilities/ovs-psample.c >>> @@ -35,7 +35,7 @@ >>> #include "openvswitch/uuid.h" >>> #include "openvswitch/vlog.h" >>> >>> -VLOG_DEFINE_THIS_MODULE(ovs_sample); >>> +VLOG_DEFINE_THIS_MODULE(ovs_psample); >>> >>> /* -g, --group: Group id filter option */ >>> static uint32_t group_id = 0; >>> @@ -206,7 +206,7 @@ parse_psample(struct ofpbuf *buf, struct sample >>> *sample) { >>> struct nlmsghdr *nlmsg = ofpbuf_try_pull(&b, sizeof *nlmsg); >>> struct genlmsghdr *genl = ofpbuf_try_pull(&b, sizeof *genl); >>> struct nlattr *attr; >>> - const char *cookie; >>> + const uint32_t *cookie; >>> >>> struct nlattr *a[ARRAY_SIZE(psample_packet_policy)]; >>> if (!nlmsg || !genl >>> @@ -227,8 +227,8 @@ parse_psample(struct ofpbuf *buf, struct sample >>> *sample) { >>> if (attr && nl_attr_get_size(attr) == 8) { >>> cookie = nl_attr_get(attr); >>> sample->has_cookie = true; >>> - sample->obs_domain_id = (uint32_t) *(&cookie[0]); >>> - sample->obs_point_id = (uint32_t) *(&cookie[4]); >>> + sample->obs_domain_id = cookie[0]; >>> + sample->obs_point_id = cookie[1]; >>> } >>> return 0; >>> } >>> -- >>> 2.44.0 >> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
