Re: [ovs-dev] [PATCH v26 4/8] netdev-offload-tc: Add sample offload API for TC
On 26 May 2023, at 10:18, Chris Mi wrote: > On 5/10/2023 7:42 PM, Eelco Chaudron wrote: >> On 26 Apr 2023, at 4:42, Chris Mi wrote: >> >> >/* DPIF_UC_ACTION only. */ >struct nlattr *userdata;/* Argument to > OVS_ACTION_ATTR_USERSPACE. */ > -struct nlattr *out_tun_key;/* Output tunnel key. */ > -struct nlattr *actions;/* Argument to OVS_ACTION_ATTR_USERSPACE. > */ > +struct nlattr *out_tun_key; /* Output tunnel key. */ > +struct nlattr *actions; /* Argument to > OVS_ACTION_ATTR_USERSPACE. */ > +struct flow flow; /* Caller provide 'flow' if 'key' is not > + available. */ "Caller provided 'flow' if the 'key' is not available." >> Did you fix this one? > Done. Also, multiline comments should start with a *. >>> But it is not true for most of data struct member comments. For example, >>> >>> struct dpif_dp_stats { >>> uint64_t n_hit; /* Number of flow table matches. */ >>> uint64_t n_missed; /* Number of flow table misses. */ >>> uint64_t n_lost; /* Number of misses not sent to userspace. >>> */ >>> uint64_t n_flows; /* Number of flows present. */ >>> uint64_t n_cache_hit; /* Number of mega flow mask cache hits for >>> flow table matches. */ >>> uint64_t n_mask_hit; /* Number of mega flow masks visited for >>> flow table matches. */ >>> uint32_t n_masks; /* Number of mega flow masks. */ >>> }; >>> >>> Could you please confirm with it? Thanks. >> Looks like this file has both, and more without the starting *, so I assume >> you can keep it as is. > OK, thanks. >> >> > +nl_parse_psample(struct offload_psample *psample, struct ofpbuf *buf) > +{ > +static const struct nl_policy ovs_psample_policy[] = { > +[PSAMPLE_ATTR_IIFINDEX] = { .type = NL_A_U16, .optional = true, > }, > +[PSAMPLE_ATTR_SAMPLE_GROUP] = { .type = NL_A_U32 }, > +[PSAMPLE_ATTR_GROUP_SEQ] = { .type = NL_A_U32 }, > +[PSAMPLE_ATTR_DATA] = { .type = NL_A_UNSPEC }, > +}; > +struct nlattr *a[ARRAY_SIZE(ovs_psample_policy)]; > +struct genlmsghdr *genl; > +struct nlmsghdr *nlmsg; > +struct ofpbuf b; > + > +b = ofpbuf_const_initializer(buf->data, buf->size); > +nlmsg = ofpbuf_try_pull(, sizeof *nlmsg); > +genl = ofpbuf_try_pull(, sizeof *genl); > +if (!nlmsg || !genl || nlmsg->nlmsg_type != psample_family > +|| !nl_policy_parse(, 0, ovs_psample_policy, a, > +ARRAY_SIZE(ovs_psample_policy))) { > +return EINVAL; > +} > + > +if (a[PSAMPLE_ATTR_IIFINDEX]) { > +psample->iifindex = nl_attr_get_u16(a[PSAMPLE_ATTR_IIFINDEX]); > +} Do } else { psample->iffindex = 0 }, to safe on the memset later. >>> I removed psample->ifindex. New code will get input ifindex also via >>> group_id mapping >>> instead of from TC. TC only provides group id and sampled packets. We will >>> get other info >>> all from the group_id mapping, eg. output tunnel, input ifindex. >> Can you elaborate a bit more on why you made this change? Is it because the >> value is not returned by the kernel? >> >> > I explained in patch #8. There is a limitation that ovs/tc can't get > in_ifindex from namespace. > And I think this change is a good design. It is extensible. > +upcall->userdata = sample->userdata; > +if (sample->tunnel) { > +memcpy(>tunnel, sample->tunnel, sizeof flow->tunnel); Are we ok memcpy will work here? We might need to initialize the sample->tunnel info correctly in the upcomming patches, see flow_tnl_copy__(). >>> From the testing result, it seems it is ok. Maybe flow_tnl_copy__() is >>> more accurate. >>> But we need to change a lot of functions non-static. Not sure if we should >>> do that. >>> What do you think? >> It’s already a static inline function in an included file, so maybe just >> removing the __ is all that needs to be done. >> This way we are sure we copy only the needed data (after memsetting the >> destination), and are not introducing any hidden problem that does not show >> up with the current tests. > Yes, indeed. Done. > +} > +if (sample->userspace_actions) { > +upcall->actions = sample->userspace_actions; > +} > +if (psample->iifindex) { > +flow->in_port.odp_port = > netdev_ifindex_to_odp_port(psample->iifindex); > +} If not provided I think the port should be set to ODPP_NONE (the default value), please confirm. If this is not true the above netdev_ifindex_to_odp_port() should cause this function to fail if ODPP_NONE is returned. >>> Now ifindex is
Re: [ovs-dev] [PATCH v26 4/8] netdev-offload-tc: Add sample offload API for TC
On 5/10/2023 7:42 PM, Eelco Chaudron wrote: On 26 Apr 2023, at 4:42, Chris Mi wrote: /* DPIF_UC_ACTION only. */ struct nlattr *userdata;/* Argument to OVS_ACTION_ATTR_USERSPACE. */ -struct nlattr *out_tun_key;/* Output tunnel key. */ -struct nlattr *actions;/* Argument to OVS_ACTION_ATTR_USERSPACE. */ +struct nlattr *out_tun_key; /* Output tunnel key. */ +struct nlattr *actions; /* Argument to OVS_ACTION_ATTR_USERSPACE. */ +struct flow flow; /* Caller provide 'flow' if 'key' is not + available. */ "Caller provided 'flow' if the 'key' is not available." Did you fix this one? Done. Also, multiline comments should start with a *. But it is not true for most of data struct member comments. For example, struct dpif_dp_stats { uint64_t n_hit; /* Number of flow table matches. */ uint64_t n_missed; /* Number of flow table misses. */ uint64_t n_lost; /* Number of misses not sent to userspace. */ uint64_t n_flows; /* Number of flows present. */ uint64_t n_cache_hit; /* Number of mega flow mask cache hits for flow table matches. */ uint64_t n_mask_hit; /* Number of mega flow masks visited for flow table matches. */ uint32_t n_masks; /* Number of mega flow masks. */ }; Could you please confirm with it? Thanks. Looks like this file has both, and more without the starting *, so I assume you can keep it as is. OK, thanks. +nl_parse_psample(struct offload_psample *psample, struct ofpbuf *buf) +{ +static const struct nl_policy ovs_psample_policy[] = { +[PSAMPLE_ATTR_IIFINDEX] = { .type = NL_A_U16, .optional = true, }, +[PSAMPLE_ATTR_SAMPLE_GROUP] = { .type = NL_A_U32 }, +[PSAMPLE_ATTR_GROUP_SEQ] = { .type = NL_A_U32 }, +[PSAMPLE_ATTR_DATA] = { .type = NL_A_UNSPEC }, +}; +struct nlattr *a[ARRAY_SIZE(ovs_psample_policy)]; +struct genlmsghdr *genl; +struct nlmsghdr *nlmsg; +struct ofpbuf b; + +b = ofpbuf_const_initializer(buf->data, buf->size); +nlmsg = ofpbuf_try_pull(, sizeof *nlmsg); +genl = ofpbuf_try_pull(, sizeof *genl); +if (!nlmsg || !genl || nlmsg->nlmsg_type != psample_family +|| !nl_policy_parse(, 0, ovs_psample_policy, a, +ARRAY_SIZE(ovs_psample_policy))) { +return EINVAL; +} + +if (a[PSAMPLE_ATTR_IIFINDEX]) { +psample->iifindex = nl_attr_get_u16(a[PSAMPLE_ATTR_IIFINDEX]); +} Do } else { psample->iffindex = 0 }, to safe on the memset later. I removed psample->ifindex. New code will get input ifindex also via group_id mapping instead of from TC. TC only provides group id and sampled packets. We will get other info all from the group_id mapping, eg. output tunnel, input ifindex. Can you elaborate a bit more on why you made this change? Is it because the value is not returned by the kernel? I explained in patch #8. There is a limitation that ovs/tc can't get in_ifindex from namespace. And I think this change is a good design. It is extensible. +upcall->userdata = sample->userdata; +if (sample->tunnel) { +memcpy(>tunnel, sample->tunnel, sizeof flow->tunnel); Are we ok memcpy will work here? We might need to initialize the sample->tunnel info correctly in the upcomming patches, see flow_tnl_copy__(). From the testing result, it seems it is ok. Maybe flow_tnl_copy__() is more accurate. But we need to change a lot of functions non-static. Not sure if we should do that. What do you think? It’s already a static inline function in an included file, so maybe just removing the __ is all that needs to be done. This way we are sure we copy only the needed data (after memsetting the destination), and are not introducing any hidden problem that does not show up with the current tests. Yes, indeed. Done. +} +if (sample->userspace_actions) { +upcall->actions = sample->userspace_actions; +} +if (psample->iifindex) { +flow->in_port.odp_port = netdev_ifindex_to_odp_port(psample->iifindex); +} If not provided I think the port should be set to ODPP_NONE (the default value), please confirm. If this is not true the above netdev_ifindex_to_odp_port() should cause this function to fail if ODPP_NONE is returned. Now ifindex is got from netdev_get_ifindex(netdev) in netdev_tc_flow_put(). It should be provided always. ACK ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v26 4/8] netdev-offload-tc: Add sample offload API for TC
On 26 Apr 2023, at 4:42, Chris Mi wrote: >>> /* DPIF_UC_ACTION only. */ >>> struct nlattr *userdata;/* Argument to OVS_ACTION_ATTR_USERSPACE. >>> */ >>> -struct nlattr *out_tun_key;/* Output tunnel key. */ >>> -struct nlattr *actions;/* Argument to OVS_ACTION_ATTR_USERSPACE. */ >>> +struct nlattr *out_tun_key; /* Output tunnel key. */ >>> +struct nlattr *actions; /* Argument to OVS_ACTION_ATTR_USERSPACE. >>> */ >>> +struct flow flow; /* Caller provide 'flow' if 'key' is not >>> + available. */ >> "Caller provided 'flow' if the 'key' is not available." Did you fix this one? >> Also, multiline comments should start with a *. > But it is not true for most of data struct member comments. For example, > > struct dpif_dp_stats { > uint64_t n_hit; /* Number of flow table matches. */ > uint64_t n_missed; /* Number of flow table misses. */ > uint64_t n_lost; /* Number of misses not sent to userspace. */ > uint64_t n_flows; /* Number of flows present. */ > uint64_t n_cache_hit; /* Number of mega flow mask cache hits for > flow table matches. */ > uint64_t n_mask_hit; /* Number of mega flow masks visited for > flow table matches. */ > uint32_t n_masks; /* Number of mega flow masks. */ > }; > > Could you please confirm with it? Thanks. Looks like this file has both, and more without the starting *, so I assume you can keep it as is. >>> +nl_parse_psample(struct offload_psample *psample, struct ofpbuf *buf) >>> +{ >>> +static const struct nl_policy ovs_psample_policy[] = { >>> +[PSAMPLE_ATTR_IIFINDEX] = { .type = NL_A_U16, .optional = true, }, >>> +[PSAMPLE_ATTR_SAMPLE_GROUP] = { .type = NL_A_U32 }, >>> +[PSAMPLE_ATTR_GROUP_SEQ] = { .type = NL_A_U32 }, >>> +[PSAMPLE_ATTR_DATA] = { .type = NL_A_UNSPEC }, >>> +}; >>> +struct nlattr *a[ARRAY_SIZE(ovs_psample_policy)]; >>> +struct genlmsghdr *genl; >>> +struct nlmsghdr *nlmsg; >>> +struct ofpbuf b; >>> + >>> +b = ofpbuf_const_initializer(buf->data, buf->size); >>> +nlmsg = ofpbuf_try_pull(, sizeof *nlmsg); >>> +genl = ofpbuf_try_pull(, sizeof *genl); >>> +if (!nlmsg || !genl || nlmsg->nlmsg_type != psample_family >>> +|| !nl_policy_parse(, 0, ovs_psample_policy, a, >>> +ARRAY_SIZE(ovs_psample_policy))) { >>> +return EINVAL; >>> +} >>> + >>> +if (a[PSAMPLE_ATTR_IIFINDEX]) { >>> +psample->iifindex = nl_attr_get_u16(a[PSAMPLE_ATTR_IIFINDEX]); >>> +} >> Do } else { psample->iffindex = 0 }, to safe on the memset later. > I removed psample->ifindex. New code will get input ifindex also via group_id > mapping > instead of from TC. TC only provides group id and sampled packets. We will > get other info > all from the group_id mapping, eg. output tunnel, input ifindex. Can you elaborate a bit more on why you made this change? Is it because the value is not returned by the kernel? >>> +upcall->userdata = sample->userdata; >>> +if (sample->tunnel) { >>> +memcpy(>tunnel, sample->tunnel, sizeof flow->tunnel); >> Are we ok memcpy will work here? We might need to initialize the >> sample->tunnel info correctly in the upcomming patches, see >> flow_tnl_copy__(). > From the testing result, it seems it is ok. Maybe flow_tnl_copy__() is more > accurate. > But we need to change a lot of functions non-static. Not sure if we should do > that. > What do you think? It’s already a static inline function in an included file, so maybe just removing the __ is all that needs to be done. This way we are sure we copy only the needed data (after memsetting the destination), and are not introducing any hidden problem that does not show up with the current tests. >>> +} >>> +if (sample->userspace_actions) { >>> +upcall->actions = sample->userspace_actions; >>> +} >>> +if (psample->iifindex) { >>> +flow->in_port.odp_port = >>> netdev_ifindex_to_odp_port(psample->iifindex); >>> +} >> If not provided I think the port should be set to ODPP_NONE (the default >> value), please confirm. >> If this is not true the above netdev_ifindex_to_odp_port() should cause this >> function to fail if ODPP_NONE is returned. > Now ifindex is got from netdev_get_ifindex(netdev) in netdev_tc_flow_put(). > It should be provided always. ACK ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v26 4/8] netdev-offload-tc: Add sample offload API for TC
On 4/12/2023 10:06 PM, Eelco Chaudron wrote: On 29 Mar 2023, at 13:42, Chris Mi wrote: Initialize psample socket. Add sample recv API to receive sampled packets from psample socket. Add sample recv wait API to add psample socket fd to poll list. Thanks for the update, see comments inline. //Eelco Signed-off-by: Chris Mi Reviewed-by: Roi Dayan --- lib/dpif.h| 6 +- lib/netdev-offload-provider.h | 17 +++ lib/netdev-offload-tc.c | 192 ++ lib/netdev-offload.c | 3 +- 4 files changed, 215 insertions(+), 3 deletions(-) diff --git a/lib/dpif.h b/lib/dpif.h index 129cbf6a1..5b094ce68 100644 --- a/lib/dpif.h +++ b/lib/dpif.h @@ -834,8 +834,10 @@ struct dpif_upcall { /* DPIF_UC_ACTION only. */ struct nlattr *userdata;/* Argument to OVS_ACTION_ATTR_USERSPACE. */ -struct nlattr *out_tun_key;/* Output tunnel key. */ -struct nlattr *actions;/* Argument to OVS_ACTION_ATTR_USERSPACE. */ +struct nlattr *out_tun_key; /* Output tunnel key. */ +struct nlattr *actions; /* Argument to OVS_ACTION_ATTR_USERSPACE. */ +struct flow flow; /* Caller provide 'flow' if 'key' is not + available. */ "Caller provided 'flow' if the 'key' is not available." Also, multiline comments should start with a *. But it is not true for most of data struct member comments. For example, struct dpif_dp_stats { uint64_t n_hit; /* Number of flow table matches. */ uint64_t n_missed; /* Number of flow table misses. */ uint64_t n_lost; /* Number of misses not sent to userspace. */ uint64_t n_flows; /* Number of flows present. */ uint64_t n_cache_hit; /* Number of mega flow mask cache hits for flow table matches. */ uint64_t n_mask_hit; /* Number of mega flow masks visited for flow table matches. */ uint32_t n_masks; /* Number of mega flow masks. */ }; Could you please confirm with it? Thanks. }; /* A callback to notify higher layer of dpif about to be purged, so that diff --git a/lib/netdev-offload-provider.h b/lib/netdev-offload-provider.h index 9108856d1..b3bf433d0 100644 --- a/lib/netdev-offload-provider.h +++ b/lib/netdev-offload-provider.h @@ -28,6 +28,8 @@ extern "C" { #endif +struct dpif_upcall; + struct netdev_flow_api { char *type; /* Flush all offloaded flows from a netdev. @@ -121,6 +123,21 @@ struct netdev_flow_api { int (*meter_del)(ofproto_meter_id meter_id, struct ofputil_meter_stats *stats); +/* Polls for sample offload packets for an upcall handler. If successful, This should be a general upcall handling API, so I would change this to: /* Polls for upcall offload packets... Done. + * stores the upcall into '*upcall', using 'buf' for storage. + * Also missing this part as suggested earlier, as it should also apply to the code supplied here (does it?): * The implementation should point 'upcall->key' and 'upcall->userdata' * (if any) into data in the caller-provided 'buf'. The implementation may * also use 'buf' for storing the data of 'upcall->packet'. If necessary * to make room, the implementation may reallocate the data in 'buf'. * * The caller owns the data of 'upcall->packet' and may modify it. If * packet's headroom is exhausted as it is manipulated, 'upcall->packet' * will be reallocated. This requires the data of 'upcall->packet' to be * released with ofpbuf_uninit() before 'upcall' is destroyed. However, * when an error is returned, the 'upcall->packet' may be uninitialized * and should not be released. Done. + * Return 0 if successful, otherwise returns a positive errno value. + * + * This function must not block. If no upcall is pending when it is + * called, it should return EAGAIN without blocking. */ +int (*recv)(struct dpif_upcall *upcall, struct ofpbuf *buf, +uint32_t handler_id); + +/* Arranges for the poll loop for an upcall handler to wake up when + * sample socket has a message queued to be received with the recv + * member functions. */ +void (*recv_wait)(uint32_t handler_id); + /* Initializies the netdev flow api. * Return 0 if successful, otherwise returns a positive errno value. */ int (*init_flow_api)(struct netdev *); diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c index c53ea5588..8c8f29af7 100644 --- a/lib/netdev-offload-tc.c +++ b/lib/netdev-offload-tc.c @@ -18,6 +18,8 @@ #include #include +#include +#include #include "dpif.h" #include "hash.h" @@ -124,6 +126,9 @@ struct sgid_node { struct offload_sample sample; }; +static struct nl_sock *psample_sock; +static int psample_family; + static struct ovs_mutex sgid_lock =
Re: [ovs-dev] [PATCH v26 4/8] netdev-offload-tc: Add sample offload API for TC
On 29 Mar 2023, at 13:42, Chris Mi wrote: > Initialize psample socket. Add sample recv API to receive sampled > packets from psample socket. Add sample recv wait API to add psample > socket fd to poll list. Thanks for the update, see comments inline. //Eelco > Signed-off-by: Chris Mi > Reviewed-by: Roi Dayan > --- > lib/dpif.h| 6 +- > lib/netdev-offload-provider.h | 17 +++ > lib/netdev-offload-tc.c | 192 ++ > lib/netdev-offload.c | 3 +- > 4 files changed, 215 insertions(+), 3 deletions(-) > > diff --git a/lib/dpif.h b/lib/dpif.h > index 129cbf6a1..5b094ce68 100644 > --- a/lib/dpif.h > +++ b/lib/dpif.h > @@ -834,8 +834,10 @@ struct dpif_upcall { > > /* DPIF_UC_ACTION only. */ > struct nlattr *userdata;/* Argument to OVS_ACTION_ATTR_USERSPACE. */ > -struct nlattr *out_tun_key;/* Output tunnel key. */ > -struct nlattr *actions;/* Argument to OVS_ACTION_ATTR_USERSPACE. */ > +struct nlattr *out_tun_key; /* Output tunnel key. */ > +struct nlattr *actions; /* Argument to OVS_ACTION_ATTR_USERSPACE. */ > +struct flow flow; /* Caller provide 'flow' if 'key' is not > + available. */ "Caller provided 'flow' if the 'key' is not available." Also, multiline comments should start with a *. > }; > > /* A callback to notify higher layer of dpif about to be purged, so that > diff --git a/lib/netdev-offload-provider.h b/lib/netdev-offload-provider.h > index 9108856d1..b3bf433d0 100644 > --- a/lib/netdev-offload-provider.h > +++ b/lib/netdev-offload-provider.h > @@ -28,6 +28,8 @@ > extern "C" { > #endif > > +struct dpif_upcall; > + > struct netdev_flow_api { > char *type; > /* Flush all offloaded flows from a netdev. > @@ -121,6 +123,21 @@ struct netdev_flow_api { > int (*meter_del)(ofproto_meter_id meter_id, > struct ofputil_meter_stats *stats); > > +/* Polls for sample offload packets for an upcall handler. If successful, This should be a general upcall handling API, so I would change this to: /* Polls for upcall offload packets... > + * stores the upcall into '*upcall', using 'buf' for storage. > + * Also missing this part as suggested earlier, as it should also apply to the code supplied here (does it?): * The implementation should point 'upcall->key' and 'upcall->userdata' * (if any) into data in the caller-provided 'buf'. The implementation may * also use 'buf' for storing the data of 'upcall->packet'. If necessary * to make room, the implementation may reallocate the data in 'buf'. * * The caller owns the data of 'upcall->packet' and may modify it. If * packet's headroom is exhausted as it is manipulated, 'upcall->packet' * will be reallocated. This requires the data of 'upcall->packet' to be * released with ofpbuf_uninit() before 'upcall' is destroyed. However, * when an error is returned, the 'upcall->packet' may be uninitialized * and should not be released. > + * Return 0 if successful, otherwise returns a positive errno value. > + * > + * This function must not block. If no upcall is pending when it is > + * called, it should return EAGAIN without blocking. */ > +int (*recv)(struct dpif_upcall *upcall, struct ofpbuf *buf, > +uint32_t handler_id); > + > +/* Arranges for the poll loop for an upcall handler to wake up when > + * sample socket has a message queued to be received with the recv > + * member functions. */ > +void (*recv_wait)(uint32_t handler_id); > + > /* Initializies the netdev flow api. > * Return 0 if successful, otherwise returns a positive errno value. */ > int (*init_flow_api)(struct netdev *); > diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c > index c53ea5588..8c8f29af7 100644 > --- a/lib/netdev-offload-tc.c > +++ b/lib/netdev-offload-tc.c > @@ -18,6 +18,8 @@ > > #include > #include > +#include > +#include > > #include "dpif.h" > #include "hash.h" > @@ -124,6 +126,9 @@ struct sgid_node { > struct offload_sample sample; > }; > > +static struct nl_sock *psample_sock; > +static int psample_family; > + > static struct ovs_mutex sgid_lock = OVS_MUTEX_INITIALIZER; > static struct cmap sgid_map = CMAP_INITIALIZER; > static struct id_pool *sample_group_ids OVS_GUARDED_BY(sgid_lock); > @@ -148,6 +153,14 @@ sgid_find(uint32_t id) > return node ? CONTAINER_OF(node, struct sgid_node, id_node) : NULL; > } > > +static struct offload_sample * > +sample_find(uint32_t id) > +{ > +struct sgid_node *node = sgid_find(id); > + > +return node ? >sample: NULL; > +} > + > static void > offload_sample_clone(struct offload_sample *new, > const struct offload_sample *old) > @@ -2931,6 +2944,55 @@ tc_cleanup_policer_actions(struct id_pool *police_ids, > hmap_destroy(); > } > > +static void > +psample_init(void)