On 3/29/22 20:36, Cpp Code wrote: > On Mon, Mar 21, 2022 at 9:46 AM Ilya Maximets <i.maxim...@ovn.org> wrote: >> >> On 3/18/22 22:34, Toms Atteka wrote: >>> IPv6 extension headers carry optional internet layer information >>> and are placed between the fixed header and the upper-layer >>> protocol header. >>> >>> This change adds a new OpenFlow field OFPXMT_OFB_IPV6_EXTHDR and >>> packets can be filtered using ipv6_ext flag. >>> >>> Some spacing style issues fixed. >>> >>> Tested-at: https://github.com/TomCodeLV/ovs/actions/runs/504185214 >>> Signed-off-by: Toms Atteka <cpp.code...@gmail.com> >>> --- >> >> Hi. Thanks for the patch! >> >> I didn't try to run it, but there are few issues highlighted inline >> that we might need to act on sooner rather than later as they might >> need some kernel uAPI changes. Please, take a look. >> Some less urgent issues also inline. >> >> Best regards, Ilya Maximets. >> >>> build-aux/extract-odp-netlink-macros-h | 1 + >>> build-aux/extract-ofp-fields | 4 +- >>> datapath/flow.c | 192 +++++++++++++++--- >>> datapath/flow.h | 14 ++ >>> datapath/flow_netlink.c | 25 ++- >> >> Please, don't include code changes for the kernel datapath. >> OOT kernel module is deprecated at this point and will be >> completely removed soon. Only uAPI header and build fixes >> (if any) should be part of that patch. > > At that point we would take kernel modules from linux direclty?
Yes. The idea is that users need to use the openvswitch module provided along with their kernel. > >> >>> .../linux/compat/include/linux/openvswitch.h | 6 + >>> include/openvswitch/flow.h | 8 +- >>> include/openvswitch/match.h | 2 + >>> include/openvswitch/meta-flow.h | 18 ++ >>> lib/dpif-netdev-extract-avx512.c | 2 +- >>> lib/flow.c | 188 ++++++++++++++++- >>> lib/flow.h | 6 +- >>> lib/match.c | 25 ++- >>> lib/meta-flow.c | 80 +++++++- >>> lib/meta-flow.xml | 14 ++ >>> lib/nx-match.c | 13 +- >>> lib/odp-execute.c | 2 + >>> lib/odp-util.c | 111 +++++++++- >>> lib/odp-util.h | 11 +- >>> lib/ofp-match.c | 2 +- >>> lib/packets.c | 28 +++ >>> lib/packets.h | 12 ++ >>> ofproto/ofproto-dpif-rid.h | 2 +- >>> ofproto/ofproto-dpif-sflow.c | 1 + >>> ofproto/ofproto-dpif-xlate.c | 2 +- >>> ofproto/ofproto-dpif.c | 47 +++++ >>> tests/ofproto.at | 4 +- >>> tests/system-traffic.at | 31 +++ >>> 28 files changed, 789 insertions(+), 62 deletions(-) >>> >> >> <snip> >> >>> diff --git a/datapath/linux/compat/include/linux/openvswitch.h >>> b/datapath/linux/compat/include/linux/openvswitch.h >>> index 8d9300091..b9e74543b 100644 >>> --- a/datapath/linux/compat/include/linux/openvswitch.h >>> +++ b/datapath/linux/compat/include/linux/openvswitch.h >>> @@ -387,6 +387,7 @@ enum ovs_key_attr { >>> OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4, /* struct ovs_key_ct_tuple_ipv4 */ >>> OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6, /* struct ovs_key_ct_tuple_ipv6 */ >>> OVS_KEY_ATTR_NSH, /* Nested set of ovs_nsh_key_* */ >>> + OVS_KEY_ATTR_IPV6_EXTHDRS, /* struct ovs_key_ipv6_exthdr */ >> >> In case you missed for some reason the latest kernel change, >> the uAPI header on net-next was flattened and looks differently >> now. Hence, the value of OVS_KEY_ATTR_IPV6_EXTHDRS is also >> different. >> >> I'll send a patch to partially backport the new header layout. > > Which kernel change you are reffering? Could you provide commit id? 1926407a4ab0 ("net: openvswitch: fix uAPI incompatibility with existing user space") I think, you were in CC list in the discussion. > >> >>> >>> #ifdef __KERNEL__ >>> /* Only used within kernel data path. */ >>> @@ -498,6 +499,11 @@ struct ovs_key_ipv6 { >>> __u8 ipv6_frag; /* One of OVS_FRAG_TYPE_*. */ >>> }; >>> >>> +/* separate structure to support backward compatibility with older user >>> space */ >>> +struct ovs_key_ipv6_exthdrs { >>> + __u16 hdrs; >>> +}; >>> + >>> struct ovs_key_tcp { >>> __be16 tcp_src; >>> __be16 tcp_dst; >>> diff --git a/include/openvswitch/flow.h b/include/openvswitch/flow.h >>> index 3054015d9..faa0fa412 100644 >>> --- a/include/openvswitch/flow.h >>> +++ b/include/openvswitch/flow.h >>> @@ -27,7 +27,7 @@ extern "C" { >>> /* This sequence number should be incremented whenever anything involving >>> flows >>> * or the wildcarding of flows changes. This will cause build assertion >>> * failures in places which likely need to be updated. */ >>> -#define FLOW_WC_SEQ 42 >>> +#define FLOW_WC_SEQ 43 >>> >>> /* Number of Open vSwitch extension 32-bit registers. */ >>> #define FLOW_N_REGS 16 >>> @@ -136,6 +136,8 @@ struct flow { >>> struct in6_addr ipv6_dst; /* IPv6 destination address. */ >>> struct in6_addr ct_ipv6_src; /* CT orig tuple IPv6 source address. */ >>> struct in6_addr ct_ipv6_dst; /* CT orig tuple IPv6 destination >>> address. */ >>> + uint16_t ipv6_exthdr; /* IPv6 flow extension headers. */ >>> + ovs_be16 pad4[3]; /* Pad to 64 bits. */ >> >> I'm not sure if that's the most efficient way of packing the >> structure. Though I didn't actually dig into that enough to >> suggest anything better. Will take another look once other >> issues are sorted out. > > So far I understand it has to be in some reasonable order, and padding > is mandatory. Yes, padding is mandatory, unless some fields can be packed together. I didn't look closely yet, so I don't know if the better placing is possible. > >> >>> ovs_be32 ipv6_label; /* IPv6 flow label. */ >>> uint8_t nw_frag; /* FLOW_FRAG_* flags. */ >>> uint8_t nw_tos; /* IP ToS (including DSCP and ECN). */ >>> @@ -167,8 +169,8 @@ BUILD_ASSERT_DECL(sizeof(struct ovs_key_nsh) % >>> sizeof(uint64_t) == 0); >>> >>> /* Remember to update FLOW_WC_SEQ when changing 'struct flow'. */ >>> BUILD_ASSERT_DECL(offsetof(struct flow, igmp_group_ip4) + sizeof(uint32_t) >>> - == sizeof(struct flow_tnl) + sizeof(struct ovs_key_nsh) >>> + 300 >>> - && FLOW_WC_SEQ == 42); >>> + == sizeof(struct flow_tnl) + sizeof(struct ovs_key_nsh) >>> + 308 >>> + && FLOW_WC_SEQ == 43); >>> >>> /* Incremental points at which flow classification may be performed in >>> * segments. >>> diff --git a/include/openvswitch/match.h b/include/openvswitch/match.h >>> index 2e8812048..5ba129190 100644 >>> --- a/include/openvswitch/match.h >>> +++ b/include/openvswitch/match.h >>> @@ -234,6 +234,8 @@ void match_set_ipv6_dst_masked(struct match *, const >>> struct in6_addr *, >>> const struct in6_addr *); >>> void match_set_ipv6_label(struct match *, ovs_be32); >>> void match_set_ipv6_label_masked(struct match *, ovs_be32, ovs_be32); >>> +void match_set_ipv6_exthdr(struct match *, uint16_t); >>> +void match_set_ipv6_exthdr_masked(struct match *, uint16_t, uint16_t); >>> void match_set_nd_target(struct match *, const struct in6_addr *); >>> void match_set_nd_target_masked(struct match *, const struct in6_addr *, >>> const struct in6_addr *); >>> diff --git a/include/openvswitch/meta-flow.h >>> b/include/openvswitch/meta-flow.h >>> index 045dce8f5..6f4d63b42 100644 >>> --- a/include/openvswitch/meta-flow.h >>> +++ b/include/openvswitch/meta-flow.h >>> @@ -1392,6 +1392,22 @@ enum OVS_PACKED_ENUM mf_field_id { >>> */ >>> MFF_IPV6_LABEL, >>> >>> + /* "ipv6_ext". >>> + * >>> + * IPV6 extension headers flags. >>> + * >>> + * Currently there are 9 IPV6 extension headers flags. >>> + * >>> + * Type: u16. >> >> This doesn't look right. OpenFlow spec defines the field as >> "16-bit integer with 7 most-significant bits forced to 0. Only the >> lower 9 bits have meaning". But that can not be achieved with the >> le field. In current implementation most significant bits will be >> set instead in most cases. So, that definition should be changed to >> >> Type: be16 (low 9 bits). > > I agree that there is missing bits part. But as it a pseudo field and the > value does not come from network it shouldn't be big endian. As per > convention we use big endian for values coming from network devices. The point here is that OpenFlow messages are coming from the network. So, if the OF controller is runnning on BE host, but OVS is running on LE host, they will not agree on the message format. > > So > Type: u16 (low 9 bits) > should be the way to go > >> >> I also see that this field is defined as u16 in the kernel uAPI. >> That might be fine to have it that way, but we'll need to convert >> between le and be every time we're exchanging messages between >> ovs-vswitch and the datapath. That might be not very convenient >> and prone to error. > > If I am not mistaken we don't exchange this value but generate it > independently > on userspace and in kernel. On upcall, the kernel will send it to the userspace in a netlink message. And OVS userspace will send the value generated in userspace to the kernel while installing datapath flows. We do have a conversion in palce for ct_state and it doesn't look bad (see ovs_to_odp_ct_state), so maybe that's not a big problem. But conversion has to be implemented, if format is different. <snip> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev