[dpdk-dev] [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare exchange
Add 128-bit atomic compare exchange on aarch64. Suggested-by: Jerin Jacob Signed-off-by: Phil Yang Reviewed-by: Honnappa Nagarahalli Tested-by: Honnappa Nagarahalli Acked-by: Jerin Jacob --- v9: Updated 19.11 release note. v8: Fixed "WARNING:LONG_LINE: line over 80 characters" warnings with latest kernel checkpatch.pl v7: 1. Adjust code comment. v6: 1. Put the RTE_ARM_FEATURE_ATOMICS flag into EAL group. (Jerin Jocob) 2. Keep rte_stack_lf_stubs.h doing nothing. (Gage Eads) 3. Fixed 32 bit build issue. v5: 1. Enable RTE_ARM_FEATURE_ATOMICS on octeontx2 in default. (Jerin Jocob) 2. Record the reason of introducing "rte_stack_lf_stubs.h" in git commit. (Jerin, Jocob) 3. Fixed a conditional MACRO error in rte_atomic128_cmp_exchange. (Jerin Jocob) v4: 1. Add RTE_ARM_FEATURE_ATOMICS flag to support LSE CASP instructions. (Jerin Jocob) 2. Fix possible arm64 ABI break by making casp_op_name noinline. (Jerin Jocob) 3. Add rte_stack_lf_stubs.h to reduce the ifdef clutter. (Gage Eads/Jerin Jocob) v3: 1. Avoid duplication code with macro. (Jerin Jocob) 2. Make invalid memory order to strongest barrier. (Jerin Jocob) 3. Update doc/guides/prog_guide/env_abstraction_layer.rst. (Gage Eads) 4. Fix 32-bit x86 builds issue. (Gage Eads) 5. Correct documentation issues in UT. (Gage Eads) v2: Initial version. config/arm/meson.build | 2 + config/common_base | 3 + config/defconfig_arm64-octeontx2-linuxapp-gcc | 1 + config/defconfig_arm64-thunderx2-linuxapp-gcc | 1 + .../common/include/arch/arm/rte_atomic_64.h| 163 + .../common/include/arch/x86/rte_atomic_64.h| 12 -- lib/librte_eal/common/include/generic/rte_atomic.h | 17 ++- 7 files changed, 186 insertions(+), 13 deletions(-) diff --git a/config/arm/meson.build b/config/arm/meson.build index 979018e..9f28271 100644 --- a/config/arm/meson.build +++ b/config/arm/meson.build @@ -71,11 +71,13 @@ flags_thunderx2_extra = [ ['RTE_CACHE_LINE_SIZE', 64], ['RTE_MAX_NUMA_NODES', 2], ['RTE_MAX_LCORE', 256], + ['RTE_ARM_FEATURE_ATOMICS', true], ['RTE_USE_C11_MEM_MODEL', true]] flags_octeontx2_extra = [ ['RTE_MACHINE', '"octeontx2"'], ['RTE_MAX_NUMA_NODES', 1], ['RTE_MAX_LCORE', 24], + ['RTE_ARM_FEATURE_ATOMICS', true], ['RTE_EAL_IGB_UIO', false], ['RTE_USE_C11_MEM_MODEL', true]] diff --git a/config/common_base b/config/common_base index 8ef75c2..2054480 100644 --- a/config/common_base +++ b/config/common_base @@ -82,6 +82,9 @@ CONFIG_RTE_MAX_LCORE=128 CONFIG_RTE_MAX_NUMA_NODES=8 CONFIG_RTE_MAX_HEAPS=32 CONFIG_RTE_MAX_MEMSEG_LISTS=64 + +# Use ARM LSE ATOMIC instructions +CONFIG_RTE_ARM_FEATURE_ATOMICS=n # each memseg list will be limited to either RTE_MAX_MEMSEG_PER_LIST pages # or RTE_MAX_MEM_MB_PER_LIST megabytes worth of memory, whichever is smaller CONFIG_RTE_MAX_MEMSEG_PER_LIST=8192 diff --git a/config/defconfig_arm64-octeontx2-linuxapp-gcc b/config/defconfig_arm64-octeontx2-linuxapp-gcc index f20da24..7687dbe 100644 --- a/config/defconfig_arm64-octeontx2-linuxapp-gcc +++ b/config/defconfig_arm64-octeontx2-linuxapp-gcc @@ -9,6 +9,7 @@ CONFIG_RTE_MACHINE="octeontx2" CONFIG_RTE_CACHE_LINE_SIZE=128 CONFIG_RTE_MAX_NUMA_NODES=1 CONFIG_RTE_MAX_LCORE=24 +CONFIG_RTE_ARM_FEATURE_ATOMICS=y # Doesn't support NUMA CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES=n diff --git a/config/defconfig_arm64-thunderx2-linuxapp-gcc b/config/defconfig_arm64-thunderx2-linuxapp-gcc index cc5c64b..af4a89c 100644 --- a/config/defconfig_arm64-thunderx2-linuxapp-gcc +++ b/config/defconfig_arm64-thunderx2-linuxapp-gcc @@ -9,3 +9,4 @@ CONFIG_RTE_MACHINE="thunderx2" CONFIG_RTE_CACHE_LINE_SIZE=64 CONFIG_RTE_MAX_NUMA_NODES=2 CONFIG_RTE_MAX_LCORE=256 +CONFIG_RTE_ARM_FEATURE_ATOMICS=y diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h index 97060e4..14d869b 100644 --- a/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h +++ b/lib/librte_eal/common/include/arch/arm/rte_atomic_64.h @@ -1,5 +1,6 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright(c) 2015 Cavium, Inc + * Copyright(c) 2019 Arm Limited */ #ifndef _RTE_ATOMIC_ARM64_H_ @@ -14,6 +15,9 @@ extern "C" { #endif #include "generic/rte_atomic.h" +#include +#include +#include #define dsb(opt) asm volatile("dsb " #opt : : : "memory") #define dmb(opt) asm volatile("dmb " #opt : : : "memory") @@ -40,6 +44,165 @@ extern "C" { #define rte_cio_rmb() dmb(oshld) +/* 128 bit atomic operations -*/ + +#define __HAS_ACQ(mo) ((mo) != __ATOMIC_RELAXED && (mo) != __ATOMIC_RELEASE) +#define __HAS_RLS(mo) ((mo) == __ATOMIC_RELEASE || (mo) == __ATOMIC_ACQ_REL || \ + (mo) == __ATOMIC_SEQ_CST) + +#define __MO_LOAD(mo) (__HAS_ACQ((mo)) ? __ATOMIC_ACQUIRE : __ATOMIC_RELAXED
[dpdk-dev] [PATCH v9 2/3] test/atomic: add 128b compare and swap test
Add 128b atomic compare and swap test for aarch64 and x86_64. Signed-off-by: Phil Yang Reviewed-by: Honnappa Nagarahalli Acked-by: Gage Eads Acked-by: Jerin Jacob Tested-by: Jerin Jacob --- app/test/test_atomic.c | 125 - 1 file changed, 123 insertions(+), 2 deletions(-) diff --git a/app/test/test_atomic.c b/app/test/test_atomic.c index 43be30e..0dad923 100644 --- a/app/test/test_atomic.c +++ b/app/test/test_atomic.c @@ -1,5 +1,6 @@ /* SPDX-License-Identifier: BSD-3-Clause * Copyright(c) 2010-2014 Intel Corporation + * Copyright(c) 2019 Arm Limited */ #include @@ -20,7 +21,7 @@ * Atomic Variables * * - * - The main test function performs three subtests. The first test + * - The main test function performs four subtests. The first test * checks that the usual inc/dec/add/sub functions are working * correctly: * @@ -61,11 +62,27 @@ * atomic_sub(&count, tmp+1); * * - At the end of the test, the *count* value must be 0. + * + * - Test "128b compare and swap" (aarch64 and x86_64 only) + * + * - Initialize 128-bit atomic variables to zero. + * + * - Invoke ``test_atomici128_cmp_exchange()`` on each lcore. Before doing + * anything else, the cores are waiting a synchro. Each lcore does + * these compare and swap (CAS) operations several times:: + * + * Acquired CAS update counter.val[0] + 2; counter.val[1] + 1; + * Released CAS update counter.val[0] + 2; counter.val[1] + 1; + * Acquired_Released CAS update counter.val[0] + 2; counter.val[1] + 1; + * Relaxed CAS update counter.val[0] + 2; counter.val[1] + 1; + * + * - At the end of the test, the *count128* first 64-bit value and + * second 64-bit value differ by the total iterations. */ #define NUM_ATOMIC_TYPES 3 -#define N 1 +#define N 100 static rte_atomic16_t a16; static rte_atomic32_t a32; @@ -216,6 +233,78 @@ test_atomic_dec_and_test(__attribute__((unused)) void *arg) return 0; } +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_ARM64) +static rte_int128_t count128; + +/* + * rte_atomic128_cmp_exchange() should update a 128 bits counter's first 64 + * bits by 2 and the second 64 bits by 1 in this test. It should return true + * if the compare exchange operation is successful. + * This test repeats 128 bits compare and swap operations 10K rounds. In each + * iteration it runs compare and swap operation with different memory models. + */ +static int +test_atomic128_cmp_exchange(__attribute__((unused)) void *arg) +{ + rte_int128_t expected; + int success; + unsigned int i; + + while (rte_atomic32_read(&synchro) == 0) + ; + + expected = count128; + + for (i = 0; i < N; i++) { + do { + rte_int128_t desired; + + desired.val[0] = expected.val[0] + 2; + desired.val[1] = expected.val[1] + 1; + + success = rte_atomic128_cmp_exchange(&count128, + &expected, &desired, 1, + __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); + } while (success == 0); + + do { + rte_int128_t desired; + + desired.val[0] = expected.val[0] + 2; + desired.val[1] = expected.val[1] + 1; + + success = rte_atomic128_cmp_exchange(&count128, + &expected, &desired, 1, + __ATOMIC_RELEASE, __ATOMIC_RELAXED); + } while (success == 0); + + do { + rte_int128_t desired; + + desired.val[0] = expected.val[0] + 2; + desired.val[1] = expected.val[1] + 1; + + success = rte_atomic128_cmp_exchange(&count128, + &expected, &desired, 1, + __ATOMIC_ACQ_REL, __ATOMIC_RELAXED); + } while (success == 0); + + do { + rte_int128_t desired; + + desired.val[0] = expected.val[0] + 2; + desired.val[1] = expected.val[1] + 1; + + success = rte_atomic128_cmp_exchange(&count128, + &expected, &desired, 1, + __ATOMIC_RELAXED, __ATOMIC_RELAXED); + } while (success == 0); + } + + return 0; +} +#endif + static int test_atomic(void) { @@ -340,6 +429,38 @@ test_atomic(void) return -1; } +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_ARM64) + /* +* This case tests the functionality of rte_atomic128b_cmp_exchange +* API. It calls rte_atomic128b_cmp_exchange with four kinds of memory +* models successively on each s
[dpdk-dev] [PATCH v9 3/3] eal/stack: enable lock-free stack for aarch64
Enable both c11 atomic and non c11 atomic lock-free stack for aarch64. Introduced a new header to reduce the ifdef clutter across generic and c11 files. The rte_stack_lf_stubs.h contains stub implementations of __rte_stack_lf_count, __rte_stack_lf_push_elems and __rte_stack_lf_pop_elems. Suggested-by: Gage Eads Suggested-by: Jerin Jacob Signed-off-by: Phil Yang Reviewed-by: Honnappa Nagarahalli Tested-by: Honnappa Nagarahalli Acked-by: Jerin Jacob --- doc/guides/prog_guide/env_abstraction_layer.rst | 4 +-- doc/guides/rel_notes/release_19_11.rst | 3 ++ lib/librte_stack/Makefile | 3 +- lib/librte_stack/rte_stack_lf.h | 4 +++ lib/librte_stack/rte_stack_lf_c11.h | 16 - lib/librte_stack/rte_stack_lf_generic.h | 16 - lib/librte_stack/rte_stack_lf_stubs.h | 44 + 7 files changed, 55 insertions(+), 35 deletions(-) create mode 100644 lib/librte_stack/rte_stack_lf_stubs.h diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst index 94f30fd..6e59fae 100644 --- a/doc/guides/prog_guide/env_abstraction_layer.rst +++ b/doc/guides/prog_guide/env_abstraction_layer.rst @@ -648,8 +648,8 @@ Known Issues Alternatively, applications can use the lock-free stack mempool handler. When considering this handler, note that: - - It is currently limited to the x86_64 platform, because it uses an -instruction (16-byte compare-and-swap) that is not yet available on other + - It is currently limited to the aarch64 and x86_64 platforms, because it uses +an instruction (16-byte compare-and-swap) that is not yet available on other platforms. - It has worse average-case performance than the non-preemptive rte_ring, but software caching (e.g. the mempool cache) can mitigate this by reducing the diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst index 8490d89..60ffd70 100644 --- a/doc/guides/rel_notes/release_19_11.rst +++ b/doc/guides/rel_notes/release_19_11.rst @@ -56,6 +56,9 @@ New Features Also, make sure to start the actual text at the margin. = +* **Added Lock-free Stack for aarch64.** + + The lock-free stack implementation is enabled for aarch64 platforms. Removed Items - diff --git a/lib/librte_stack/Makefile b/lib/librte_stack/Makefile index 8d18ce5..c337ab7 100644 --- a/lib/librte_stack/Makefile +++ b/lib/librte_stack/Makefile @@ -24,6 +24,7 @@ SYMLINK-$(CONFIG_RTE_LIBRTE_STACK)-include := rte_stack.h \ rte_stack_std.h \ rte_stack_lf.h \ rte_stack_lf_generic.h \ - rte_stack_lf_c11.h + rte_stack_lf_c11.h \ + rte_stack_lf_stubs.h include $(RTE_SDK)/mk/rte.lib.mk diff --git a/lib/librte_stack/rte_stack_lf.h b/lib/librte_stack/rte_stack_lf.h index f5581f0..e67630c 100644 --- a/lib/librte_stack/rte_stack_lf.h +++ b/lib/librte_stack/rte_stack_lf.h @@ -5,11 +5,15 @@ #ifndef _RTE_STACK_LF_H_ #define _RTE_STACK_LF_H_ +#if !(defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_ARM64)) +#include "rte_stack_lf_stubs.h" +#else #ifdef RTE_USE_C11_MEM_MODEL #include "rte_stack_lf_c11.h" #else #include "rte_stack_lf_generic.h" #endif +#endif /** * @internal Push several objects on the lock-free stack (MT-safe). diff --git a/lib/librte_stack/rte_stack_lf_c11.h b/lib/librte_stack/rte_stack_lf_c11.h index 3d677ae..999359f 100644 --- a/lib/librte_stack/rte_stack_lf_c11.h +++ b/lib/librte_stack/rte_stack_lf_c11.h @@ -36,12 +36,6 @@ __rte_stack_lf_push_elems(struct rte_stack_lf_list *list, struct rte_stack_lf_elem *last, unsigned int num) { -#ifndef RTE_ARCH_X86_64 - RTE_SET_USED(first); - RTE_SET_USED(last); - RTE_SET_USED(list); - RTE_SET_USED(num); -#else struct rte_stack_lf_head old_head; int success; @@ -79,7 +73,6 @@ __rte_stack_lf_push_elems(struct rte_stack_lf_list *list, * to the LIFO len update. */ __atomic_add_fetch(&list->len, num, __ATOMIC_RELEASE); -#endif } static __rte_always_inline struct rte_stack_lf_elem * @@ -88,14 +81,6 @@ __rte_stack_lf_pop_elems(struct rte_stack_lf_list *list, void **obj_table, struct rte_stack_lf_elem **last) { -#ifndef RTE_ARCH_X86_64 - RTE_SET_USED(obj_table); - RTE_SET_USED(last); - RTE_SET_USED(list); - RTE_SET_USED(num); - - return NULL; -#else struct rte_stack_lf_head old_head; uint64_t len; int success; @@ -169,7 +154,6 @@ __rte_stack_lf_pop_elems(struct
Re: [dpdk-dev] [PATCH 3/6] net/mlx: fix meson build with custom dependency path
From: Thomas Monjalon > If rdma-core is not installed in a standard directory of the system, it is > possible to specify the location of the pkgconfig file via an environment > variable: > PKG_CONFIG_PATH=$PKG_CONFIG_PATH:~/rdma-core/build/lib/pkgconfig > > In this case, the dependency may become mandatory to specify for the > configuration tests (checking dependency symbols or fields). > > Some spacing is also fixed around. > > Fixes: 8e4937640022 ("net/mlx4: add external allocator for Verbs object") > Fixes: 1dd7c7e38c19 ("net/mlx4: support meson build") > Fixes: 96d7c62a70c7 ("net/mlx5: support meson build") > Cc: sta...@dpdk.org > > Suggested-by: Luca Boccassi > Signed-off-by: Thomas Monjalon Acked-by: Matan Azrad
Re: [dpdk-dev] [PATCH 1/2] test: replace license text with SPDX tag
Acked-by: Hemant Agrawal
Re: [dpdk-dev] [PATCH 2/2] doc: replace license text with SPDX tag
Acked-by: Hemant Agrawal
Re: [dpdk-dev] [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare exchange
> -Original Message- > From: Phil Yang > Sent: Wednesday, August 14, 2019 1:58 PM > To: tho...@monjalon.net; Jerin Jacob Kollanukkaran ; > gage.e...@intel.com; dev@dpdk.org > Cc: hemant.agra...@nxp.com; honnappa.nagaraha...@arm.com; > gavin...@arm.com; n...@arm.com > Subject: [EXT] [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare > exchange > +#define __HAS_ACQ(mo) ((mo) != __ATOMIC_RELAXED && (mo) != > +__ATOMIC_RELEASE) #define __HAS_RLS(mo) ((mo) == > __ATOMIC_RELEASE || (mo) == __ATOMIC_ACQ_REL || \ > + (mo) == __ATOMIC_SEQ_CST) > + > +#define __MO_LOAD(mo) (__HAS_ACQ((mo)) ? __ATOMIC_ACQUIRE : > +__ATOMIC_RELAXED) #define __MO_STORE(mo) (__HAS_RLS((mo)) ? > +__ATOMIC_RELEASE : __ATOMIC_RELAXED) > + > +#if defined(__ARM_FEATURE_ATOMICS) || > defined(RTE_ARM_FEATURE_ATOMICS) > +#define __ATOMIC128_CAS_OP(cas_op_name, op_string) \ > +static __rte_noinline rte_int128_t \ Could you check the cost of making it as __rte_noinline? If it is costly, How about having two versions, one with __rte_noinline to make compliance with arm64 procedure call standard for old gcc and clang. Other one without explicit register hardcoding + inline for latest gcc > +cas_op_name(rte_int128_t *dst, rte_int128_t old,\ > + rte_int128_t updated) \ > +{ \ > + /* caspX instructions register pair must start from even-numbered > + * register at operand 1. > + * So, specify registers for local variables here. > + */ \ > + register uint64_t x0 __asm("x0") = (uint64_t)old.val[0];\ > + register uint64_t x1 __asm("x1") = (uint64_t)old.val[1];\ > + register uint64_t x2 __asm("x2") = (uint64_t)updated.val[0];\ > + register uint64_t x3 __asm("x3") = (uint64_t)updated.val[1];\ > + asm volatile( \ > + op_string " %[old0], %[old1], %[upd0], %[upd1], [%[dst]]" \ > + : [old0] "+r" (x0), \ > + [old1] "+r" (x1)\ > + : [upd0] "r" (x2), \ > + [upd1] "r" (x3),\ > + [dst] "r" (dst) \ > + : "memory");\ > + old.val[0] = x0;\ > + old.val[1] = x1;\ > + return old; \ > +} > +
Re: [dpdk-dev] [PATCH v2] ethdev: add more protocol support in flow API
Hi Wang Ying, On Wed, Aug 14, 2019 at 11:24:30AM +0800, Wang Ying A wrote: > Add new protocol header match support as below > > RTE_FLOW_ITEM_TYPE_GTP_PSC > - matches a GTP PDU extension header (type is 0x85: > PDU Session Container) > RTE_FLOW_ITEM_TYPE_PPPOES > - matches a PPPoE Session header. > RTE_FLOW_ITEM_TYPE_PPPOED > - matches a PPPoE Discovery stage header. > > Signed-off-by: Wang Ying A OK, please split this patch, one for GTP and the other for PPPoE to make title less vague than "add more protocol support". For PPPoE, the distinction between session and discovery is not a bad idea but since discovery packets typically have a different format (tags in place of protocol), I'm not sure they should share a common structure. How about a single "PPPOE" item without proto_id to cover both session and discovery, then later add separate tag items on a needed basis for each possible/optional tag (e.g. PPPOE_TAG_SERVICE_NAME)? Likewise proto_id would be provided through a separate optional item if relevant (PPPOE_PROTO_ID). Such an approach is already used for IPV6 and IPV6_EXT. Another benefit is that applications could match PPPoE regardless of session or discovery when they do not care, while PPPOED/PPPOES make that distinction mandatory. Also by inserting new entries in the middle of the pattern items list, this patch breaks ABI. I think it's not on purpose, so please move them to the end (not grouping them with existing GTP stuff is fine, ABI compat is more important.) This must be reflected in rte_flow.h, rte_flow.c, testpmd and documentation. More nits below. > --- > --- > v2: Remove Gerrit Change-Id's. > --- > app/test-pmd/cmdline_flow.c | 80 > + > doc/guides/prog_guide/rte_flow.rst | 25 + > doc/guides/testpmd_app_ug/testpmd_funcs.rst | 10 > lib/librte_ethdev/rte_flow.c| 3 ++ > lib/librte_ethdev/rte_flow.h| 71 + > 5 files changed, 189 insertions(+) > > diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c [...] > + [ITEM_PPPOES] = { > + .name = "pppoes", > + .help = "match PPPoE Session header", Session => session > + .priv = PRIV_ITEM(PPPOES, sizeof(struct rte_flow_item_pppoe)), > + .next = NEXT(item_pppoe), > + .call = parse_vc, > + }, > + [ITEM_PPPOED] = { > + .name = "pppoed", > + .help = "match PPPoE Discovery stage header", Discovery => discovery > + .priv = PRIV_ITEM(PPPOED, sizeof(struct rte_flow_item_pppoe)), > + .next = NEXT(item_pppoe), > + .call = parse_vc, > + }, > + [ITEM_PPPOE_SEID] = { > + .name = "seid", > + .help = "Session identifier", Session => session [...] > diff --git a/doc/guides/prog_guide/rte_flow.rst > b/doc/guides/prog_guide/rte_flow.rst > index 821b524b3..d09c42071 100644 > --- a/doc/guides/prog_guide/rte_flow.rst > +++ b/doc/guides/prog_guide/rte_flow.rst > @@ -1055,6 +1055,31 @@ flow rules. > - ``teid``: tunnel endpoint identifier. > - Default ``mask`` matches teid only. > > +Item: ``GTP_PSC`` > +^^^ Too many "^^^"'s. [...] > +Item: ``PPPOES``, ``PPPOED`` > + > + > +Matches a PPPOE header. PPPOE => PPPoE > + > +Note: PPPOES and PPPOED use the same structure. PPPOES and PPPOED item item => items (or better, "pattern items") > +are defined for a user-friendly API when creating PPPOE-Session and > +PPPOE-Discovery flow rules. Super nit: use "PPPoE" when mentioning the protocol itself and "PPPOE" when mentioning the pattern item. > + > +- ``v_t_flags``: version (4b), type (4b). Why "flags"? I don't see any so you could name it "version_type" (same in documentation). > +- ``code``: Message type. Message => message > +- ``session_id``: Session identifier. Session => session > +- ``length``: Payload length. Payload => payload > +- ``proto_id``: PPP Protocol identifier. Protocol => protocol > +- Default ``mask`` matches session_id,proto_id. Missing space between session_id and proto_id. > + > Item: ``ESP`` > ^ > > diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst > b/doc/guides/testpmd_app_ug/testpmd_funcs.rst > index 313e0707e..0da36d5f1 100644 > --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst > +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst > @@ -3904,6 +3904,16 @@ This section lists supported pattern items and their > attributes, if any. > >- ``teid {unsigned}``: tunnel endpoint identifier. > > +- ``gtp_psc``: match GTPv1 entension header (type is 0x85). > + > + - ``pdu_type {unsigned}``: PDU type (0 or 1). > + - ``qfi {unsigned}``: QoS flow identifier. > + > +- ``pppoes``, ``pppoed``: match PPPOE header. PPPOE => PPPoE [...] > diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h > index b66b
Re: [dpdk-dev] [RFC] ethdev: allow multiple security sessions to use one rte flow
Hi all, Reminder...! If there are no concerns, I'll send the patch after adding the required changes in ipsec-secgw as well. Thanks, Anoob > -Original Message- > From: Anoob Joseph > Sent: Friday, August 2, 2019 11:05 AM > To: Anoob Joseph ; Akhil Goyal > ; Adrien Mazarguil ; > Declan Doherty ; Pablo de Lara > ; Thomas Monjalon > > Cc: Jerin Jacob Kollanukkaran ; Narayana Prasad Raju > Athreya ; Ankur Dwivedi > ; Shahaf Shuler ; > Hemant Agrawal ; Matan Azrad > ; Yongseok Koh ; Wenzhuo > Lu ; Konstantin Ananyev > ; Radu Nicolau ; > dev@dpdk.org > Subject: RE: [RFC] ethdev: allow multiple security sessions to use one rte > flow > > Hi Akhil, Adrien, Declan, Pablo, > > Can you review this proposal and share your feedback? > > Thanks, > Anoob > > > -Original Message- > > From: Anoob Joseph > > Sent: Wednesday, July 24, 2019 7:47 PM > > To: Akhil Goyal ; Adrien Mazarguil > > ; Declan Doherty > > ; Pablo de Lara > > ; Thomas Monjalon > > > > Cc: Anoob Joseph ; Jerin Jacob Kollanukkaran > > ; Narayana Prasad Raju Athreya > > ; Ankur Dwivedi ; > Shahaf > > Shuler ; Hemant Agrawal > > ; Matan Azrad ; > Yongseok > > Koh ; Wenzhuo Lu ; > > Konstantin Ananyev ; Radu Nicolau > > ; dev@dpdk.org > > Subject: [RFC] ethdev: allow multiple security sessions to use one rte > > flow > > > > The rte_security API which enables inline protocol/crypto feature > > mandates that for every security session an rte_flow is created. This > > would internally translate to a rule in the hardware which would do packet > classification. > > > > In rte_securty, one SA would be one security session. And if an > > rte_flow need to be created for every session, the number of SAs > > supported by an inline implementation would be limited by the number > > of rte_flows the PMD would be able to support. > > > > If the fields SPI & IP addresses are allowed to be a range, then this > > limitation can be overcome. Multiple flows will be able to use one > > rule for SECURITY processing. In this case, the security session provided as > conf would be NULL. > > > > Application should do an rte_flow_validate() to make sure the flow is > > supported on the PMD. > > > > Signed-off-by: Anoob Joseph > > --- > > lib/librte_ethdev/rte_flow.h | 6 ++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/lib/librte_ethdev/rte_flow.h > > b/lib/librte_ethdev/rte_flow.h index f3a8fb1..4977d3c 100644 > > --- a/lib/librte_ethdev/rte_flow.h > > +++ b/lib/librte_ethdev/rte_flow.h > > @@ -1879,6 +1879,12 @@ struct rte_flow_action_meter { > > * direction. > > * > > * Multiple flows can be configured to use the same security session. > > + * > > + * The NULL value is allowed for security session. If security > > + session is NULL, > > + * then SPI field in ESP flow item and IP addresses in flow items > > + 'IPv4' and > > + * 'IPv6' will be allowed to be a range. The rule thus created can > > + enable > > + * SECURITY processing on multiple flows. > > + * > > */ > > struct rte_flow_action_security { > > void *security_session; /**< Pointer to security session structure. > > */ > > -- > > 2.7.4
[dpdk-dev] [patch] net/octeontx2: fix ptype get overflow
From: Pavan Nikhilesh The function `rte_eth_dev_get_supported_ptypes` expects the underlying ethernet device to return array of supported ptypes. The ethernet device needs to set `RTE_PTYPE_UNKNOWN` as the last element which signifies thats its the end of the ptype array. Else the function `rte_eth_dev_get_supported_ptypes` might overflow. Fixes: 6e892eabce11 ("net/octeontx2: support packet type") Signed-off-by: Pavan Nikhilesh --- drivers/net/octeontx2/otx2_lookup.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/octeontx2/otx2_lookup.c b/drivers/net/octeontx2/otx2_lookup.c index 99199d08a..3347e7014 100644 --- a/drivers/net/octeontx2/otx2_lookup.c +++ b/drivers/net/octeontx2/otx2_lookup.c @@ -53,6 +53,7 @@ otx2_nix_supported_ptypes_get(struct rte_eth_dev *eth_dev) RTE_PTYPE_INNER_L4_UDP, /* LH */ RTE_PTYPE_INNER_L4_SCTP, /* LH */ RTE_PTYPE_INNER_L4_ICMP, /* LH */ + RTE_PTYPE_UNKNOWN, }; if (dev->rx_offload_flags & NIX_RX_OFFLOAD_PTYPE_F) -- 2.17.1
[dpdk-dev] [PATCH] common/cpt: add support for new firmware
From: Ankur Dwivedi With the latest firmware, there are few changes for zuc and snow3g. 1. The iv_source is present in bitfield 7 of minor opcode. In the old firmware this was present in bitfield 6. 2. Algorithm type is a 2 bit field in new firmware. In the old firmware it was named as cipher type and it was a 1 bit field. Signed-off-by: Ankur Dwivedi Signed-off-by: Anoob Joseph --- drivers/common/cpt/cpt_mcode_defines.h | 4 ++-- drivers/common/cpt/cpt_ucode.h | 6 -- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/common/cpt/cpt_mcode_defines.h b/drivers/common/cpt/cpt_mcode_defines.h index c0adbd5..b7c3feb 100644 --- a/drivers/common/cpt/cpt_mcode_defines.h +++ b/drivers/common/cpt/cpt_mcode_defines.h @@ -303,8 +303,8 @@ struct cpt_ctx { uint64_t hmac :1; uint64_t zsk_flags :3; uint64_t k_ecb :1; - uint64_t snow3g :1; - uint64_t rsvd :22; + uint64_t snow3g :2; + uint64_t rsvd :21; /* Below fields are accessed by hardware */ union { mc_fc_context_t fctx; diff --git a/drivers/common/cpt/cpt_ucode.h b/drivers/common/cpt/cpt_ucode.h index 7d9c31e..0dac12e 100644 --- a/drivers/common/cpt/cpt_ucode.h +++ b/drivers/common/cpt/cpt_ucode.h @@ -1467,7 +1467,8 @@ cpt_zuc_snow3g_enc_prep(uint32_t req_flags, opcode.s.major = CPT_MAJOR_OP_ZUC_SNOW3G; /* indicates CPTR ctx, operation type, KEY & IV mode from DPTR */ - opcode.s.minor = ((1 << 6) | (snow3g << 5) | (0 << 4) | + + opcode.s.minor = ((1 << 7) | (snow3g << 5) | (0 << 4) | (0 << 3) | (flags & 0x7)); if (flags == 0x1) { @@ -1791,7 +1792,8 @@ cpt_zuc_snow3g_dec_prep(uint32_t req_flags, opcode.s.major = CPT_MAJOR_OP_ZUC_SNOW3G; /* indicates CPTR ctx, operation type, KEY & IV mode from DPTR */ - opcode.s.minor = ((1 << 6) | (snow3g << 5) | (0 << 4) | + + opcode.s.minor = ((1 << 7) | (snow3g << 5) | (0 << 4) | (0 << 3) | (flags & 0x7)); /* consider iv len */ -- 2.7.4
Re: [dpdk-dev] [PATCH 4/6] net/mlx: fix build with make and recent gcc
From: Thomas Monjalon > With VERBOSE=1, this error was seen in debug mode with gcc 9.1: > > In file included from /tmp/dpdk.auto-config-h.sh.c.w0VWMi:1: > In file included from rdma-core/build/include/infiniband/mlx5dv.h:47: > In file included from rdma-core/build/include/infiniband/verbs.h:46: > In file included from rdma-core/build/include/infiniband/verbs_api.h:66: > In file included from rdma- > core/build/include/infiniband/ib_user_ioctl_verbs.h:38: > include/rdma/ib_user_verbs.h:161:28: fatal error: > zero size arrays are an extension [-Wzero-length-array] > __aligned_u64 driver_data0; > ^ > 1 error generated. > > As a result, buildtools/auto-config-h.sh was not generating a correct > autoconf file, so the compilation was generating such error: > > fatal error: redefinition of > 'mlx5_ib_uapi_flow_action_packet_reformat_type' > > It is fixed by disabling -pedantic option when calling auto-config-h.sh from > the makefile-based system. > > Cc: adrien.mazarg...@6wind.com > Cc: sta...@dpdk.org > > Signed-off-by: Thomas Monjalon Acked-by: Matan Azrad Consider to create more patches to cleanup the compiler commands to ignore pedantic in the code. Matan
Re: [dpdk-dev] [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare exchange
> -Original Message- > From: Jerin Jacob Kollanukkaran > Sent: Wednesday, August 14, 2019 4:46 PM > To: Phil Yang (Arm Technology China) ; > tho...@monjalon.net; gage.e...@intel.com; dev@dpdk.org > Cc: hemant.agra...@nxp.com; Honnappa Nagarahalli > ; Gavin Hu (Arm Technology China) > ; nd > Subject: RE: [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare exchange > > > -Original Message- > > From: Phil Yang > > Sent: Wednesday, August 14, 2019 1:58 PM > > To: tho...@monjalon.net; Jerin Jacob Kollanukkaran > ; > > gage.e...@intel.com; dev@dpdk.org > > Cc: hemant.agra...@nxp.com; honnappa.nagaraha...@arm.com; > > gavin...@arm.com; n...@arm.com > > Subject: [EXT] [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare > > exchange > > +#define __HAS_ACQ(mo) ((mo) != __ATOMIC_RELAXED && (mo) != > > +__ATOMIC_RELEASE) #define __HAS_RLS(mo) ((mo) == > > __ATOMIC_RELEASE || (mo) == __ATOMIC_ACQ_REL || \ > > + (mo) == __ATOMIC_SEQ_CST) > > + > > +#define __MO_LOAD(mo) (__HAS_ACQ((mo)) ? __ATOMIC_ACQUIRE : > > +__ATOMIC_RELAXED) #define __MO_STORE(mo) (__HAS_RLS((mo)) ? > > +__ATOMIC_RELEASE : __ATOMIC_RELAXED) > > + > > +#if defined(__ARM_FEATURE_ATOMICS) || > > defined(RTE_ARM_FEATURE_ATOMICS) > > +#define __ATOMIC128_CAS_OP(cas_op_name, op_string) > > \ > > +static __rte_noinline rte_int128_t > > \ > > > Could you check the cost of making it as __rte_noinline? > If it is costly, How about having two versions, one with __rte_noinline > to make compliance with arm64 procedure call standard for > old gcc and clang. > Other one without explicit register hardcoding + inline for latest > gcc Hi Jerin, According to the stack_lf_perf_autotest, making it as __rte_noinline has no overhead on ThunderX2 with GCC 8.3. The 'Average cycles per object push/pop' numbers for __rte_noinline and __rte_always_inline versions are nearly the same. Test results : ## Two NUMA Node ## __rte_noinline RTE>>stack_lf_perf_autotest ### Testing using two NUMA nodes ### Average cycles per object push/pop (bulk size: 8): 24.10 Average cycles per object push/pop (bulk size: 32): 6.85 ### Testing on all 18 lcores ### Average cycles per object push/pop (bulk size: 8): 680.39 Average cycles per object push/pop (bulk size: 32): 146.38 Test OK __rte_always-inline RTE>>stack_lf_perf_autotest ### Testing using two NUMA nodes ### Average cycles per object push/pop (bulk size: 8): 24.29 Average cycles per object push/pop (bulk size: 32): 6.92 ### Testing on all 18 lcores ### Average cycles per object push/pop (bulk size: 8): 683.92 Average cycles per object push/pop (bulk size: 32): 145.11 Test OK ## Single NUMA ## __rte_always-inline RTE>>stack_lf_perf_autotest ### Testing on all 18 lcores ### Average cycles per object push/pop (bulk size: 8): 582.92 Average cycles per object push/pop (bulk size: 32): 125.57 Test OK __rte_noinline RTE>>stack_lf_perf_autotest ### Testing on all 18 lcores ### Average cycles per object push/pop (bulk size: 8): 537.56 Average cycles per object push/pop (bulk size: 32): 122.98 Test OK Thanks, Phil Yang > > > > +cas_op_name(rte_int128_t *dst, rte_int128_t old, > > \ > > + rte_int128_t updated) \ > > +{ > > \ > > + /* caspX instructions register pair must start from even-numbered > > +* register at operand 1. > > +* So, specify registers for local variables here. > > +*/ \ > > + register uint64_t x0 __asm("x0") = (uint64_t)old.val[0];\ > > + register uint64_t x1 __asm("x1") = (uint64_t)old.val[1];\ > > + register uint64_t x2 __asm("x2") = (uint64_t)updated.val[0];\ > > + register uint64_t x3 __asm("x3") = (uint64_t)updated.val[1];\ > > + asm volatile( \ > > + op_string " %[old0], %[old1], %[upd0], %[upd1], [%[dst]]" \ > > + : [old0] "+r" (x0), \ > > + [old1] "+r" (x1)\ > > + : [upd0] "r" (x2), \ > > + [upd1] "r" (x3),\ > > + [dst] "r" (dst) \ > > + : "memory");\ > > + old.val[0] = x0;\ > > + old.val[1] = x1;\ > > + return old; \ > > +} > > +
[dpdk-dev] [Bug 339] net/af_packet: af_packet driver is leaving stale socket after device is removed
https://bugs.dpdk.org/show_bug.cgi?id=339 Bug ID: 339 Summary: net/af_packet: af_packet driver is leaving stale socket after device is removed Product: DPDK Version: 18.02 Hardware: x86 OS: Linux Status: UNCONFIRMED Severity: normal Priority: Normal Component: ethdev Assignee: dev@dpdk.org Reporter: abhishek.sac...@altran.com CC: linvi...@tuxdriver.com Target Milestone: --- I am trying to use af_packet dpdk pmd driver in "ovs with dpdk". Looks like af_packet driver is leaving stale sockets after device is removed. The stale sockets can be confirmed by the output of "ss -0" command. Kindly let me know if this is known/valid issue. Have suggested a code change in dpdk net/af_packet to fix the issue, please suggest if this is appropriate fix so that i can submit a patch. Below is the output of "ss -0" after addition and deletion of port in ovs : [1] Add port in ovs using command "ovs-vsctl add-port br0 wan -- set Interface wan type=dpdk options:dpdk-devargs=eth_af_packet0,iface=veth0" #ss -0 Netid Recv-Q Send-Q Local Address:Port Peer Address:Port p_raw 0 0*:veth0 * [2] Delete port in ovs using command "ovs-vsctl del-port br0 wan" #ss -0 Netid Recv-Q Send-Q Local Address:Port Peer Address:Port p_raw 0 0*:veth0 * "ss -0" output keeps showing stale entry , and the stale entries keeps piling up with multiple add/del triggered for same port in ovs. All the entries gets removed only when ovs process is exited. Went through dpdk net/af_packet code and below are the findings:- [1] Ring buffers are memory mapped when device is added using rte_dev_probe [2] There is no corresponding munmap call when device is removed/closed [3] The issue is fixed by calling munmap from rte_pmd_af_packet_remove() Patch for changes done to fix the issue :- --- a/drivers/net/af_packet/rte_eth_af_packet.c +++ b/drivers/net/af_packet/rte_eth_af_packet.c @@ -972,6 +972,7 @@ { struct rte_eth_dev *eth_dev = NULL; struct pmd_internals *internals; + struct tpacket_req *req; unsigned q; PMD_LOG(INFO, "Closing AF_PACKET ethdev on numa socket %u", @@ -992,7 +993,10 @@ return rte_eth_dev_release_port(eth_dev); internals = eth_dev->data->dev_private; + req = &(internals->req); for (q = 0; q < internals->nb_queues; q++) { + munmap(internals->rx_queue[q].map, + 2 * req->tp_block_size * req->tp_block_nr); rte_free(internals->rx_queue[q].rd); rte_free(internals->tx_queue[q].rd); } -- You are receiving this mail because: You are the assignee for the bug.
[dpdk-dev] [Bug 340] Can't build examples in Ubuntu 18 after commit 4131ad5db from 03/07/2019
https://bugs.dpdk.org/show_bug.cgi?id=340 Bug ID: 340 Summary: Can't build examples in Ubuntu 18 after commit 4131ad5db from 03/07/2019 Product: DPDK Version: unspecified Hardware: x86 OS: Linux Status: UNCONFIRMED Severity: normal Priority: Normal Component: examples Assignee: dev@dpdk.org Reporter: john.soa...@accelercomm.com Target Milestone: --- Created attachment 65 --> https://bugs.dpdk.org/attachment.cgi?id=65&action=edit Patch to reverse the commit After bisecting I found out that the commit 4131ad5db79a016970287282b938ebed2f19bdb3 prevents building the examples. This is the error message: "gcc: error: rte_config.h: No such file or directory" Everything works fine after applying a patch that reverts the mentioned commit. It might be my mistake but these are the commands that I run from the project's folder: make config T=x86_64-native-linux-gcc; cd build; make; make -C ../examples/; This is the full message from bisecting: " 4131ad5db79a016970287282b938ebed2f19bdb3 is the first bad commit commit 4131ad5db79a016970287282b938ebed2f19bdb3 Author: Bruce Richardson Date: Wed Jul 3 17:40:00 2019 +0100 examples: fix pkg-config detection with older make Make versions before 4.2 did not have support for the .SHELLSTATUS variable, so use another method to detect shell success. Fixes: 22119c4591a0 ("examples: use pkg-config in makefiles") Cc: sta...@dpdk.org Signed-off-by: Bruce Richardson Acked-by: Luca Boccassi :04 04 a19effc010e643f86f2d3951d86bf8651bec0d63 096fc047eab18a285d7f6d0c783ce47316fcc435 M examples " -- You are receiving this mail because: You are the assignee for the bug.
Re: [dpdk-dev] [RFC] ethdev: allow multiple security sessions to use one rte flow
Hi Anoob, > > Hi all, > > Reminder...! > Sorry for a delayed response. > If there are no concerns, I'll send the patch after adding the required > changes in > ipsec-secgw as well. > > Thanks, > Anoob > > > -Original Message- > > From: Anoob Joseph > > Sent: Friday, August 2, 2019 11:05 AM > > To: Anoob Joseph ; Akhil Goyal > > ; Adrien Mazarguil ; > > Declan Doherty ; Pablo de Lara > > ; Thomas Monjalon > > > > Cc: Jerin Jacob Kollanukkaran ; Narayana Prasad Raju > > Athreya ; Ankur Dwivedi > > ; Shahaf Shuler ; > > Hemant Agrawal ; Matan Azrad > > ; Yongseok Koh ; Wenzhuo > > Lu ; Konstantin Ananyev > > ; Radu Nicolau ; > > dev@dpdk.org > > Subject: RE: [RFC] ethdev: allow multiple security sessions to use one rte > > flow > > > > Hi Akhil, Adrien, Declan, Pablo, > > > > Can you review this proposal and share your feedback? > > > > Thanks, > > Anoob > > > > > -Original Message- > > > From: Anoob Joseph > > > Sent: Wednesday, July 24, 2019 7:47 PM > > > To: Akhil Goyal ; Adrien Mazarguil > > > ; Declan Doherty > > > ; Pablo de Lara > > > ; Thomas Monjalon > > > > > > Cc: Anoob Joseph ; Jerin Jacob Kollanukkaran > > > ; Narayana Prasad Raju Athreya > > > ; Ankur Dwivedi ; > > Shahaf > > > Shuler ; Hemant Agrawal > > > ; Matan Azrad ; > > Yongseok > > > Koh ; Wenzhuo Lu ; > > > Konstantin Ananyev ; Radu Nicolau > > > ; dev@dpdk.org > > > Subject: [RFC] ethdev: allow multiple security sessions to use one rte > > > flow > > > > > > The rte_security API which enables inline protocol/crypto feature > > > mandates that for every security session an rte_flow is created. This > > > would internally translate to a rule in the hardware which would do packet > > classification. > > > > > > In rte_securty, one SA would be one security session. And if an > > > rte_flow need to be created for every session, the number of SAs > > > supported by an inline implementation would be limited by the number > > > of rte_flows the PMD would be able to support. > > > > > > If the fields SPI & IP addresses are allowed to be a range, then this > > > limitation can be overcome. Multiple flows will be able to use one > > > rule for SECURITY processing. In this case, the security session provided > > > as > > conf would be NULL. SPI values are normally used to uniquely identify the SA that need to be applied on a particular flow. I believe SPI value should not be a range for applying a particular SA or session. Plain packet IP addresses can be a range. That is not an issue. Multiple plain packet flows can use the same session/SA. Why do you feel that security session provided should be NULL to support multiple flows. How will the keys and other SA related info will be passed to the driver/HW. > > > > > > Application should do an rte_flow_validate() to make sure the flow is > > > supported on the PMD. > > > > > > Signed-off-by: Anoob Joseph > > > --- > > > lib/librte_ethdev/rte_flow.h | 6 ++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/lib/librte_ethdev/rte_flow.h > > > b/lib/librte_ethdev/rte_flow.h index f3a8fb1..4977d3c 100644 > > > --- a/lib/librte_ethdev/rte_flow.h > > > +++ b/lib/librte_ethdev/rte_flow.h > > > @@ -1879,6 +1879,12 @@ struct rte_flow_action_meter { > > > * direction. > > > * > > > * Multiple flows can be configured to use the same security session. > > > + * > > > + * The NULL value is allowed for security session. If security > > > + session is NULL, > > > + * then SPI field in ESP flow item and IP addresses in flow items > > > + 'IPv4' and > > > + * 'IPv6' will be allowed to be a range. The rule thus created can > > > + enable > > > + * SECURITY processing on multiple flows. > > > + * > > > */ > > > struct rte_flow_action_security { > > > void *security_session; /**< Pointer to security session structure. > > > */ > > > -- > > > 2.7.4
Re: [dpdk-dev] [PATCH 1/2] test: replace license text with SPDX tag
> -Original Message- > From: Legacy, Allain > Sent: Tuesday, August 13, 2019 8:20 AM > To: hemant.agra...@nxp.com > Cc: dev@dpdk.org; john.mcnam...@intel.com; marko.kovace...@intel.com; > cristian.dumitre...@intel.com; Peters, Matt > Subject: [PATCH 1/2] test: replace license text with SPDX tag > > Replacing full license text with SPDX tag. > > Signed-off-by: Allain Legacy > --- Acked-by: Matt Peters
Re: [dpdk-dev] [PATCH 2/2] doc: replace license text with SPDX tag
> -Original Message- > From: Legacy, Allain > Sent: Tuesday, August 13, 2019 8:20 AM > To: hemant.agra...@nxp.com > Cc: dev@dpdk.org; john.mcnam...@intel.com; marko.kovace...@intel.com; > cristian.dumitre...@intel.com; Peters, Matt > Subject: [PATCH 2/2] doc: replace license text with SPDX tag > > Replace full license text with SPDX tag. > > Signed-off-by: Allain Legacy Acked-by: Matt Peters
Re: [dpdk-dev] [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare exchange
> -Original Message- > From: Phil Yang (Arm Technology China) > Sent: Wednesday, August 14, 2019 3:55 PM > To: Jerin Jacob Kollanukkaran ; tho...@monjalon.net; > gage.e...@intel.com; dev@dpdk.org > Cc: hemant.agra...@nxp.com; Honnappa Nagarahalli > ; Gavin Hu (Arm Technology China) > ; nd ; nd > Subject: [EXT] RE: [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare > exchange > > External Email > > -- > > -Original Message- > > From: Jerin Jacob Kollanukkaran > > Sent: Wednesday, August 14, 2019 4:46 PM > > To: Phil Yang (Arm Technology China) ; > > tho...@monjalon.net; gage.e...@intel.com; dev@dpdk.org > > Cc: hemant.agra...@nxp.com; Honnappa Nagarahalli > > ; Gavin Hu (Arm Technology China) > > ; nd > > Subject: RE: [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare > > exchange > > > > > -Original Message- > > > From: Phil Yang > > > Sent: Wednesday, August 14, 2019 1:58 PM > > > To: tho...@monjalon.net; Jerin Jacob Kollanukkaran > > ; > > > gage.e...@intel.com; dev@dpdk.org > > > Cc: hemant.agra...@nxp.com; honnappa.nagaraha...@arm.com; > > > gavin...@arm.com; n...@arm.com > > > Subject: [EXT] [PATCH v9 1/3] eal/arm64: add 128-bit atomic compare > > > exchange > > > +#define __HAS_ACQ(mo) ((mo) != __ATOMIC_RELAXED && (mo) != > > > +__ATOMIC_RELEASE) #define __HAS_RLS(mo) ((mo) == > > > __ATOMIC_RELEASE || (mo) == __ATOMIC_ACQ_REL || \ > > > + (mo) == __ATOMIC_SEQ_CST) > > > + > > > +#define __MO_LOAD(mo) (__HAS_ACQ((mo)) ? __ATOMIC_ACQUIRE : > > > +__ATOMIC_RELAXED) #define __MO_STORE(mo) (__HAS_RLS((mo)) ? > > > +__ATOMIC_RELEASE : __ATOMIC_RELAXED) > > > + > > > +#if defined(__ARM_FEATURE_ATOMICS) || > > > defined(RTE_ARM_FEATURE_ATOMICS) > > > +#define __ATOMIC128_CAS_OP(cas_op_name, op_string) > \ > > > +static __rte_noinline rte_int128_t > > >\ > > > > > > Could you check the cost of making it as __rte_noinline? > > If it is costly, How about having two versions, one with > > __rte_noinline to make compliance with arm64 procedure call standard > > for old gcc and clang. > > Other one without explicit register hardcoding + inline for latest gcc > > Hi Jerin, Hi Phil Yang, > According to the stack_lf_perf_autotest, making it as __rte_noinline has no > overhead on ThunderX2 with GCC 8.3. > The 'Average cycles per object push/pop' numbers for __rte_noinline and > __rte_always_inline versions are nearly the same. I tested with octeontx2 as well. It is yielding similar result. No change is expected in this patch then.
[dpdk-dev] [PATCH] bpf: hide internal program arg type
From: Jerin Jacob RTE_BPF_ARG_PTR_STACK is used as internal program arg type. Rename to RTE_BPF_ARG_RESERVED to avoid exposing internal program type. Signed-off-by: Jerin Jacob --- lib/librte_bpf/bpf_validate.c | 12 +++- lib/librte_bpf/rte_bpf.h | 2 +- 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/lib/librte_bpf/bpf_validate.c b/lib/librte_bpf/bpf_validate.c index 0cf41fa27..6bd6f78e9 100644 --- a/lib/librte_bpf/bpf_validate.c +++ b/lib/librte_bpf/bpf_validate.c @@ -15,6 +15,8 @@ #include "bpf_impl.h" +#define BPF_ARG_PTR_STACK RTE_BPF_ARG_RESERVED + struct bpf_reg_val { struct rte_bpf_arg v; uint64_t mask; @@ -710,7 +712,7 @@ eval_ptr(struct bpf_verifier *bvf, struct bpf_reg_val *rm, uint32_t opsz, if (rm->u.max % align != 0) return "unaligned memory access"; - if (rm->v.type == RTE_BPF_ARG_PTR_STACK) { + if (rm->v.type == BPF_ARG_PTR_STACK) { if (rm->u.max != rm->u.min || rm->s.max != rm->s.min || rm->u.max != (uint64_t)rm->s.max) @@ -764,7 +766,7 @@ eval_load(struct bpf_verifier *bvf, const struct ebpf_insn *ins) if (err != NULL) return err; - if (rs.v.type == RTE_BPF_ARG_PTR_STACK) { + if (rs.v.type == BPF_ARG_PTR_STACK) { sv = st->sv + rs.u.max / sizeof(uint64_t); if (sv->v.type == RTE_BPF_ARG_UNDEF || sv->mask < msk) @@ -859,7 +861,7 @@ eval_store(struct bpf_verifier *bvf, const struct ebpf_insn *ins) if (err != NULL) return err; - if (rd.v.type == RTE_BPF_ARG_PTR_STACK) { + if (rd.v.type == BPF_ARG_PTR_STACK) { sv = st->sv + rd.u.max / sizeof(uint64_t); if (BPF_CLASS(ins->code) == BPF_STX && @@ -908,7 +910,7 @@ eval_func_arg(struct bpf_verifier *bvf, const struct rte_bpf_arg *arg, * pointer to the variable on the stack is passed * as an argument, mark stack space it occupies as initialized. */ - if (err == NULL && rv->v.type == RTE_BPF_ARG_PTR_STACK) { + if (err == NULL && rv->v.type == BPF_ARG_PTR_STACK) { i = rv->u.max / sizeof(uint64_t); n = i + arg->size / sizeof(uint64_t); @@ -2131,7 +2133,7 @@ evaluate(struct bpf_verifier *bvf) /* initial state of frame pointer */ static const struct bpf_reg_val rvfp = { .v = { - .type = RTE_BPF_ARG_PTR_STACK, + .type = BPF_ARG_PTR_STACK, .size = MAX_BPF_STACK_SIZE, }, .mask = UINT64_MAX, diff --git a/lib/librte_bpf/rte_bpf.h b/lib/librte_bpf/rte_bpf.h index cd4d56dea..cbf1cddac 100644 --- a/lib/librte_bpf/rte_bpf.h +++ b/lib/librte_bpf/rte_bpf.h @@ -32,7 +32,7 @@ enum rte_bpf_arg_type { RTE_BPF_ARG_RAW,/**< scalar value */ RTE_BPF_ARG_PTR = 0x10, /**< pointer to data buffer */ RTE_BPF_ARG_PTR_MBUF, /**< pointer to rte_mbuf */ - RTE_BPF_ARG_PTR_STACK, + RTE_BPF_ARG_RESERVED, /**< reserved for internal use */ }; /** -- 2.22.0
Re: [dpdk-dev] [RFC] ethdev: support hairpin queue
On Wed, 14 Aug 2019 06:05:13 + Ori Kam wrote: > > -Original Message- > > From: Ori Kam > > Sent: Wednesday, August 14, 2019 8:36 AM > > To: Stephen Hemminger > > Cc: Thomas Monjalon ; ferruh.yi...@intel.com; > > arybche...@solarflare.com; Shahaf Shuler ; Slava > > Ovsiienko ; Alex Rosenbaum > > ; dev@dpdk.org > > Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue > > > > Hi Stephen, > > > > > -Original Message- > > > From: Stephen Hemminger > > > Sent: Tuesday, August 13, 2019 6:46 PM > > > To: Ori Kam > > > Cc: Thomas Monjalon ; ferruh.yi...@intel.com; > > > arybche...@solarflare.com; Shahaf Shuler ; Slava > > > Ovsiienko ; Alex Rosenbaum > > > ; dev@dpdk.org > > > Subject: Re: [dpdk-dev] [RFC] ethdev: support hairpin queue > > > > > > On Tue, 13 Aug 2019 13:37:48 + > > > Ori Kam wrote: > > > > > > > This RFC replaces RFC[1]. > > > > > > > > The hairpin feature (different name can be forward) acts as "bump on > > > > the > > > wire", > > > > meaning that a packet that is received from the wire can be modified > > > > using > > > > offloaded action and then sent back to the wire without application > > > intervention > > > > which save CPU cycles. > > > > > > > > The hairpin is the inverse function of loopback in which application > > > > sends a packet then it is received again by the > > > > application without being sent to the wire. > > > > > > > > The hairpin can be used by a number of different NVF, for example load > > > > balancer, gateway and so on. > > > > > > > > As can be seen from the hairpin description, hairpin is basically RX > > > > queue > > > > connected to TX queue. > > > > > > > > During the design phase I was thinking of two ways to implement this > > > > feature the first one is adding a new rte flow action. and the second > > > > one is create a special kind of queue. > > > > > > > > > Life would be easier for users if the hairpin was an attribute > > > of queue configuration, not a separate API call. > > > > I was thinking about it. the reason that I split the functions is that they > > use > > different > > parameters sets. For example the hairpin queue doesn't need memory region > > while it does need > > the hairpin configuration. So in each case hairpin queue / normal queue > > there > > will be > > parameters that are not in use. I think this is less preferred. What do you > > think? > > > > Forgot in my last mail two more reasons I had for this for this: > 1. changing to existing function will break API, and will force all > applications to update date. > 2. 2 API are easier to document and explain. > 3. the reason stated above that there will be unused parameters in each call. New API's are like system calls, they create longer term support overhead. It would be good if there was support for this on multiple NIC types.
[dpdk-dev] [RFC] devtools: add spdx license check tool
This is a simple script to print files that have missing SPDX license tag and list of files with redundant boilerplate. Signed-off-by: Stephen Hemminger --- devtools/spdx-check.sh | 19 +++ 1 file changed, 19 insertions(+) create mode 100755 devtools/spdx-check.sh diff --git a/devtools/spdx-check.sh b/devtools/spdx-check.sh new file mode 100755 index ..c77454a8b320 --- /dev/null +++ b/devtools/spdx-check.sh @@ -0,0 +1,19 @@ +#! /bin/sh +# SPDX-License-Identifier: BSD-3-Clause +# Copyright (c) 2019 Microsoft Corporation +# +# Produce a list of files without SPDX license identifiers + +echo "Files without SPDX License" +echo "--" + +git grep -L SPDX-License-Identifier | \ +grep -v '^\.' |grep -v '^license/' |\ +grep -v '\.map$' | grep -v '\.png$' | grep -v '\.svg$' + +echo +echo "Files with redundant BSD boilerplate" +echo "" + +git grep -l SPDX-License-Identifier | \ +xargs grep -l 'Redistribution and use' -- 2.20.1
[dpdk-dev] [PATCH] net/virtio: Add support for vectorized functions on Power systems
Added the file virtio_rxtx_simple_altivec.c which implements Altivec code for the virtio vectorized RX functions. Updated the various build files. Cc: Maxime Coquelin Cc: Tiwei Bie Signed-off-by: David Christensen --- drivers/net/virtio/Makefile | 2 + drivers/net/virtio/meson.build | 2 + drivers/net/virtio/virtio_rxtx_simple_altivec.c | 202 3 files changed, 206 insertions(+) create mode 100644 drivers/net/virtio/virtio_rxtx_simple_altivec.c diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile index 6c2c996..5144e7c 100644 --- a/drivers/net/virtio/Makefile +++ b/drivers/net/virtio/Makefile @@ -33,6 +33,8 @@ SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c ifeq ($(CONFIG_RTE_ARCH_X86),y) SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple_sse.c +else ifeq ($(CONFIG_RTE_ARCH_PPC_64),y) +SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple_altivec.c else ifneq ($(filter y,$(CONFIG_RTE_ARCH_ARM) $(CONFIG_RTE_ARCH_ARM64)),) SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple_neon.c endif diff --git a/drivers/net/virtio/meson.build b/drivers/net/virtio/meson.build index 7949054..04c7fdf 100644 --- a/drivers/net/virtio/meson.build +++ b/drivers/net/virtio/meson.build @@ -11,6 +11,8 @@ deps += ['kvargs', 'bus_pci'] if arch_subdir == 'x86' sources += files('virtio_rxtx_simple_sse.c') +elif arch_subdir == 'ppc_64' + sources += files('virtio_rxtx_simple_altivec.c') elif arch_subdir == 'arm' and host_machine.cpu_family().startswith('aarch64') sources += files('virtio_rxtx_simple_neon.c') endif diff --git a/drivers/net/virtio/virtio_rxtx_simple_altivec.c b/drivers/net/virtio/virtio_rxtx_simple_altivec.c new file mode 100644 index 000..f4eb4eb --- /dev/null +++ b/drivers/net/virtio/virtio_rxtx_simple_altivec.c @@ -0,0 +1,202 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2010-2015 Intel Corporation + */ + +#include +#include +#include +#include +#include + +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "virtio_rxtx_simple.h" + +#define RTE_VIRTIO_DESC_PER_LOOP 8 + +/* virtio vPMD receive routine, only accept(nb_pkts >= RTE_VIRTIO_DESC_PER_LOOP) + * + * This routine is for non-mergeable RX, one desc for each guest buffer. + * This routine is based on the RX ring layout optimization. Each entry in the + * avail ring points to the desc with the same index in the desc ring and this + * will never be changed in the driver. + * + * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet + */ +uint16_t +virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, + uint16_t nb_pkts) +{ + struct virtnet_rx *rxvq = rx_queue; + struct virtqueue *vq = rxvq->vq; + struct virtio_hw *hw = vq->hw; + uint16_t nb_used; + uint16_t desc_idx; + struct vring_used_elem *rused; + struct rte_mbuf **sw_ring; + struct rte_mbuf **sw_ring_end; + uint16_t nb_pkts_received = 0; + const vector unsigned char zero = {0}; + + const vector unsigned char shuf_msk1 = { + 0xFF, 0xFF, 0xFF, 0xFF, /* packet type */ + 4, 5, 0xFF, 0xFF, /* vlan tci */ + 4, 5, /* dat len */ + 0xFF, 0xFF, /* vlan tci */ + 0xFF, 0xFF, 0xFF, 0xFF + }; + + const vector unsigned char shuf_msk2 = { + 0xFF, 0xFF, 0xFF, 0xFF, /* packet type */ + 12, 13, 0xFF, 0xFF, /* pkt len */ + 12, 13, /* dat len */ + 0xFF, 0xFF, /* vlan tci */ + 0xFF, 0xFF, 0xFF, 0xFF + }; + + /* +* Subtract the header length. +* In which case do we need the header length in used->len ? +*/ + const vector unsigned short len_adjust = { + 0, 0, + (uint16_t)-vq->hw->vtnet_hdr_size, 0, + (uint16_t)-vq->hw->vtnet_hdr_size, 0, + 0, 0 + }; + + if (unlikely(hw->started == 0)) + return nb_pkts_received; + + if (unlikely(nb_pkts < RTE_VIRTIO_DESC_PER_LOOP)) + return 0; + + nb_used = VIRTQUEUE_NUSED(vq); + + rte_compiler_barrier(); + + if (unlikely(nb_used == 0)) + return 0; + + nb_pkts = RTE_ALIGN_FLOOR(nb_pkts, RTE_VIRTIO_DESC_PER_LOOP); + nb_used = RTE_MIN(nb_used, nb_pkts); + + desc_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1)); + rused = &vq->vq_split.ring.used->ring[desc_idx]; + sw_ring = &vq->sw_ring[desc_idx]; + sw_ring_end = &vq->sw_ring[vq->vq_nentries]; + + rte_prefetch0(rused); + + if (vq->vq_free_cnt >= RTE_VIRTIO_VPMD_RX_REARM_THRESH) { + virtio_rxq_rearm_vec(rxvq); + i
[dpdk-dev] [PATCH 0/2] fixes to resolve meson build issues on Power systems
Encountered a few issues while testing meson builds. The first is related to the gcc compiler used on RHEL 7.6 and how it interprets the "native" CPU type when running on a Power 9 system. The second is a print format warning error on Power systems. David Christensen (2): config: fix RHEL7.6 build errors on Power 9 systems vhost: fix build error caused by 64bit print formatting config/ppc_64/meson.build | 10 ++ lib/librte_vhost/vhost_user.c | 9 + 2 files changed, 15 insertions(+), 4 deletions(-) -- 1.8.3.1
[dpdk-dev] [PATCH 2/2] vhost: fix build error caused by 64bit print formatting
Use of %llx print formatting causes meson build error on Power systems with RHEL 7.6 and gcc 4.8.5. Replace with PRIx64 macro. Fixes: 9b62e2da1844 (vhost: register new regions with userfaultfd) Cc: maxime.coque...@redhat.com Signed-off-by: David Christensen --- lib/librte_vhost/vhost_user.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c index 0b72648..6a6d694 100644 --- a/lib/librte_vhost/vhost_user.c +++ b/lib/librte_vhost/vhost_user.c @@ -1086,10 +1086,11 @@ goto err_mmap; } RTE_LOG(INFO, VHOST_CONFIG, - "\t userfaultfd registered for range : %llx - %llx\n", - reg_struct.range.start, - reg_struct.range.start + - reg_struct.range.len - 1); + "\t userfaultfd registered for range : " + "%" PRIx64 " - %" PRIx64 "\n", + (uint64_t)reg_struct.range.start, + (uint64_t)reg_struct.range.start + + (uint64_t)reg_struct.range.len - 1); #else goto err_mmap; #endif -- 1.8.3.1
[dpdk-dev] [PATCH 1/2] config: fix RHEL7.6 build errors on Power 9 systems
gcc 4.8.5 used on RHEL 7.6 can identify a Power 9 CPU but cannot generate Power 9 code when the "-mcpu=native" command line argument is used. Test whether the compiler can generate Power 9 code and adjust the machine setting appropriately. Signed-off-by: David Christensen --- config/ppc_64/meson.build | 10 ++ 1 file changed, 10 insertions(+) diff --git a/config/ppc_64/meson.build b/config/ppc_64/meson.build index 0e65f9d..495ef6f 100644 --- a/config/ppc_64/meson.build +++ b/config/ppc_64/meson.build @@ -7,6 +7,16 @@ endif dpdk_conf.set('RTE_ARCH', 'ppc_64') dpdk_conf.set('RTE_ARCH_PPC_64', 1) +# RHEL 7.x uses gcc 4.8.X which doesn't generate code for Power 9 CPUs, +# though it will detect a Power 9 CPU when the "-mcpu=native" argument +# is used, resulting in a build failure. +power9_supported = cc.has_argument('-mcpu=power9') +if not power9_supported + machine = 'power8' + machine_args = ['-mcpu=power8', '-mtune=power8'] + dpdk_conf.set('RTE_MACHINE','power8') +endif + # overrides specific to ppc64 dpdk_conf.set('RTE_MAX_LCORE', 1536) dpdk_conf.set('RTE_MAX_NUMA_NODES', 32) -- 1.8.3.1
[dpdk-dev] [PATCH v1 0/2] examples/ipsec-secgw: add fallback session
Inline processing is limited to a specified subset of traffic. It is often unable to handle more complicated situations, such as fragmented traffic. When using inline processing such traffic is dropped. Introduce multiple sessions per SA allowing to configure a fallback lookaside session for packets that normally would be dropped. A fallback session type in the SA configuration by adding 'fallback' with 'lookaside-none' or 'lookaside-protocol' parameter to determine type of session. Fallback session feature is available only when using librte_ipsec. Marcin Smoczynski (2): examples/ipsec-secgw: ipsec_sa structure cleanup examples/ipsec-secgw: add fallback session feature doc/guides/sample_app_ug/ipsec_secgw.rst | 17 ++- examples/ipsec-secgw/esp.c | 35 -- examples/ipsec-secgw/ipsec-secgw.c | 16 ++- examples/ipsec-secgw/ipsec.c | 99 --- examples/ipsec-secgw/ipsec.h | 61 +++-- examples/ipsec-secgw/ipsec_process.c | 113 ++--- examples/ipsec-secgw/sa.c| 153 +-- 7 files changed, 334 insertions(+), 160 deletions(-) -- 2.17.1
[dpdk-dev] [PATCH v1 1/2] examples/ipsec-secgw: ipsec_sa structure cleanup
Cleanup ipsec_sa structure by removing every field that is already in the rte_ipsec_session structure: * cryptodev/security session union * action type * offload flags * security context References to abovementioned fields are changed to direct references to matching fields of rte_ipsec_session structure. Such refactoring is needed to introduce many sessions per SA feature, e.g. fallback session for inline offload processing. Signed-off-by: Marcin Smoczynski --- examples/ipsec-secgw/esp.c | 35 +++ examples/ipsec-secgw/ipsec.c | 91 +++- examples/ipsec-secgw/ipsec.h | 27 ++--- examples/ipsec-secgw/ipsec_process.c | 30 - examples/ipsec-secgw/sa.c| 66 ++-- 5 files changed, 137 insertions(+), 112 deletions(-) diff --git a/examples/ipsec-secgw/esp.c b/examples/ipsec-secgw/esp.c index d6d7b1256..c1b49da1e 100644 --- a/examples/ipsec-secgw/esp.c +++ b/examples/ipsec-secgw/esp.c @@ -30,7 +30,8 @@ esp_inbound(struct rte_mbuf *m, struct ipsec_sa *sa, int32_t payload_len, ip_hdr_len; RTE_ASSERT(sa != NULL); - if (sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO) + if (ipsec_get_action_type(sa) == + RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO) return 0; RTE_ASSERT(m != NULL); @@ -148,13 +149,16 @@ esp_inbound_post(struct rte_mbuf *m, struct ipsec_sa *sa, uint8_t *nexthdr, *pad_len; uint8_t *padding; uint16_t i; + struct rte_ipsec_session *ips; RTE_ASSERT(m != NULL); RTE_ASSERT(sa != NULL); RTE_ASSERT(cop != NULL); - if ((sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL) || - (sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO)) { + ips = ipsec_get_session(sa); + + if ((ips->type == RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL) || + (ips->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO)) { if (m->ol_flags & PKT_RX_SEC_OFFLOAD) { if (m->ol_flags & PKT_RX_SEC_OFFLOAD_FAILED) cop->status = RTE_CRYPTO_OP_STATUS_ERROR; @@ -169,8 +173,8 @@ esp_inbound_post(struct rte_mbuf *m, struct ipsec_sa *sa, return -1; } - if (sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO && - sa->ol_flags & RTE_SECURITY_RX_HW_TRAILER_OFFLOAD) { + if (ips->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO && + ips->security.ol_flags & RTE_SECURITY_RX_HW_TRAILER_OFFLOAD) { nexthdr = &m->inner_esp_next_proto; } else { nexthdr = rte_pktmbuf_mtod_offset(m, uint8_t*, @@ -225,10 +229,12 @@ esp_outbound(struct rte_mbuf *m, struct ipsec_sa *sa, struct rte_crypto_sym_op *sym_cop; int32_t i; uint16_t pad_payload_len, pad_len, ip_hdr_len; + struct rte_ipsec_session *ips; RTE_ASSERT(m != NULL); RTE_ASSERT(sa != NULL); + ips = ipsec_get_session(sa); ip_hdr_len = 0; ip4 = rte_pktmbuf_mtod(m, struct ip *); @@ -277,9 +283,10 @@ esp_outbound(struct rte_mbuf *m, struct ipsec_sa *sa, } /* Add trailer padding if it is not constructed by HW */ - if (sa->type != RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO || - (sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO && -!(sa->ol_flags & RTE_SECURITY_TX_HW_TRAILER_OFFLOAD))) { + if (ips->type != RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO || + (ips->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO && +!(ips->security.ol_flags & +RTE_SECURITY_TX_HW_TRAILER_OFFLOAD))) { padding = (uint8_t *)rte_pktmbuf_append(m, pad_len + sa->digest_len); if (unlikely(padding == NULL)) { @@ -344,8 +351,9 @@ esp_outbound(struct rte_mbuf *m, struct ipsec_sa *sa, } } - if (sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO) { - if (sa->ol_flags & RTE_SECURITY_TX_HW_TRAILER_OFFLOAD) { + if (ips->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO) { + if (ips->security.ol_flags & + RTE_SECURITY_TX_HW_TRAILER_OFFLOAD) { /* Set the inner esp next protocol for HW trailer */ m->inner_esp_next_proto = nlp; m->packet_type |= RTE_PTYPE_TUNNEL_ESP; @@ -448,11 +456,14 @@ esp_outbound_post(struct rte_mbuf *m, struct ipsec_sa *sa, struct rte_crypto_op *cop) { + enum rte_security_session_action_type type; RTE_ASSERT(m != NULL); RTE_ASSERT(sa != NULL); - if ((sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL) || - (sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO)) { + type = ipsec_get
[dpdk-dev] [PATCH v1 2/2] examples/ipsec-secgw: add fallback session feature
Inline processing is limited to a specified subset of traffic. It is often unable to handle more complicated situations, such as fragmented traffic. When using inline processing such traffic is dropped. Introduce multiple sessions per SA allowing to configure a fallback lookaside session for packets that normally would be dropped. A fallback session type in the SA configuration by adding 'fallback' with 'lookaside-none' or 'lookaside-protocol' parameter to determine type of session. Fallback session feature is available only when using librte_ipsec. Signed-off-by: Marcin Smoczynski --- doc/guides/sample_app_ug/ipsec_secgw.rst | 17 +++- examples/ipsec-secgw/esp.c | 4 +- examples/ipsec-secgw/ipsec-secgw.c | 16 ++-- examples/ipsec-secgw/ipsec.c | 10 +-- examples/ipsec-secgw/ipsec.h | 40 -- examples/ipsec-secgw/ipsec_process.c | 85 +--- examples/ipsec-secgw/sa.c| 99 7 files changed, 210 insertions(+), 61 deletions(-) diff --git a/doc/guides/sample_app_ug/ipsec_secgw.rst b/doc/guides/sample_app_ug/ipsec_secgw.rst index ad2d79e75..9ded5fb70 100644 --- a/doc/guides/sample_app_ug/ipsec_secgw.rst +++ b/doc/guides/sample_app_ug/ipsec_secgw.rst @@ -401,7 +401,7 @@ The SA rule syntax is shown as follows: .. code-block:: console sa - + where each options means: @@ -573,6 +573,21 @@ where each options means: * *port_id X* X is a valid device number in decimal + + + * Action type for packets that inline processor failed to process. + + * Optional: Yes, by default there is *no fallback* session type. + + * Available options: + + * *lookaside-none*: use automatically chosen cryptodev to process packets + * *lookaside-protocol*: lookaside protocol offload to HW accelerator + + * Syntax: + + * *fallback lookaside-none* + * *fallback lookaside-protocol* Example SA rules: diff --git a/examples/ipsec-secgw/esp.c b/examples/ipsec-secgw/esp.c index c1b49da1e..bfa7ff721 100644 --- a/examples/ipsec-secgw/esp.c +++ b/examples/ipsec-secgw/esp.c @@ -155,7 +155,7 @@ esp_inbound_post(struct rte_mbuf *m, struct ipsec_sa *sa, RTE_ASSERT(sa != NULL); RTE_ASSERT(cop != NULL); - ips = ipsec_get_session(sa); + ips = ipsec_get_primary_session(sa); if ((ips->type == RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL) || (ips->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO)) { @@ -234,7 +234,7 @@ esp_outbound(struct rte_mbuf *m, struct ipsec_sa *sa, RTE_ASSERT(m != NULL); RTE_ASSERT(sa != NULL); - ips = ipsec_get_session(sa); + ips = ipsec_get_primary_session(sa); ip_hdr_len = 0; ip4 = rte_pktmbuf_mtod(m, struct ip *); diff --git a/examples/ipsec-secgw/ipsec-secgw.c b/examples/ipsec-secgw/ipsec-secgw.c index 0d1fd6af6..641ed3767 100644 --- a/examples/ipsec-secgw/ipsec-secgw.c +++ b/examples/ipsec-secgw/ipsec-secgw.c @@ -189,6 +189,7 @@ static uint32_t mtu_size = RTE_ETHER_MTU; /* application wide librte_ipsec/SA parameters */ struct app_sa_prm app_sa_prm = {.enable = 0}; +static const char *cfgfile; struct lcore_rx_queue { uint16_t port_id; @@ -1462,12 +1463,7 @@ parse_args(int32_t argc, char **argv) print_usage(prgname); return -1; } - if (parse_cfg_file(optarg) < 0) { - printf("parsing file \"%s\" failed\n", - optarg); - print_usage(prgname); - return -1; - } + cfgfile = optarg; f_present = 1; break; case 'j': @@ -2398,6 +2394,14 @@ main(int32_t argc, char **argv) if (ret < 0) rte_exit(EXIT_FAILURE, "Invalid parameters\n"); + /* parse configuration file */ + if (parse_cfg_file(cfgfile) < 0) { + printf("parsing file \"%s\" failed\n", + optarg); + print_usage(argv[0]); + return -1; + } + if ((unprotected_port_mask & enabled_port_mask) != unprotected_port_mask) rte_exit(EXIT_FAILURE, "Invalid unprotected portmask 0x%x\n", diff --git a/examples/ipsec-secgw/ipsec.c b/examples/ipsec-secgw/ipsec.c index 8c60bd84b..8b0441028 100644 --- a/examples/ipsec-secgw/ipsec.c +++ b/examples/ipsec-secgw/ipsec.c @@ -404,7 +404,7 @@ enqueue_cop(struct cdev_qp *cqp, struct rte_crypto_op *cop) static inline void ipsec_enqueue(ipsec_xform_fn xform_func, struct ipsec_ctx *ipsec_ctx, - struct rte_mbuf *pkts[], struct ipsec_sa *sas[], + struct rte_mbuf *pkts[], void *sas[], uint16_t nb_pkts) { int32_t ret = 0, i; @@ -42
Re: [dpdk-dev] [RFC] ethdev: support hairpin queue
> -Original Message- > From: Stephen Hemminger > Sent: Wednesday, August 14, 2019 5:56 PM > To: Ori Kam > Cc: Thomas Monjalon ; ferruh.yi...@intel.com; > arybche...@solarflare.com; Shahaf Shuler ; Slava > Ovsiienko ; Alex Rosenbaum > ; dev@dpdk.org > Subject: Re: [dpdk-dev] [RFC] ethdev: support hairpin queue > > On Wed, 14 Aug 2019 06:05:13 + > Ori Kam wrote: > > > > -Original Message- > > > From: Ori Kam > > > Sent: Wednesday, August 14, 2019 8:36 AM > > > To: Stephen Hemminger > > > Cc: Thomas Monjalon ; ferruh.yi...@intel.com; > > > arybche...@solarflare.com; Shahaf Shuler ; > Slava > > > Ovsiienko ; Alex Rosenbaum > > > ; dev@dpdk.org > > > Subject: RE: [dpdk-dev] [RFC] ethdev: support hairpin queue > > > > > > Hi Stephen, > > > > > > > -Original Message- > > > > From: Stephen Hemminger > > > > Sent: Tuesday, August 13, 2019 6:46 PM > > > > To: Ori Kam > > > > Cc: Thomas Monjalon ; ferruh.yi...@intel.com; > > > > arybche...@solarflare.com; Shahaf Shuler ; > Slava > > > > Ovsiienko ; Alex Rosenbaum > > > > ; dev@dpdk.org > > > > Subject: Re: [dpdk-dev] [RFC] ethdev: support hairpin queue > > > > > > > > On Tue, 13 Aug 2019 13:37:48 + > > > > Ori Kam wrote: > > > > > > > > > This RFC replaces RFC[1]. > > > > > > > > > > The hairpin feature (different name can be forward) acts as "bump on > the > > > > wire", > > > > > meaning that a packet that is received from the wire can be modified > using > > > > > offloaded action and then sent back to the wire without application > > > > intervention > > > > > which save CPU cycles. > > > > > > > > > > The hairpin is the inverse function of loopback in which application > > > > > sends a packet then it is received again by the > > > > > application without being sent to the wire. > > > > > > > > > > The hairpin can be used by a number of different NVF, for example load > > > > > balancer, gateway and so on. > > > > > > > > > > As can be seen from the hairpin description, hairpin is basically RX > queue > > > > > connected to TX queue. > > > > > > > > > > During the design phase I was thinking of two ways to implement this > > > > > feature the first one is adding a new rte flow action. and the second > > > > > one is create a special kind of queue. > > > > > > > > > > > > Life would be easier for users if the hairpin was an attribute > > > > of queue configuration, not a separate API call. > > > > > > I was thinking about it. the reason that I split the functions is that > > > they use > > > different > > > parameters sets. For example the hairpin queue doesn't need memory > region > > > while it does need > > > the hairpin configuration. So in each case hairpin queue / normal queue > there > > > will be > > > parameters that are not in use. I think this is less preferred. What do > > > you > think? > > > > > > > Forgot in my last mail two more reasons I had for this for this: > > 1. changing to existing function will break API, and will force all > > applications > to update date. > > 2. 2 API are easier to document and explain. > > 3. the reason stated above that there will be unused parameters in each > > call. > > New API's are like system calls, they create longer term support overhead. > It would be good if there was support for this on multiple NIC types. I don't know the capability of other NICs. I think this is a good feature that can be embrace and implemented by other NICS (may be they can even have some SW implementation for this that will still use CPU but will give faster packet rate since they know how their HW works) Regarding the long term support, I'm sorry but I don't see the longer support issue that important since for this exact reason I think a dedicated API is much easer to maintain. Also my be in future there will be a new type and then the generic function will have a lot of unused code which is hard to maintain and debug. Thanks, Ori
Re: [dpdk-dev] *rte_vhost_rx_queue_count* should be protected by vq->access_lock
On Wed, Aug 14, 2019 at 03:31:09AM +, He Peng wrote: > Hi, > > We found that *rte_vhost_rx_queue_count* is not protected by vq->access_lock, > and the access to vq->avail->idx is not thread-safe, since at the same time, > the vq->avail might be > > set by *vring_invalidate* when some vhost-user messages arrived, such as > VRING_SET_ADDRESS, > VRING_SET_MEM_TABLE, etc. You are right. And other similar APIs also need to be protected. Thanks for reporting this! Thanks, Tiwei > > > Thanks. > > > >
Re: [dpdk-dev] [RFC] ethdev: allow multiple security sessions to use one rte flow
Hi Akhil, Please see inline. Thanks, Anoob > -Original Message- > From: Akhil Goyal > Sent: Wednesday, August 14, 2019 4:37 PM > To: Anoob Joseph ; Adrien Mazarguil > ; Declan Doherty > ; Pablo de Lara > ; Thomas Monjalon > > Cc: Jerin Jacob Kollanukkaran ; Narayana Prasad Raju > Athreya ; Ankur Dwivedi > ; Shahaf Shuler ; > Hemant Agrawal ; Matan Azrad > ; Yongseok Koh ; Wenzhuo > Lu ; Konstantin Ananyev > ; Radu Nicolau ; > dev@dpdk.org > Subject: RE: [RFC] ethdev: allow multiple security sessions to use one rte > flow > > Hi Anoob, > > > > > Hi all, > > > > Reminder...! > > > Sorry for a delayed response. > > > If there are no concerns, I'll send the patch after adding the > > required changes in ipsec-secgw as well. > > > > Thanks, > > Anoob > > > > > -Original Message- > > > From: Anoob Joseph > > > Sent: Friday, August 2, 2019 11:05 AM > > > To: Anoob Joseph ; Akhil Goyal > > > ; Adrien Mazarguil > > > ; Declan Doherty > > > ; Pablo de Lara > > > ; Thomas Monjalon > > > > > > Cc: Jerin Jacob Kollanukkaran ; Narayana Prasad > > > Raju Athreya ; Ankur Dwivedi > > > ; Shahaf Shuler ; > Hemant > > > Agrawal ; Matan Azrad > ; > > > Yongseok Koh ; Wenzhuo Lu > > > ; Konstantin Ananyev > > > ; Radu Nicolau > > > ; dev@dpdk.org > > > Subject: RE: [RFC] ethdev: allow multiple security sessions to use > > > one rte flow > > > > > > Hi Akhil, Adrien, Declan, Pablo, > > > > > > Can you review this proposal and share your feedback? > > > > > > Thanks, > > > Anoob > > > > > > > -Original Message- > > > > From: Anoob Joseph > > > > Sent: Wednesday, July 24, 2019 7:47 PM > > > > To: Akhil Goyal ; Adrien Mazarguil > > > > ; Declan Doherty > > > > ; Pablo de Lara > > > > ; Thomas Monjalon > > > > > > > > Cc: Anoob Joseph ; Jerin Jacob Kollanukkaran > > > > ; Narayana Prasad Raju Athreya > > > > ; Ankur Dwivedi ; > > > Shahaf > > > > Shuler ; Hemant Agrawal > > > > ; Matan Azrad ; > > > Yongseok > > > > Koh ; Wenzhuo Lu ; > > > > Konstantin Ananyev ; Radu Nicolau > > > > ; dev@dpdk.org > > > > Subject: [RFC] ethdev: allow multiple security sessions to use one > > > > rte flow > > > > > > > > The rte_security API which enables inline protocol/crypto feature > > > > mandates that for every security session an rte_flow is created. > > > > This would internally translate to a rule in the hardware which > > > > would do packet > > > classification. > > > > > > > > In rte_securty, one SA would be one security session. And if an > > > > rte_flow need to be created for every session, the number of SAs > > > > supported by an inline implementation would be limited by the > > > > number of rte_flows the PMD would be able to support. > > > > > > > > If the fields SPI & IP addresses are allowed to be a range, then > > > > this limitation can be overcome. Multiple flows will be able to > > > > use one rule for SECURITY processing. In this case, the security > > > > session provided as > > > conf would be NULL. > > SPI values are normally used to uniquely identify the SA that need to be > applied on a particular flow. > I believe SPI value should not be a range for applying a particular SA or > session. > > Plain packet IP addresses can be a range. That is not an issue. Multiple plain > packet flows can use the same session/SA. > > Why do you feel that security session provided should be NULL to support > multiple flows. > How will the keys and other SA related info will be passed to the driver/HW. [Anoob] The SA configuration would be done via rte_security session. The proposal here only changes the 1:1 dependency of rte_flow and rte_security session. The h/w could use SPI field in the received packet to identify SA(ie, rte_security session). If the h/w allows to index into a table which holds SA information, then per SPI rte_flow is not required. This is in fact our case. And for PMDs which doesn't do it this way, rte_flow_validate() would fail and then per SPI rte_flow would require to be created. In the present model, a security session is created, and then rte_flow will connect ESP packets with one SPI to one security session. Instead, when we create the security session, h/w can populate entries in a DB that would be accessed during data path handling. And the rte_flow could say, all SPI in some range gets inline processed with the security session identified with its SPI. Our PMD supports limited number of flow entries but our h/w can do SA lookup without flow entries(using SPI instead). So the current approach of one flow per session is creating an artificial limit to the number of SAs that can be supported. > > > > > > > > > Application should do an rte_flow_validate() to make sure the flow > > > > is supported on the PMD. > > > > > > > > Signed-off-by: Anoob Joseph > > > > --- > > > > lib/librte_ethdev/rte_flow.h | 6 ++ > > > > 1 file changed, 6 insertions(+) > > > > > > > > diff --git a/lib/librte_ethdev/rte_flow.h > > > > b/lib/librte_ethdev/rte_flow.h in
Re: [dpdk-dev] [PATCH 2/2] vhost: fix build error caused by 64bit print formatting
On Wed, Aug 14, 2019 at 8:37 PM David Christensen wrote: > > Use of %llx print formatting causes meson build error on Power systems with > RHEL 7.6 and gcc 4.8.5. Replace with PRIx64 macro. > > Fixes: 9b62e2da1844 (vhost: register new regions with userfaultfd) > Cc: maxime.coque...@redhat.com > > Signed-off-by: David Christensen > --- > lib/librte_vhost/vhost_user.c | 9 + > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c > index 0b72648..6a6d694 100644 > --- a/lib/librte_vhost/vhost_user.c > +++ b/lib/librte_vhost/vhost_user.c > @@ -1086,10 +1086,11 @@ > goto err_mmap; > } > RTE_LOG(INFO, VHOST_CONFIG, > - "\t userfaultfd registered for range : %llx - > %llx\n", > - reg_struct.range.start, > - reg_struct.range.start + > - reg_struct.range.len - 1); > + "\t userfaultfd registered for range : " > + "%" PRIx64 " - %" PRIx64 "\n", > + (uint64_t)reg_struct.range.start, > + (uint64_t)reg_struct.range.start + > + (uint64_t)reg_struct.range.len - 1); struct uffdio_register { struct uffdio_range range; ... struct uffdio_range { __u64 start; __u64 len; }; You can drop those casts. -- David Marchand